We are going to parse fdinfo for getting mnt_id,
so we can take there pos and flags and don't call
fcntl and lseek for that.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
An mmaped file is opened O_RDONLY or O_RDWR depending on the permissions
on the first vma dump_task_mm() encounters mapping that file. This
causes two problems:
1. If a file has multiple MAP_SHARED mappings, some of which are
read-only and some of which are read-write, and the first encountered
mapping happens to be read-only, the file will be opened O_RDONLY
during restore, and mmap(PROT_WRITE) will fail with EACCES, causing
the restore to fail.
2. If a file is opened read-write and mapped read-only, it will be
opened O_RDONLY during restore, so restore will succeed, but
mprotect(PROT_WRITE) on the read-only mapping after restore will
fail.
To fix both of these, record open flags per-vma based on the presence of
VM_MAYWRITE in smaps.
Signed-off-by: Jamie Liu <jamieliu@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
ID: 0
signal: 26/ (null)
notify: signal/pid.5954
ClockID: 1
fscanf "%p" doesn't handle "(null)".
https://bugzilla.openvz.org/show_bug.cgi?id=2894
v2: make the original scanf be %d/%s and then additionally
parse the obtained string
v3: don't use strstr
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We spend a lot of time reading the /proc/$pid/smaps file. The time
is spent in two places:
1 kernel puts too many info into it
2 fgets pulls info in 1024-bytes chunks, info about one vma is
typically bigger (up to 3k bytes) thus we call read() ~3 times
per one vma, which increases the amount of time spent in kernel
to re-fill this info
Setting the internal buffer to PAGE_SIZE size reduces the amount of
read()-s on ~60% during basic container dump. Setting bigger buffer
doesn't work, as kernel's seq file engine feeds at most one page of
data per read syscall regardless of the buffer size.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
On restore we only need to know currnet task mappings' start and end
to find where to put the restorer blob. And since the smaps file in
/proc/pid is up to 3 times slower, than the maps one, it makes
perfect sense just to parse the latter one.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On restore we will read all VmaEntries in one big MmEntry object,
so to avoif copying them all into vma_areas, make them be pointable.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In /proc/<pid>/smaps/ output we may omit testing
for capital hex letters, since we know the format
kernel provides.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When parsing mappings in proc, we fstat vm file, later,
when dumping it, we stat it again to fill fd_parms.
The 2nd stat is not required, we can keep the stat in
vma_area.
This removed 35% of all stat calls on dump of basic container.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Quite a lot of VMAs in tasks map the same file with different
perms. In that case we may skip opening all these files, but
"borrow" one from the previous VMA parsed.
There's little sense in seeking more that just previous VMA,
as same files are rarely (can be though) mapped in different
locations.
After this on a basic Centos6 container the number of opens and
stats in this function drops from ~1500 to ~500.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Bug was introduced by on-disk-irmap-cache patch. The proc-parse
routine allocates memory for handle, calls ->cb then frees handle.
The problem is that the cb in case of pre-dump saves the handle
for future reference. So, in this future handle's mem happen to
be corrupted.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The existing code opens "self" and parses what's in there,
just twist the code a little to accept generic pid.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This helper will cause BTRFS engine to collect all subvolumes.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If we fail in xmalloc the function occasionally return
0 meaning that everything is fine. Don't do that, wait
until routine complete.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need it for additional handling once parsing
of mount entry is complete (in particular btrfs requires
additional processing to figure out subvolumes names).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
crtools.h is too heavy to be included in many sources
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This vma looks like VSYSCALL on x86. We don't need to dump and restore it.
Currently this vma is dumped and restored as a private vma, but it is not
remmaped in a correct place:
Restore
--- dump/pipe00/6392/1/dump.maps 2013-09-23 12:49:19.436816192 +0000
+++ dump/pipe00/6392/1/restore.maps 2013-09-23 12:49:20.276766356 +0000
@@ -6,5 +6,6 @@ e05000-e26000
4009d000-4009f000
400a0000-400aa000
400b8000-401e7000
+b6d6f000-b6d70000
be838000-be859000
ffff0000-ffff1000
ERROR: Sets of mappings differ:
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
parse_thread allocated a buffer for threads and then it initialized read
pid for each thread.
Now we want to use it on restore and in this moment we already have
a buffer with initialized virt pid-s, so we need to initialize read
pid-s only.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used in cr-restore.c for stopping threads on the exit from
sigreturn.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The kernel logic is described in v2.6.36-rc1-161-g7798330:
"If we've split the stack vma, only the lowest one has the guard page."
https://bugzilla.openvz.org/show_bug.cgi?id=2715
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
timer is not freed in case of eof.
CID 1042301 (#1 of 1): Resource leak (RESOURCE_LEAK)
15. leaked_storage: Variable timer going out of scope leaks the storage it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
parse_posix_timers should not call ferror if fopen returned NULL.
Reported-by: Adrian Reber <adrian@lisas.de>
Cc: Pavel Tikhomirov <snorcht@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
1st stage is -- creating the timers. It may fail if kernel
allocated IDs in a manner we don't expect or runs out of
memory.
2nd stage is -- arm the timers. It cannot fail, since we've
validated the timespecs in advance and should happen after
we've waited for all the other tasks to complete the restore.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The [vdso] mark in procfs output is not reliable,
so since we know which prot it should has, escape
obvious mishints.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When a kernel didn't show vma flags, we set MAP_GROWSDOWN for stack
vmas, but it's not reliable. E.g. thread stacks are mapped without
MAP_GROWSDOWN.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The size of vma can be changed after parsing flags. For example we need
to add a guard page for vma with MAP_GROWSDOWN.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
when is_blocked is seted, we should free file_lock
Signed-off-by: Libo Chen <libo.chen@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
"crtools check" crashes on ubuntu 12.10
(00.011275) Error (proc_parse.c:1049): No records of type 6 found in fdinfo file
(00.011281) Error (proc_parse.c:1052): parse_fdinfo: error parsing [flags: 02 ] for 6 : Operation not permitted
*** glibc detected *** /home/vvs/devel/criu/crtools/crtools: double free or corruption (top): 0x000000000068a5a0 **
Signed-off-by: Vasily Averin <vvs@parallels.com>
diff-double-fclose-in-parse_fdinfo
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
One of such things we use right now is the device for anon shmem
mappings backing. In the furure this can be extended to check for
various kernel features.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On error paths we don't explicitly close procfile.
CID 996191 (#5 of 6): Resource leak (RESOURCE_LEAK)
22. leaked_storage: Variable "f" going out of scope leaks the storage it points to.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
mnt_entry contains a few strings and they should be release too
CID 996198 (#4 of 4): Resource leak (RESOURCE_LEAK)
20. leaked_storage: Variable "pm" going out of scope leaks the storage
it points to.
CID 996190 (#1 of 1): Resource leak (RESOURCE_LEAK)
13. leaked_storage: Variable "new" going out of scope leaks the storage
it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CID 996207 (#1 of 1): Out-of-bounds access (OVERRUN)
5. alloc_strlen: Allocating insufficient memory for the terminating null of the string.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I will have to push some sort of map of pages to dump into parasite.
For this, I need to have estimation of how much memory I'd need for
than in parasite args. These two values will help with it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
They are really depends on CPU we're running on.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>