2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 21:07:43 +00:00

224 Commits

Author SHA1 Message Date
Andrey Vagin
8bcffef6b9 proc_parse: parse fdinfo to get pos and flags
We are going to parse fdinfo for getting mnt_id,
so we can take there pos and flags and don't call
fcntl and lseek for that.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-09 16:42:53 +04:00
Jamie Liu
efe594f8f4 criu: fix filemap open permissions
An mmaped file is opened O_RDONLY or O_RDWR depending on the permissions
on the first vma dump_task_mm() encounters mapping that file. This
causes two problems:

1. If a file has multiple MAP_SHARED mappings, some of which are
   read-only and some of which are read-write, and the first encountered
   mapping happens to be read-only, the file will be opened O_RDONLY
   during restore, and mmap(PROT_WRITE) will fail with EACCES, causing
   the restore to fail.

2. If a file is opened read-write and mapped read-only, it will be
   opened O_RDONLY during restore, so restore will succeed, but
   mprotect(PROT_WRITE) on the read-only mapping after restore will
   fail.

To fix both of these, record open flags per-vma based on the presence of
VM_MAYWRITE in smaps.

Signed-off-by: Jamie Liu <jamieliu@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-04 20:35:48 +04:00
Cyrill Gorcunov
8ed215d252 proc_parse: Borrow vmi iif there is file referenced
Otherwise we migh propagate previous vfi status
to vmas which actually don't match.

 | (00.005471) 0x2b79227d6000-0x2b79227d8000 (8K) prot 0x5 flags 0x22 off 0 reg vdso ap  shmid: 0
 | (00.005473) 0x2b79227d8000-0x2b79227da000 (8K) prot 0x3 flags 0x22 off 0 reg vdso ap  shmid: 0
 | (00.005475) 0x2b79227f1000-0x2b79227f2000 (4K) prot 0x3 flags 0x22 off 0 reg vdso ap  shmid: 0
 | (00.005476) 0x2b79227f2000-0x2b79227f4000 (8K) prot 0x3 flags 0x22 off 0 reg vdso ap  shmid: 0

Tested-by: Pavel Tikhomirov <snorcht@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-06 21:32:15 +04:00
Andrey Vagin
1934e1a963 posix-timer: take into account that sival_ptr can be NULL (v3)
ID: 0
signal: 26/          (null)
notify: signal/pid.5954
ClockID: 1

fscanf "%p" doesn't handle "(null)".

https://bugzilla.openvz.org/show_bug.cgi?id=2894

v2: make the original scanf be %d/%s and then additionally
    parse the obtained string
v3: don't use strstr

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-18 19:35:14 +04:00
Pavel Emelyanov
95c2a4d641 proc_parse: Set bigger buffer for smaps FILE
We spend a lot of time reading the /proc/$pid/smaps file. The time
is spent in two places:

1 kernel puts too many info into it
2 fgets pulls info in 1024-bytes chunks, info about one vma is
  typically bigger (up to 3k bytes) thus we call read() ~3 times
  per one vma, which increases the amount of time spent in kernel
  to re-fill this info

Setting the internal buffer to PAGE_SIZE size reduces the amount of
read()-s on ~60% during basic container dump. Setting bigger buffer
doesn't work, as kernel's seq file engine feeds at most one page of
data per read syscall regardless of the buffer size.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-02-14 16:46:15 +04:00
Pavel Emelyanov
0b98d87bf1 proc-parse: Fix 32-bit printing of vma addresses
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-07 20:51:21 +04:00
Pavel Emelyanov
fd41201975 restore: Parse /proc/self/maps for self mappings
On restore we only need to know currnet task mappings' start and end
to find where to put the restorer blob. And since the smaps file in
/proc/pid is up to 3 times slower, than the maps one, it makes
perfect sense just to parse the latter one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-07 13:32:21 +04:00
Pavel Emelyanov
44a0fe499a proc-parse: Fix 32-bit compilation
Broken by bbab13eb

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 20:40:02 +04:00
Pavel Emelyanov
eb1ae0a025 vma: Turn embeded VmaEntry on vma_area into pointer
On restore we will read all VmaEntries in one big MmEntry object,
so to avoif copying them all into vma_areas, make them be pointable.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 11:44:01 +04:00
Cyrill Gorcunov
c643ed76e7 proc_parse: Speedup VMA range parsing
In /proc/<pid>/smaps/ output we may omit testing
for capital hex letters, since we know the format
kernel provides.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-03 18:26:24 +04:00
Pavel Emelyanov
608db864a3 vmas: Don't call stat on vm file twice
When parsing mappings in proc, we fstat vm file, later,
when dumping it, we stat it again to fill fd_parms.
The 2nd stat is not required, we can keep the stat in
vma_area.

This removed 35% of all stat calls on dump of basic container.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-03 00:18:32 +04:00
Pavel Emelyanov
740eb9c101 proc-parse: Don't open and stat every single map_files link
Quite a lot of VMAs in tasks map the same file with different
perms. In that case we may skip opening all these files, but
"borrow" one from the previous VMA parsed.

There's little sense in seeking more that just previous VMA,
as same files are rarely (can be though) mapped in different
locations.

After this on a basic Centos6 container the number of opens and
stats in this function drops from ~1500 to ~500.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-31 20:31:06 +04:00
Pavel Emelyanov
bbab13ebdb proc: Helper for opening vma's file
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-31 20:30:47 +04:00
Pavel Emelyanov
446ab1fc64 irmap: Don't let proc-parse free handle's mem before caching
Bug was introduced by on-disk-irmap-cache patch. The proc-parse
routine allocates memory for handle, calls ->cb then frees handle.

The problem is that the cb in case of pre-dump saves the handle
for future reference. So, in this future handle's mem happen to
be corrupted.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-31 08:50:28 +04:00
Pavel Emelyanov
c18c733b7c proc-parse: Parse pid's fdinfo entries
The existing code opens "self" and parses what's in there,
just twist the code a little to accept generic pid.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:15 +04:00
Pavel Emelyanov
c5d2386a2f btrfs: Remove volume parsing code
Now we have more robust and fs agnostic path-resolution
engine for resolving dev conflicts.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-11 17:18:59 +04:00
Pavel Emelyanov
3708ecb499 mount: Introduce generic FSs parsing callback
And make use of it in for btrfs.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2013-12-04 20:03:17 +04:00
Cyrill Gorcunov
c9069ba09f proc_parse: Call for btrfs_parse_mountinfo on every mount
This helper will cause BTRFS engine to collect all subvolumes.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-04 19:23:31 +04:00
Cyrill Gorcunov
eeb21b8a34 proc_parse: Remember a type of FS provided by a kernel
We will need it for btrfs handling. Also print out the
FS type for easier debug

 | (00.003545)     type unsupported (cgroup) source cgroup 1c / @ /sys/fs/cgroup/blkio flags 30000e options blkio,
 | (00.003558)     type unsupported (cgroup) source cgroup 1d / @ /sys/fs/cgroup/perf_event flags 30000e options perf_event,
 | (00.003571)     type unsupported (cgroup) source cgroup 1e / @ /sys/fs/cgroup/hugetlb flags 30000e options hugetlb,
 | (00.003584)     type unsupported (ext4) source /dev/sda2 800002 / @ / flags 300000 options data=ordered,
 | (00.003670)     type tmpfs (tmpfs) source tmpfs 20 / @ /tmp flags 100000 options
 | (00.003696)     type unsupported (mqueue) source mqueue d / @ /dev/mqueue flags 300000 options

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-04 18:18:21 +04:00
Cyrill Gorcunov
38c34aae5e proc_parse: Don't setup ret = 0 early
If we fail in xmalloc the function occasionally return
0 meaning that everything is fine. Don't do that, wait
until routine complete.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-03 22:49:30 +04:00
Cyrill Gorcunov
654131626e proc_parse: Delay freeing of kernel fs type
We will need it for additional handling once parsing
of mount entry is complete (in particular btrfs requires
additional processing to figure out subvolumes names).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-29 15:36:12 +04:00
Cyrill Gorcunov
ce3b4aaafa headers: Move MADV definitions to own mman.h
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-14 22:48:30 +04:00
Andrey Vagin
a434e7f075 crtools: move pid_rst_prio to pid.h
crtools.h is too heavy to be included in many sources

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 18:18:12 +04:00
Andrey Vagin
0d1dfc2e08 crtools: move all stuff about vma together
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:43:49 +04:00
Andrey Vagin
a6edbcf669 crtools: don't include restorer.h in proc_parse.h
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:37:55 +04:00
Cyrill Gorcunov
2503fd7da5 proc: parse -- Fix length for smaps maj/min parshing
Otherwise

 | Error (proc_parse.c:227): Can't parse: 555555554000-555555577000 r-xp 00000000 b6:d2f61 133545                  /sbin/init

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-14 13:20:35 +04:00
Cyrill Gorcunov
d393e5d137 flock: Don't assume device maj/min numbers are byte long
Igor reported the following file lock

 | (00.003139) lockinfo: 4:POSIX ADVISORY WRITE 46567 b6:5f0b1:524702 0 EOF
 | (00.003154) lockinfo: 5:POSIX ADVISORY WRITE 46559 b6:5f0b1:524661 0 EOF
 | (00.003172) lockinfo: 6:POSIX ADVISORY WRITE 46543 b6:5f0b1:524326 0 0
 | (00.003188) lockinfo: 7:POSIX ADVISORY WRITE 46543 b6:5f0b1:524367 0 EOF

where device maj number is pretty big and parsing failed.
Fix it removing field width.

Reported-by: Igor Sukhih <igor@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-11 18:37:14 +04:00
Andrey Vagin
bf7c171a8b arm: mark the "[vectors]" vma as VMA_AREA_VSYSCALL
This vma looks like VSYSCALL on x86. We don't need to dump and restore it.

Currently this vma is dumped and restored as a private vma, but it is not
remmaped in a correct place:
Restore
 --- dump/pipe00/6392/1/dump.maps	2013-09-23 12:49:19.436816192 +0000
 +++ dump/pipe00/6392/1/restore.maps	2013-09-23 12:49:20.276766356 +0000
 @@ -6,5 +6,6 @@ e05000-e26000
  4009d000-4009f000
  400a0000-400aa000
  400b8000-401e7000
 +b6d6f000-b6d70000
  be838000-be859000
  ffff0000-ffff1000
ERROR: Sets of mappings differ:

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 17:51:46 +04:00
Andrey Vagin
dbd41b522f proc: allow parse_thread to use an existent buffer
parse_thread allocated a buffer for threads and then it initialized read
pid for each thread.

Now we want to use it on restore and in this moment we already have
a buffer with initialized virt pid-s, so we need to initialize read
pid-s only.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:04 +04:00
Andrey Vagin
3e5ad587f4 parse_proc: move parse_threads from cr-dump.c
It will be used in cr-restore.c for stopping threads on the exit from
sigreturn.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:03 +04:00
Andrey Vagin
8f18db5f6a mm: don't add a guard page if a prev vma grows-down too
The kernel logic is described in v2.6.36-rc1-161-g7798330:
"If we've split the stack vma, only the lowest one has the guard page."

https://bugzilla.openvz.org/show_bug.cgi?id=2715
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 14:22:35 +04:00
Andrey Vagin
0e0a398959 proc: fix memory leak
timer is not freed in case of eof.

CID 1042301 (#1 of 1): Resource leak (RESOURCE_LEAK)
15. leaked_storage: Variable timer going out of scope leaks the storage it points to.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-08 13:40:23 +04:00
Andrey Vagin
8bbf64ebfa posix-timer: make parser a bit more readable
Cc: Pavel Tikhomirov <snorcht@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-03 13:36:08 +04:00
Andrey Vagin
f70859e2fc posix-timers: don't call ferror for NULL
parse_posix_timers should not call ferror if fopen returned NULL.

Reported-by: Adrian Reber <adrian@lisas.de>
Cc: Pavel Tikhomirov <snorcht@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-02 20:25:14 +04:00
Pavel Emelyanov
f8464fdafe timers: Split posix timers restore into two stages
1st stage is -- creating the timers. It may fail if kernel
allocated IDs in a manner we don't expect or runs out of
memory.

2nd stage is -- arm the timers. It cannot fail, since we've
validated the timespecs in advance and should happen after
we've waited for all the other tasks to complete the restore.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-01 20:02:15 +04:00
Pavel Emelyanov
db77402ae0 proc: Use open_proc helper to open timers file
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-01 16:26:30 +04:00
Pavel Tikhomirov
d992960fa7 posix-timer: Parse proc /proc/<pid>/timers and save info in list
Signed-off-by: Pavel Tikhomirov <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-01 16:19:26 +04:00
Cyrill Gorcunov
20b39341ca proc: Don't mark mishinted vdso
The [vdso] mark in procfs output is not reliable,
so since we know which prot it should has, escape
obvious mishints.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-24 13:27:21 +04:00
Andrey Vagin
653053b40c proc: use vma flags for determing vmas with MAP_GROWSDOWN
When a kernel didn't show vma flags, we set MAP_GROWSDOWN for stack
vmas, but it's not reliable. E.g. thread stacks are mapped without
MAP_GROWSDOWN.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-21 19:03:57 +04:00
Andrey Vagin
84e84cbb65 proc: add vma_area in a list after parsing all parameters
The size of vma can be changed after parsing flags. For example we need
to add a guard page for vma with MAP_GROWSDOWN.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-21 19:03:48 +04:00
Libo Chen
7071d52088 filelock: fix potential fl memleak
when is_blocked is seted, we should free file_lock

Signed-off-by: Libo Chen <libo.chen@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-16 17:07:00 +04:00
Cyrill Gorcunov
921dbf23de Don't use \Newline in pr_perror
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-02 22:44:24 +04:00
Vasily Averin
9c88d0cdd8 proc_parse: double fclose in parse_fdinfo
"crtools check" crashes on ubuntu 12.10
(00.011275) Error (proc_parse.c:1049): No records of type 6 found in fdinfo file
(00.011281) Error (proc_parse.c:1052): parse_fdinfo: error parsing [flags:  02 ] for 6 : Operation not permitted
*** glibc detected *** /home/vvs/devel/criu/crtools/crtools: double free or corruption (top): 0x000000000068a5a0 **

Signed-off-by:  Vasily Averin <vvs@parallels.com>

diff-double-fclose-in-parse_fdinfo
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-29 16:10:50 +04:00
Pavel Emelyanov
5b343b40eb kerndat: Introduce the storage of kernel run-time info
One of such things we use right now is the device for anon shmem
mappings backing. In the furure this can be extended to check for
various kernel features.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-15 13:02:09 +04:00
Cyrill Gorcunov
ea438d3f86 proc-parse: Rework error paths
On error paths we don't explicitly close procfile.

CID 996191 (#5 of 6): Resource leak (RESOURCE_LEAK)
22. leaked_storage: Variable "f" going out of scope leaks the storage it points to.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-12 20:50:44 +04:00
Andrey Vagin
6a49f82fb6 mount: free all parts of mnt entries
mnt_entry contains a few strings and they should be release too

CID 996198 (#4 of 4): Resource leak (RESOURCE_LEAK)
20. leaked_storage: Variable "pm" going out of scope leaks the storage
it points to.

CID 996190 (#1 of 1): Resource leak (RESOURCE_LEAK)
13. leaked_storage: Variable "new" going out of scope leaks the storage
it points to.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-10 01:26:59 +04:00
Andrey Vagin
4ef152447c proc_parse: allocate memory for the terminating null of the string
CID 996207 (#1 of 1): Out-of-bounds access (OVERRUN)
5. alloc_strlen: Allocating insufficient memory for the terminating null of the string.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-05 08:18:57 +04:00
Pavel Emelyanov
4d0b24b52a vma: Keep track of lonest vma in list and sum of its lengths
I will have to push some sort of map of pages to dump into parasite.
For this, I need to have estimation of how much memory I'd need for
than in parasite args. These two values will help with it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-03-01 20:12:33 +04:00
Pavel Emelyanov
b71f9e80be vma: Introduce list-of-vmas object
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-03-01 20:11:51 +04:00
Cyrill Gorcunov
fcb9a9bfb1 cpu: Make cpu routines being per-acrh
They are really depends on CPU we're running on.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-02-18 18:42:08 +04:00