2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 12:57:57 +00:00

208 Commits

Author SHA1 Message Date
Andrey Vagin
34cb65ce5d mount: handle a case when a source argument is empty (v2)
For example:
mount -t tmpfs "" test

v2: don't leak memory

Reported-by: Ross Boucher <boucher@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-21 16:44:23 +03:00
Tycho Andersen
209693d49b don't assume the kernel has CONFIG_SECCOMP
linux/seccomp.h may not be available, and the seccomp mode might not be
listed in /proc/pid/status, so let's not assume those two things are
present.

v2: add a seccomp.h with all the constants we use from linux/seccomp.h
v3: don't do a compile time check for PTRACE_O_SUSPEND_SECCOMP, just let
    ptrace return EINVAL for it; also add a checkskip to skip the
    seccomp_strict test if PTRACE_O_SUSPEND_SECCOMP or linux/seccomp.h
    aren't present.
v4: use criu check --feature instead of checkskip to check whether the
    kernel supports seccomp_suspend

Reported-by: Mr. Jenkins
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:50:35 +03:00
Tycho Andersen
e0b24e21d3 creds: fail to dump when creds in thread group don't match
Since we don't support dumping per-thread creds, let's at least fail to
dump if the creds don't match.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:38:35 +03:00
Tycho Andersen
0d8aec0c3a seccomp: add initial support for SECCOMP_MODE_STRICT
Unfortunately, SECCOMP_MODE_FILTER is not currently exposed to userspace,
so we can't checkpoint that. In any case, this is what we need to do for
SECCOMP_MODE_STRICT, so let's do it.

This patch works by first disabling seccomp for any processes who are going
to have seccomp filters restored, then restoring the process (including the
seccomp filters), and finally resuming the seccomp filters before detaching
from the process.

v2 changes:

* update for kernel patch v2
* use protobuf enum for seccomp type
* don't parse /proc/pid/status twice

v3 changes:

* get rid of extra CR_STAGE_SECCOMP_SUSPEND stage
* only suspend seccomp in finalize_restore(), just before the unmap
* restore the (same) seccomp state in threads too; also add a note about
  how this is slightly wrong, and that we should at least check for a
  mismatch

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:38:32 +03:00
Pavel Emelyanov
b08f3fae5b vdso: Reduce the amount of in-code ifdef-s
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Reviewed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2015-06-08 23:34:33 +03:00
Andrey Vagin
5a9fe81b75 locks: print unknown file locks
Now it isn't clear which lock is not supported.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-05-30 00:32:16 +03:00
Andrew Vagin
641693f8f0 proc_parse: remove a debug message
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-29 17:24:01 +03:00
Andrew Vagin
84c65f00f9 proc_parse: handle errors for breadline()
00:03:27.746 (00.008815) Error (bfd.c:149): bfd: Error reading file: No such process

Reported-by: Mr Jenkins
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-29 17:23:55 +03:00
Oleg Nesterov
be4acd9d6e fix parse_mnt_flags() to dump/restore STRICTATIME correctly
CRIU always retores the mounts as MNT_RELATIME. This is because the
kernel uses this mode by default, so we need to pass MS_STRICTATIME
explicitely if we didn't see "noatime" or "MS_RELATIME".

While at it, make mnt_opt2flag[] and sb_opt2flag "static", otherwise
gcc actually creates these arrays on stack even if there are "const".

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-27 14:54:44 +03:00
Andrey Vagin
25267e5b30 lock: parse the lock field in fdinfo if it's avaliable (v2)
/proc/locks can contain a wrong pid for a lock and we always need to
check this fact.  Starting with the 4.1 kernel, locks are reported
in fdinfo.

v2: rebase to the curret master
    skip note_file_lock()

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-27 14:53:24 +03:00
Andrey Vagin
e59a81bab6 proc_parse: handle a return code of fopen_proc
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-22 15:36:20 +03:00
Andrey Vagin
4a36feacc7 parse_proc: take into account that breadline can return an error code
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-22 15:36:12 +03:00
Andrey Vagin
6b4ecb90db proc_parse: don't play with a function exit code
We should not have a chance to exit with a wrong code on error
paths.

|^^^\
|    \________________
| **                |_\
\_______/^^^^^^^/_____/
       /      /
      /     /
     /____/

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-22 15:34:37 +03:00
Oleg Nesterov
6400fc9516 simplify the "ignore filesystem-subtype" logic
We can simply overwrite the dot symbol right after the kernel reports
it to us.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-14 15:15:19 +03:00
Oleg Nesterov
eb518936d8 introduce --skip-mnt cli option
Which obviously can be used to "ignore" the mounts we do not want or
need to dump. The user should know what he does.

Note: this patch changes parse_mountinfo() to check should_skip_mount().
This is because imo we want to filter out the unwanted mounts asap, af
if they do not exist. This increases the chances the dumping will fail
if something else depends on this mount. Say, another mountpoint or an
opened file.

Perhaps it makes sense to teach should_skip_mount() to use fnmatch()
and/or look at the optional "(fs|mnt)=" prefix to skip by fsname too.

To me it would be better to force the user of this option to understand
what it does. Say, if "dump" fails because the child mount can't find
the skipped parent, he should add another --skip-mnt option or do not
dump. Otherwise, if we do this automagically the user can probably be
surpised, he might even miss the fact that we skip more than he asked.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-03 17:56:05 +03:00
Oleg Nesterov
9fee3dc817 pass "bool for_dump" argument down to collect_mntinfo() and parse_mountinfo()
Preparation.

1. Add the new "bool for_dump" arg to collect/parse_mntinfo().

2. Introduce "struct collect_mntns_arg" to pass the additional
   "bool for_dump" field to collect_mntinfo() and change it to
   pass this boolean to collect_mntinfo()->parse_mountinfo() path.

3. Change other callers of collect_mntinfo() to pass "false".

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-03 17:55:18 +03:00
Cyrill Gorcunov
9ce0254c04 vma: Unify private VMAs testing
We have two helpers for VMA type testing: privately_dump_vma() and vma_priv(). They
work with different types but basically do the same: check if we should dump VMA into
the image and restore it back then.

Lets unify they both into common vma_entry_is_private() helper and vma_area_is_private()
for working with vma_area type.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-01 12:36:46 +03:00
Oleg Nesterov
79e0b37c4e parse_mountinfo_ent: xrealloc(new->mountpoint) can fail
This is pure theoretical, especially in this particular case when we
actually want to (likely) free the unused memory. Still the code which
ignores potential error doesn't look good.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-30 13:20:39 +03:00
Oleg Nesterov
69600335a9 parse_mountinfo_ent: fix the leakage of "opt"
1. parse_mountinfo_ent() mixes "return -1" and "goto err" on failure,
   this looks confusing and inconsistent.

2. And buggy. It forgets to free(opt) if parse_mnt_flags() fails.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-30 13:20:38 +03:00
Oleg Nesterov
8b5faee7c6 parse_mountinfo_ent: kill the wrong xfree(new->mountpoint)
The caller will do this on failure too. So this is unnecessary and wrong
because we do not nullify ->mountpoint.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-30 13:20:37 +03:00
Oleg Nesterov
b66728ef14 parse_mountinfo: fix and simplify the usage of r_fstype
1. parse_mountinfo() forgets to free(fst) if parse_mountinfo_ent()
   succeeds.

2. The usage of fst/r_fstype is ovecomplicated for no reason.

Just change the parse_mountinfo() paths to populate/use/free this
fsname unconditionally, and move the ownership to the caller. There
is no reason to check FSTYPE__UNSUPPORTED and/or fallback to ->name.

Better yet, we could even turn fsname into the local "char []" and
avoid %ms and free(), but then we would need to pass the length of
this buffer to parse_mountinfo_ent().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-30 13:20:37 +03:00
Oleg Nesterov
2cfeeac465 parse_mountinfo: add the "end" block into the main loop
Preparation to simplify the review. parse_mountinfo() assumes that:

1. The "err:" block does all the necessary cleanups on failure.

   This is wrong, see the next patch.

2. We can never skip the mountpoint.

   This is true, but we are going to change this.

s/goto err/goto end/ in the main loop, add the "end:" label which inserts
the new mount_info into the list and then checks ret != 0 to figure out
whether we need to abort.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-30 13:20:36 +03:00
Andrey Vagin
54d0f24107 proc_parse: fixed format strings
Use the format specifier PRIx64 instead of %lx to print uint64.
integer.

Reported-by: Mr Travis CI
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-27 15:27:59 +03:00
Oleg Nesterov
57db932a0a mount: always report ->mnt_id as decimal
validate_mounts() prints ->mnt_id in hex when it reports the failure.
This complicates the understanding because this ->mnt_id is printed as
decimal elsewhere, including /proc/$pid/mountinfo.

parse_mountinfo() adds "0x" at least and this is just pr_info(), but
lets change it too.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-27 14:03:04 +03:00
Andrey Vagin
36f0c16db0 proc: move logic about adding vma into a list in a separate function
parse_smaps() is too big for easy reading. In addition, we are
creating a new interface to get information about processes, which is
called taskdiag, so parse_smaps() will do only what it should do
accoding with the name. All other should be moved in a separate
functions which will be reused to work with task_diag.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-27 14:01:19 +03:00
Andrey Vagin
046245dcb0 proc: move logic about filling vma structures into a separate function
parse_smaps() is too big for easy reading. In addition, we are
creating a new interface to get information about processes, which is
called taskdiag, so parse_smaps() will do only what it should do
accoding with the name. All other should be moved in a separate
functions which will be reused to work with task_diag.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-27 14:00:50 +03:00
Pavel Emelyanov
7ede4697cf bfd: Don't leak image-open flags into bfdopen
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-16 15:58:14 +03:00
Saied Kazemi
e3fec5f8eb Ignore mnt_id value for AUFS file descriptors.
Starting with version 3.15, the kernel provides a mnt_id field in
/proc/<pid>/fdinfo/<fd>.  However, the value provided by the kernel for
AUFS file descriptors obtained by opening a file in /proc/<pid>/map_files
is incorrect.

Below is an example for a Docker container running Nginx.  The mntid
program below mimics CRIU by opening a file in /proc/1/map_files and
using the descriptor to obtain its mnt_id.  As shown below, mnt_id is
set to 22 by the kernel but it does not exist in the mount namespace of
the container.  Therefore, CRIU fails with the error:

	"Unable to look up the 22 mount"

In the global namespace, 22 is the root of AUFS (/var/lib/docker/aufs).

This patch sets the mnt_id of these AUFS descriptors to -1, mimicing
pre-3.15 kernel behavior.

	$ docker ps
	CONTAINER ID        IMAGE                    ...
	3850a63ee857        nginx-streaming:latest   ...
	$ docker exec -it 38 bash -i
	root@3850a63ee857:/# ps -e
	  PID TTY          TIME CMD
	    1 ?        00:00:00 nginx
	    7 ?        00:00:00 nginx
	   31 ?        00:00:00 bash
	   46 ?        00:00:00 ps
	root@3850a63ee857:/# ./mntid 1
	open("/proc/1/map_files/400000-4b8000") = 3
	cat /proc/49/fdinfo/3
	pos:	0
	flags:	0100000
	mnt_id:	22
	root@3850a63ee857:/# awk '{print $1 " " $2}' /proc/1/mountinfo
	87 58
	103 87
	104 87
	105 104
	106 104
	107 104
	108 87
	109 87
	110 87
	111 87
	root@3850a63ee857:/# exit
	$ grep 22 /proc/self/mountinfo
	22 21 8:1 /var/lib/docker/aufs /var/lib/docker/aufs ...
	44 22 0:35 / /var/lib/docker/aufs/mnt/<ID> ...
	$

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-02-09 14:07:40 +03:00
Pavel Emelyanov
89d9578730 proc: Print parsed fstype for unsupported mounts
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-01-27 16:15:22 +03:00
Pavel Emelyanov
c9b6614eef proc: Remove now pointless debug
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-01-26 15:05:32 +03:00
Saied Kazemi
2749d9e6ea Rework fixup_aufs_vma_fd() for non-AUFS links
This patch reworks fixup_aufs_vma_fd() to let symbolic links in
/proc/<pid>/map_files that are not pointing to AUFS branch names follow
the non-AUFS applcation logic.

The use case that prompted this commit was an application mapping
/dev/zero as shared and writeable which shows up in map_files as:

lrw------- ... 7fc5c5a5f000-7fc5c5a60000 -> /dev/zero (deleted)

If the AUFS support code reads the link, it will have to strip off the
" (deleted)" string added by the kernel but core CRIU code already
does this.

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-01-22 14:56:40 +03:00
Pavel Emelyanov
08c204820f aio: Dump AIO rings
When AIO context is set up kernel does two things:

1. creates an in-kernel aioctx object
2. maps a ring into process memory

The 2nd thing gives us all the needed information
about how the AIO was set up. So, in order to dump
one we need to pick the ring in memory and get all
the information we need from it.

One thing to note -- we cannot dump tasks if there
are any AIO requests pending. So we also need to
go to parasite and check the ring to be empty.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-26 18:13:36 +03:00
Pavel Emelyanov
86c0c5fb99 proc: Allocate and get vma fstat in vma_get_mapfile
We will need to detect aio mappings soon, so this is a preparation,
that makes future patching simpler.

Also move aufs stat-ing into aufs code to keep more aufs logic in
one place.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-25 21:10:15 +03:00
Pavel Emelyanov
6a6cdb8d4a proc: Drop always true last argument of parse_smaps()
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-12-22 13:52:03 +03:00
Pavel Emelyanov
ab2c1e426c proc_parse: Invert supported VMA check
It's for more natural adding of new else-if branch for aio.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-17 18:11:40 +03:00
Pavel Emelyanov
d4c62b6b5c proc_parse: Print vma start in hex
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-17 18:10:54 +03:00
Cyrill Gorcunov
88031bf89e proc_parse: Convert parse_pid_status to BFD engine
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:18:27 +04:00
Pavel Emelyanov
19a76494a9 kerndat: Collect all global variables on one struct
Not to spoil the global namespace and unify the kerndat
data names.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:14:53 +04:00
Andrey Vagin
71a9cd0634 proc: delete parse_pid_stat_small() (v2)
It's unused now.

v2: remove the proc_pid_stat_small struct too.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:15:37 +04:00
Andrey Vagin
05943959a5 proc: parse state and ppid from /proc/pid/status (v2)
v2: don't leak FILE

CID 73423 (#1 of 1): Resource leak (RESOURCE_LEAK)
15. leaked_storage: Variable f going out of scope leaks the storage it points to.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:15:03 +04:00
Andrey Vagin
fc84aa581a flock: blocked processes are not interesting for us (v2)
All out processes are stopped in a moment, when file locks are
collected, so they can't to wait any locks.

Here is a proof of this theory:
[root@avagin-fc19-cr ~]# flock xxx sleep 1000 &
[1] 23278
[root@avagin-fc19-cr ~]# flock xxx sleep 1000 &
[2] 23280
[root@avagin-fc19-cr ~]# cat /proc/locks
1: FLOCK  ADVISORY  WRITE 23278 08:03:280001 0 EOF
1: -> FLOCK  ADVISORY  WRITE 23280 08:03:280001 0 EOF
[root@avagin-fc19-cr ~]# gdb -p 23280
(gdb) ^Z
[3]+  Stopped                 gdb -p 23280
[root@avagin-fc19-cr ~]# cat /proc/locks
1: FLOCK  ADVISORY  WRITE 23278 08:03:280001 0 EOF

Currently criu can dump nothing, if we have one process which is
waiting a lock. I don't see any reason to do this.

v2: typo fix

Cc: Qiang Huang <h.huangqiang@huawei.com>
Reported-by: Mr Jenkins
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-05 16:34:52 +04:00
Andrey Vagin
7cb829b78f proc: don't leak memory
CID 73370: Resource leak (RESOURCE_LEAK)
13. leaked_storage: Variable timer going out of scope leaks the storage it points to.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-05 15:46:59 +04:00
Pavel Emelyanov
2e91a9c814 bfd: Don't flush read-only images
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-05 15:38:17 +04:00
Andrey Vagin
2464ad08d6 locks: print a lock before reporting an error about it
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-30 15:16:17 +04:00
Cyrill Gorcunov
4135f6cd1c proc_parse: parse_smaps -- Use @file_path instead of strstr helper
strstr is a really heavy one, lets use already defined
and filled @file_path variable instead.

Reported-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-27 21:28:18 +04:00
Cyrill Gorcunov
4ad462b459 mount: proc-parse -- Show @mnt_id on debug print as well
This is convenient when need to lookup into debug prints
and check which mount point were used somewhere else
(in particular I will need @mnt_id in tty code so
 on error I can easily figure out which mountpoint has
 been used).

No func changes.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:21:16 +04:00
Cyrill Gorcunov
ae96d21a07 bfd: Use ERR_PTR and such instead of BREADERR
No need to invent new error codes here, simply
use ERR_PTR/IS_ERR_OR_NULL and such.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:56:39 +04:00
Cyrill Gorcunov
c01efda8af bfd: timerfd -- Fix parsing typo
While been converting reading of data stream
to bfd the @buf member was left untouched leading
to incorrect data to be read, fix it setting up
proper one, ie @str itself, otherwise dumping
of timerfd files are failing.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-30 11:48:15 +04:00
Pavel Emelyanov
e651a6eba4 filemap: Get vma mnt_id early
We have a, well, issue with how we calculate the vma's mnt_id.

Right now get one via criu side file descriptor that it got by
opening the /proc/pid/map_files/ link. The problem is that these
descriptors are 'merged' or 'borrowed' by adjacent vmas from
previous ones. Thus, getting the mnt_id value for each of them
makes no sense -- these files are the same.

So move this mnt_id getting earlier into vma parsing code. This
brings a potential problem -- if we have two adjacent vmas
mapping the same inode (dev:ino pair) but living in different
mount namespaces -- this check would produce wrong result.
"Wrong" from the perspective that on restore correct file would
be opened from wrong namespace.

I propose to live with it, since this is not worse than the
--evasive-devices option, it's _very_ unlikely, but saves a lot
of openeings.

Note, that in case app switched mount namespace and then mapped
some new library (with dlopen) things would work correctly -- new
vmas will likely be not adjacent and for different dev:ino.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-29 13:20:55 +04:00
Pavel Emelyanov
cf8c9ae870 vma: Reshuffle the struct vma_area
We have some fields, that are dump-only and some that
are restore only (quite a lot of them actually).

Reshuffle them on the vma_area to explicitly show which
one is which. And rename some of them for easier grep.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-29 13:19:55 +04:00