2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 13:28:27 +00:00

263 Commits

Author SHA1 Message Date
Tycho Andersen
1c79933301 remove some double ;;s
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-31 22:09:38 +03:00
Oleg Nesterov
57db932a0a mount: always report ->mnt_id as decimal
validate_mounts() prints ->mnt_id in hex when it reports the failure.
This complicates the understanding because this ->mnt_id is printed as
decimal elsewhere, including /proc/$pid/mountinfo.

parse_mountinfo() adds "0x" at least and this is just pr_info(), but
lets change it too.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-27 14:03:04 +03:00
Pavel Emelyanov
f7f76d6ba6 img: Introduce empty images
When an image of a certian type is not found, CRIU sometimes
fails, sometimes ignores this fact. I propose to ignore this
fact always and treat absent images and those containing no
objects inside (i.e. -- empty). If the latter code flow will
_need_ objects, then criu will fail later.

Why object will be explicitly required? For example, due to
restoring code reading the image with pb_read_one, w/o the
_eof suffix thus required the object to be in the image.

Another example is objects dependencies. E.g. fdinfo objects
require various files objects. So missing image files will
result in non-resolved searches later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-03-13 14:42:54 +03:00
Pavel Emelyanov
e10b3ef407 mount: Don't ignore validation and shared resolving errors on dump
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-02-25 11:46:26 +03:00
Pavel Emelyanov
199619791d mnt: Factor out find-mount-by-s_dev code
And move the 2nd piece lower to avoid fwd declaration and
keep similar calls close to each other.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-19 13:17:30 +04:00
Andrey Vagin
104d3b84d5 mount: rework can_mount_now() to support bind-mounts of shared mounts
Fedora bind-mounts a part of the root mount to itself. Currently we
don't allow to mount children of a shared mount, if other mount from
this shared group are not mounted.

This patch adds an exclusion for cases, when a child has the same
group. We allow to mount a child, if wider mounts are mounted.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-12 12:54:39 +04:00
Andrey Vagin
beeabc3b2b mount: add the mnt_roots mount in the mount tree on restore
Currently we connect roots of sub-namespaces to the root of the root
mount namespace. And we get problems, if the root of the root mntns is
shared, because all children of a shared mount must be propagated to
other mounts in this group.

Actually we mount tmpfs in mnt_roots and here is nothing wrong to add it
in a tree.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-12 12:54:37 +04:00
Pavel Emelyanov
4e7064cd7e mount: Remove the len variable
And use the expression for it, it's quite short. This
makes the amount of variables in the code fit into brains.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:33 +04:00
Pavel Emelyanov
0fbee68f26 mount: Add helper for searching for shared peer
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:32 +04:00
Pavel Emelyanov
cedd10254c mount: Sanitize the path recalculation between submounts
Do paths conversions and checks step-by-step and add many comments
what we do in each step and why.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:31 +04:00
Pavel Emelyanov
991c02874a mount: Use pre-calculated ct->mountpoint + t_mpnt_l value
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:30 +04:00
Pavel Emelyanov
ffe5f9b422 mount: Use issubpath() when checking for submount visibility
When we check whether a submount of a mount is visible in another
mount (shared peer of the latter), we can and should use the new
issubpath helper.

Should because the used strncmp may scan beyond ct_mpnt_rpath if
its length is smaller (no checks for this in the code).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:29 +04:00
Pavel Emelyanov
e9512502d1 mount: Move some variables out of search loop
These are constant for given m, so calculate them outside
of the loop. Also rename them to reflect what they are.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:28 +04:00
Pavel Emelyanov
028d6355e6 mount: Rename paths' lengths to reflect whose lengths they are
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:27 +04:00
Pavel Emelyanov
6b0059049d mount: Add helper for path length calculations
The path lenght is zero for the "/" one and strlen(path)
for all the others. This is done so to make it possible
to use this length to get tail-paths: if path_1 starts
with path_2 and both are absolute, then

   path_1 + path_length(path_2)

would give the tail of the tail of path_1 relative to
path_2 even if the path_2 is just "/".

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:26 +04:00
Pavel Emelyanov
64c95bf586 mount: Add helper to search for widest shared peer
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-12 12:54:25 +04:00
Andrey Vagin
1cf5168cab mntns: rework validation to support non-root shared bind-mounts (v2)
A problem which is solved in this path is that some children can be
unaccessiable (unvisiable) for non-root bind-mounts

root	mount point
-------------------
/	/a (shared:1)
/	/a/x
/	/a/x/y
/	/a/z
/x	/b (shared:1)
/	/b/y

/b is a non-root bind-mount of /a
/y is visiable to both mounts
/z is vidiable only for /a

Before this patch we checked that the set of children is the same for
all mount in a shared group. Now we check that a visiable set of mounts
is the same for all mounts in a shared group.

Now we take the next mount in the shared group, which is wider or equal
to current and compare children between them.

Before this patch validate_shared(m) validates the m->parent mount.
Now it validates the "m" mount. So you can find following lines in the
patch:
-               if (m->parent->shared_id && validate_shared(m))
+               if (m->shared_id && validate_shared(m))

We doesn't support shared mounts with different set of children.
Here is an example of such case can be created:
mount tmpfs a /a
mount --make-shared /a
mkdir /a/b
mount tmpfs b /a/b
mount --bind /a /c

In this case /c doesn't have the /b child. To support such cases,
we need to sort all shared mounts accoding with a set of children.

v2: If root is equal to "/", its len should be zero. We expect that the
last symbol in a path is not "/".

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-12 12:54:21 +04:00
Pavel Emelyanov
00770a91c1 kerndat: Handle errors from devtmpfs virtualized checks
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:15:17 +04:00
Pavel Emelyanov
69bffe26d3 kerndat: Make fs-virtualized check report yes/no
Right now it returns the whole struct stat which is excessive.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:15:09 +04:00
Pavel Emelyanov
f33908a897 ns: Rename "created" futex and comment what it is
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:11:58 +04:00
Pavel Emelyanov
32f58742ca mnt: Introduce and use issubpath helper
When we validate the mount tree not to have overmounts we need to
check one path to be the sub-path of another. Here's a helper for
this.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-07 17:39:23 +04:00
Andrey Vagin
4ed63afa00 mount: bind-mount root into itself if processes are restored in userns
When we create a new mntns in a userns, all inhereted mounts are marked
as locked. pivot_root() returns EINVAL if a new root is locked.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:00:13 +04:00
Pavel Emelyanov
3bb7731c2e mnt: Helper and comment for bind mount validation
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-05 15:41:09 +04:00
Pavel Emelyanov
defdb96b9b mount: Invert check for shared mounts check
Introduced by eb214be2, the empty mnt_share list cannot
produce the list_first_entry element :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-05 15:40:41 +04:00
Pavel Emelyanov
b1a8e41dd0 mnt: Don't validate mounts on pre-dump
This is for two reasons. First, validation can meet external mount
and will call plugins, which is not correct on pre-dump and actually
crashes on uninitilized plugins lists. Second, even if on pre-dump
mount tree is not "supported" this can be a temporary situation (yes,
yes, unlikely, but still).

On the other hand, it's better to fail earlier, but that's another
story.

Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-10-30 15:15:30 +04:00
Cyrill Gorcunov
a1d7c7c593 mount: Dump/restore devtmpfs if it's virtualized
In case if we meet virtualized devtmpfs on dump
(which means its s_dev is different from one obtained
 during mountpoints dump procedure) we should dump it
with tar help. Thus on restore it get filled from the
image.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-30 15:10:32 +04:00
Cyrill Gorcunov
48d81eb48a kerndat: Transform kerndat_get_devpts_stat into general form
We will need devtmpfs as well so make it general.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-30 15:10:31 +04:00
Andrey Vagin
eb214be2d0 mount: move code to validate shared mounts in a separate function
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-29 16:19:03 +04:00
Andrey Vagin
ffe7f01d29 mount: don't add extra / between a temporary root and mountpoint
Currenlty a generated path contains two slashes successively.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-29 16:18:52 +04:00
Andrey Vagin
4862246c10 mntns: pivot_root() can move the current root to a non-shared mount
So we need to create a temporary private mount for the old root.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-29 16:18:46 +04:00
Andrey Vagin
41aa981dde mount: don't mark mounts as private twice
We do the same action twice. It's typo.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-29 16:18:17 +04:00
Andrey Vagin
377205c147 Revert "mount: don't create a temporary directory for pivot_root()"
This reverts commit 21d1b2fdb97facf88436d083896311367e4398af.

pivot_root() can move the current root to a non-shared mount. So we are
going to create a temporary private mount in put_old.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-29 16:18:16 +04:00
Cyrill Gorcunov
b12072b979 mount: Allow to c/r empty mqueues
We don't support posix mqueues at the moment
but in case if this fs is simply mounted and
not used lets proceed without errors.

In case if someone is using it we detect it
because fs won't be empty and refuse to dump.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-23 20:06:43 +04:00
Pavel Emelyanov
87d68fcafd mnt: Use ns walking helper
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-14 18:01:56 +04:00
Pavel Emelyanov
a36917d4f2 mnt: Clean up namespaces walking code
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-14 18:01:44 +04:00
Andrey Vagin
21d1b2fdb9 mount: don't create a temporary directory for pivot_root()
I found this solution in the LXC code. We can open the old root, call
pivot_root(".", "."), call fchdir to the old root and call umount(".").

Now restore will not fail, if the root is read-only.
In addition it's a bit faster.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-09 19:19:28 +04:00
Pavel Emelyanov
295090c1ea img: Introduce the struct cr_img
We want to have buffered images to speed up dump and,
slightly, restore. Right now we use plan file descriptors
to write and read images to/from. Making them buffered
cannot be gracefully done on plain fds, so introduce
a new class.

This will also help if (when?) we will want to do more
complex changes with images, e.g. store them all in one
file or send them directly to the network.

For now the cr_img just contains one int _fd variable.

This patch chages the prototype of open_image() to
return struct cr_img *, pb_(read|write)* to accept one
and fixes the compilation of the rest of the code :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:13 +04:00
Andrey Vagin
92ee123386 mntns: don't dump criu's namespace
Reported-by: Mr Jenkins
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-24 17:30:23 +04:00
Pavel
867bcd2196 mnt: Shorten the mntns dumping loop
We currently have all mouninfo-s from all mnt namespaces collected
in one big list. On dump we scan through it to find the namespaces
we need to dump.

This can be optimized by walking the list of namespaces instead.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-09-23 20:37:32 +04:00
Andrey Vagin
5a101d83af mount: skip the criu's mount namespace if tasks live in another mntns
Currently here is a bug, because when we see criu's mount namespace,
we go to the "out" mark and don't validate mounts.

Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-19 17:40:26 +04:00
Andrey Vagin
7fa98a30bc mount: handle a circular reference in mount tree
$ cat /proc/self/mountinfo
...
1 1 0:2 / / rw - rootfs rootfs rw,size=373396k,nr_inodes=93349
...

You can see that mnt_id and parent_mnt_id are equals here.
This patch interpretes this case as a root mount in a tree.

0'th mount is rootfs, which is mounted in init_mount_tree().

We don't see it in cases when system makes chroot, because of

static int show_mountinfo(struct seq_file *m, struct vfsmount *mnt)
	...
	/* mountpoints outside of chroot jail will give SEQ_SKIP on this */
	err = seq_path_root(m, &mnt_path, &root, " \t\n\\");

Cc: beproject criu <beprojectcriu@gmail.com>
Cc: Christopher Covington <cov@codeaurora.org>
Reported-by: beproject criu <beprojectcriu@gmail.com>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-16 23:17:55 +04:00
Andrey Vagin
aadc309a4f mount: validate mounts only once on dump (v3)
mntinfo contains mounts from all namespaces, so we can validate it only
once after collecting mounts.

v2: add a fake comment about goto
v3: add a real comment about goto
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-15 16:15:50 +04:00
Andrey Vagin
f88d72d0bd mount: strip options for all mounts
Currently we stript options only one of brothers, but
mount_equal() thinks that two brothers should have the same options.

Execute zdtm/live/static/mountpoints
./mountpoints --pidfile=mountpoints.pid --outfile=mountpoints.out
Dump 2737
WARNING: mountpoints returned 1 and left running for debug needs
Test: zdtm/live/static/mountpoints, Result: FAIL
==================================== ERROR ====================================
Test: zdtm/live/static/mountpoints, Namespace:
Dump log   : /root/git/criu/test/dump/static/mountpoints/2737/1/dump.log
--------------------------------- grep Error ---------------------------------
(00.146444) Error (mount.c:399): Two shared mounts 50, 67 have different sets of children
(00.146460) Error (mount.c:402): 67:./zdtm_mpts/dev/share-1 doesn't have a proper point for 54:./zdtm_mpts/dev/share-3/test.mnt.share
(00.146820) Error (cr-dump.c:1921): Dumping FAILED.
------------------------------------- END -------------------------------------
================================= ERROR OVER =================================

Reported-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Tested-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-10 18:33:08 +04:00
Andrey Vagin
cb738520e3 mount: don't skip checks in validate_mounts()
"continue" is called by mistake, so we skip a few checks for shared
mounts without siblings.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-10 18:32:20 +04:00
Cyrill Gorcunov
3146f58317 plugin: Rework plugins API, v2
Here we define new api to be used in plugins.

 - Plugin should provide a descriptor with help of
   CR_PLUGIN_REGISTER macro, or in case if plugin require
   no init/exit functions -- with CR_PLUGIN_REGISTER_DUMMY.

 - Plugin should define a plugin hook with help of
   CR_PLUGIN_REGISTER_HOOK macro.

 - Now init/exit functions of plugins takes @stage
   argument which tells plugin which stage of criu
   it's been called on dump/restore. For exit it
   also takes @ret which allows plugin to know if
   something went wrong and it needs to cleanup
   own resources.

The idea behind is to not limit plugins authors with names
of functions they might need to use for particular hook.

Such new API deprecates olds plugins structure but to keep
backward compatibility we will provide a tiny layer of
additional code to support old plugins for at least a couple
of release cycles.

For example a trivial plugin might look like

 | #include <sys/types.h>
 | #include <sys/stat.h>
 | #include <fcntl.h>
 | #include <libgen.h>
 | #include <errno.h>
 |
 | #include <sys/socket.h>
 | #include <linux/un.h>
 |
 | #include <stdio.h>
 | #include <stdlib.h>
 | #include <string.h>
 | #include <unistd.h>
 |
 | #include "criu-plugin.h"
 | #include "criu-log.h"
 |
 | static int dump_ext_file(int fd, int id)
 | {
 |	pr_info("dump_ext_file: fd %d id %d\n", fd, id);
 |	return 0;
 | }
 |
 | CR_PLUGIN_REGISTER_DUMMY("trivial")
 | CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__DUMP_EXT_FILE, dump_ext_file)

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-03 20:48:36 +04:00
Andrey Vagin
8f17b34abb criu: Drop redundant newline from pr_perror
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-22 19:22:39 +04:00
Saied Kazemi
d8b41b6525 Added AUFS support.
The AUFS support code handles the "bad" information that we get from
the kernel in /proc/<pid>/map_files and /proc/<pid>/mountinfo files.
For details see comments in sysfs_parse.c.

The main motivation for this work was dumping and restoring Docker
containers which by default use the AUFS graph driver.  For dump,
--aufs-root <container_root> should be added to the command line options.
For restore, there is no need for AUFS-specific command line options
but the container's AUFS filesystem should already be set up before
calling criu restore.

[ xemul: With AUFS files sometimes, in particular -- in case of a
  mapping of an executable file (likekely the one created at elf load),
  in the /proc/pid/map_files/xxx link target we see not the path
  by which the file is seen in AUFS, but the path by which AUFS
  accesses this file from one of its "branches". In order to fix
  the path we get the info about branches from sysfs and when we
  meet such a file, we cut the branch part of the path. ]

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-21 18:35:22 +04:00
Andrey Vagin
c9228dd809 restore: use /proc/self/mountinfo for collecting mounts fo the root task (v3)
If the root task is forked in a new pidns, it can't use its pid for
accessing /proc, because this proc belongs to the source pidns.

v2: don't copy a static string.
v3: take a bright part of Tycho's patch

Reported-by: Tycho Andersen <tycho.andersen@canonical.com>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-12 14:35:25 +04:00
Pavel Emelyanov
8b019e0bb4 mnt: Don't delay external mount points
It looks like criu constantly postpones external bind mounts. I'm trying to resolve
when we manage to break this (when I did ext-mount-map they for some reason didn't).
Meanwhile, this patch fixes it back.

Reported-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-12 14:30:39 +04:00
Andrey Vagin
9ba0baabd5 mount: fix dereference after null check
CID 1168169 (#1 of 1): Dereference after null check (FORWARD_NULL)
7. var_deref_model: Passing "mi" to function "do_bind_mount(struct
   mount_info *)", which dereferences null "mi->bind"

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-07 10:25:53 +04:00