In the rest of this series we need to walk all the namespaces to autodetect
which mounts are master/shared/private bind mounts, so we need the information
from criu's namespace in the case when the namespaces are not the same.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently if /tmp does not exist, CRIU fails because it will not be
able to create a temporary directory there. But when checkpointing
and restoring containers, we cannot rely on the existence of /tmp.
For such containers, we should use root (/). The temporary directory
will be removed after CRIU is done.
Signed-off-by: Saied Kazemi <saied@google.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
setns(fd, CLONE_NEWNS) resets cwd and root, so we need to
restore them back.
Without this patch stats-dump isn't saved in the work dir:
-rw-r--r-- 1 root root 32 Apr 2 14:21 /stats-dump
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Preparation.
1. Add the new "bool for_dump" arg to collect/parse_mntinfo().
2. Introduce "struct collect_mntns_arg" to pass the additional
"bool for_dump" field to collect_mntinfo() and change it to
pass this boolean to collect_mntinfo()->parse_mountinfo() path.
3. Change other callers of collect_mntinfo() to pass "false".
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
validate_mounts() prints ->mnt_id in hex when it reports the failure.
This complicates the understanding because this ->mnt_id is printed as
decimal elsewhere, including /proc/$pid/mountinfo.
parse_mountinfo() adds "0x" at least and this is just pr_info(), but
lets change it too.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When an image of a certian type is not found, CRIU sometimes
fails, sometimes ignores this fact. I propose to ignore this
fact always and treat absent images and those containing no
objects inside (i.e. -- empty). If the latter code flow will
_need_ objects, then criu will fail later.
Why object will be explicitly required? For example, due to
restoring code reading the image with pb_read_one, w/o the
_eof suffix thus required the object to be in the image.
Another example is objects dependencies. E.g. fdinfo objects
require various files objects. So missing image files will
result in non-resolved searches later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Fedora bind-mounts a part of the root mount to itself. Currently we
don't allow to mount children of a shared mount, if other mount from
this shared group are not mounted.
This patch adds an exclusion for cases, when a child has the same
group. We allow to mount a child, if wider mounts are mounted.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we connect roots of sub-namespaces to the root of the root
mount namespace. And we get problems, if the root of the root mntns is
shared, because all children of a shared mount must be propagated to
other mounts in this group.
Actually we mount tmpfs in mnt_roots and here is nothing wrong to add it
in a tree.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
And use the expression for it, it's quite short. This
makes the amount of variables in the code fit into brains.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Do paths conversions and checks step-by-step and add many comments
what we do in each step and why.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
When we check whether a submount of a mount is visible in another
mount (shared peer of the latter), we can and should use the new
issubpath helper.
Should because the used strncmp may scan beyond ct_mpnt_rpath if
its length is smaller (no checks for this in the code).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
These are constant for given m, so calculate them outside
of the loop. Also rename them to reflect what they are.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
The path lenght is zero for the "/" one and strlen(path)
for all the others. This is done so to make it possible
to use this length to get tail-paths: if path_1 starts
with path_2 and both are absolute, then
path_1 + path_length(path_2)
would give the tail of the tail of path_1 relative to
path_2 even if the path_2 is just "/".
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
A problem which is solved in this path is that some children can be
unaccessiable (unvisiable) for non-root bind-mounts
root mount point
-------------------
/ /a (shared:1)
/ /a/x
/ /a/x/y
/ /a/z
/x /b (shared:1)
/ /b/y
/b is a non-root bind-mount of /a
/y is visiable to both mounts
/z is vidiable only for /a
Before this patch we checked that the set of children is the same for
all mount in a shared group. Now we check that a visiable set of mounts
is the same for all mounts in a shared group.
Now we take the next mount in the shared group, which is wider or equal
to current and compare children between them.
Before this patch validate_shared(m) validates the m->parent mount.
Now it validates the "m" mount. So you can find following lines in the
patch:
- if (m->parent->shared_id && validate_shared(m))
+ if (m->shared_id && validate_shared(m))
We doesn't support shared mounts with different set of children.
Here is an example of such case can be created:
mount tmpfs a /a
mount --make-shared /a
mkdir /a/b
mount tmpfs b /a/b
mount --bind /a /c
In this case /c doesn't have the /b child. To support such cases,
we need to sort all shared mounts accoding with a set of children.
v2: If root is equal to "/", its len should be zero. We expect that the
last symbol in a path is not "/".
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we validate the mount tree not to have overmounts we need to
check one path to be the sub-path of another. Here's a helper for
this.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
When we create a new mntns in a userns, all inhereted mounts are marked
as locked. pivot_root() returns EINVAL if a new root is locked.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Introduced by eb214be2, the empty mnt_share list cannot
produce the list_first_entry element :)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
This is for two reasons. First, validation can meet external mount
and will call plugins, which is not correct on pre-dump and actually
crashes on uninitilized plugins lists. Second, even if on pre-dump
mount tree is not "supported" this can be a temporary situation (yes,
yes, unlikely, but still).
On the other hand, it's better to fail earlier, but that's another
story.
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
In case if we meet virtualized devtmpfs on dump
(which means its s_dev is different from one obtained
during mountpoints dump procedure) we should dump it
with tar help. Thus on restore it get filled from the
image.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need devtmpfs as well so make it general.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
So we need to create a temporary private mount for the old root.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This reverts commit 21d1b2fdb9.
pivot_root() can move the current root to a non-shared mount. So we are
going to create a temporary private mount in put_old.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We don't support posix mqueues at the moment
but in case if this fs is simply mounted and
not used lets proceed without errors.
In case if someone is using it we detect it
because fs won't be empty and refuse to dump.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I found this solution in the LXC code. We can open the old root, call
pivot_root(".", "."), call fchdir to the old root and call umount(".").
Now restore will not fail, if the root is read-only.
In addition it's a bit faster.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We want to have buffered images to speed up dump and,
slightly, restore. Right now we use plan file descriptors
to write and read images to/from. Making them buffered
cannot be gracefully done on plain fds, so introduce
a new class.
This will also help if (when?) we will want to do more
complex changes with images, e.g. store them all in one
file or send them directly to the network.
For now the cr_img just contains one int _fd variable.
This patch chages the prototype of open_image() to
return struct cr_img *, pb_(read|write)* to accept one
and fixes the compilation of the rest of the code :)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
We currently have all mouninfo-s from all mnt namespaces collected
in one big list. On dump we scan through it to find the namespaces
we need to dump.
This can be optimized by walking the list of namespaces instead.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Currently here is a bug, because when we see criu's mount namespace,
we go to the "out" mark and don't validate mounts.
Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
$ cat /proc/self/mountinfo
...
1 1 0:2 / / rw - rootfs rootfs rw,size=373396k,nr_inodes=93349
...
You can see that mnt_id and parent_mnt_id are equals here.
This patch interpretes this case as a root mount in a tree.
0'th mount is rootfs, which is mounted in init_mount_tree().
We don't see it in cases when system makes chroot, because of
static int show_mountinfo(struct seq_file *m, struct vfsmount *mnt)
...
/* mountpoints outside of chroot jail will give SEQ_SKIP on this */
err = seq_path_root(m, &mnt_path, &root, " \t\n\\");
Cc: beproject criu <beprojectcriu@gmail.com>
Cc: Christopher Covington <cov@codeaurora.org>
Reported-by: beproject criu <beprojectcriu@gmail.com>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
mntinfo contains mounts from all namespaces, so we can validate it only
once after collecting mounts.
v2: add a fake comment about goto
v3: add a real comment about goto
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we stript options only one of brothers, but
mount_equal() thinks that two brothers should have the same options.
Execute zdtm/live/static/mountpoints
./mountpoints --pidfile=mountpoints.pid --outfile=mountpoints.out
Dump 2737
WARNING: mountpoints returned 1 and left running for debug needs
Test: zdtm/live/static/mountpoints, Result: FAIL
==================================== ERROR ====================================
Test: zdtm/live/static/mountpoints, Namespace:
Dump log : /root/git/criu/test/dump/static/mountpoints/2737/1/dump.log
--------------------------------- grep Error ---------------------------------
(00.146444) Error (mount.c:399): Two shared mounts 50, 67 have different sets of children
(00.146460) Error (mount.c:402): 67:./zdtm_mpts/dev/share-1 doesn't have a proper point for 54:./zdtm_mpts/dev/share-3/test.mnt.share
(00.146820) Error (cr-dump.c:1921): Dumping FAILED.
------------------------------------- END -------------------------------------
================================= ERROR OVER =================================
Reported-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Tested-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>