In case criu and dumpee live in the same mount namespace there's no
need in getting ns' root from init task. We can get it from criu and
(!) void the root == "/" check, required for namespace case.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we check that all shared mounts have identical set of
children and that Each non-root mount has a proper root mount.
v2: check that nobody is overmounted
check a tree before trying to restore it.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A non-root mount is bind-mounted from a proper root mount.
Non-root mount without root mount is not supported yet
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The idea is simple. If a mount can't be mounted now, we will try to
mount it later.
v2: don't wait slaves, they are unmounted anyway
v3: add a comment in do_bind_mount to explain restoring of shared groups
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A few sentences, which are required for understanging this patch
2a) A shared mount can be replicated to as many mountpoints and all the
replicas continue to be exactly same.
2b) A slave mount is like a shared mount except that mount and umount
events only propagate towards it.
2c) A private mount does not forward or receive propagation.
All rules is there Documentation/filesystems/sharedsubtree.txt
If it's a first mount in a group, all group members should be
bind-mounted from this one.
Each mount propagates to all members of parent's group. The group can
contains a few slaves.
Mounts, which have propagated to slaves, are unmounted, because we can't
be sure, that they propagated in real life. For example:
mount --bind --make-slave /share /slave1
mount --bind --make-slave /share /slave2
mount /share/test
umount /slave2/test
mount --make-share /slave1/test
mount --bind --make-share /slave1/test /slave2/test
41 40 0:33 / /share rw,relatime shared:28 - tmpfs xxx rw
42 40 0:33 / /slave1 rw,relatime master:28 - tmpfs xxx rw
43 40 0:33 / /slave2 rw,relatime master:28 - tmpfs xxx rw
44 41 0:34 / /share/test rw,relatime shared:29 - tmpfs xxx rw
46 42 0:34 / /slave1/test rw,relatime shared:30 master:29 - tmpfs xxx rw
45 43 0:34 / /slave2/test rw,relatime shared:30 master:29 - tmpfs xxx rw
/slave1/test and /slave2/test depend on each other and minimum one of them
doesn't propagate from /share/test
v2: use false and true for bool
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Try to restore mounts while a postpone list isn't empty and check
that each iteration has some progress, otherwice it will fails for
preventing infinite loops
v2: rework logic about postpone list
add more comments
v3: one more attempt to make it more readable
v4: Here is a master class from Pavel how to write self-documented code.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
They are required for restoring shared and slave mounts
v2: use the same names of variables in image and in code
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
All shared mounts from one group are connected to circular list.
All slave are added into the proper master list.
v2: change variable name and fix a bug about adding shared mounts in a
circular list.
v3: handle errors of collect_shared
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have generic do_pb_show() call and tons of show_foo
routines, that just call one with proper args. Compact
the code by putting the args into array and calling
the do_pb_show() in one place.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These contain linkage between number, data type and routines
for pb messages we write/read to/from image files. Most of them
have simple number-type-routines mapping, so introduce a generating
script for that.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
They are trivial and these functions will be used in many places
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
for that the mount point is bind-mounted in a temporary place.
v2: * check, that the fs you get access to at the end is _really_ the
one you wanted to
* use switch_ns/restore_ns helpers
v3: reuse code of __open_mountpoint and a few small cleanups
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When CRIU wants to dump content, it checks that nothing is overmounted.
The list of children for such mounts must be empty, but these lists are
filled during constructing a tree of mounts.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Actully for dumping tmpfs it should be remounted to somewhere else to
avoid overmounts.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If a parent mount point is shared with exteranl mntns, a child will be
umounted from the external mntns too.
For example:
$ mount -t tmpfs xxx /root/tmp/
$ mount --make-shared tmp
$ mkdir tmp/xxx
$ mount -t tmpfs xxx /root/tmp/xxx
$ touch tmp/xxx/a
$ unshare -m umount tmp/xxx
$ ls -l tmp/xxx/a
ls: cannot access tmp/xxx/a: No such file or directory
This patch changes a parent mnt to private for umounting childrens.
v2: exit if a mount point can not be marked ad private
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If we meet shared mount point without share master belonging
to us -- it means we might fail on restore, thus require both
master/slave mount peers to be collected on dump.
In other words, the output will be like
| (00.077025) Error (mount.c:421): Mount 49 (master_id: 2 shared_id: 0) has unreachable sharing
| (00.077123) Error (mount.c:472): Can't proceed 4237's mountinfo
| (00.077865) Error (namespaces.c:442): Namespaces dumping finished with error 65280
https://bugzilla.openvz.org/show_bug.cgi?id=2608
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Allocate it with xzalloc instead of massive
NULL assignment. Moreover, don't forget to
initialize @siblings.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This will be needed for fast parsing of procfs ns references.
[ xemul: Add user_ns_desc here ]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
when mntns_collect_root->readlinkat call faild, we should close pdf
Signed-off-by: Libo Chen <libo.chen@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Information about mount points is used for dumping fanotify.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
mnt_entry contains a few strings and they should be release too
CID 996198 (#4 of 4): Resource leak (RESOURCE_LEAK)
20. leaked_storage: Variable "pm" going out of scope leaks the storage
it points to.
CID 996190 (#1 of 1): Resource leak (RESOURCE_LEAK)
13. leaked_storage: Variable "new" going out of scope leaks the storage
it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
simfs is used in OpenVZ containers, so lets understand it
and don't fail on its meeting.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise we will clean up the root mntns too.
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise if the root is mounted with MS_SHARED, pivot_root fails with EINVAL.
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
mnt_entry::fstype is a part of image ABI, thus we need
to provide some "common" encoding outside tools would
know about this field encoding.
Thus we instorduce fstype enum in .proto file and use it
in source code as well.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These are structs that (now) tie together ns string
and the CLONE_ flag. It's nice to have one (some code
becomes simpler) and will help us with auto-namespaces
detection.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need to lookup mount points by mount id
and device for fanotify restore.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Because /proc could not be umounted, if any its file is opened.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* The following files goes into the directory arch/x86/include/asm unmodified:
- include/atomic.h,
- include/linkage.h,
- include/memcpy_64.h,
- include/types.h,
- include/bitops.h,
- pie/parasite-head-x86-64.S,
- include/processor-flags.h,
- include/syscall-x86-64.def.
* Changed include directives in the source files that include the headers
listed above.
* Modified build scripts to reflect the source moves.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Many image files opened by open_image_ro weren't closed before return, fix
them all in this patch.
Signed-off-by: Huang Qiang <h.huangqiang@huawei.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We don't need to clean_mnt_ns(), if we are goning to do pivot_root().
"""
pivot_root moves the root file system of the current process to the
directory put_old and makes new_root the new root file system.
"""
So I suggest to do pivot_root() and then detach the old root, all
other mount points will be unmounted automatically.
This patch fixes a problem, when a new root is mounted above a non-root
mount point. It's a default configuration for OpenVZ.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Restore of namespaces requires executions of external tools
(ip, tar, etc). We want to know return codes, so we should
block a default sigchld handler. Before we did that for each
command, I suggest to block SIGCHLD, then restore namespace and
unblock SIGCHLD.
The default sigchld handler is used for catching target processes,
but all this processes (except a current one ) are started after
restoring namespaces.
Currently we forgot to block SIGCHLD before executing "ip",
and this bug was caught.
Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The devpts fs should be mounted and its content is restored,
when crtools restores terminals.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>