Each sub-namespace is restored as sub-tree of the root mntns, so
the parent of sub-mntns root is the root of the root mntns.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we'll restore nested mount namespaces, all but root ones (sub-namespaces)
will be restored as sub-mounts in the root mount namespace. So mi->mountpoint
will be not '/' even if a mount is root for its mntns.
v2: s/is_root/is_ns_root/
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to restore nested mount namespaces and we will need to
change root for each of them.
v2: don't call chdir in a second time, because a path may be relative
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently it's initialized for the root mount namespace, but we are
going to dump nested mount namespaces.
It's used in open_mountpoint(), which is used in dump_tmpfs() and in
other callbacks.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We're about to collect root several times in a row, so keeping
the old one isn't required.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
BTRFS returns subvolume dev-id instead of superblock dev-id,
so we need to know which mounts are btrfs.
The mi->fstype->name is "unsuppoerted" here, because the fstype->code
is saved in an image
{
.name = "unsupported",
.code = FSTYPE__UNSUPPORTED,
},
{
.name = "btrfs",
.code = FSTYPE__UNSUPPORTED,
}
An a second reason is that pocesses can be migrated from smth to btrfs.
This all can happen _only_ for the root mount and for bind mounts of
the root mount...
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
"relative path" is absolute path with dot at the beginning.
We already use relative paths on restore. In this patch we add "."
on dump too. It's convinient, because we needed to add dot each time
when we want to access this mount point.
Before this patch we had to created a temporary copy.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now we have two funсtions which do mostly the same, so this patch merges
them.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
st_dev and s_dev have different formats.
st_dev is (MAJOR(dev) << 8) | MINOR(dev)
s_dev is (MAJOR(dev) << 20) | MINOR(dev)
so we need to convert one of them
v2: use kdev_to_odev
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Only one function use DIR, so I don't see reason to return it
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's already used for dumping files and it will be used for restoring,
so it should be service fd to avoid intersection with restored
descriptors.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The root mount is an external mount and its source can be not '/'.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The source of the root mount may be not equal to "/" and we need to take
this fact into account, when we bind-mount it to somewhere.
For example:
11877 ? Ss 0:00 ./bind-mount --pidfile=bind-mount.pid --outfile=bind-mount.out --dirname=bind-mount.test
11880 ? Ss 0:00 \_ ./bind-mount --pidfile=bind-mount.pid --outfile=bind-mount.out --dirname=bind-mount.test
[root@avagin-fc19-cr crtools]# cat /proc/11880/mountinfo
68 42 8:3 /root/git/crtools/test / rw,relatime - ext4 /dev/sda3 rw,data=ordered
43 68 0:33 / /proc rw,relatime - proc proc rw
44 68 0:34 / /dev/pts rw,relatime - devpts pts rw,mode=666,ptmxmode=666
45 68 8:3 /root/git/crtools/test/zdtm/live/static/bind-mount.test/test /zdtm/live/static/bind-mount.test/bind rw,relatime - ext4 /dev/sda3 rw,data=ordered
The 45 mount is bind-mount of the 68 mount.
mi(45)->root = /root/git/crtools/test/zdtm/live/static/bind-mount.test/test
mi(68)->root = /root/git/crtools/test
so the comman part is "/root/git/crtools/test" and the command is
mount --bind /zdtm/live/static/bind-mount.test/test /zdtm/live/static/bind-mount.test/bind
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The devpts instance was mounted w/o the newinstance option if,
the device number is equal to the root /dev/pts.
I think this condition is strong enough to not mount devpts in a
temporary place.
v2: move the host.bla-bla-bla in kerndat.c
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we marks all mounts as private before restoring mntns. We do
these to avoid problem with pivot_root.
It's wrong, because the root mount can be slave for an external shared
group. The root mount is not mounted by CRIU, so here is nothing wrong.
Now look at the pivot_root code in kernel
if (IS_MNT_SHARED(old_mnt) ||
IS_MNT_SHARED(new_mnt->mnt_parent) ||
IS_MNT_SHARED(root_mnt->mnt_parent))
goto out4;
So we don't need to change options for all mounts. We need to remount
/ and the parent of the new root. It's safe, because we already in another
mntns.
v2: simplify code
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The root mount isn't always private. For example it is mounted
as a slave in LXC 1.0 containers. So we need to execute logic
about propogation for the root mount too.
v2: move all logic about the root mount in a separate function
v3: make code more readable
v4: do_mount_root() looks like other do_*_root() functions
Reported-by: David Shwatrz <dshwatrz@gmail.com>
Cc: David Shwatrz <dshwatrz@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The current code think that /vz/lxc/centos-6-x86_64-root is
in /vz/lxc/centos-6-x86_64.
If the path is not equal to mountpoint, we need to check, that
path contains a slash after mountpoint.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This should be done before restoring a mount tree. This patch is a part
of the series about moving pivot_root, which has been committed.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are using tar for resting tmpfs. Currently we execute tar from a
restored root, but nobody guarantees that it is there and that it's
really tar.
We don't have reason to change root too early. Lets live in a source
root as long as we can, because we can be sure that it's consistent.
https://bugzilla.openvz.org/show_bug.cgi?id=2870
v2: remove redundant chdir()
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to restore mounts before changing root. For that the
current dir is changed in a new root and mounts will be restored by
relative paths.
v2: don't use snprintf
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to make pivot_root after restoring mount name-space,
so relative paths will be used for mountpoints.
v2: print correct root in a error message
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If a process is in another pidns than /proc, the link /proc/self doesn't
work.
(00.061569) Error (mount.c:558): Can't bind-mount
46:/zdtm/live/static/tempfs.test to /tmp/cr-tmpfs.gBVwTb: No such file
or directory
But since we've switched to the mount namespace (with setns) we
can just go an open the path by its name.
Reported-by: Urgen Sherpa <urgen.sherpa@nepallink.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some filesystems do not provide open-by-handle functionality. For those,
we should abort fsnotifies dumping, not restoring.
The open_mount() changes are about opening mountpoints inside another
mount namespace.
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If a mount is slave and it has a shared group. crtools must convert it
in slave and only than crtools can make it shared.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The expression in if () becomes quite complex and
deserves a helper with proper explanation of what's
going on.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
All the entries with with_plugin set will be mounted by plugin.
The interesting case is when we do the pivot-root restore. In this
case we call restore callback very early (before we unmount the old
tree) and ask it to create the mountpoint at temporary location.
Later we move the mount to proper place.
The old_root argument of the callback is where it can find files
in the original mount namespace.
The is_file is return-argument. Sine files and directories cannot be
bind-mounted to each-other, the callback should create the mountpoint
itself and report whether it created file or directory.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
External bind mounts are those with source sitting outside of the
current FS view. Such are detected in validate_mounts(), so we
just go ahead and call plugins.
The plugin is provided with the mountpoint to decide whether it's
his or not (what else does the guy need?) and an ID with this it
can identify the mountpoint in /proc. The same ID will be used at
restore time to find the needed restore info.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need images at hands while we do pivot_root (see further patches),
so prepare the images reading routine.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Remove whitespace at EOL (found by git grep ' $')
(the character before $ is real tab, typed in shell using Ctrl+V Tab)
To people using vim, I'd suggest adding the following code to ~/.vimrc:
let c_space_errors = 1
highlight FormatError ctermbg=darkred guibg=darkred
match FormatError /\s\+$\|\ \+\t\|\%80v.\|\ \{8\}/
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If we specify a new root for restore old mounts get destroyed
with pivot_root + umount calls, tree umount is omitted. In this
case mi-s are leaked.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Validation means -- check chat we can _restore_ this tree.
Those read from proc can be in any state -- we're going to
umount them (can do anything) and do path resolution (work
for any knots as well).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Remove whitespace at EOL (found by git grep ' $')
To people using vim, I'd suggest adding the following code to ~/.vimrc:
let c_space_errors = 1
highlight FormatError ctermbg=darkred guibg=darkred
match FormatError /\s\+$\|\ \+\t\|\%80v.\|\ \{8\}/
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is more correct, as if st_dev == phys_dev check fails
we have to treat phys_dev as kdev for path resolve device
comparison.
Howver, this is not the case for non-btrfs FSs, and for the
latter one doesn't change anything as it uses anon devices
which are equal for kdev and odev cases.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When dumping a ghost file we put real device in its header,
not the (btrfs) virtual one. This is done since we put real
devices into fsnotify images (we get them from proc). That
said on fsnotify ghost restore we don't need to do path
resolution, just devices compare.
And one more thing. When dumping device for ghost file for
_non_ btrfs case we have to convert stat dev_t into kernel
dev_t as all the other places in criu manipulate the latter
ones.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Instead of scanning btrfs subvolumes (which can be even unaccessbile
if mount point lays on directory instead of subvolume itself) we use
path resolving feature here -- once we need to figure out if some
device number need to be altered up to mount point (as we know stat()
called on subvolume returns st_dev for subvolume itself, but not
one that associated with a superblock and shown in /proc/self/mountinfo
output).
This as well implies that we need to check if device number for ghost
files are to be updated to match mountinfo, thus we use phys_stat_resolve_dev
helper here.
After this patch the previously merged btrfs engine is no longer needed
(at least it seems so) and can be dropped.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This routine is aimed to find a mount point on which
the path passed as argument is laying on. We walk over
all mount points and see which one is matching.
Once found (in worst case it will be a root mount point
so function is never failing) we're checking if this is
btrfs and then return subvolume0 device id.
See commit 921cf873f3
for details what the hell we're doing here.
v2: rewrite mount_resolve_path w/o recursion
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For paths resolution we will need mount tree to be parsed
and built, but it's not that simple -- the current code
implies that once parsed the tree must not be re-parsed
again, so we pass @parse argument from a caller: if a task
we're restoring do not use mount namespace, we should parse
mount tree early, otherwise defer this action until mount
tree is read from the image.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>