There's no such thing as fchroot() in Linux, but we need to do
chroot() into existing file descriptor. Before this patch we did
this by chroot()-ing into /proc/self/fd/$fd. W/o proc mounted it's
no longer possible, so do this like
fchdir(proc_service_fd);
chroot("./self/fd/$root_fd");
fchdir($cwd_fd);
Thanks to Andrey Vagin for this trick ;)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
We have a set of routines that open /proc/$pid files via proc service
descriptor. Teach them to accept non-pids as pids to open /proc/self/*
and /proc/* files via the same engine.
Signed-f-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When running test with ns/ prefix zdth.sh does complex preparations.
Make it possible to make them and let started process ready for
manual investigation.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
New vDSO are in stripped format so use dynamic
symbols instead of sectioned ones.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We're not sharing the code anymore so drop it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This fixes the support for fifo-s in mount namespaces and
makes it easier to control the correct open_path() usage in
the future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I will need to make cgroup test behave slightly differently
when it's in and out of ns/ run. To do so it's handy to use
the ZDTM_NEWNS variable set by zdtm.sh
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch consists of 3 unsplittable (from my POV) fixes.
1. Remove messy check from dump_one_mountpoint() -- we have
validate_mounts to check whether we can dump the tree
or not.
2. Other than being in the wron place the mentioned check
is wrong. Comparing of the length of the mp->source-s
makes no sense -- it should be mp->root, but even this
would be wrong...
3. ... instead, we should check for bind mount root path
being accessible from the target mount root path, i.e.
the bind->root should start with src->root.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
The memcpy() in devpts_dump() just overwrites part of them.
Fix this and move the whole code into sub-routine for future.
v2: Fix off-by-one error spotted by Filipe.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Filipe Brandenburger <filbranden@google.com>
We use config.h in vDSO handling code so arch
targets should depend on it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On restore find out in which sets tasks live in and move
them there.
Optimization note -- move tasks into cgroups _before_ fork
kids to make them inherit cgroups if required. This saves
a lot of time.
Accessibility note -- when moving tasks into cgroups don't
search for existing host mounts (they may be not available)
and don't mount temporary ones (may be impossible due to
user namespaces). Instead introduce service fd with a yard
of mounts.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Each task points to a single ID of cgroup-set it lives in. This
is done so to save some space in the image, as tasks likely
live in the same set of cgroups.
Other than this we keep track of what cgroup set we dump the
subtree from. If it happens, that root task lives in the same
cgroup set as criu does, we don't allow for any other sub-cgroups
and make restore (next patch) much simpler and faster.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The exact structure of the image will be revealed in the
next patch(es). What is important here, is that cgroup
image is somewhat new.
It will likely contain arrays of objects of different types,
so I introduce the "header" object, that will link these
arrays using pb repeated fields. This will help us to avoid
many image files for different cgroup objects and will make
the amount of write()-s required be 1.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we build vDSO handling code for all archs provided
in the source code having some "common" parts inside pie/vdso.c,
pie/vdso-stub.c, vdso-stub.c and vdso.c. This were more or
less well but in new linux kernels (starting from 3.16 presumably)
the vDSO has been significantly reworked so every architecture
must have own vDSO handling engine (just like the kernel does).
So in this patch we move vDSO code to arch specific and because
aarch64 actually doesn't implement proxification yet due to
kernel restrictions -- we drops it out. When there will be
kernel support we bring it back in proper arch/aarch64
implementation.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Guard vDSO code with CONFIG_VDSO, no need to even build it
on archs which do not support vDSO handling.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need it to figure out if architecture
needs vDSO handling code to be built. Note
currently only x86 is exporting vDSO simply
because ARM support is not yet ready.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is required when linking statically. It is also consistent with
the rules for other zdtm tests that link with -lrt.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Without the added dependencies, "make criu" will fail when trying to
build arch/x86/crtools.d because it can not find include/config.h, the
extra dependencies force the "config" rule to be processed before the
dependencies of "criu" are evaluated.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is needed for lib/rpc.pb-c.{d,o} to be removed by "make clean".
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
vmsplice doesn't work for such VMA-s.
This flags is set in a kernel function remap_pfn_range()
(remap kernel memory to userspace), which is widely used by device
drivers to provide direct access to a device memory.
Reported-by: J F <jgmb45@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When there is a bind mount present on same mountpoint
where mark is laying the device of both mountpoints
is the same so we might end up getting wrong mountpoint.
Instead we should used mount id which is unique among
all mounpoints.
Note it's a fast fix, I need to review fsnotify code
more and make sure all corner cases are covered.
https://bugzilla.openvz.org/show_bug.cgi?id=2974
Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Add extra troubleshooting messages for errors and failures so that the
output can help troubleshoot the issue.
Make error handling more uniform by using -1 for errors (e.g. opening
files, reading from pagemap, etc.) and 1 for failures (data mismatch,
pages not COWed when expected to be, etc.)
Tested: Ran this test with criu versions known to have problems with COW
and with problems to read from pagemap due to the dumpable flag not
being preserved.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The test proces opens master and slave points. The slave point becomes
the control terminal for this process. Then the test process forks a
second process and the first process closes the master point. So when a
master point is closed in a second process, the first one will be killed
by SIGHUP.
Reported-by: Victor Konyashkin <vkonyashkin@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
test/Makefile knows better how to execute tests.
For example it allows to execute tests simultaneously
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Preserve the dumpable flag, which affects whether a core dump will be
generated, but also affects the ownership of the virtual files under
/proc/$pid after restoring a process.
Tested: Restored a process with a criu including this patch and looked
at /proc/$pid to confirm that the virtual files were no longer all owned
by root:root.
zdtm tests pass except for cow01 which seems to be broken.
(see https://bugzilla.openvz.org/show_bug.cgi?id=2967 for details.)
This patch fixes https://bugzilla.openvz.org/show_bug.cgi?id=2968
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Change-Id: I8c386508448a84368a86666f2d7500b252a78bbf
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We stop searching if vma->start is bigger than a required one.
The coursor is set on the last examined vma. When we are searching a
parent vma for the next vma, we start examine vma-s starting from
coursor->next, so we don't examine the vma, which is pointed by cursor.
This patch replaces list_for_each_entry_continue on list_for_each_entry_from.
Reported-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A parent vma can be only one.
Fixes: 57d25e7cea12 ("mm: fix expression to determine which vma-s can be shared")
Reported-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Look at this hunk from 7659c995f58f:
- paddr = decode_pointer(vma_premmaped_start(&p->vma));
+ paddr = decode_pointer(vma->premmaped_addr);
Obviously we want to use p->premmaped_addr instead of
vma->premmaped_addr.
Fixes: 7659c995f58f ("vm: don't overwrite vma->shmid for private mappings")
Reported-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Here we test a big number of small VMAs to survive c/r. The main
idea is to use a big number of IOVs needed for page transferring.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In a worst scenario we need one IOV for every page we're transferring
from the parasite, thus don't divide by two here.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* VDSO is never mapped with MAP_GROWSDOWN
* The first page of growsdown vma may be a guard page, so any attempt
to read it is suppressed by SIGBUS.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We explicitly setns() every single task by hands when restoring
mount namespaces, they can be created without the NEWNS flag.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrey Vagin <avagin@parallels.com>
This test case creates two consecutive grows down vmas with a hole
between them.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>