2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 21:38:16 +00:00

4661 Commits

Author SHA1 Message Date
Pavel Emelyanov
bd7bddb889 zdtm: Add test for mount namespace w/o mountpoints
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-09 15:29:52 +04:00
Pavel Emelyanov
701f883765 restore: Do fchroot() via proc helpers
There's no such thing as fchroot() in Linux, but we need to do
chroot() into existing file descriptor. Before this patch we did
this by chroot()-ing into /proc/self/fd/$fd. W/o proc mounted it's
no longer possible, so do this like

fchdir(proc_service_fd);
chroot("./self/fd/$root_fd");
fchdir($cwd_fd);

Thanks to Andrey Vagin for this trick ;)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-09 15:29:50 +04:00
Pavel Emelyanov
3659d60ab7 restore: Open /proc/sys/kernel/ns_last_pid via helpers
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-09 15:29:49 +04:00
Pavel Emelyanov
0066d5e813 restore: Open /proc/self/maps via helpers
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-09 15:29:47 +04:00
Pavel Emelyanov
8644ce9628 util: Prepare proc opening helpers to open any files
We have a set of routines that open /proc/$pid files via proc service
descriptor. Teach them to accept non-pids as pids to open /proc/self/*
and /proc/* files via the same engine.

Signed-f-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-09 15:29:46 +04:00
Pavel Emelyanov
d9e7a5f155 zdtm: Add ability just to start the test
When running test with ns/ prefix zdth.sh does complex preparations.
Make it possible to make them and let started process ready for
manual investigation.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-09 15:29:44 +04:00
Cyrill Gorcunov
c09b7c2f37 vdso: x86 -- Use dynamic symbols for parsing
New vDSO are in stripped format so use dynamic
symbols instead of sectioned ones.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-06 15:26:31 +04:00
Cyrill Gorcunov
3ca8b12ee7 vdso: x86 -- Drop DECLARE_VDSO macro
We're not sharing the code anymore so drop it.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-06 15:26:14 +04:00
Pavel Emelyanov
8a07349388 files: Fix open_path() to provide mntns root fd to callbacks
This fixes the support for fifo-s in mount namespaces and
makes it easier to control the correct open_path() usage in
the future.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-06 12:20:02 +04:00
Pavel Emelyanov
b9c6cf3dd3 mnt: Strip commas from options string
Not all filesystems like it. Other than this options in the
image just look cleaner.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-06 12:19:15 +04:00
Pavel Emelyanov
0457c94c69 zdtm: Make it possible for test to get ZDTM_NEWNS variable
I will need to make cgroup test behave slightly differently
when it's in and out of ns/ run. To do so it's handy to use
the ZDTM_NEWNS variable set by zdtm.sh

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-06 12:18:50 +04:00
Pavel Emelyanov
30e95be264 mnt: Fix validation of dumpable mountpoints
This patch consists of 3 unsplittable (from my POV) fixes.

1. Remove messy check from dump_one_mountpoint() -- we have
   validate_mounts to check whether we can dump the tree
   or not.

2. Other than being in the wron place the mentioned check
   is wrong. Comparing of the length of the mp->source-s
   makes no sense -- it should be mp->root, but even this
   would be wrong...

3. ... instead, we should check for bind mount root path
   being accessible from the target mount root path, i.e.
   the bind->root should start with src->root.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-04 19:34:54 +04:00
Pavel Emelyanov
3635f2c4b9 mnt: Relax checks for top-mount in validate_mounts
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-04 19:34:53 +04:00
Pavel Emelyanov
c75b7ab61c mnt: Devpts options get corrupted on dump (v2)
The memcpy() in devpts_dump() just overwrites part of them.
Fix this and move the whole code into sub-routine for future.

v2: Fix off-by-one error spotted by Filipe.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Filipe Brandenburger <filbranden@google.com>
2014-06-02 13:07:11 +04:00
Cyrill Gorcunov
06f559fcc7 vdso: make -- Arch targets depends on config
We use config.h in vDSO handling code so arch
targets should depend on it.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-28 00:51:07 +04:00
Pavel Emelyanov
441b9b9ee5 zdtm: Stupid test for task-in-cgroup
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-28 00:24:54 +04:00
Pavel Emelyanov
203c291467 cg: Restore tasks into proper cgroups
On restore find out in which sets tasks live in and move
them there.

Optimization note -- move tasks into cgroups _before_ fork
kids to make them inherit cgroups if required. This saves
a lot of time.

Accessibility note -- when moving tasks into cgroups don't
search for existing host mounts (they may be not available)
and don't mount temporary ones (may be impossible due to
user namespaces). Instead introduce service fd with a yard
of mounts.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:48:06 +04:00
Pavel Emelyanov
1ba9d2cae9 cg: Dump cgroups tasks live in
Each task points to a single ID of cgroup-set it lives in. This
is done so to save some space in the image, as tasks likely
live in the same set of cgroups.

Other than this we keep track of what cgroup set we dump the
subtree from. If it happens, that root task lives in the same
cgroup set as criu does, we don't allow for any other sub-cgroups
and make restore (next patch) much simpler and faster.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:48:06 +04:00
Pavel Emelyanov
8b8eb53a0a cg: Skeleton for cgroup code
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:48:06 +04:00
Pavel Emelyanov
06f7243380 image: Add bits and pieces for cgroups image
The exact structure of the image will be revealed in the
next patch(es). What is important here, is that cgroup
image is somewhat new.

It will likely contain arrays of objects of different types,
so I introduce the "header" object, that will link these
arrays using pb repeated fields. This will help us to avoid
many image files for different cgroup objects and will make
the amount of write()-s required be 1.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:48:06 +04:00
Pavel Emelyanov
b48e4cbfb8 proc: Introduce helper for parsing /proc/$pid/cgroup file
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:48:06 +04:00
Pavel Emelyanov
e5eb73ea48 util: Introduce strstartswith helper
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:48:06 +04:00
Cyrill Gorcunov
c473461d24 vdso: Make it arch specific
Currently we build vDSO handling code for all archs provided
in the source code having some "common" parts inside pie/vdso.c,
pie/vdso-stub.c, vdso-stub.c and vdso.c. This were more or
less well but in new linux kernels (starting from 3.16 presumably)
the vDSO has been significantly reworked so every architecture
must have own vDSO handling engine (just like the kernel does).

So in this patch we move vDSO code to arch specific and because
aarch64 actually doesn't implement proxification yet due to
kernel restrictions -- we drops it out. When there will be
kernel support we bring it back in proper arch/aarch64
implementation.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:41:31 +04:00
Cyrill Gorcunov
676708e3b3 vdso: Put CONFIG_VDSO where needed
Guard vDSO code with CONFIG_VDSO, no need to even build it
on archs which do not support vDSO handling.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:40:07 +04:00
Cyrill Gorcunov
89faae1e9b vdso: dump -- Drop duplicated include
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:40:00 +04:00
Cyrill Gorcunov
46661cf8b2 vdso: make -- Export VDSO and CONFIG_VDSO
We will need it to figure out if architecture
needs vDSO handling code to be built. Note
currently only x86 is exporting vDSO simply
because ARM support is not yet ready.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:39:51 +04:00
Filipe Brandenburger
ff1cf6b2ca zdtm: add missing entries to test/zdtm/.gitignore
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-26 17:38:56 +04:00
Filipe Brandenburger
5b14a32e35 zdtm: keep entries in test/zdtm/.gitignore sorted
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-26 17:38:54 +04:00
Filipe Brandenburger
733bbb337c make: use -pthread when linking with -lrt
This is required when linking statically.  It is also consistent with
the rules for other zdtm tests that link with -lrt.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-26 17:38:53 +04:00
Filipe Brandenburger
81fe07e54d make: respect USERCFLAGS in zdtm makefiles
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-26 17:38:52 +04:00
Filipe Brandenburger
2351d188c3 make: fix dependencies for "make criu" to work
Without the added dependencies, "make criu" will fail when trying to
build arch/x86/crtools.d because it can not find include/config.h, the
extra dependencies force the "config" rule to be processed before the
dependencies of "criu" are evaluated.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-26 17:38:51 +04:00
Filipe Brandenburger
c1c7d8984c make: clean up obj-ext-src-y objects on "make clean"
This is needed for lib/rpc.pb-c.{d,o} to be removed by "make clean".

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-26 17:38:49 +04:00
Andrey Vagin
f0cbc301fc mm: mark VM_IO and VM_PFNMAP VMA-s as unsupported
vmsplice doesn't work for such VMA-s.

This flags is set in a kernel function remap_pfn_range()
(remap kernel memory to userspace), which is widely used by device
drivers to provide direct access to a device memory.

Reported-by: J F <jgmb45@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-23 13:34:16 +04:00
Cyrill Gorcunov
19906f9182 fsnotify: Use mnt_id when restoring mount mark
When there is a bind mount present on same mountpoint
where mark is laying the device of both mountpoints
is the same so we might end up getting wrong mountpoint.
Instead we should used mount id which is unique among
all mounpoints.

Note it's a fast fix, I need to review fsnotify code
more and make sure all corner cases are covered.

https://bugzilla.openvz.org/show_bug.cgi?id=2974

Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-23 13:31:32 +04:00
Filipe Brandenburger
9f415abdb7 zdtm: fix cow01 test to fail on system errors
Add extra troubleshooting messages for errors and failures so that the
output can help troubleshoot the issue.

Make error handling more uniform by using -1 for errors (e.g. opening
files, reading from pagemap, etc.) and 1 for failures (data mismatch,
pages not COWed when expected to be, etc.)

Tested: Ran this test with criu versions known to have problems with COW
and with problems to read from pagemap due to the dumpable flag not
being preserved.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-21 00:51:44 +04:00
Andrey Vagin
8e1842aec1 zdtm/pty02: ignore SIGHUP
The test proces opens master and slave points. The slave point becomes
the control terminal for this process. Then the test process forks a
second process and the first process closes the master point. So when a
master point is closed in a second process, the first one will be killed
by SIGHUP.

Reported-by: Victor Konyashkin <vkonyashkin@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-14 17:35:53 +04:00
Andrey Vagin
5a14695777 make: "make test" executes make in the test directory
test/Makefile knows better how to execute tests.
For example it allows to execute tests simultaneously

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-14 17:35:20 +04:00
Andrey Vagin
be10e0e157 git: update .gitignore
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-14 17:34:41 +04:00
Filipe Brandenburger
d5bb7e9748 dump: preserve the dumpable flag on criu dump/restore
Preserve the dumpable flag, which affects whether a core dump will be
generated, but also affects the ownership of the virtual files under
/proc/$pid after restoring a process.

Tested: Restored a process with a criu including this patch and looked
at /proc/$pid to confirm that the virtual files were no longer all owned
by root:root.

zdtm tests pass except for cow01 which seems to be broken.
(see https://bugzilla.openvz.org/show_bug.cgi?id=2967 for details.)

This patch fixes https://bugzilla.openvz.org/show_bug.cgi?id=2968

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Change-Id: I8c386508448a84368a86666f2d7500b252a78bbf
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-14 01:02:37 +04:00
Andrey Vagin
1a1b50168d vma: don't skip vmas during searching a parent vma
We stop searching if vma->start is bigger than a required one.
The coursor is set on the last examined vma. When we are searching a
parent vma for the next vma, we start examine vma-s starting from
coursor->next, so we don't examine the vma, which is pointed by cursor.

This patch replaces list_for_each_entry_continue on list_for_each_entry_from.

Reported-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-14 01:00:31 +04:00
Andrey Vagin
c838494034 mem: stop searching a parent vma if we found one
A parent vma can be only one.

Fixes: 57d25e7cea12 ("mm: fix expression to determine which vma-s can be shared")
Reported-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-14 01:00:22 +04:00
Andrey Vagin
b57497ade5 mem: fix typo in determining an address of parent vma
Look at this hunk from 7659c995f58f:
-    paddr = decode_pointer(vma_premmaped_start(&p->vma));
+    paddr = decode_pointer(vma->premmaped_addr);

Obviously we want to use p->premmaped_addr instead of
vma->premmaped_addr.

Fixes: 7659c995f58f ("vm: don't overwrite vma->shmid for private mappings")
Reported-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-14 01:00:13 +04:00
Cyrill Gorcunov
ebd2e88c9a zdtm: maps05 -- Test a flood of small VMAs
Here we test a big number of small VMAs to survive c/r. The main
idea is to use a big number of IOVs needed for page transferring.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-12 14:46:48 +04:00
Cyrill Gorcunov
d8e5f617b5 mem: Don't shrink the number of IOVs needed for page transferring
In a worst scenario we need one IOV for every page we're transferring
from the parasite, thus don't divide by two here.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-12 14:46:41 +04:00
Andrey Vagin
85d366e8e3 vdso: don't scan MAP_GROWSDOWN vmas
* VDSO is never mapped with MAP_GROWSDOWN
* The first page  of growsdown vma may be a guard page, so any attempt
  to read it is suppressed by SIGBUS.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-12 14:46:33 +04:00
Cyrill Gorcunov
8a9e81b646 parasite-syscall: Print which @syscall_ip is selected
Useful for debugging.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-12 14:46:24 +04:00
Pavel Emelyanov
d513098f3c mount: Don't create kids with CLONE_NEWNS
We explicitly setns() every single task by hands when restoring
mount namespaces, they can be created without the NEWNS flag.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrey Vagin <avagin@parallels.com>
2014-05-12 14:20:17 +04:00
Andrey Vagin
63c9478a4a zdtm: add a new test case for growdown vma-s
This test case creates two consecutive grows down vmas with a hole
between them.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-07 21:03:50 +04:00
Andrey Vagin
20ec585916 mem: add a guard page only if here is enough space for it
Currently we don't add a guard page to a second consecutive growsdonw vma,
even if here is enough space for it. It's wrong. Look at the following test
output:

Execute zdtm/live/static/grow_map03
./grow_map03 --pidfile=grow_map03.pid --outfile=grow_map03.out
Dump 3888
Restore
Test: zdtm/live/static/grow_map03, Result: FAIL
==================================== ERROR ====================================
Test: zdtm/live/static/grow_map03, Namespace:
Dump log   : /root/git/criu/test/dump/grow_map03/3888/1/dump.log
--------------------------------- grep Error ---------------------------------
------------------------------------- END -------------------------------------
Restore log: /root/git/criu/test/dump/grow_map03/3888/1/restore.log
--------------------------------- grep Error ---------------------------------
pie: Error (pie/restorer.c:465): Unable to remap 0x7f0da2c99000 -> 0x7f46425fc000
pie: Error (pie/restorer.c:969): Restorer fail 3888
(00.035621) Error (cr-restore.c:1590): Restoring FAILED.
------------------------------------- END -------------------------------------
================================= ERROR OVER =================================

strace:
mremap(0x7fc3de5b6000, 0, 0, MREMAP_MAYMOVE|MREMAP_FIXED, 0x7f38dd4e0000) = -1 EINVAL (Invalid argument)

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-07 21:03:34 +04:00
Andrey Vagin
3a9c6a3d37 util: use glibc macros to generate device numbers in the dev_t format
Our version of macroses are worng.

Our macros:
#define MINOR(dev)           ((dev) & 0xff)

Glibc function:
return (__dev & 0xff) | ((unsigned int) (__dev >> 12) & ~0xff);

Reported-by: Amey Deshpande <ameyd@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-07 21:02:35 +04:00