2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 05:48:05 +00:00

4661 Commits

Author SHA1 Message Date
Pavel Emelyanov
1a5a034413 libcriu: Add add_ext_mount_map call
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-27 14:22:33 +04:00
Pavel Emelyanov
3f4447d72e libcriu: Add simple missing criu_set_ calls
These are just copy the value on RPC message and
do nothing more.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-27 14:22:32 +04:00
Pavel Emelyanov
a09396cca5 libcriu: Add criu_set_root to header
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-27 14:22:31 +04:00
Filipe Brandenburger
2812d04d29 make: fix "make criu" after arch-specific vdso broke it
Building criu with "make criu" on a clean tree was not working, failing on:

  make[1]: *** No rule to make target `arch/x86/vdso-pie.o'.  Stop.
  make: *** [arch/x86/vdso-pie.o] Error 2

git bisect traced the regression to commit c473461d24fd (vdso: Make it arch
specific) which apparently dropped the rule to build $(ARCH_DIR)/vdso-pie.o
using the pie rule.  Restore the dependency for "make criu" to work again from
a clean tree.

Tested:
$ git clean -fdx
$ make criu

Fixes: c473461d24fdfcd25542b427829a37fd2f0facb5

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-26 16:32:53 +04:00
Chris J Arges
c05b7f4153 Ensure LDFLAGS is passed to CC not LD.
If we build with something like:
make LDFLAGS="-Wl,-Bsymbolic-functions"

We'll get an error because the LDFLAGS are being passed to LD when they
should be pased to CC.

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-26 16:29:28 +04:00
Filipe Brandenburger
64dc66c29f dump: do not fail dump when robust_lists are disabled
Robust lists may be disabled, for example if the "futex_cmpxchg_enabled"
variable in the kernel is unset.

Detect that case by checking that both "get_robust_list" and "set_robust_list"
syscalls return ENOSYS and do not make criu dump fail in that case, but simply
assume an empty list, which is consistent with the syscalls not being
available.

Tested: Successfully ran the zdtm test suite on a kernel where the
"get_robust_list" and "set_robust_list" syscalls are disabled.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 19:57:32 +04:00
Saied Kazemi
bbb3299f03 cg: skip name= in cgroup named hierarchies
Skip the string "name=" when recreating cgroups directories in cgyard.
For example, systemd's entries in cgroup.img are:

	name: "name=systemd"
	path: "/user/1000.user/4.session"

When creating systemd subdir, named= should not be part of the name.

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 19:56:37 +04:00
Andrey Vagin
6f45c38c18 mount: parse devpts options
The newinstance options isn't shown in mountinfo. Currently it is
detected in devpts_dump. It is added only for root mounts and it
isn't added for bind-mounts. So mounts_equal(a, b, true) returns false
for such mounts and criu doesn't understand that they should be
bind-mounted.

Reported-by: Tycho Andersen <tycho.andersen@canonical.com>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 19:51:01 +04:00
Andrey Vagin
44356b37f2 mount: simplify devpts_dump
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 19:50:59 +04:00
Cyrill Gorcunov
817bf78523 zdtm: posix_timers -- Add definition of CLOCK_BOOTTIME
On PI we've noticed that CLOCK_BOOTTIME might not be defined
in system headers, so ship own one.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 19:37:01 +04:00
Cyrill Gorcunov
1764db0fa1 zdtm: Make arch specific tests to have \Space at the end
Otherwise we might have a clash

| Execute zdtm/live/static/vdso01ns/static/pipe00

Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 16:49:16 +04:00
Andrew Vagin
8b58c98086 files: Fix compilation on PI (a2)
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 16:08:06 +04:00
Tycho Andersen
4a012f1478 Fix typo
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 14:13:27 +04:00
Andrey Vagin
f5b67f5148 files: Fix compilation on PI
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 14:13:00 +04:00
Cyrill Gorcunov
905ceac7c5 zdtm: Add arch specific tests
To be able to run specific tests depending on
architecture we're executing on.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 14:11:50 +04:00
Cyrill Gorcunov
3423e3492b zdtm: Add vdso01 test case
It parses vDSO in memory (just like CRIU does) and
then use direct calls to vDSO entries instead of
.plt/.got bundle. The reason for that -- I must
be sure we're able to proceed calls without relying
on libc anyhow.

Note the test is x86-64 specific so I don't turn in
on in test suite by default.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-25 14:11:24 +04:00
Pavel Emelyanov
fac7befa6b files: Sanity check for reg file on restore is not corrupted
When opening a reg file on restore -- check that the file size we
opened matches the on we saw on dump. This is not bullet-proof protection,
but is helpful to protect against FS updates between dump/restore.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 23:38:48 +04:00
Cyrill Gorcunov
99cd969595 zdtm: Add CLOCK_BOOTTIME test into posix_timers
To test CLOCK_BOOTTIME feature recently implemented in OpenVZ kernel.
Vanilla kernel and CRIU passes it.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 23:37:18 +04:00
Pavel Emelyanov
f52efcce0a cg: Mark yard mount as private
Otherwise cgroups sub-mounts may propagate to another namespaces
and the directory would become unremovable.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:51:09 +04:00
Andrew Vagin
876def9546 test: add a target to execute non-zdtm tests
make -C test other

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:50:43 +04:00
Andrey Vagin
ac2dc9fc7f test/ext-links: fix path to the ip tool
run.sh: line 17: xip: command not found

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:50:17 +04:00
Cyrill Gorcunov
1c4f8478ee vdso: x86 -- Make sure the mark version matches
Otherwise we're meeting somehow corrupted mark and
must abort dumping.

Reported-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:48:44 +04:00
Cyrill Gorcunov
fe7b8aeb8c vdso: x86 -- Add handling of vvar zones
New kernel 3.16 will have old vDSO zone splitted into the two vmas:
one for vdso code itself and second that named vvar for data been
referenced from vdso code.

Because I can't do 'dump' and 'restore' parts of the code separately
(otherwise test would fail) the commit is pretty big one and hard to
read so here is detailed explanation what's going on.

 1) When start dumping we detect vvar zone by reading /proc/pid/smap
    and looking up for "[vvar]" token. Note the vvar zone is mapped
    by a kernel with PF/IO flags so we should not fail here.

    Also it's assumed that at least for now kernel won't be changed
    much and [vvar] zone always follows the [vdso] zone, otherwise
    criu will print error.

 2) In previous commits we disabled dumping vvar area contents so
    the restorer code never try to read vvar data but still we need
    to map vvar zone thus vma entry remains in image.

 3) As with previous vdso format we might have 2 cases

    a) Dump and restore is happening on same kernel
    b) Dump and restore are done on different kernels

    To detect which case we have we parse vdso data from image
    and find symbols offsets then compare their values with runtime
    symbols provided us by a kernel. If they match and (!!!) the
    size of vvar zone is the same -- we simply remap both zones
    from runtime kernel into the positions dumpee had at checkpoint
    time. This is that named "inplace" remap (a).

    If this happens the vdso_proxify() routine drops VMA_AREA_REGULAR
    from vvar area provided by a caller code and restorer won't try
    to handle this vma. It looks somehow strange and probably should
    be reworked but for now I left it as is to minimize the patch.

    In case of (b) we need to generate a proxy. We do that in same
    way as we were before just include vvar zone into proxy and save
    vvar proxy address inside vdso mark injected into vdso area. Thus
    on subsequent checkpoint we can detect proxy vvar zone and rip
    it off the list of vmas to handle.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:48:43 +04:00
Cyrill Gorcunov
154d1c6c2c vdso: parasite -- Prepare new vdso mark structure.
Because of new vvar area we need to carry the
address of vvar proxy inside the mark. Thus
add members needed and update routines.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:48:43 +04:00
Cyrill Gorcunov
993205e3be vdso: util -- Show 'vvar' abreviature when meet VMA_AREA_VVAR
This is for debug purpose mostly.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:48:42 +04:00
Cyrill Gorcunov
0bb002ce69 vdso: dump -- Don't dump contents of vvar zone
vvar zone is mapped by a kernel and must not ever
been dumped into image, the data present there is
valid on running kernel only.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:48:41 +04:00
Cyrill Gorcunov
72ead490e4 vdso: image -- Add VMA_AREA_VVAR flag
Will need it to handle vvar zones in a special way.

Because VMA_UNSUPP never goes into the image file
lets reuse bit 12 for VVAR.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:48:40 +04:00
Filipe Brandenburger
e0b3018b71 git: add /dev to test/.gitignore
The /dev directory is also created by zdtm when running ns/ enabled tests.
Add it to the list, together with entries such as /bin and /lib.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:44:42 +04:00
Filipe Brandenburger
340a246444 zdtm: add missing entries to test/zdtm/.gitignore
This adds new tests "cgroup00" and "clean_mntns" to the .gitignore file.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:44:40 +04:00
Filipe Brandenburger
6cf2906b0a zdtm: add new dumpable02 test to check that dumpable flag set to 0 or 2 works
This confirms that the fix to handle dumpable flag set to 2 still works after
restore.

To force dumpable flag set to 0 or 2 (whatever the fs.suid_dumpable is set to),
chmod the test binary to 0111 (executable, but not readable) and execv() it
while running as non-root.  The kernel will unset the dumpable flag to prevent
a core dump or ptrace to giving the user access to the pages of the binary
(which are supposedly not readable by that user.)

Tested:
- # test/zdtm.sh static/dumpable02
  Test: zdtm/live/static/dumpable02, Result: PASS
- # test/zdtm.sh ns/static/dumpable02
  Test: zdtm/live/static/dumpable02, Result: PASS
- Used -DDEBUG to confirm the value of the dumpable flag was 0 or 2 to match
  the fs.suid_dumpable sysctl in the tests (both in and out of namespaces.)
- Confirmed that the test fails if the commit that fixes handling of dumpable
  flag with value 2 is reverted and the fs.suid_dumpable sysctl is set to 2.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:44:39 +04:00
Filipe Brandenburger
f662df452d restore: preserve dumpable flag when it is set to 2
Commit d5bb7e9748fd started to preserve the dumpable flag across migration by
using prctl to get the value on dump and set it back on restore.

On some situations, the dumpable flag can be set to 2.  This happens when it is
not reset (with prctl) after using setuid() or after using execv() on a binary
that has executable but not read permissions, when the fs.suid_dumpable sysctl
is also set to 2.  However, it is not possible to set it to 2 using prctl,
which would make criu restore fail.

Fix this by checking for the value before passing it to prctl.  In case the
value of the dumpable flag was 2 at the source, check whether it is already 2
at the destination, which is likely to happen if the fs.suid_dumpable sysctl is
also set to 2 where restore is running.  In that case, preserve the value,
otherwise reset it to 0 which is the most secure fallback.

Fixes: d5bb7e9748fda52a8653edd2804a4b15c4e4a1e1

Tested:
- Using dumpable02 zdtm test after setting fs.suid_dumpable to 2.
  # sysctl -w fs.suid_dumpable=2
  # test/zdtm.sh ns/static/dumpable02
  4: DEBUG: before dump: dumpable=2
  4: DEBUG: after restore: dumpable=2
  4: PASS
  Test: zdtm/live/static/dumpable02, Result: PASS

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:44:38 +04:00
Filipe Brandenburger
9f30b9e7e3 Revert "pie: A quick workaround for PR_SET_DUMPABLE == 2 restore error."
This reverts commit 8870aa1e4f86da5fad087a31fe93bacd95144e09.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:44:35 +04:00
Filipe Brandenburger
1176081e1f zdtm: add new dumpable01 test to check that dumpable flag is preserved
This confirms that the fix in commit d5bb7e9748fd to preserve the dumpable flag
after migration is working as expected.

In this test case, the dumpable flag is expected to always be set to 1, as
test_init will use prctl to reset it to 1 after using setuid and setgid.

Tested:
- # test/zdtm.sh static/dumpable01
  Test: zdtm/live/static/dumpable01, Result: PASS
- # test/zdtm.sh ns/static/dumpable01
  Test: zdtm/live/static/dumpable01, Result: PASS
- Confirmed that the test fails after reverting commit d5bb7e9748fd.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-24 22:44:34 +04:00
Cyrill Gorcunov
7f3de2889e restore: Make sure the last_pid is writen with zero offset
Otherwise I see on 3.16-rc1 and higher

| [  100.851730] futex wrote to ns_last_pid when file position was not 0!
| This will not be supported in the future. To silence this
| warning, set kernel.sysctl_writes_strict = -1

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-23 16:49:40 +04:00
Pavel Emelyanov
687c389478 iov: Add page_server_iov to iov and back helpers
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-20 16:35:54 +04:00
Pavel Emelyanov
3b995f1aef iov: Add iovec2pagemap() helper
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-20 16:35:52 +04:00
Pavel Emelyanov
cd34724092 iov: Add iov_init() helper
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-20 16:35:51 +04:00
Pavel Emelyanov
bb7ac03a5b iov: Add iov_grow_page() helper
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-20 16:35:14 +04:00
Andrey Vagin
997f08eaa6 vdso: don't forget to adjust vma_area_list->nr
A proxy vdso is removed from the vma_area_list list,
so vma_area_list->nr must be decremented.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-20 16:28:03 +04:00
Pavel Emelyanov
7edf0994c9 criu: Version 1.3-rc2
Next acheivement -- external bind mounts and tasks-to-cgroups
bindings. Plus many bugfixes in memory restore and mounpoints
dump, many thanks to Google guys for reports and patches!

We have quite a few things left to make workable LXC and Docker
support, hopefully the next tag will be the 1.3 one :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v1.3-rc2
2014-06-18 13:34:36 +04:00
Saied Kazemi
8870aa1e4f pie: A quick workaround for PR_SET_DUMPABLE == 2 restore error.
[ xemul: It's a temporary workaround not to lock the -rc2 release.
  Once we have some better solution, this will be rolled back. ]

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 21:44:50 +04:00
Andrey Vagin
2ad1ba72fa zdtm: check bind-mounted files in static/mountpoints
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:40:23 +04:00
Andrey Vagin
494c044384 mount: dump one file system only once (v2)
A file system can be bind-mounted a few times and some of these mounts
can be non-root. We need to find one of root mounts and dump it.

v2: don't forget to check pm->dumped and pm->parent
    don't dump a root file system, it's always external for now.

Reported-by: Saied Kazemi <saied@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:40:00 +04:00
Andrey Vagin
697211908a tmpfs: use device number instead of mnt_id in image names
One file system can be mounted a few times, so mnt_id isn't unique for it.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:39:52 +04:00
Pavel Emelyanov
061d6cfadf mnt: Handle external bind mounts according to --ext-mount option (v3)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:36:52 +04:00
Pavel Emelyanov
c7e0042946 crtools: Introduce the --ext-mount-map option (v3)
On dump one uses one or more --ext-mount-map option with A:B arguments.
A denotes a mountpoint (as seen from the target mount namespace) criu
dumps and B is the string that will be written into the image file
instead of the mountpoint's root.

On restore one uses the same --ext-mount-map option(s) with similar
A:B arguments, but this time criu treats A as string from the image's
root field (foobar in the example above) and B as the path in criu's
mount namespace the should be bind mounted into the mountpoint.

v3:
* Added documentation
* Added RPC bits
* Changed option name into --ext-mount-map
* Use colon as key and value separator

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:36:30 +04:00
Pavel Emelyanov
c3ea0ba06f mnt: Tossing bits around in validate_mounts
Just for simpler further patching.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:36:02 +04:00
Tycho Andersen
43c96be798 Allow dumping of pstore, securityfs, fusectl, debugfs
These are mounted by default in ubuntu containers, so criu should know about
them and remount them on restore.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-11 15:09:15 +04:00
Pavel Emelyanov
72a9372aff fs: Opening FE-s after fchdir doesn't work
It uses absolute file names, so any open-s should happen _before_
we change tasks' root.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-10 17:47:32 +04:00
Pavel Emelyanov
7aa7e95f7e fs: Don't hide error from prepare_fs
If fchroot() succeeds the further failures don't get
noticed by caller.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-06-10 17:47:31 +04:00