2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 13:28:27 +00:00

9923 Commits

Author SHA1 Message Date
Abhishek Dubey
20d4920a8b Adding --pre-dump-mode option
Two modes of pre-dump algorithm:
    1) splicing memory by parasite
        --pre-dump-mode=splice (default)
    2) using process_vm_readv syscall
        --pre-dump-mode=read

Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:39:02 -08:00
Dmitry Safonov
576a99f492 restorer/inotify: Don't overflow PIE stack
PATH_MAX == 4096; PATH_MAX*8 == 32k; RESTORE_STACK_SIZE == 32k.

Fixes: a3cdf948699c6 ("inotify: cleanup auxiliary events from queue")
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Andrei Vagin <avagin@gmail.com>
Co-debugged-with: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:38:24 -08:00
Nicolas Viennot
578597299a Cleanup do_full_int80()
1) Instead of tampering with the nr argument, do_full_int80() returns
the value of the system call. It also avoids copying all registers back
into the syscall_args32 argument after the syscall.

2) Additionally, the registers r12-r15 were added in the list of
clobbers as kernels older than v4.4 do not preserve these.

3) Further, GCC uses a 128-byte red-zone as defined in the x86_64 ABI
optimizing away the correct position of the %rsp register in
leaf-functions. We now avoid tampering with the red-zone, fixing a
SIGSEGV when running mmap_bug_test() in debug mode (DEBUG=1).

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:38:24 -08:00
Andrei Vagin
b84f481b55 unix: print inode numbers as unsigned int
Reported-by: Mr Jenkins
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:38:24 -08:00
Andrei Vagin
3f1c4a17ad pipe: print pipe_id as unsigned to generate an external pipe name
Reported-by: Mr Jenkins
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:38:24 -08:00
Pavel Tikhomirov
b47ef26eac cgroup: fixup nits
1) s/\s*$//
2) fix snprintf out of bound access

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2020-02-04 12:38:24 -08:00
Andrei Vagin
f44939317f zdtm/cgroup_yard: create a test cgroup yard from the post-start hook
Right now, it is created from the pre-dump hook, but
if the --snap option is set, the test fails:
$ python test/zdtm.py run -t zdtm/static/cgroup_yard -f h --snap --iter 3
...
Running zdtm/static/cgroup_yard.hook(--pre-dump)
Traceback (most recent call last):
  File zdtm/static/cgroup_yard.hook, line 14, in <module>
    os.mkdir(yard)
OSError: [Errno 17] File exists: 'external_yard'

Cc: Michał Cłapiński <mclapinski@google.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:38:24 -08:00
Andrei Vagin
db40ef5be6 test/cgroup_yard: always clean up a test cgroup yard
Right now it is cleaned up from a post-restore hook,
but zdtm.py can be executed with the norst option:
$ zdtm.py run -t zdtm/static/cgroup_yard --norst
...
OSError: [Errno 17] File exists: 'external_yard'

Cc: Michał Cłapiński <mclapinski@google.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:38:24 -08:00
Radostin Stoyanov
813bfbeb4f Convert pr_msg() error messages to pr_err()
Print error messages to stderr (instead of stdout).

Suggested-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-02-04 12:38:23 -08:00
Radostin Stoyanov
a9f974b495 Introduce flush_early_log_to_stderr destructor
Prior log initialisation CRIU preserves all (early) log messages in a
buffer. In case of error the content of the content of this buffer
needs to be printed out (flushed).

Suggested-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-02-04 12:37:37 -08:00
Andrei Vagin
8bdc60d50e arch/x86: fpu_state->fpu_state_ia32.xsave hast to be 64-byte aligned
Before the 5.2 kernel, only fpu_state->fpu_state_64.xsave has to be
64-byte aligned. But staring with the 5.2 kernel, the same is required
for pu_state->fpu_state_ia32.xsave.

The behavior was changed in:
c2ff9e9a3d9d ("x86/fpu: Merge the two code paths in __fpu__restore_sig()")

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:37:37 -08:00
Radostin Stoyanov
4f24786b36 travis: Install missing diffutils dependency
The following tests fail in Fedora rawhide because /usr/bin/diff
is missing.

 * zdtm/static/bridge(ns)
 * zdtm/static/cr_veth(uns)
 * zdtm/static/macvlan(ns)
 * zdtm/static/netns(uns)
 * zdtm/static/netns-nf(ns)
 * zdtm/static/sit(ns)

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-02-04 12:37:37 -08:00
Michał Cłapiński
cf0080505a test: implement test for new --cgroup-yard option
Signed-off-by: Michał Cłapiński <mclapinski@google.com>
2020-02-04 12:37:37 -08:00
Michał Cłapiński
2f337652ad Add new command line option: --cgroup-yard
Instead of creating cgroup yard in CRIU, now we can create it externally
and pass it to CRIU. Useful if somebody doesn't want to grant
CAP_SYS_ADMIN to CRIU.

Signed-off-by: Michał Cłapiński <mclapinski@google.com>
2020-02-04 12:37:37 -08:00
Radostin Stoyanov
ad7e82a30f scripts: Drop Fedora 28/rawhide fix
This change was introduced with c75cb2b and it is no longer necessary.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-02-04 12:37:37 -08:00
Dmitry Safonov
3e9dc1c7f5 compel/x86: Don't use pushq for a label
`pushq` sign-extends the value. Which is a bummer as the label's address
may be higher that 2Gb, which means that the sign-bit will be set.

As it long-jumps with ia32 selector, %r11 can be scratched.
Use %r11 register as a temporary to push the 32-bit address.

Complements: a9a760278c1a ("arch/x86: push correct eip on the stack
before lretq")
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Reported-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:37:37 -08:00
Andrei Vagin
0d8e2477e9 arch/x86: push correct eip on the stack before lretq
Right now we use pushq, but it pushes sign-extended value, so if the
parasite code is placed higher that 2Gb, we will see something like
this:

   0xf7efd5b0:	pushq  $0x23
   0xf7efd5b2:	pushq  $0xfffffffff7efd5b9
=> 0xf7efd5b7:	lretq

Actually we want to push 0xf7efd5b9 instead of 0xfffffffff7efd5b9.

Fixes: #398

Cc: Dmitry Safonov <dima@arista.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Acked-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-04 12:37:37 -08:00
Radostin Stoyanov
8ea953f18b cr-dump: Remove redundant if-statement
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-02-04 12:37:37 -08:00
Radostin Stoyanov
3eed47223b files-reg: Drop clear_ghost_files() prototype
The function clear_ghost_files() has been removed in commit
b11eeea "restore: auto-unlink for ghost files (v2)".

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-02-04 12:37:37 -08:00
Radostin Stoyanov
08f3b57ab3 py: Manual fixlets of code formatting
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2020-02-04 12:37:37 -08:00
Pavel Emelyanov
c703e3fd84 criu: Version 3.13
Here we have some bugfixes, huuuge *.py patch for coding style
and nice set of new features like 32bit for ARM, TLS for page
server and new mode for CGroups.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
v3.13
2019-09-11 11:29:31 +03:00
Radostin Stoyanov
72402c6e7a py: Fix tabs in code comments
These were left by yapf formatter

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2019-09-07 15:59:57 +03:00
Pavel Emelyanov
34dbf67b24 pyimages: Add pb2dict.py to checked and fix warnings/errors
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2019-09-07 15:59:57 +03:00
Radostin Stoyanov
6b615ca152 test/others: Reuse setup_swrk()
Reduce code duplication by taking setup_swrk() function into a separate
module that can be reused in multiple places.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2019-09-07 15:59:57 +03:00
Radostin Stoyanov
4c1ee3e227 test/other: Resolve Py3 compatibility issues
When Python 2 is not installed we assume that /usr/bin/python refers to
version 3 of Python and the executable /usr/bin/python2 does not exist.

This commit also resolves a compatibility issue with Popen where in
Py2 file descriptors will be inherited by the child process and in
Py3 they will be closed by default.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2019-09-07 15:59:56 +03:00
Andrei Vagin
5aa72e7237 py: Reformat everything into pep8 style
As discussed on the mailing list, current .py files formatting does not
conform to the world standard, so we should better reformat it. For this
the yapf tool is used. The command I used was

  yapf -i $(find -name *.py)

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2019-09-07 15:59:56 +03:00
Pavel Tikhomirov
5ff4fcb753 zdtm: make inotify04 require restore
After adding the test for fake inotify events cleanup on restore, we've
detected that we also have the same problem on dump/predump, criu
touches files that are watched and generates fake events:

  [root@snorch criu]# test/zdtm.py run -t zdtm/static/inotify04 --norst -k always
  === Run 1/1 ================ zdtm/static/inotify04
  ======================== Run zdtm/static/inotify04 in h ========================
  Start test
  ./inotify04 --pidfile=inotify04.pid --outfile=inotify04.out --dirname=inotify04.test
  Run criu dump
  =[log]=> dump/zdtm/static/inotify04/36/1/dump.log
  ------------------------ grep Error ------------------------
  (00.004050) fsnotify: 			openable (inode match) as home/snorch/devel/criu/test/zdtm/static/inotify04.test/inotify-testfile
  (00.004052) fsnotify: 	Dumping /home/snorch/devel/criu/test/zdtm/static/inotify04.test/inotify-testfile as path for handle
  (00.004055) fsnotify: id 0x000007 flags 0x000800
  (00.004071) 36 fdinfo 5: pos:                0 flags:             4000/0
  (00.004080) Warn  (criu/fsnotify.c:336): fsnotify: The 0x000008 inotify events will be dropped
  ------------------------ ERROR OVER ------------------------
  Send the 15 signal to  36
  Wait for zdtm/static/inotify04(36) to die for 0.100000
  ############### Test zdtm/static/inotify04 FAIL at result check ################
  Test output: ================================
  18:20:10.558:    36: Event       0x20
  18:20:10.558:    36: Event       0x10
  18:20:10.558:    36: Event       0x20
  18:20:10.558:    36: Event       0x10
  18:20:10.558:    36: Event       0x20
  18:20:10.558:    36: Event       0x10
  18:20:10.558:    36: Event       0x20
  18:20:10.558:    36: Event       0x10
  18:20:10.558:    36: Read 8 events
  18:20:10.558:    36: FAIL: inotify04.c:105: Found 8 unexpected inotify events (errno = 11 (Resource temporarily unavailable))

   <<< ================================
  ##################################### FAIL #####################################

To suppress fails in jenkins make the inotify04 test 'reqrst'. Still
need to cleanup (or do not create) these events on dump/predump.
2019-09-07 15:59:56 +03:00
Adrian Reber
bbd922ed32 travis: add podman test case
This adds the same tests currently running for docker also for podman.
In addition this also tests podman --export/--import (migration)
support.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-09-07 15:59:56 +03:00
Sebastiaan van Stijn
2a76ecc9fd README: fix broken links to github.com/xemul/criu
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-07 15:59:56 +03:00
Sebastiaan van Stijn
1356a1def3 Replace references to github.com/xemul/criu
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-07 15:59:56 +03:00
Andrei Vagin
4e84d11c1f kerndat: remove unused code
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Andrei Vagin
25460af822 kerndat: mark functions as static which are used in kerndat.c only
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Dmitry Safonov
f6ab462074 vdso: Correctly track vdso position without vvar
If vvar is absent vdso_before_vvar is initialized by "false".
Which means that the check that supposed to track vdso/vvar pair went
into wrong brackets. In result it broke CRIU on kernels that don't have
vvar mapping.

Simpilfy the code by moving the check for VVAR_BAD_SIZE outside of
conditional for vdso_before_vvar.

Reported-by: Cyrill Gorcunov <gorcunov@gmail.com>
Fixes: 0918c7667647 ("vdso/restorer: Always track vdso/vvar positions in
vdso_maps_rt")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Tested-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Andrei Vagin
5f91f920a8 test: bring the lo interface up in each network namespace
This is needed to workaround the problem with "ip route save":
(00.113153) 	Running ip route save
Error: ipv4: FIB table does not exist.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Pavel Tikhomirov
e5bdcbbd1d zdtm/inotify: add a test that no unexpected events appear after c/r
Just create two inotify watches on a testfile, and do nothing except
c/r, it is expected that there is no events in queue after these.

before "inotify: cleanup auxiliary events from queue":

[root@snorch criu]# ./test/zdtm.py run -t zdtm/static/inotify04
=== Run 1/1 ================ zdtm/static/inotify04
======================== Run zdtm/static/inotify04 in h ========================
 DEP       inotify04.d
 CC        inotify04.o
 LINK      inotify04
Start test
./inotify04 --pidfile=inotify04.pid --outfile=inotify04.out --dirname=inotify04.test
Run criu dump
Run criu restore
Send the 15 signal to  60
Wait for zdtm/static/inotify04(60) to die for 0.100000
=============== Test zdtm/static/inotify04 FAIL at result check ================
Test output: ================================
18:37:14.279:    60: Event       0x10
18:37:14.280:    60: Event       0x20
18:37:14.280:    60: Event       0x10
18:37:14.280:    60: Read 3 events
18:37:14.280:    60: FAIL: inotify04.c:105: Found 3 unexpected inotify events (errno = 11 (Resource temporarily unavailable))

<<< ================================

v2: make two inotifies on the same file

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

zdtm: inotify04 add another inotify on the same file
2019-09-07 15:59:56 +03:00
Pavel Tikhomirov
96992883ca inotify: cleanup auxiliary events from queue
I've mentioned the problem that after c/r each inotify receives one or
more unexpected events.

This happens because our algorithm mixes setting up an inotify watch on
the file with opening and closing it.

We mix inotify creation and watched file open/close because we need to
create the inotify watch on the file from another mntns (generally). And
we do a trick opening the file so that it can be referenced in current
mntns by /proc/<pid>/fd/<id> path.

Moreover if we have several inotifies on the same file, than queue gets
even more events than just one which happens in a simple case.

note: For now we don't have a way to c/r events in queue but we need to
at least leave the queue clean from events generated by our own.

These, still, looks harder to rewrite wd creation without this proc-fd
trick than to remove unexpected events from queues.

So just cleanup these events for each fdt-restorer process, for each of
its inotify fds _after_ restore stage (at CR_STATE_RESTORE_SIGCHLD).
These is a closest place where for an _alive_ process we know that all
prepare_fds() are done by all processes. These means we need to do the
cleanup in PIE code, so need to add sys_ppoll definitions for PIE and
divide process in two phases: first collect and transfer fds, second do
real cleanup.

note: We still do prepare_fds() for zombies. But zombies have no fds in
/proc/pid/fd so we will collect no in collect_fds() and therefore we
have no in prepare_fds(), thus there is no need to cleanup inotifies for
zombies.

v2: adopt to multiple unexpected events
v3: do not cleanup from fdt-receivers, done from fdt-restorer
v4: do without additional fds restore stage
v5: replace sys_poll with sys_ppoll and fix minor nits

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

use ppoll always and remove poll
2019-09-07 15:59:56 +03:00
Dmitry Safonov
10a831689e restorer: Use gettimeofday() from rt-vdso for log timings
Omit calling raw syscalls and use vdso for the purpose of logging.
That will eliminate as much as one-syscall-per-PIE-message.
Getting time without switching to kernel will speed up C/R,
keeping logs as informative as they were.

Fixes: #346

I haven't enabled vdso timings for ia32 applications as it needs more
changes and complexity.. Maybe later.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Dmitry Safonov
9e5c0634ff vdso: Add compatible property to vdso_maps
We need to differ compatible (ia32) vdso maps from x86_64.
That dictates ABI on vdso code.
According to that, the decision to (not) use gettimeofday() from vdso in
64-bit restorer.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Dmitry Safonov
23960fe60e seccomp/restorer: Disable gtod from vdso in strict mode
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Dmitry Safonov
90ecb82202 restorer/parasite-vdso: Don't move vvar if failed to move vdso
Also slight refactor.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Dmitry Safonov
53c2fdc955 vdso/restorer: Always track vdso/vvar positions in vdso_maps_rt
For simplicity, make them always valid in restorer.
rt->vdso_start will be used to calculate gettimeofday() address.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:56 +03:00
Dmitry Safonov
2d521f3c93 vdso/restorer: Try best to preserve vdso during restore
vdso will be used in restorer for timings in logs - try to keep it
during restore process.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:55 +03:00
Dmitry Safonov
28949d5fb8 compel/std/uapi: Provide setter for gettimeofday()
Provide a way to set gettimeofday() function for an infected task.
CRIU's parasite & restorer are very voluble as more logs are better
than lesser in terms of bug investigations.
In all modern kernels there is a way to get time without entering
kernel: vdso. So, add a way to reduce the cost of logging without making
it less valuable.

[I'm not particularly fond of std_log_set_gettimeofday() name, so
 if someone can come with a better naming - I'm up for a change]

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:55 +03:00
Dmitry Safonov
d2d6e3f537 compel/log: Use enum as parameter for std_log_set_loglevel()
Doesn't change uapi, but makes it a bit more friendly and documented
which loglevel means what for foreign user.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2019-09-07 15:59:55 +03:00
Radostin Stoyanov
b25d1facae pb2dict: Disable undefined name 'basestring'
The following error is falsely reported by flake8:

lib/py/images/pb2dict.py:266:24: F821 undefined name 'basestring'

This error occurs because `basestring` is not available in Python 3,
however the if condition on the line above ensures that this error
will not occur at run time.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2019-09-07 15:59:55 +03:00
Radostin Stoyanov
5721e61000 scripts: Install flake8 with dnf in Fedora
In the Fedora tests we install python3-pip only to install flake8.

This is not necessary as there is a Fedora package for flake8.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2019-09-07 15:59:55 +03:00
Radostin Stoyanov
2a683849b9 scripts: Set PYTHON=python3 in Fedora Dockerfiles
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2019-09-07 15:59:55 +03:00
Radostin Stoyanov
cd87a628e1 scripts: Remove yaml/ipaddress Py2 fedora modules
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2019-09-07 15:59:55 +03:00
Pavel Tikhomirov
77efcde96d mount: fix inconsistent return and goto err alternation
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2019-09-07 15:59:55 +03:00
Adrian Reber
229a8ab06b scripts: remove python2 from Fedora Dockerfiles
More and more python2 packages are being removed from future Fedora
releases. This removes python2 packages explicitly listed in CRIU's
Dockerfiles, which all are not required for the current level of
testing.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-09-07 15:59:55 +03:00