2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 05:48:05 +00:00

10170 Commits

Author SHA1 Message Date
Mike Rapoport
808684c99e Add CONTRIBUTING.md
Move the existing contribution guidelines to a dedicated file for future
extensions.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
2020-10-20 00:18:24 -07:00
Cyrill Gorcunov
6ee4b72382 arch/x86: Fix calculation of xstate_size
The layout of xsave frame in a standart format is predefined by the hardware.
Lets make sure we're increasing in frame offsets and use latest offset where
appropriate.

https://github.com/checkpoint-restore/criu/issues/1042

Reported-by: Ashutosh Mehra <mehra.ashutosh@ibm.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
2020-10-20 00:18:24 -07:00
Kir Kolyshkin
1d9438aefb criu swrk: fix usage, allow common options
TL;DR: this makes possible -v with criu swrk, and removes showing usage
which is useless in swrk mode.

1. Since criu swrk command is not described in usage, there is no sense
   in showing it. Instead, show a one-line hint about how to use it.

2. In case some global options (like -v) are used, argv[1] might not
   point to "swrk". Use optind to point to a correct non-option
   argument.

3. While at it, also error out in case we have extra arguments.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
cbf099400a Travis: use Vagrant to run VMs
This adds the minimal configuration to run Fedora 31 based VMs on
Travis.

This can be used to test cgroupv2 based tests, tests with vdso=off and
probably much more which requires booting a newer kernel.

As an example this builds CRIU on Fedora 31 and reconfigures it to boot
without VDSO support and runs one single test.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
d72428b7c4 Also report clone3() errors correctly
Without clone3() CRIU was able to detect a process with a wrong PID only
in the already created child process. With clone3() this error can
happen before the process is created.

In the case of EEXIST this error will now be correctly forwarded to an
RPC client.

This was detected by running test/others/libcriu on a clone3() system.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
047ecd3a15 test/others/libcriu: test version library calls
This adds the previously added libcriu version functions to the libcriu
tests.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
55f71b8667 lib/c: add criu_get_version()
Although the CRIU version is exported in macros in version.h it only
contains the CRIU version of libcriu during build time.

As it is possible that CRIU is upgraded since the last time something
was built against libcriu, this adds functions to query the actual CRIU
binary about its version.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
ZeyadYasser
e57e74a18d criu: optimize find_unix_sk_by_ino()
Fixes: #339
Replaced the linear search with a hashtable lookup.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2020-10-20 00:18:24 -07:00
Kir Kolyshkin
62c03530c9 swrk: send notification instead of using status fd
When we use swrk, we have a mechanism to send notifications over RPC.
It is cleaner and more straightforward than sending \0 to status fd.

For now, both mechanisms are supported, although status fd request
option is now deprecated, so a warning is logged in case it's used.

Guess we can remove it in a few years.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-10-20 00:18:24 -07:00
Kir Kolyshkin
faf6dbf33e close_service_fd: rename to status_ready
The name close_service_fd() is misleading, as it not just closes the
status_fd, but also writes to it. On a high level, though, it signals
the other side that we are ready, so rename to status_ready.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
e34f5dd3a3 clang: Branch condition evaluates to a garbage value
criu-3.14/criu/namespaces.c:692:7: warning: Branch condition evaluates to a garbage value

criu-3.14/criu/namespaces.c:690:3: note: 'supported' declared without an initial value
              protobuf_c_boolean supported;
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
criu-3.14/criu/namespaces.c:691:8: note: Calling 'get_ns_id'
              id = get_ns_id(pid, &time_for_children_ns_desc, &supported);
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
criu-3.14/criu/namespaces.c:479:9: note: Calling '__get_ns_id'
      return __get_ns_id(pid, nd, supported, NULL);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
criu-3.14/criu/namespaces.c:454:6: note: Assuming 'proc_dir' is < 0
      if (proc_dir < 0)
          ^~~~~~~~~~~~
criu-3.14/criu/namespaces.c:454:2: note: Taking true branch
      if (proc_dir < 0)
      ^
criu-3.14/criu/namespaces.c:455:3: note: Returning without writing to '*supported'
              return 0;
              ^
criu-3.14/criu/namespaces.c:479:9: note: Returning from '__get_ns_id'
      return __get_ns_id(pid, nd, supported, NULL);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
criu-3.14/criu/namespaces.c:479:2: note: Returning without writing to '*supported'
      return __get_ns_id(pid, nd, supported, NULL);
      ^
criu-3.14/criu/namespaces.c:691:8: note: Returning from 'get_ns_id'
              id = get_ns_id(pid, &time_for_children_ns_desc, &supported);
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
criu-3.14/criu/namespaces.c:692:7: note: Branch condition evaluates to a garbage value
              if (!supported || !id) {
                  ^~~~~~~~~~
690|   		protobuf_c_boolean supported;
691|   		id = get_ns_id(pid, &time_for_children_ns_desc, &supported);
692|-> 		if (!supported || !id) {
693|   			pr_err("Can't make timens id\n");
694|

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
b4c51ea492 coverity: fix FORWARD_NULL in criu/proc_parse.c: 1481
8. criu-3.14/criu/proc_parse.c:1511: var_deref_model: Passing null pointer "f" to "fclose", which dereferences it.
  1509|   	exit_code = 0;
  1510|   out:
  1511|-> 	fclose(f);
  1512|   	return exit_code;
  1513|   }

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
64347398c1 coverity: fix RESOURCE_LEAK criu/timens.c: 67
7. criu-3.14/criu/timens.c:67: leaked_storage: Variable "img" going out of scope leaks the storage it points to.
    65|   	if (id == 0 && empty_image(img)) {
    66|   		pr_warn("Clocks values have not been dumped\n");
    67|-> 		return 0;
    68|   	}

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Radostin Stoyanov
f334102520 libcriu: Add space between 'if' and parenthesis
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-10-20 00:18:24 -07:00
Radostin Stoyanov
4ac9a3c904 libcriu: Use spaces around '='
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-10-20 00:18:24 -07:00
Radostin Stoyanov
ae4fd07ca5 libcriu: Add orphan pts master
The orphan pts master option was introduced with commit [1]
to enable checkpoint/restore of containers with a pty pair
used as a console.

[1] 6afe523d97

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-10-20 00:18:24 -07:00
Kir Kolyshkin
f6d1b498dc cr-service: spell out an error
While working on runc checkpointing, I incorrectly closed status_fd
prematurely, and received an error from CRIU, but it was
non-descriptive.

Do print the error from open().

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-10-20 00:18:24 -07:00
Kir Kolyshkin
00a44031e2 cr-service: fix wording in debug messages
The message "Overwriting RPC settings with values from <filename>" is
misleading, giving the impression that file is being read and consumed.
It really puzzled me, since <filename> didn't exist.

What it needs to say is "Would overwrite", i.e. if a file with such name
is present, it would be used.

Also, add actual "Parsing file ..." so it will be clear which files are
being used.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
00b8257d9f tests: move cross compilation to github actions
This moves the cross compilation tests to github actions, to slightly
reduce the number of Travis tests and run them in parallel on github
actions.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
8452be93cf travis: use bionic almost everywhere
A few tests were still running on xenial because at some point they were
hanging. This switches now all tests to bionic except one docker test
which still uses xenial to test with overlayfs.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Kir Kolyshkin
5bd776da38 Remove dupe of "deprecated stuff on" msg
A similar one is already printed in check_options().

Before this patch:
> $ ./criu/criu -vvvvvv --deprecated --log-file=/dev/stdout xxx
> (00.000000) Turn deprecated stuff ON
> ...
> (00.029680) DEPRECATED ON
> (00.029687) Error (criu/crtools.c:284): unknown command: xxx

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-10-20 00:18:24 -07:00
Josh Abraham
8364b09407 soccr/test: Fix error logging in libsoccr tcp-test
Signed-off-by: Joshua Abraham <sinisterpatrician@gmail.com>
2020-10-20 00:18:24 -07:00
Guoyun Sun
277b0b69fa mips: fix fail when run zdtm test pthread01.c
k_rtsigset_t is 16Bytes in mips architecture but not 8Bytes.
so blk_sigset_extended be added in TaskCoreEntry and ThreadCoreEntry for dumping
extern 8Bytes data in parasite-syscall.c, restore extern 8Bytes data in cr-restore.c

Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Guoyun Sun
be13941221 mips: impliment arch_shmat()
On MIPS CPUs with VIPT caches also has aliasing issues, just like ARMv6.
To overcome this issue, page coloring 0x40000 align for shared mappings was introduced (SHMLBA) in kernel.
    https://github.com/torvalds/linux/blob/master/arch/mips/include/asm/shmparam.h

Related to this, zdtm test suites ipc.c shm.c shm-unaligned.c and shm-mp.c are passed.

Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Andrei Vagin
d38851c9bd test/jenkins: use bash to run shell scripts
We permanently have issues like this:
./test/jenkins/criu-iter.sh: 3: source: not found

It looks like a good idea to use one shell to run our jenkins scripts.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-10-20 00:18:24 -07:00
Nicolas Viennot
40169b950e style: fix typos
Oddly, one of the test had a typo which should be fatal.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-10-20 00:18:24 -07:00
Guoyun Sun
b5c34c74c5 mips:support docker-cross compile
Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Guoyun Sun
afe90627e2 mips:criu: Enable mips in criu
Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Guoyun Sun
d325b7b775 mips:criu/arch/mips: Add mips parts to criu
Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Guoyun Sun
158e8f8fe6 mips:proto: Add mips to protocol buffer files
Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Guoyun Sun
e7d13b368d mips:compel: Enable mips in compel/
Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Guoyun Sun
ba0d6dbac1 mips:compel/arch/mips: Add architecture support to compel tool and libraries
This patch only adds the support but does not enable it for building.

Signed-off-by: Guoyun Sun <sunguoyun@loongson.cn>
2020-10-20 00:18:24 -07:00
Adrian Reber
8be1d457d7 net: fix coverity RESOURCE_LEAK
criu-3.12/criu/net.c:2043: overwrite_var: Overwriting "img" in "img =
open_image_at(-1, CR_FD_IP6TABLES, 0UL, pid)" leaks the storage that
"img" points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-10-20 00:18:24 -07:00
Radostin Stoyanov
eb732bcf0d util: Remove deprecated print_data() routine
The print_data() function was part of the deprecated (and removed)
'show' action, and it was moved in util.c with the following commit:

	a501b4804b3c95e1d83d64dd10ed95c37f0378bb
	The 'show' action has been deprecated since 1.6, let's finally drop it.

	The print_data() routine is kept for yet another (to be deprecated too)
	feature called 'criu exec'.

The criu exec feature was removed with:

	909590a3558560655c1ce5b72215efbb325999ca
	Remove criu exec code

	It's now obsoleted by compel library.
	Maybe-TODO: Add compel tool exec action?

Therefore, now we can drop print_data() as well.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2020-10-20 00:18:24 -07:00
Pavel Emelyanov
8c538ca10d page-read: Warn about async read w/o completion cb
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2020-10-20 00:18:24 -07:00
Nicolas Viennot
27ab533cbe tests: run tests with criu-image-streamer with --stream
One can pass --stream to zdtm.py for testing criu with image streaming.
criu-image-streamer should be installed in ../criu-image-streamer
relative to the criu project directory. But any path will do providing
that criu-image-streamer can be found in the PATH env.

Added a few tests to run on travis-ci to make sure streaming works.
We run test that are likely to fail. However, it would be good to once
in a while run all tests with `--stream -a`.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-10-20 00:18:24 -07:00
Nicolas Viennot
7d79a58f4d img-streamer: introduction of criu-image-streamer
This adds the ability to stream images with criu-image-streamer

The workflow is the following:
1) criu-image-streamer is started, and starts listening on a UNIX
   socket.
2) CRIU is started. img_streamer_init() is invoked, which connects to the
   socket. During dump/restore operations, instead of using local disk to
   open an image file, img_streamer_open() is called to provide a UNIX pipe
   that is sent over the UNIX socket.
3) Once the operation is done, img_streamer_finish() is called, and the
   UNIX socket is disconnected.

criu-image-streamer can be found at:
https://github.com/checkpoint-restore/criu-image-streamer

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-10-20 00:18:24 -07:00
Nicolas Viennot
51c3f8a908 pipes: loop over splice() when dumping a pipe's data
Instead of erroring, we should loop until we get the desired number of
bytes written, like regular I/O loops.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-10-20 00:18:24 -07:00
Radostin Stoyanov
0708cbd883 remote: Use tmp file buffer when restore ip dump
When CRIU calls the ip tool on restore, it passes the fd of remote
socket by replacing the STDIN before execvp. The stdin is used by the
ip tool to receive input. However, the ip tool calls ftell(stdin)
which fails with "Illegal seek" since UNIX sockets do not support file
positioning operations. To resolve this issue, read the received
content from the UNIX socket and store it into temporary file, then
replace STDIN with the fd of this tmp file.

 # python test/zdtm.py run -t zdtm/static/env00 --remote -f ns
 === Run 1/1 ================ zdtm/static/env00

 ========================= Run zdtm/static/env00 in ns ==========================
 Start test
 ./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST
 Adding image cache
 Adding image proxy
 Run criu dump
 Run criu restore
 =[log]=> dump/zdtm/static/env00/31/1/restore.log
 ------------------------ grep Error ------------------------
 RTNETLINK answers: File exists
 (00.229895)      1: do_open_remote_image RDONLY path=route-9.img snapshot_id=dump/zdtm/static/env00/31/1
 (00.230316)      1: 	Running ip route restore
 Failed to restore: ftell: Illegal seek
 (00.232757)      1: Error (criu/util.c:712): exited, status=255
 (00.232777)      1: Error (criu/net.c:1479): IP tool failed on route restore
 (00.232803)      1: Error (criu/net.c:2153): Can't create net_ns
 (00.255091) Error (criu/cr-restore.c:1177): 105 killed by signal 9: Killed
 (00.255307) Error (criu/mount.c:2960): mnt: Can't remove the directory /tmp/.criu.mntns.dTd7ak: No such file or directory
 (00.255339) Error (criu/cr-restore.c:2119): Restoring FAILED.
 ------------------------ ERROR OVER ------------------------
 ################# Test zdtm/static/env00 FAIL at CRIU restore ##################
 ##################################### FAIL #####################################

Fixes #311

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2020-10-20 00:18:24 -07:00
Radostin Stoyanov
01cab14dfa util: Fix addr casting for IPv4/IPv6 in autobind
When saddr.ss_family is AF_INET6 we should cast &saddr to
(struct sockaddr_in6 *).

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-10-20 00:18:24 -07:00
Adrian Reber
be2ded15ee test: fix flake8 errors
The newest version of flake reports errors that variable names like 'l'
should not be used, because they are hard to read.

This changes 'l' to 'line' to make flake8 happy.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-06-06 11:46:14 -07:00
Adrian Reber
d23d1fc0f9 travis: fix alpine builds
With the latest version of the alpine container image it seems that
alpine changed a few package names. This adapts the alpine container
to solve the travis failures.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-06-06 11:45:21 -07:00
Adrian Reber
f2edc1e199 Update certificates for failing tls based tests
When using zdtm.py with --tls it started to fail as the certificates
seem to have expired. Following commands have been used to re-generate
the certificate:

            # Generate CA key and certificate
            echo -ne "ca\ncert_signing_key" > temp
            certtool --generate-privkey > cakey.pem
            certtool --generate-self-signed \
                --template temp \
                --load-privkey cakey.pem \
                --outfile cacert.pem

            # Generate server key and certificate
            echo -ne "cn=$HOSTNAME\nencryption_key\nsigning_key" > temp
            certtool --generate-privkey > key.pem
            certtool --generate-certificate \
                --template temp \
                --load-privkey key.pem \
                --load-ca-certificate cacert.pem \
                --load-ca-privkey cakey.pem \
                --outfile cert.pem
            rm temp cakey.pem

Without this tests will fail in Travis.

Signed-off-by: Adrian Reber <areber@redhat.com>
2020-06-05 11:37:50 -07:00
Pavel Emelyanov
95ead14874 criu: Version π
The long-tempting release with lots of new features on board.
We have finally the time namespace support, great improvment of
the pre-dump memory consumption, new clone3 support and many
more.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
v3.14
2020-04-29 16:31:49 +03:00
Kir Kolyshkin
5c5e7695a5 get_clean_mount: demote an error to a warning
When testing runc checkpointing, I frequently see the following error:

> Error (criu/mount.c:1107): mnt: Can't create a temporary directory: Read-only file system

This happens because container root is read-only mount.

The error here is not actually fatal since it is handled later
in ns_open_mountpoint() (at least since [1] is fixed), but it is shown
as error in runc integration tests.

Since it is not fatal, let's demote it to a warning to avoid confusion.

[1] https://github.com/checkpoint-restore/criu/issues/520

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-04-25 00:43:23 -07:00
Andrei Vagin
c83a0aae2c proc: parse clock symbolic names in /proc/pid/timens_offsets
Clock IDs in this file has been replaced by clock symbolic names.

Now it looks like this:
    $ cat /proc/774/timens_offsets
    monotonic      864000         0
    boottime      1728000         0

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-04-25 00:43:23 -07:00
Pavel Tikhomirov
7dc89376b8 pstree: improve error handling in read_pstree_image
First don't free pstree_item as they are allocated with shmalloc on
restore. Second always pstree_entry__free_unpacked PstreeEntry. Third
remove all breaks replacing them with implict goto err, so that it would
be easier to understand that we are on error path. Forth split out
code for reading one pstree item in separate function.

Sadly there is no much use in xfree-ing pi->threads because in case of
an error we still have ->threads unfreed from previous entries anyway.

But at least some cleanup can be done here.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2020-04-25 00:43:23 -07:00
Pavel Tikhomirov
42b5700b72 kerndat remove duplicate call to kerndat_nsid()
Func kerndat_nsid() is called twice.

v2: leave kerndat_nsid call near kerndat_link_nsid

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2020-04-25 00:43:23 -07:00
Nicolas Viennot
2c2fdd3334 parasite-msg: %u is not implemented for parasite code
Changed all the %u into %d.

Ideally, we should implement the %u format for parasite code.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-04-25 00:43:23 -07:00
Nicolas Viennot
ef7ef9cfa0 kerndat: remove duplicate call to kerndat_socket_netns()
kerndat_socket_netns() is called twice. We keep the latter to avoid
changing the behavior.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2020-04-25 00:43:23 -07:00