2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 22:05:36 +00:00
Commit Graph

10636 Commits

Author SHA1 Message Date
Pavel Tikhomirov
d0511319e5 github: Add templates for new issues and pull requests
This way users would be able to create more meaningfull pull-requests
and issues. And we would not need to ask them to provide basic
information each time.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
3c10d3335b criu(8): document --join-ns option
The --join-ns option was introduced with commits:

https://github.com/checkpoint-restore/criu/commit/2cf17cd
https://github.com/checkpoint-restore/criu/commit/790ec46

Closes: #1054

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Pavel Tikhomirov
80ee4f8aec kdat: make uffd_open return errno from syscall separately
Previousely kerndat_uffd could not differentiate -EPERM and -1 returned
from uffd_open(). That way "Failed to get uffd API" and "Incompatible
uffd API ..." errors were just ignored, which is probably not what we
want.

v2: rework with extra argument of uffd_open for errno, rename err
label in uffd_open for readability

Fixes: cfdeac4a4 ("kerndat: Handle non-root mode when checking uffd")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
a8525c07d4 ci: no longer avoid overlayfs
Now that the Ubuntu kernel is no longer broken with regards to
overlayfs, let's switch back to overlayfs instead of devicemapper and
vfs graphdrivers.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
2aa4185a6c test/others: refactor loop process
There are several problems with the loop.sh script. First, the code is
duplicated across tests in the so-called 'othres' category. Second, we
need to run it with the 'setsid' utility to make sure that it runs in
a new session. Third, we have to redirect the standard file descriptors
and use the '&' operator to make it run in the background. Finally,
obtaining the PID of the 'loop.sh' process resulted in race condition.

In this patch we replace the loop.sh script with a program that would
address all problems mentioned above. The requirements for this program
are as follows.
- It must be reusable across tests
- It must start a process that is detached from the current shell
- It must wait for the process to start and output its PID

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
2b78d95e6b test/others: drop '_exit' function
The function name '_exit' is misleading as this function doesn't
actually exit when the status of the previous command is zero.
In addition, the behaviour of this function is not really needed.

This patch removes the '_exit' function and applies the correct
behaviour to stop the test on failure.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Andrei Vagin
34410b9e75 test: add a test to check that sigtrap handlers are restored
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Andrei Vagin
b310fbd31f ksigset: fix a typo in ksigdelset
Fixes: 8063eb8fe6 ("parasite: don't block SIGTRAP")
Reported-by: zl-wang <zlwang@ca.ibm.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Pavel Tikhomirov
c1b2d194e9 mem/pidfd: fix poll retry error checking
One should never rely on errno if libc syscall is successful. We can
either see an errno set from some previous failed syscall or even errno
set by a this successful libc syscall. So lets check ret first.

Fixes: 1ccdaf47 ("criu: add pidfd based pid reuse detection for RPC
clients")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
1c08709cdb zdtm: add pidfd store based pid reuse test
This is just a symlink to the original transition/pid_reuse test with
the right options passed to trigger the pidfd store based pid reuse
detection code path.

Pidfd store based detection is supported only in RPC mode which
requires passing a unix socket fd to be used as pidfd store and
the kernel should support pidfd_open and pidfd_getfd syscalls
{'feature': 'pidfd_store'} for this test to work.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
ea0dc7807a zdtm: add --pidfd-store option in RPC mode
When testing pid reuse using pidfd_store feature in RPC mode we need
to pass a unix socket fd used to CRIU in the RPC option
pidfd_store_sk to store the pidfds between predump/dump iterations.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
e79131e8c3 criu: add pidfd based pid reuse detection for RPC clients
Closes: #717

This increases the reliability of pid reuse detection using pidfds,
currently through RPC migration tools like P.Haul.

A connectionless unix socket is passed to criu in RPC mode through
the RPC option pidfd_store_sk.
If this option is set, the socket is initialized in
init_pidfd_store_sk() to be used as a queue for task pidfds.
criu then sends tasks pidfds to this socket in send_pidfd_entry()
and receives them in the next pre-dump/dump iteration to build
the pidfds hashtable in init_pidfd_store_hash().
These pidfds will be used later in detect_pid_reuse().

How it should be used in migration tools like P.Haul:
	- Open a connectionless unix socket
	- Pass the socket fd in the RPC option pidfd_store_sk when
	  doing a pre-dump or dump

This will fail if the kernel does not support pidfd_open or
pidfd_getfd syscalls, so pidfd_store_sk should not be set if the
kernel does not support pidfd_open.
This could be checked with:
	CLI: criu check --feature pidfd_store
	RPC: CRIU_REQ_TYPE__FEATURE_CHECK and set pidfd_store to
	     true in the "features" field of the request

v2:
	- add reasonable polling restart limit in check_pidfd_entry_state
	  to avoid getting stuck
	- avoid leaking pidfd in send_pidfd_entry when entry is NULL,
	  otherwise pidfds are freed in free_pidfd_store

v3:
	- check that the passed pidfd store is not empty after
	  the first iteration (i.e. --prev-images-dir option set).

v4:
	- clear pidfd_hash heads
	- check entry allocation error in init_pidfd_store_hash()

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
ba882893c3 cr-check: add ability to check if pidfd_store feature is supported
pidfd_store which will be used for reliable pidfd based pid reuse
detection for RPC clients requires two recent syscalls (pidfd_open
and pidfd_getfd).

We allow checking if pidfd_store is supported using:
	1. CLI: criu check --feature pidfd_store
	2. RPC: CRIU_REQ_TYPE__FEATURE_CHECK and set pidfd_store to
	   true in the "features" field of the request

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
e3c9c3429a cr-service: add pidfd_store_sk option to rpc.proto
pidfd_store_sk option will be used later to store tasks pidfds
between predumps to detect pid reuse reliably.
pidfd_store_sk should be a fd of a connectionless unix socket.

init_pidfd_store_sk() steals the socket from the RPC client using
pidfd_getfd, checks that it is a connectionless unix socket and
checks if it is not initialized before (i.e. unnamed socket).
If not initialized the socket is first bound to an abstract name
(combination of the real pid/fd to avoid overlap), then it is
connected to itself hence allowing us to store the pidfds in the
receive queue of the socket (this is similar to how fdstore_init()
works).

v2:
	- avoid close(pidfd) overriding errno of SYS_pidfd_open in
	  init_pidfd_store_sk()
	- close pidfd_store_sk because we might have leftover from
	  previous iterations

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
a9508c9864 criu: check if pidfd_getfd syscall is supported
pidfd_getfd syscall will be needed later to send pidfds between
pre-dump/dump iterations for pid reuse detection.

v2:
	- check size written/read of val_a/val_b is correct
	- return with error when val_a != val_b

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
30e8d8cadf criu: check if pidfd_open syscall is supported
pidfd_open syscall will be needed later to send pidfds between
pre-dump/dump iterations for pid reuse detection.

v2:
	- make kerndat_has_pidfd_open void since 0 is always returned
	- fix missing tabs in syscall tables

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
nithin-jaikar
5d08f975a1 kerndat: Handle non-root mode when checking uffd
When criu is run as user it fails and exits because of kerndat_uffd() returning -1.
This, in turn, happens after uffd = syscall(SYS_userfaultfd, flags); which only works
for root.

In the change it ignores the permission error and proceeds further just like it's done
for e.g. pagemap checking.

Signed-off-by: Nithin Jaikar J <jaikar006@gmail.com>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
8c303d1a64 test/others/crit: add test for 'x'
This commit extends the CRIT tests to cover the 'x' command, which is
used to explore an image directory.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
e393001095 lib/cli.py: Open explore file as a binary
Fixes #1484

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Andrei Vagin
c8973d426b test/zdtm: check that a penging SIGTRAP handled properly
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Andrei Vagin
61c7cc5a92 parasite: don't block SIGTRAP
This is the workaround for #1429.

The parasite code contains instructions that trigger SIGTRAP to stop at
certain points. In such cases, the kernel sends a force SIGTRAP that
can't be ignore and if it is blocked, the kernel resets its signal
handler to a default one and unblocks it. It means that if we want to
save the origin signal handle

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
ed58fb2214 test: create new tls certificates
The certificates expired again. This replaces the expired
certificates with test certificates which are valid for ever:

  echo -ne "ca\ncert_signing_key\nexpiration_days = -1\n" > temp
  certtool --generate-privkey > cakey.pem
  certtool --generate-self-signed \
           --template temp \
           --load-privkey cakey.pem \
           --outfile cacert.pem
  echo -ne "cn=$HOSTNAME\nencryption_key\nsigning_key\nexpiration_days = -1\n" > temp
  certtool --generate-privkey > key.pem
  certtool --generate-certificate \
           --template temp \
           --load-privkey key.pem \
           --load-ca-certificate cacert.pem \
           --load-ca-privkey cakey.pem \
           --outfile cert.pem
  rm cakey.pem temp

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Alexander Mikhalitsyn
6beeabcd42 zdtm: add sk-unix-dgram-ghost test case
This testcase reproduces deadlock in "wait_fds_event" futex in open_fdinfos()
function (files subsystem).

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Alexander Mikhalitsyn
2609e98ee7 sk-unix: ghost: fix deadlock between peer_fle->stage and fds wake up
This patch fixes deadlock that appears on ghost DGRAM unix sockets.
Problem is that wake_connected_sockets() function *should* be called
strictly after fle->stage >= FLE_OPEN.

Explanation:
Consider situation, we have ghost unix DGRAM socket (peer socket),
and also have several sockets that connected to this peer socket.

How restore of that picture works?
In files subsystem we have open_fdinfos(pstree_item*) function that calls open_fd()
function for *every* fd of task. open_fd() function calls appropriate
file descriptor "open" handler that may return "1" which means "try again later".
This retcode means, that some additional resources is needed to fully restore file
descriptor. For *ghost* UNIX sockets, for instance, we need to have peer socket
file descriptor *before* we can open and restore client sockets. Here is the main problem.
open_fdinfos() called from separate tasks simultaneously, so, when we get "1" retcode
we stay on futex (wait_fds_event() function) and waiting for someone another task
restore some resource and notify us that we can retry opening of file descriptor.

With *ghost* UNIX socket I've managed to caught next behaviour.
1. From one task (that holds client socket) open_fdinfos() called open_fd() that called
open_unixsk_standalone(). In open_unixsk_standalone we have check that means
"if socket have peer and that peer is GHOST and that peer fle->stage < FLE_OPEN"
return "try again". Ok. So, this task will stay on wait_fds_event().

2. Second task, that holds *peer* tried to open peer socket fd. So,
it also calls open_fd() -> open_unixsk_standalone() -> opening socket
-> bind_unix_sk() -> in bind_unix_sk we have call to wake_connected_sockets().
So, after that call we will "wake up" task from first point and it may proceed
fd restoring. Yes? No. In first point we need to "peer_fle->stage >= FLE_OPEN"
but fle->stage of our peer socket will become FLE_OPEN in open_fd(). After we
return from open_unixsk_standalone we proceed to setup_and_serve_out() where we have
appropriate stage change.
Between call of wake_connected_sockets and moment when we set stage to FLE_OPEN
should pass very small amount of time. But it may happen, so we "wake up"
tasks that holds client sockets but did not have enough time to change fle->stage
to FLE_OPEN. Exactly that case I've managed to reproduce.
(Really, ossec-hids application managed to reproduce this problem at first %) )

v1: file_desc_ops->on_stage_change callback was introduced,
sk-unix ghost code reworked so that to call wake_connected_sockets() strictly
after changing fle->stage to FLE_OPEN.
v2: implementation replaced with short and more practical patch by Andrei

Suggested-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Alexander Mikhalitsyn
655610e09a ci: remove hack for netns-nft zdtm test
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Alexander Mikhalitsyn
ddefbbff16 zdtm: add combined nftables/iptables netns-nft-ipt test
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Alexander Mikhalitsyn
4696e61edb zdtm: skip static/netns-nft test if nftables feature isn't supported
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Alexander Mikhalitsyn
d8821d9a8f net: skip iptables dump if it has nft backend and nft dump is supported
On modern Linux distributions iptables binaries using new nftables
API. We dump iptables rules using "iptables-save", and nftables
rules using libnftables API. This breaks network unlock on modern systems
because technically, we dump rules (including network lock rules) two times.

There is another problem - on host we can have modern distribution, but
in Docker container we can use iptables with netfilter (legacy) API.
So, in this case this legacy rules will be skipped.

This patch handles all of that cases. It tries to find iptables legacy and
dump legacy rules by using appropriate iptables binaries, dump nftables
rules by using libnftables.

Fixes #1435

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
e26949cfed lsm: handle half initialized SELinux setups
CRIU used to check for the existence of /sys/fs/selinux to see if
SELinux is enabled on a system. We have seen systems with SELinux kind
of enabled but reading out the labels gives does not return real labels.

To work around this, this commit adds a check during LSM detection
if SELinux labels are in the right format. For CRIU this check means to
see if there are at least 3 ':' in a label. If not CRIU switches to no
LSM mode.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
e2c352e4f8 tools.mk: Use Python 3 by default
As of January 1st, 2020 Python 2 is no longer supported and
many distributions no longer provide packages for Python 2
dependencies.

This patch allows CRIU to use Python 3 by default when both
major versions are available on the system.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
177e4b4bad mips: remove empty gitignore
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
22142eedf0 mips: coding style fixes
CRIU follows Linux kernel coding style. This patch updates the
architecture-specific code for MIPS to use tab indentation,
add whitespace between closing parenthesis and open bracket,
and changes the mode of source files from 755 to 644.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
zl-wang
99a6a17c2f Allow systemcfg proc file to be dumped
Currently, it cannot be check-pointed, because that type of file is on UNSUPP list.

Signed-off-by: zl-wang <zlwang@ca.ibm.com>
2021-09-03 10:31:00 -07:00
Nicolas Viennot
731cafa85c logging: pr_perror() -> pr_msg() when execvp fails in action scripts and others
When invoking an action-script, all file descriptors >= 3 are closed.
If execvp() fails, we can only log the error on stderr. pr_msg() outputs
on stderr, so we use this as opposed to pr_perror().

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2021-09-03 10:31:00 -07:00
Nicolas Viennot
24bdfa72df net: add a #define for increased compatiblity with old distributions
Debian 9 doesn't have IFLA_NEW_IFINDEX defined

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2021-09-03 10:31:00 -07:00
Nicolas Viennot
29c34386b0 restore: fix error message when fork fails
`last_pid_buf` is not where the last_pid string is, but it is in `s`.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
f10425e053 criu: end pr_(err|warn|msg|info|debug) with \n
Unlike pr_perror, pr_err and other macros do not append \n
to the message being printed, so the caller needs to take care of it.

Sometimes it was not done, so let's add this manually.

To make sure it won't happen again, add a line to Makefile under the
linter target to check for such missing \n. NOTE this check is only
done for part of such cases (where the pr_* statement fits in one line
and there's no comment after), but it's better than nothing.

Add comments after pr_msg and pr_info statements where we deliberately
don't add \n, so that the above check ignores them.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
96b7178bab Whitespace at EOL cleanup and check
My editor (vim) auto-removes whitespace at EOL for *.c and *.h files,
and I think it makes sense to have a separate commit for this, rather
than littering other commits with such changes.

To make sure this won't pile up again, add a line to Makefile under
the linter target to check for such things (so CI will fail).

This is all whitespace except an addition to Makefile.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
7ea20e8f5a criu: make sure to use pr_perror to show errno
In cases where we need to print errno, it is better to use pr_perror.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
10c619adb9 test/zdtm: pr_err / pr_perror fixes
1. Use pr_perror where errno needs to be shown.

2. Use pr_err in cases where errno is not set
   by the previous failed call.

3. Make sure pr_err's first argument do not have \n.

4. While at it, fix some error messages.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
dca0eb5b4a test/others/bers: use pr_perror
When errno is set, it makes sense to use pr_perror.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
e326889c06 criu/mount.c: fix \n in pr_debug
Funny but it used incorrect slash.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
2166d47482 scripts: fix shellcheck warnings
On my system (shellcheck v0.7.1) make lint shows a few warnings about
needing to quote variables.

Fix those.

PS I am not sure why those are not shown by GHA CI, I assume there is
different shellcheck version used. Add shellcheck -- version to the
appropriate Makefile target to avoid confusion.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
5f3631916a Makefile: amend lint with pr_perror/fail checks
In many cases developers forget that pr_perror and fail macros
are a bit special, in particular:

1. they already show errno;
2. they already append \n to the message.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
4cd23083be test/zdtm: don't pass errno to fail()
Macro fail() already prints the value of errno, so there's no need to
explicitly add it.

Found by git grep '^\s*\<fail\>.*errno'

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
12a2bd0edd test/zdtm: don't use %m with fail
Macro fail already appends errno and strerror(errno) to the error
message, so there's no need to do it explicitly.

Brought to you by

	for f in $(git grep -l fail test/zdtm); do
		test -f $f || continue
		echo $f
		sed -i '\|^[[:space:]]*fail(.*[ (]%m)*"|s/:*[ (]*%m)*//' $f
	done

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
b20694835d test/zdtm: don't use \n with fail()
Macro fail already appends \n to the message, so there's no need to do
it explicitly.

Brought to you by

	for f in $(git grep -l fail test/zdtm); do
		test -f $f || continue
		echo $f
		sed -i '\%^[[:space:]]*fail(.*\\n"%s/\\n"/"/' $f
	done

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
9cbcaaed39 test/zdtm: don't use errno for pr_perror
Macro pr_perror already adds errno and its string representation to the
error message, so there's no need to explicitly do it.

While at it, fix some error messages.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
865a5e9511 test/zdtm: don't use pr_perror where errno is unset
pr_perror should be used for cases where the failed operation sets
errno. For cases where errno is not set, pr_err is preferable.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00
Kir Kolyshkin
d55a65e939 criu: don't use errno for pr_error
There is no need to, as pr_error already adds strerror(errno)
to the error message.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-09-03 10:31:00 -07:00