2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 13:58:34 +00:00

10636 Commits

Author SHA1 Message Date
Zeyad Yasser
aa772bf286 zdtm: fix network lock tests when run with --norst
In test/jenkins/{crit.sh,criu-dump}, ZDTM is run with --norst,
Causing tests to only go through dump wihtout restoring.

The network locking tests are highly dependant on dump/restore hooks
causing them to hang when run with --norst.

We just add a reqrst flag to all network lock tests.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
9838d34ded criu: use unique table names for nftables based locking
During network locking CRIU creates an nftables table to
add needed rules. If more than one instance of CRIU run
in parallel, those tables' names would conflict.

Solution is to append root task pid to the nftables table
name as a postfix (e.g. inet CRIU-3231).

We also need to use `create table` instead of `add table`
because using `create` returns an error in case table name
already exists so we could detect conflicts if they happen.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
ca3e3c50be inventory: save network lock method to reuse in restore
When the network is locked using a specific method like iptables
or nftables there is no need to require passing the same method
during restore.

We save the lock method during dump in the inventory image and
use that in restore.

This always overwrites the restore --network-lock option.

v2: store opts.network_lock_method directly to avoid dependency
    on rpc.proto's 'enum criu_network_lock_method'.
v3: fall back to iptables if image is generated with an older
    version of CRIU.
v4: remove --network-lock from netns_lock_* from restore

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
cd1570b15e zdtm: add ipv6 variants of net_lock_socket_* tests
v2: remove unnecessary elif and else after return in
    wait_server_addr()
v3: use IOError instead of FileNotFoundError for python2
    compatibility

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
212db1d9a6 zdtm: add nftables per-socket locking test
This is just a symlink to the original static/net_lock_socket_iptables
test with the right options passed to use nftables instead.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
826d3d7407 criu: add nftables connection locking/unlocking
This adds nftables based connection locking as an alernative
for iptables. This avoid the external dependency of
iptables-restore.

It works by creating a 'connection set', which is a set of
connection identifying tuples. Rules are added to drop packets that
match the connection tuples in the set. Locking is now reduced to
just adding the connection identifying tuple to the set.

Unlocking is just as simple as deleteing the CRIU table.

v2: split ip string conversion into two if conditions
v3: add better message when CRIU is build without libnftables support
v4: fix indentation in nftables_lock_connection_raw()
v5: move 'ret = -1' below err: lable to avoid redundancy
v6: add better error message on lock failure
v7: run make indent

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
6e59b2bd77 zdtm: add iptables per-socket locking test
When criu dumps a process with --tcp-established opt it locks
the open tcp connections so that no packets from peer enters
the stack, otherwise RST will be sent by a kernel causing the
connection to fail.

Post-start hook creates a connection with the test server and
creates a background thread that stays alive for the duration
of the test. This background thread sends data to the test
server at three stages:
- Pre-dump: Should send normally
- Pre-restore:
	If connection is locked properly, packets will be dropped
	and TCP will just retry, which will eventually be sent when
	the process is restored and the network is unlocked.
- Post-restore: Should send normally

Data sent at the three stages is then checked at the server's side.

v2:
	- remove unused imports and constants
	- delete sync file in wait_sync_file() instead of --clean
v3:
	- add comments

Co-developed-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
c15327656a zdtm: add nftables network namespace locking test
This is just a symlink to the original static/netns_lock test with
the right options passed to use nftables instead.

v2:
	- make static/netns_lock test iptables explicitly
	- prevent netns_lock tests from running in parallel because
	  netns & sync files creation were conflicting in both tests.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
19cc0bfa65 criu: add nftables netns-wide locking/unlocking
This adds nftables based internal network locking as an
alernative for iptables. This avoid the external dependency
of iptables-restore.

v2: fix indentation & rename 'free' lable to 'out'
v3: add better message when CRIU is build without libnftables support
v4:
	- move 'ret = -1' below err: lable to avoid redundancy
	- fix nft ctx memory leak in case of success in
	  nftables_network_unlock()
v5: add better error message on lock failure

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
f246ca56c0 criu: rename iptables network locking/unlocking functions
Related to the new --network-lock option, other methods for network
locking/unlocking will be added as an alternative to iptables like
nftables.

This option is used in the core network locking/unlocking hooks to
decide which method should be used, making it easier to add new
methods later smoothly.
i.e.
	- network_lock_internal
	- network_unlock_internal
	- lock_connection (renamed from nf_lock_connection)
	- unlock_connection (renamed from nf_unlock_connection)
	- unlock_connection_info (renamed from unlock_connection_info)

nf_* functions are renamed to iptables_* to avoid confusion with
other netfilter methods in the future like nftables.

v2: run make indent
v3: make error messages more descriptive

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
e9d24a2ba3 cr-check: add check for nftables based network locking
Nftables based network locking/unlocking will be added later.

Nftables sets will be used to load the connection tuples that
will be locked, to be able to store those tuples we need to
check "Set Concatenations" support.

https://wiki.nftables.org/wiki-nftables/index.php/Concatenations

v2: fix 'has_nftables_concat=true' when adding CRIU table fails
v3: add better message when CRIU is build without libnftables support
v4: run make indent

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
b85fad797c cr-service: add network_lock option to RPC and libcriu
v2: run make indent

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
2e30db5c3d criu: add --network-lock option to allow nftables alternative
This adds the option to choose the networking locking method.

CRIU currently uses iptables-restore cli for network locking/unlocking
but nftables support will be added later.

There have been reports from users that iptables-restore fails in some
way and an nftables based approach using libnftables could avoid this
external dependency.

v2: remove dependency details in man page for --network-lock.
v3: remove --network-lock from restore section in docs because it is
    automatically detected from the inventory image now.
v4: add message that --network-lock will be ignored during restore
    and value from dump will be used.
v5: run make indent

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Andrei Vagin
ef7af1dd15 Run 'make indent' on criu/include/plugin.h
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Andrei Vagin
cf2b67375a workflows/lint: show changes
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
03cdbc4c02 criu/config: fix use-after-free in parse_join_ns
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Andrei Vagin
546a6dfd0e configs: fix used after free cases
We have some code written with the assumption that argv is never
destroyed, but when we handle configs, we construct an argv-like array
for each config, parse it and release it.

I am still not sure that we need to release memory of per-config argv
arrays... The current scheme is going to be a source of used-after-free
bugs. When we will add the non-privileged mode, all these bugs will be
serious security issues.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
399a53a43f lsm: do not print a warning if no LSM has been detected
Without any LSM detected CRIU would print a warning for every run:

 Warn  (criu/lsm.c:328): don't know how to suspend LSM 0

Which clutters up the CI logs.

Change the message to a debug message.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
960f26f907 files-reg: do not print a warning if a file has no build_id
On systems where there is no build_id we get a warning for each run

 Warn  (criu/files-reg.c:1458): Couldn't find the build-id note for file with fd 15

Change the log level to debug for this message as the file just does not have
a build_id and printing a warning clutters the CI logs.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Pavel Tikhomirov
90e175d52f zdtm/pthread_timers: make sure glibc allocated SIGEV_THREAD's stack
On Virtuozzo7 jenkins we see a fail of criu-dev zdtm:

  ===================== Run zdtm/static/pthread_timers in ns =====================
  Start test
  ./pthread_timers --pidfile=pthread_timers.pid --outfile=pthread_timers.out
  Run criu dump
  =[log]=> dump/zdtm/static/pthread_timers/112/1/dump.log
  ------------------------ grep Error ------------------------
  (00.004817) netlink: Collect netlink sock 0x1cad6e21
  (00.004821) netlink: Collect netlink sock 0x1cad6e22
  (00.004831) Collecting pidns 9/112
  (00.004886) No parent images directory provided
  (00.004903) Warn  (criu/lsm.c:328): don't know how to suspend LSM 0
  ------------------------ ERROR OVER ------------------------
  Run criu restore
  4: Old maps lost: set([])
  4: New maps appeared: set([u'7fe4c54ca000-7fe4c54cb000 ---p', u'7fe4c0000000-7fe4c0021000 rw-p', u'7fe4c0021000-7fe4c4000000 ---p', u'7fe4c54cb000-7fe4c5ccb000 rw-p'])
  ############# Test zdtm/static/pthread_timers FAIL at maps compare #############

https://ci.openvz.org/job/CRIU/job/CRIU-virtuozzo/job/criu-dev/8032/consoleFull

First thing to mention is that this is not related to criu. I can manage
to reproduce it with "--nocr", problem is that some mapping appears a
bit later when we do pre-cr get_visible_state().

By debugging SIGEV_THREAD thread with gdb I can see that addresses from
this unexpectedly appearing mapping are used by glibc here as "struct
pthread *pd":

 clone()
  start_thread()
   timer_helper_thread()
    __pthread_create_2_1()

So the mapping looks allocated by allocate_stack(), and it is only
gets done after first timer trigger (we have glibc-2.17 on vz7):

https://github.com/bminor/glibc/blob/release/2.17/master/nptl/sysdeps/unix/sysv/linux/timer_routines.c#L92

So let's wait at least 1 timer trigger so that memory outfit of the test
become permanent and our check_visible_state zdtm check would not be
false negative.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
dd0e66149f ci: fix 'crit.sh: 3: source: not found'
Jenkins test runs are failing with:

 ./test/jenkins/run_ct ./test/jenkins/crit.sh
 ./test/jenkins/crit.sh: 3: source: not found

Switch to bash which has 'source'.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
e936a0f8ad docker-test: refactor test scenario
The following error occurs when creating a checkpoint of
a container immediately after the container has been restored
from another checkpoint.

Error response from daemon: Cannot checkpoint container cr: content
sha256:12c69b7a9d25695dd5f9d37d4e858e2f7c3f9da738ccf86f8d3042f6973af1df:
already exists

In this patch we add a healthcheck to the test container and update the
test to perform a checkpoint only when the container is in a 'healthy'
state. In addition, this patch adds a scenario to test the
checkpoint/restore of multiple containers.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Andrei Vagin
78eb0dabf9 dump: suspend/resume lsm on pre-dump
Otherwise, criu pre-dump can fail with errors like this:
lib/infect.c:650: Unable to connect a transport socket: Permission denied'

Reported-by: Mr Jenkins
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Christian Brauner
5dc373385d util: add run_command()
Make it easy to run commands and capture the output in a user provided
buffer.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Christian Brauner
9422383b6b zdtm/apparmor_stacking: don't include optional AppArmor namespace separator
AppArmor namespaces are officially colon-separated. The double-slash
syntax is just convenience:

"The trailing : separates the namespace name from the profile name and
the optional / and // separators are provided as a convenience for those
familiar with ssh and protocol urls." (see [1])

[1]: https://gitlab.com/apparmor/apparmor/-/wikis/AppArmorNamespaces
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Christian Brauner
dc4c3cd48b apparmor: actually enable suspend for AppArmor
The original patches didn't pass down the "suspend" boolean into
write_aa_policy() and so suspend never really happened. Pass it down.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Christian Brauner
ea1c891476 lsm: handle SELinux LSM correctly
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Tycho Andersen
06b5d2fa8d tests: add a test for apparmor_stacking
v2: use a profile that doesn't have "unix" to test the suspend feature too
v3: use "/" in the profile names to make sure this works

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Tycho Andersen
8723e3f998 check: add a feature test for apparmor_stacking
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Tycho Andersen
8d992a680e lsm: support checkpoint/restore of stacked apparmor profiles
Support for apparmor namespaces and stacking is coming to Ubuntu kernels in
16.10, and should hopefully be upstreamed Soon (TM) :).

The basic idea is similar to how cgroups are done: we can restore the
apparmor namespace and profile blobs independently of the tasks, and then
at the end we can just set the task's label appropriately. This means the
code that moves tasks under a label stays the same, and the only new code
is the stuff that dumps and restores the policy blobs that are in the
namespace that were loaded by the container.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Tycho Andersen
0db135ac4f util: add rm -rf function
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Tycho Andersen
6085c37ba2 lsm: change when LSM profiles are collected
Instead of collecting profiles right at thread dump time, let's collect
them all at a single point. This way, when we add support for suspending
LSMs, we know what profiles if any we need to suspend.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
e2a45d7867 ci: extend lint run to run 'make indent'
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
70833bcf29 Run 'make indent' on header files
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
93dd984ca0 Run 'make indent' on all C files
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
1e26f170ca criu: introduce clang-format to format source code
This is another attempt to introduce a tool to format CRIU's source
code. This time it is based on clang-format.

The .clang-format file is taken from the linux kernel git tree (5.13).

I removed all comments from lines which state that it requires at least
clang-format 4 or 5. For this resulting file at least clang-format 11
is required. See scripts/fetch-clang-format.sh for all the changes
done to the Linux kernel .clang-format file.

Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Pavel Tikhomirov
cc2317ea48 zdtm: fix indentation in Makefile wait_stop target
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
d62e747e91 ci: fix Fedora Rawhide
Fedora Rawhide updated to a glibc using clone3(). clone3() is, however,
not yet part of the seccomp filter. Unfortunately 'docker build' does
not allow dropping seccomp but luckily 'podman build' does.

This switches the Fedora Rawhide test to use Podman. Podman is part of
GitHub Actions and no additional packages need to be installed.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Pavel Tikhomirov
b32c8c6fe5 posix-timers: fix getoverrun error handling
If retcode of dump_posix_timers is not zero it is treated as an error in
compel_rpc_sync. And currently we can return positive overrun of last
timer (if we are lucky and it is not zero) as retcode of function
dump_posix_timers, let's fix it. Also I don't see any point in putting
negative value into .overrun on error path, fix it too.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Adrian Reber
01fa34f1eb ci: use pre-installed Podman
GitHub Actions already has Podman installed. No need to install it a
second time.

Signed-off-by: Adrian Reber <areber@redhat.com>
2021-09-03 10:31:00 -07:00
Pavel Tikhomirov
918901439f zdtm/pthread_timers: require ns_pid feature and add non-ns test
Resolving real pid to vpid for notify thread ids requires NSpid feature
supported by kernel, though in simple non-pid-ns case we can deal
without it, so add a requirement and split out the host test without the
requirement.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Pavel Tikhomirov
e1b1547c80 posix-timers: fallback notify thread id encoding for non-pidns and non-nspid
1) If all dumped processes are in host pidns we can skip pid conversion
   logic and just use real pid.

2) If we have pidns to dump we should also have kernel NSpid feature,
   else we should fail to dump notify thread id, as it's not possible to
   properly convert rpid to vpid.

While on it let's put the code to encode_notify_thread_id helper to
improve code readability.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Pavel Tikhomirov
91d7203b80 proc_parse: make nspid field optional
On Centos7 we don't have NSpid field in /proc/[pid]/status so for
compatibility let's skip it.

(cherry-pick one hunk from Virtuozzo commit
https://src.openvz.org/projects/OVZ/repos/criu/commits/c6d0ee567c)

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Kirill Tkhai
a692a0d0af kerndat: Check that "/proc/[pid]/status" file has NS{pid, ..} lines
If there is nested pid_ns, we need to be able to get pid in
the whole pid hierarhy. This may be taken from "/proc/[pid]/status"
file only. Check, that kernel has support for it.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

NSpid is not (yet) supported on Centos7 thus we need this check for
compatibility.

(cherry-picked from Virtuozzo criu commit
https://src.openvz.org/projects/OVZ/repos/criu/commits/94f4653f20)

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-03 10:31:00 -07:00
Andrei Vagin
64f0012e44 zdtm: add a test for SIGEV_THREAD timers
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Andrei Vagin
7eab5a7dc7 timers: save tid from a task pid namespace
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Andrei Vagin
61e1334ab0 proc_parse: get a thread ID in a thread pidns from /proc/pid/status
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-09-03 10:31:00 -07:00
Liu Chao
80079fbb0d criu: dump and restore notify_thread_id of posix timer
When sigev_notify_thread_id is not set, get_pid will return a NULL
pointer and do_timer_create will return -EINVAL in kernel. So criu
will failed to create posix timer:

(09.806760) pie: 41301: Error (criu/pie/restorer.c:1998): Can't restore posix timers -22
(09.806824) pie: 41301: Error (criu/pie/restorer.c:2133): Restorer fail 41301
(09.891880) Error (criu/cr-restore.c:2596): Restoring FAILED.

Signed-off-by: Liu Chao <liuchao173@huawei.com>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
6be9345fb1 criu-ns: add support for 'check' action
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00
Radostin Stoyanov
868bffba4d criu-ns: add top-level conditional execution
Execute actions only if run as a script.
https://docs.python.org/3/library/__main__.html

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2021-09-03 10:31:00 -07:00