mir/criu - criu - Mike's Git repositories

mir/criu

mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 13:58:34 +00:00

Author	SHA1	Message	Date
Zeyad Yasser	aa772bf286	zdtm: fix network lock tests when run with --norst In test/jenkins/{crit.sh,criu-dump}, ZDTM is run with --norst, Causing tests to only go through dump wihtout restoring. The network locking tests are highly dependant on dump/restore hooks causing them to hang when run with --norst. We just add a reqrst flag to all network lock tests. Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	9838d34ded	criu: use unique table names for nftables based locking During network locking CRIU creates an nftables table to add needed rules. If more than one instance of CRIU run in parallel, those tables' names would conflict. Solution is to append root task pid to the nftables table name as a postfix (e.g. inet CRIU-3231). We also need to use `create table` instead of `add table` because using `create` returns an error in case table name already exists so we could detect conflicts if they happen. Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	ca3e3c50be	inventory: save network lock method to reuse in restore When the network is locked using a specific method like iptables or nftables there is no need to require passing the same method during restore. We save the lock method during dump in the inventory image and use that in restore. This always overwrites the restore --network-lock option. v2: store opts.network_lock_method directly to avoid dependency on rpc.proto's 'enum criu_network_lock_method'. v3: fall back to iptables if image is generated with an older version of CRIU. v4: remove --network-lock from netns_lock_* from restore Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	cd1570b15e	zdtm: add ipv6 variants of net_lock_socket_* tests v2: remove unnecessary elif and else after return in wait_server_addr() v3: use IOError instead of FileNotFoundError for python2 compatibility Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	212db1d9a6	zdtm: add nftables per-socket locking test This is just a symlink to the original static/net_lock_socket_iptables test with the right options passed to use nftables instead. Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	826d3d7407	criu: add nftables connection locking/unlocking This adds nftables based connection locking as an alernative for iptables. This avoid the external dependency of iptables-restore. It works by creating a 'connection set', which is a set of connection identifying tuples. Rules are added to drop packets that match the connection tuples in the set. Locking is now reduced to just adding the connection identifying tuple to the set. Unlocking is just as simple as deleteing the CRIU table. v2: split ip string conversion into two if conditions v3: add better message when CRIU is build without libnftables support v4: fix indentation in nftables_lock_connection_raw() v5: move 'ret = -1' below err: lable to avoid redundancy v6: add better error message on lock failure v7: run make indent Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	6e59b2bd77	zdtm: add iptables per-socket locking test When criu dumps a process with --tcp-established opt it locks the open tcp connections so that no packets from peer enters the stack, otherwise RST will be sent by a kernel causing the connection to fail. Post-start hook creates a connection with the test server and creates a background thread that stays alive for the duration of the test. This background thread sends data to the test server at three stages: - Pre-dump: Should send normally - Pre-restore: If connection is locked properly, packets will be dropped and TCP will just retry, which will eventually be sent when the process is restored and the network is unlocked. - Post-restore: Should send normally Data sent at the three stages is then checked at the server's side. v2: - remove unused imports and constants - delete sync file in wait_sync_file() instead of --clean v3: - add comments Co-developed-by: Radostin Stoyanov <rstoyanov@fedoraproject.org> Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	c15327656a	zdtm: add nftables network namespace locking test This is just a symlink to the original static/netns_lock test with the right options passed to use nftables instead. v2: - make static/netns_lock test iptables explicitly - prevent netns_lock tests from running in parallel because netns & sync files creation were conflicting in both tests. Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	19cc0bfa65	criu: add nftables netns-wide locking/unlocking This adds nftables based internal network locking as an alernative for iptables. This avoid the external dependency of iptables-restore. v2: fix indentation & rename 'free' lable to 'out' v3: add better message when CRIU is build without libnftables support v4: - move 'ret = -1' below err: lable to avoid redundancy - fix nft ctx memory leak in case of success in nftables_network_unlock() v5: add better error message on lock failure Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	f246ca56c0	criu: rename iptables network locking/unlocking functions Related to the new --network-lock option, other methods for network locking/unlocking will be added as an alternative to iptables like nftables. This option is used in the core network locking/unlocking hooks to decide which method should be used, making it easier to add new methods later smoothly. i.e. - network_lock_internal - network_unlock_internal - lock_connection (renamed from nf_lock_connection) - unlock_connection (renamed from nf_unlock_connection) - unlock_connection_info (renamed from unlock_connection_info) nf_* functions are renamed to iptables_* to avoid confusion with other netfilter methods in the future like nftables. v2: run make indent v3: make error messages more descriptive Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	e9d24a2ba3	cr-check: add check for nftables based network locking Nftables based network locking/unlocking will be added later. Nftables sets will be used to load the connection tuples that will be locked, to be able to store those tuples we need to check "Set Concatenations" support. https://wiki.nftables.org/wiki-nftables/index.php/Concatenations v2: fix 'has_nftables_concat=true' when adding CRIU table fails v3: add better message when CRIU is build without libnftables support v4: run make indent Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	b85fad797c	cr-service: add network_lock option to RPC and libcriu v2: run make indent Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	2e30db5c3d	criu: add --network-lock option to allow nftables alternative This adds the option to choose the networking locking method. CRIU currently uses iptables-restore cli for network locking/unlocking but nftables support will be added later. There have been reports from users that iptables-restore fails in some way and an nftables based approach using libnftables could avoid this external dependency. v2: remove dependency details in man page for --network-lock. v3: remove --network-lock from restore section in docs because it is automatically detected from the inventory image now. v4: add message that --network-lock will be ignored during restore and value from dump will be used. v5: run make indent Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Andrei Vagin	ef7af1dd15	Run 'make indent' on criu/include/plugin.h Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Andrei Vagin	cf2b67375a	workflows/lint: show changes Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	03cdbc4c02	criu/config: fix use-after-free in parse_join_ns Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Andrei Vagin	546a6dfd0e	configs: fix used after free cases We have some code written with the assumption that argv is never destroyed, but when we handle configs, we construct an argv-like array for each config, parse it and release it. I am still not sure that we need to release memory of per-config argv arrays... The current scheme is going to be a source of used-after-free bugs. When we will add the non-privileged mode, all these bugs will be serious security issues. Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	399a53a43f	lsm: do not print a warning if no LSM has been detected Without any LSM detected CRIU would print a warning for every run: Warn (criu/lsm.c:328): don't know how to suspend LSM 0 Which clutters up the CI logs. Change the message to a debug message. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	960f26f907	files-reg: do not print a warning if a file has no build_id On systems where there is no build_id we get a warning for each run Warn (criu/files-reg.c:1458): Couldn't find the build-id note for file with fd 15 Change the log level to debug for this message as the file just does not have a build_id and printing a warning clutters the CI logs. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	90e175d52f	zdtm/pthread_timers: make sure glibc allocated SIGEV_THREAD's stack On Virtuozzo7 jenkins we see a fail of criu-dev zdtm: ===================== Run zdtm/static/pthread_timers in ns ===================== Start test ./pthread_timers --pidfile=pthread_timers.pid --outfile=pthread_timers.out Run criu dump =[log]=> dump/zdtm/static/pthread_timers/112/1/dump.log ------------------------ grep Error ------------------------ (00.004817) netlink: Collect netlink sock 0x1cad6e21 (00.004821) netlink: Collect netlink sock 0x1cad6e22 (00.004831) Collecting pidns 9/112 (00.004886) No parent images directory provided (00.004903) Warn (criu/lsm.c:328): don't know how to suspend LSM 0 ------------------------ ERROR OVER ------------------------ Run criu restore 4: Old maps lost: set([]) 4: New maps appeared: set([u'7fe4c54ca000-7fe4c54cb000 ---p', u'7fe4c0000000-7fe4c0021000 rw-p', u'7fe4c0021000-7fe4c4000000 ---p', u'7fe4c54cb000-7fe4c5ccb000 rw-p']) ############# Test zdtm/static/pthread_timers FAIL at maps compare ############# https://ci.openvz.org/job/CRIU/job/CRIU-virtuozzo/job/criu-dev/8032/consoleFull First thing to mention is that this is not related to criu. I can manage to reproduce it with "--nocr", problem is that some mapping appears a bit later when we do pre-cr get_visible_state(). By debugging SIGEV_THREAD thread with gdb I can see that addresses from this unexpectedly appearing mapping are used by glibc here as "struct pthread *pd": clone() start_thread() timer_helper_thread() __pthread_create_2_1() So the mapping looks allocated by allocate_stack(), and it is only gets done after first timer trigger (we have glibc-2.17 on vz7): https://github.com/bminor/glibc/blob/release/2.17/master/nptl/sysdeps/unix/sysv/linux/timer_routines.c#L92 So let's wait at least 1 timer trigger so that memory outfit of the test become permanent and our check_visible_state zdtm check would not be false negative. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	dd0e66149f	ci: fix 'crit.sh: 3: source: not found' Jenkins test runs are failing with: ./test/jenkins/run_ct ./test/jenkins/crit.sh ./test/jenkins/crit.sh: 3: source: not found Switch to bash which has 'source'. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	e936a0f8ad	docker-test: refactor test scenario The following error occurs when creating a checkpoint of a container immediately after the container has been restored from another checkpoint. Error response from daemon: Cannot checkpoint container cr: content sha256:12c69b7a9d25695dd5f9d37d4e858e2f7c3f9da738ccf86f8d3042f6973af1df: already exists In this patch we add a healthcheck to the test container and update the test to perform a checkpoint only when the container is in a 'healthy' state. In addition, this patch adds a scenario to test the checkpoint/restore of multiple containers. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Andrei Vagin	78eb0dabf9	dump: suspend/resume lsm on pre-dump Otherwise, criu pre-dump can fail with errors like this: lib/infect.c:650: Unable to connect a transport socket: Permission denied' Reported-by: Mr Jenkins Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Christian Brauner	5dc373385d	util: add run_command() Make it easy to run commands and capture the output in a user provided buffer. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Christian Brauner	9422383b6b	zdtm/apparmor_stacking: don't include optional AppArmor namespace separator AppArmor namespaces are officially colon-separated. The double-slash syntax is just convenience: "The trailing : separates the namespace name from the profile name and the optional / and // separators are provided as a convenience for those familiar with ssh and protocol urls." (see [1]) [1]: https://gitlab.com/apparmor/apparmor/-/wikis/AppArmorNamespaces Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Christian Brauner	dc4c3cd48b	apparmor: actually enable suspend for AppArmor The original patches didn't pass down the "suspend" boolean into write_aa_policy() and so suspend never really happened. Pass it down. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Christian Brauner	ea1c891476	lsm: handle SELinux LSM correctly Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Tycho Andersen	06b5d2fa8d	tests: add a test for apparmor_stacking v2: use a profile that doesn't have "unix" to test the suspend feature too v3: use "/" in the profile names to make sure this works Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Tycho Andersen	8723e3f998	check: add a feature test for apparmor_stacking Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Tycho Andersen	8d992a680e	lsm: support checkpoint/restore of stacked apparmor profiles Support for apparmor namespaces and stacking is coming to Ubuntu kernels in 16.10, and should hopefully be upstreamed Soon (TM) :). The basic idea is similar to how cgroups are done: we can restore the apparmor namespace and profile blobs independently of the tasks, and then at the end we can just set the task's label appropriately. This means the code that moves tasks under a label stays the same, and the only new code is the stuff that dumps and restores the policy blobs that are in the namespace that were loaded by the container. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Tycho Andersen	0db135ac4f	util: add rm -rf function Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Tycho Andersen	6085c37ba2	lsm: change when LSM profiles are collected Instead of collecting profiles right at thread dump time, let's collect them all at a single point. This way, when we add support for suspending LSMs, we know what profiles if any we need to suspend. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	e2a45d7867	ci: extend lint run to run 'make indent' Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	70833bcf29	Run 'make indent' on header files Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	93dd984ca0	Run 'make indent' on all C files Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	1e26f170ca	criu: introduce clang-format to format source code This is another attempt to introduce a tool to format CRIU's source code. This time it is based on clang-format. The .clang-format file is taken from the linux kernel git tree (5.13). I removed all comments from lines which state that it requires at least clang-format 4 or 5. For this resulting file at least clang-format 11 is required. See scripts/fetch-clang-format.sh for all the changes done to the Linux kernel .clang-format file. Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	cc2317ea48	zdtm: fix indentation in Makefile wait_stop target Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	d62e747e91	ci: fix Fedora Rawhide Fedora Rawhide updated to a glibc using clone3(). clone3() is, however, not yet part of the seccomp filter. Unfortunately 'docker build' does not allow dropping seccomp but luckily 'podman build' does. This switches the Fedora Rawhide test to use Podman. Podman is part of GitHub Actions and no additional packages need to be installed. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	b32c8c6fe5	posix-timers: fix getoverrun error handling If retcode of dump_posix_timers is not zero it is treated as an error in compel_rpc_sync. And currently we can return positive overrun of last timer (if we are lucky and it is not zero) as retcode of function dump_posix_timers, let's fix it. Also I don't see any point in putting negative value into .overrun on error path, fix it too. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	01fa34f1eb	ci: use pre-installed Podman GitHub Actions already has Podman installed. No need to install it a second time. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	918901439f	zdtm/pthread_timers: require ns_pid feature and add non-ns test Resolving real pid to vpid for notify thread ids requires NSpid feature supported by kernel, though in simple non-pid-ns case we can deal without it, so add a requirement and split out the host test without the requirement. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	e1b1547c80	posix-timers: fallback notify thread id encoding for non-pidns and non-nspid 1) If all dumped processes are in host pidns we can skip pid conversion logic and just use real pid. 2) If we have pidns to dump we should also have kernel NSpid feature, else we should fail to dump notify thread id, as it's not possible to properly convert rpid to vpid. While on it let's put the code to encode_notify_thread_id helper to improve code readability. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	91d7203b80	proc_parse: make nspid field optional On Centos7 we don't have NSpid field in /proc/[pid]/status so for compatibility let's skip it. (cherry-pick one hunk from Virtuozzo commit https://src.openvz.org/projects/OVZ/repos/criu/commits/c6d0ee567c) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Kirill Tkhai	a692a0d0af	kerndat: Check that "/proc/[pid]/status" file has NS{pid, ..} lines If there is nested pid_ns, we need to be able to get pid in the whole pid hierarhy. This may be taken from "/proc/[pid]/status" file only. Check, that kernel has support for it. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> NSpid is not (yet) supported on Centos7 thus we need this check for compatibility. (cherry-picked from Virtuozzo criu commit https://src.openvz.org/projects/OVZ/repos/criu/commits/94f4653f20) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Andrei Vagin	64f0012e44	zdtm: add a test for SIGEV_THREAD timers Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Andrei Vagin	7eab5a7dc7	timers: save tid from a task pid namespace Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Andrei Vagin	61e1334ab0	proc_parse: get a thread ID in a thread pidns from /proc/pid/status Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Liu Chao	80079fbb0d	criu: dump and restore notify_thread_id of posix timer When sigev_notify_thread_id is not set, get_pid will return a NULL pointer and do_timer_create will return -EINVAL in kernel. So criu will failed to create posix timer: (09.806760) pie: 41301: Error (criu/pie/restorer.c:1998): Can't restore posix timers -22 (09.806824) pie: 41301: Error (criu/pie/restorer.c:2133): Restorer fail 41301 (09.891880) Error (criu/cr-restore.c:2596): Restoring FAILED. Signed-off-by: Liu Chao <liuchao173@huawei.com>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	6be9345fb1	criu-ns: add support for 'check' action Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	868bffba4d	criu-ns: add top-level conditional execution Execute actions only if run as a script. https://docs.python.org/3/library/__main__.html Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00

1 2 3 4 5 ...

10636 Commits