mir/criu - criu - Mike's Git repositories

mir/criu

mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 22:05:36 +00:00

Author	SHA1	Message	Date
Zeyad Yasser	d55f34ed78	test/ci: sync netns_lock test and its --post-start hook The --post-start hook creates a netns which the test should enter at the beginning of the test. The test randomly failed in CI tests, it is most likely caused by a race condition. I suspect this flow is root cause: 1. --post-start hook starts just after the test (in parallel) 2. --post-start hook calls ip netns add to create the test netns 3. ip creates the netns file 4. netns_lock test opens that file and uses it in setns 5. ip mounts the netns to the file Of course test fails at step 4 because the netns is not yet mounted to the file. I made the test wait for SYNCFILE to be created by the --post-start hook before it tries to open the netns file and call setns. Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	b290df9a65	test/jenkins: fix netns_lock test multiple iterations failure netns_lock is highly dependent on the order of the hooks, and iterations causes the --pre-dump hook to be called multiple times which expectedly caused the test to fail. The server loop accommodates for multiple iterations. https://ci.kernoops.org/job/CRIU/job/CRIU-iter/job/criu-dev/431/testReport/(root)/criu/zdtm_static_netns_lock/ Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	75feb9635e	ci: fix mips64el-cross test The mips64el-cross test target started to show following error: error: listing the stack pointer register '$29' in a clobber list is deprecated [-Werror=deprecated] This fixes it in three different places by removing $29' from the clobber list. This is only compile tested as we have no mips hardware for testing. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Younes Manton	f3cb156604	Keep inherit-fd strings alive until task restore If inherit-fd is read from a config file its buffer will be freed after the config file is parsed but before task restore, which is when we need to use the mapping. Therefore, when adding an inherit-fd mapping to the opts list, copy the key string to a new buffer. Signed-off-by: Younes Manton <ymanton@ca.ibm.com>	2021-09-03 10:31:00 -07:00
fu.lin	d3ce492cc2	pycrit: fix the broken of cli the `crit show xxx.img` It will broken when the cli `crit show ipcns-shm-9.img` is executed, msg: { "magic": "IPCNS_SHM", "entries": [ { "desc": { "key": 0, "uid": 0, "gid": 0, "cuid": 0, "cgid": 0, "mode": 438, "id": 0 }, "size": 1048576, "in_pagemaps": true, "extra": Traceback (most recent call last): File "/usr/bin/crit", line 6, in <module> cli.main() File "/usr/lib/python3/dist-packages/pycriu/cli.py", line 412, in main opts["func"](opts) File "/usr/lib/python3/dist-packages/pycriu/cli.py", line 45, in decode json.dump(img, f, indent=indent) File "/usr/lib/python3.9/json/__init__.py", line 179, in dump for chunk in iterable: File "/usr/lib/python3.9/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/usr/lib/python3.9/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/usr/lib/python3.9/json/encoder.py", line 325, in _iterencode_list yield from chunks File "/usr/lib/python3.9/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/usr/lib/python3.9/json/encoder.py", line 438, in _iterencode o = _default(o) File "/usr/lib/python3.9/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type bytes is not JSON serializable This is caused by `img['magic'][0]['extra']` which is bytes. I find other load condtions, fix them at the same time. Signed-off-by: fu.lin <fulin10@huawei.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	093fb0c878	Add test for new --lsm-mount-context option Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	64dd64e504	Enable changing of mount context on restore This change is motivated by checkpointing and restoring container in Pods. When restoring a container into a new Pod the SELinux label of the existing Pod needs to be used and not the SELinux label saved during checkpointing. The option --lsm-profile already enables changing of process SELinux labels on restore. If there are, however, tmpfs checkpointed they will be mounted during restore with the same context as during checkpointing. This can look like the following example: context="system_u:object_r:container_file_t:s0:c82,c137" On restore we want to change this context to match the mount label of the Pod this container is restored into. Changing of the mount label is now possible with the new option --mount-context: criu restore --mount-context "system_u:object_r:container_file_t:s0:c204,c495" This will lead to mount options being changed to context="system_u:object_r:container_file_t:s0:c204,c495" Now the restored container can access all the files in the container again. This has been tested in combination with runc and CRI-O. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	5be71273f6	Remove unnecessary whitespace Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	fc7705a13f	zdtm: add network namespace locking test When criu dumps a process in a network namespace it locks the network so that no packets from peer enters the stack, otherwise RST will be sent by a kernel causing the connection to fail. In netns_lock.c we try to enter the netns created by post-start hook so that criu locks the network namespace between dump and restore. A TCP server is started in post-start hook inside the test netns and runs in the background detached from its parent so that it stays alive for the duration of the test. Other hooks (pre-dump, pre-restore, post-restore) try to connect to the server. Pre-dump and post-restore hooks should be able to connect successfully. Pre-restore hook client with SOCCR_MARK should also connect successfully. Pre-restore hook client without SOCCR_MARK should not be able to connect but also should not get connection refused as all packets are dropped in the namespace so the kernel shouldn't send an RST packet as a result. Instead we check that the connect operation causes a timeout. This test would be useful when testing that the network is locked using different ways (using iptables currently and other methods later). v2: - check that packets with SOCCR_MARK are allowed to pass when the netns is locked. v3: - fix pre-restore hook skipping non SOCCR_MARK connection test due to early exit in SOCCR_MARK variant. Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	0cf79a3608	test: remove exec test criu exec is deprecated for some time now and criu just exits with an error if running 'criu exec'. This removes the test for that non-working subcommand. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	1a197d4d86	criu: add unit testing for config file parser This tries to add a unit test for the configuration file parser. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	45bde968a2	test: add tests for configuration file parsing This adds a test run to ensure known (but fixed) configuration file parser errors are not crashing CRIU anymore. Based on missing test code coverage this script also tests code paths of the option handling which have not been tested until now. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	f695e6e107	config: make configuration file parser more robust Trying to see how robust the configuration parser I was able to crash CRIU pretty quickly. This fixes a few crashes in the existing configuration file parser. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	381d2e88fb	criu: add cleanup_free attribute Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Nicolas Viennot	031a8d7905	bfd: loop through read()/write() when the action is incomplete The callers of bread() and bwrite() assume the operation reads/writes the complete length of the passed buffer. We must loop when invoking the read()/write() APIs. Fixes #1504 Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	24bc083653	ci: disable some tests on CentOS 7 Now that we are running CI on an actual CentOS 7 kernel different tests are no longer working as they require newer kernels. This commit disables a few tests only on CentOS 7. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	63ca464bcb	ci: remove old workarounds This commit removes a couple of workaround for old kernels and distributions which we no longer use in CI. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	6ef01d3e6a	ci: switch CentOS 7 test to Cirrus CI On Cirrus CI we can run tests on the orignal CentOS 7 kernel. The kernel is rather old, but on GitHub Actions we a 5.8 kernel and a containerized CentOS 7 user space not much is working correctly anymore. With this commit CentOS 7 based tests are no longer running on GitHub Actions but on Cirrus CI. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	1fbe876242	ci: disable -x during print_env() Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	b4c7267b0e	zdtm: allow ignore taint via environment variable With this change tainted kernels can be ignored with setting ZDTM_IGNORE_TAINT=1. This is just to simplify the CI script to not require to change every call of zdtm. Setting the variable once should be enough. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	a92833818b	scripts/vagrant: Use vagrant 2.2.16 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	eda3ac2ff3	scripts/vagrant: Use Fedora 34 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Mike Frysinger	87ea13f6b7	add PKG_CONFIG default in a few more places These files use $PKG_CONFIG before they include the common files that setup a default, so set early defaults in them too. Signed-off-by: Mike Frysinger <vapier@chromium.org>	2021-09-03 10:31:00 -07:00
Valery Ivanov	6db0f95dbf	crtools: improve error handling on signal setting Signed-off-by: Valery Ivanov <ivalery111@gmail.com>	2021-09-03 10:31:00 -07:00
Mike Frysinger	2967bed64e	build: respect $PKG_CONFIG settings The build needs to respect $PKG_CONFIG env var like other standard build systems and the the upstream pkg-config project itself. This allows the package builder to point it to the right tool when doing a cross-compile build. Otherwise the host pkg-config tool is used which won't have access to the packages in the cross sysroot. Signed-off-by: Mike Frysinger <vapier@chromium.org>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	81a68ad3b2	docker-test: use latest containerd release This patch improves the changes from `19be9ced9`. To use the newer version of containerd, we need to make sure that the containerd service has been restarted after install. Instead of hard-coding a version number, we can use github API to get the latest release. In addition, the tar file contains all binary files in a './bin' sub-folder. Thus, it should be extracted in '/usr'. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	638e53c950	zdtm/tun_ns: add per-test dependencies The tun_ns test was introduced with [1] and [2], however, these commits didn't add per-test dependencies required for the test. Per-test dependencies are listed in the .desc file as 'deps': [<list>] These dependencies are made available inside the test namespace and without the ip dependency, the tests fails on Fedora 34 with Error: ipv4: FIB table does not exist. [1] https://github.com/checkpoint-restore/criu/commit/7e355e7 [2] https://github.com/checkpoint-restore/criu/commit/3ba0893 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Adrian Reber	9d9ec73dd7	test: skip time namespaced tests on <= 5 Although CentOS 8 comes with 4.18 kernel it has time namespace patches backported but not all the required once. This disables time namespaced tests on everything older than 5.11. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	e42083aa8b	ci: update docker test matrix Ubuntu 18.04 still has a problem with overlayfs and breaks CRIU https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857257 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Christian Brauner	ebc74668ff	cr_options: handle the case where __dest == __src in SET_CHAR_OPTS The SET_CHAR_OPT(__dest, __src) macro is essentially: free(opts.__dest); opts.__dest = xstrdup(__src); So if __dest == __src the string that get's copied is freed. This e.g. is the case in criu/lsm.c int lsm_check_opts(void) { char aux; if (!opts.lsm_supplied) return 0; aux = strchr(opts.lsm_profile, ':'); if (aux == NULL) { pr_err("invalid argument %s for --lsm-profile\n", opts.lsm_profile); return -1; } aux = '\0'; aux++; if (strcmp(opts.lsm_profile, "apparmor") == 0) { if (kdat.lsm != LSMTYPE__APPARMOR) { pr_err("apparmor LSM specified but apparmor not supported by kernel\n"); return -1; } SET_CHAR_OPTS(lsm_profile, aux); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ } else if (strcmp(opts.lsm_profile, "selinux") == 0) { if (kdat.lsm != LSMTYPE__SELINUX) { pr_err("selinux LSM specified but selinux not supported by kernel\n"); return -1; } SET_CHAR_OPTS(lsm_profile, aux); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ } else if (strcmp(opts.lsm_profile, "none") == 0) { xfree(opts.lsm_profile); opts.lsm_profile = NULL; } else { pr_err("unknown lsm %s\n", opts.lsm_profile); return -1; } return 0; } Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	d0511319e5	github: Add templates for new issues and pull requests This way users would be able to create more meaningfull pull-requests and issues. And we would not need to ask them to provide basic information each time. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	3c10d3335b	criu(8): document --join-ns option The --join-ns option was introduced with commits: https://github.com/checkpoint-restore/criu/commit/2cf17cd https://github.com/checkpoint-restore/criu/commit/790ec46 Closes: #1054 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	80ee4f8aec	kdat: make uffd_open return errno from syscall separately Previousely kerndat_uffd could not differentiate -EPERM and -1 returned from uffd_open(). That way "Failed to get uffd API" and "Incompatible uffd API ..." errors were just ignored, which is probably not what we want. v2: rework with extra argument of uffd_open for errno, rename err label in uffd_open for readability Fixes: `cfdeac4a4` ("kerndat: Handle non-root mode when checking uffd") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Adrian Reber	a8525c07d4	ci: no longer avoid overlayfs Now that the Ubuntu kernel is no longer broken with regards to overlayfs, let's switch back to overlayfs instead of devicemapper and vfs graphdrivers. Signed-off-by: Adrian Reber <areber@redhat.com>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	2aa4185a6c	test/others: refactor loop process There are several problems with the loop.sh script. First, the code is duplicated across tests in the so-called 'othres' category. Second, we need to run it with the 'setsid' utility to make sure that it runs in a new session. Third, we have to redirect the standard file descriptors and use the '&' operator to make it run in the background. Finally, obtaining the PID of the 'loop.sh' process resulted in race condition. In this patch we replace the loop.sh script with a program that would address all problems mentioned above. The requirements for this program are as follows. - It must be reusable across tests - It must start a process that is detached from the current shell - It must wait for the process to start and output its PID Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	2b78d95e6b	test/others: drop '_exit' function The function name '_exit' is misleading as this function doesn't actually exit when the status of the previous command is zero. In addition, the behaviour of this function is not really needed. This patch removes the '_exit' function and applies the correct behaviour to stop the test on failure. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Andrei Vagin	34410b9e75	test: add a test to check that sigtrap handlers are restored Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Andrei Vagin	b310fbd31f	ksigset: fix a typo in ksigdelset Fixes: `8063eb8fe6` ("parasite: don't block SIGTRAP") Reported-by: zl-wang <zlwang@ca.ibm.com> Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00
Pavel Tikhomirov	c1b2d194e9	mem/pidfd: fix poll retry error checking One should never rely on errno if libc syscall is successful. We can either see an errno set from some previous failed syscall or even errno set by a this successful libc syscall. So lets check ret first. Fixes: 1ccdaf47 ("criu: add pidfd based pid reuse detection for RPC clients") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	1c08709cdb	zdtm: add pidfd store based pid reuse test This is just a symlink to the original transition/pid_reuse test with the right options passed to trigger the pidfd store based pid reuse detection code path. Pidfd store based detection is supported only in RPC mode which requires passing a unix socket fd to be used as pidfd store and the kernel should support pidfd_open and pidfd_getfd syscalls {'feature': 'pidfd_store'} for this test to work. Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	ea0dc7807a	zdtm: add --pidfd-store option in RPC mode When testing pid reuse using pidfd_store feature in RPC mode we need to pass a unix socket fd used to CRIU in the RPC option pidfd_store_sk to store the pidfds between predump/dump iterations. Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	e79131e8c3	criu: add pidfd based pid reuse detection for RPC clients Closes: #717 This increases the reliability of pid reuse detection using pidfds, currently through RPC migration tools like P.Haul. A connectionless unix socket is passed to criu in RPC mode through the RPC option pidfd_store_sk. If this option is set, the socket is initialized in init_pidfd_store_sk() to be used as a queue for task pidfds. criu then sends tasks pidfds to this socket in send_pidfd_entry() and receives them in the next pre-dump/dump iteration to build the pidfds hashtable in init_pidfd_store_hash(). These pidfds will be used later in detect_pid_reuse(). How it should be used in migration tools like P.Haul: - Open a connectionless unix socket - Pass the socket fd in the RPC option pidfd_store_sk when doing a pre-dump or dump This will fail if the kernel does not support pidfd_open or pidfd_getfd syscalls, so pidfd_store_sk should not be set if the kernel does not support pidfd_open. This could be checked with: CLI: criu check --feature pidfd_store RPC: CRIU_REQ_TYPE__FEATURE_CHECK and set pidfd_store to true in the "features" field of the request v2: - add reasonable polling restart limit in check_pidfd_entry_state to avoid getting stuck - avoid leaking pidfd in send_pidfd_entry when entry is NULL, otherwise pidfds are freed in free_pidfd_store v3: - check that the passed pidfd store is not empty after the first iteration (i.e. --prev-images-dir option set). v4: - clear pidfd_hash heads - check entry allocation error in init_pidfd_store_hash() Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	ba882893c3	cr-check: add ability to check if pidfd_store feature is supported pidfd_store which will be used for reliable pidfd based pid reuse detection for RPC clients requires two recent syscalls (pidfd_open and pidfd_getfd). We allow checking if pidfd_store is supported using: 1. CLI: criu check --feature pidfd_store 2. RPC: CRIU_REQ_TYPE__FEATURE_CHECK and set pidfd_store to true in the "features" field of the request Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	e3c9c3429a	cr-service: add pidfd_store_sk option to rpc.proto pidfd_store_sk option will be used later to store tasks pidfds between predumps to detect pid reuse reliably. pidfd_store_sk should be a fd of a connectionless unix socket. init_pidfd_store_sk() steals the socket from the RPC client using pidfd_getfd, checks that it is a connectionless unix socket and checks if it is not initialized before (i.e. unnamed socket). If not initialized the socket is first bound to an abstract name (combination of the real pid/fd to avoid overlap), then it is connected to itself hence allowing us to store the pidfds in the receive queue of the socket (this is similar to how fdstore_init() works). v2: - avoid close(pidfd) overriding errno of SYS_pidfd_open in init_pidfd_store_sk() - close pidfd_store_sk because we might have leftover from previous iterations Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	a9508c9864	criu: check if pidfd_getfd syscall is supported pidfd_getfd syscall will be needed later to send pidfds between pre-dump/dump iterations for pid reuse detection. v2: - check size written/read of val_a/val_b is correct - return with error when val_a != val_b Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
Zeyad Yasser	30e8d8cadf	criu: check if pidfd_open syscall is supported pidfd_open syscall will be needed later to send pidfds between pre-dump/dump iterations for pid reuse detection. v2: - make kerndat_has_pidfd_open void since 0 is always returned - fix missing tabs in syscall tables Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>	2021-09-03 10:31:00 -07:00
nithin-jaikar	5d08f975a1	kerndat: Handle non-root mode when checking uffd When criu is run as user it fails and exits because of kerndat_uffd() returning -1. This, in turn, happens after uffd = syscall(SYS_userfaultfd, flags); which only works for root. In the change it ignores the permission error and proceeds further just like it's done for e.g. pagemap checking. Signed-off-by: Nithin Jaikar J <jaikar006@gmail.com>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	8c303d1a64	test/others/crit: add test for 'x' This commit extends the CRIT tests to cover the 'x' command, which is used to explore an image directory. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Radostin Stoyanov	e393001095	lib/cli.py: Open explore file as a binary Fixes #1484 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2021-09-03 10:31:00 -07:00
Andrei Vagin	c8973d426b	test/zdtm: check that a penging SIGTRAP handled properly Signed-off-by: Andrei Vagin <avagin@gmail.com>	2021-09-03 10:31:00 -07:00

1 2 3 4 5 ...

10566 Commits