mir/criu - criu - Mike's Git repositories

mir/criu

mirror of https://github.com/checkpoint-restore/criu synced 2025-08-22 01:51:51 +00:00

Author	SHA1	Message	Date
Adrian Reber	615ccf98cf	crit: do not crash on aarch64 doing 'crit x ./ rss' Running 'crit x ./ rss' on aarch64 crashes with: File "/home/criu/crit/crit/__main__.py", line 331, in explore_rss while vmas[vmi]['start'] < pme: ~~~~^^^^^ IndexError: list index out of range This adds an additional check to the while loop to do access indexes out of range. Signed-off-by: Adrian Reber <areber@redhat.com>	2024-09-19 15:23:42 -07:00
Radostin Stoyanov	21ea718f9f	plugins/amdgpu: fix printf format specifiers Errors on aarch64: In file included from amdgpu_plugin_drm.h:10, from amdgpu_plugin.c:33: amdgpu_plugin.c: In function 'amdgpu_plugin_dump_file': amdgpu_plugin_util.h:24:20: error: format '%lld' expects argument of type 'long long int', but argument 6 has type '__u64' {aka 'long unsigned int'} [-Werror=format=] 24 \| #define LOG_PREFIX "amdgpu_plugin: " \| ^~~~~~~~~~~~~~~~~ ../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX' 47 \| #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__) \| ^~~~~~~~~~ amdgpu_plugin.c:1236:9: note: in expansion of macro 'pr_info' 1236 \| pr_info("devices:%d bos:%d objects:%d priv_data:%lld\n", args.num_devices, args.num_bos, args.num_objects, \| ^~~~~~~ cc1: all warnings being treated as errors Errors on ppc64: In file included from amdgpu_plugin_drm.h:10, from amdgpu_plugin.c:33: amdgpu_plugin.c: In function 'amdgpu_plugin_dump_file': amdgpu_plugin_util.h:24:20: error: format '%llu' expects argument of type 'long long unsigned int', but argument 6 has type '__u64' {aka 'long unsigned int'} [-Werror=format=] 24 \| #define LOG_PREFIX "amdgpu_plugin: " \| ^~~~~~~~~~~~~~~~~ ../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX' 47 \| #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__) \| ^~~~~~~~~~ amdgpu_plugin.c:1236:9: note: in expansion of macro 'pr_info' 1236 \| pr_info("devices:%u bos:%u objects:%u priv_data:%llu\n", \| ^~~~~~~ cc1: all warnings being treated as errors In file included from amdgpu_plugin_util.c:38: amdgpu_plugin_util.c: In function 'print_kfd_bo_stat': amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=] 24 \| #define LOG_PREFIX "amdgpu_plugin: " \| ^~~~~~~~~~~~~~~~~ ../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX' 47 \| #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__) \| ^~~~~~~~~~ amdgpu_plugin_util.c:196:17: note: in expansion of macro 'pr_info' 196 \| pr_info("%s(), %d. KFD BO Addr: %llx \n", __func__, idx, bo->addr); \| ^~~~~~~ amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=] 24 \| #define LOG_PREFIX "amdgpu_plugin: " \| ^~~~~~~~~~~~~~~~~ ../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX' 47 \| #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__) \| ^~~~~~~~~~ amdgpu_plugin_util.c:197:17: note: in expansion of macro 'pr_info' 197 \| pr_info("%s(), %d. KFD BO Size: %llx \n", __func__, idx, bo->size); \| ^~~~~~~ amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=] 24 \| #define LOG_PREFIX "amdgpu_plugin: " \| ^~~~~~~~~~~~~~~~~ ../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX' 47 \| #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__) \| ^~~~~~~~~~ amdgpu_plugin_util.c:198:17: note: in expansion of macro 'pr_info' 198 \| pr_info("%s(), %d. KFD BO Offset: %llx \n", __func__, idx, bo->offset); \| ^~~~~~~ amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=] 24 \| #define LOG_PREFIX "amdgpu_plugin: " \| ^~~~~~~~~~~~~~~~~ ../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX' 47 \| #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__) \| ^~~~~~~~~~ amdgpu_plugin_util.c:199:17: note: in expansion of macro 'pr_info' 199 \| pr_info("%s(), %d. KFD BO Restored Offset: %llx \n", __func__, idx, bo->restored_offset); \| ^~~~~~~ cc1: all warnings being treated as errors Co-developed-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-19 15:23:42 -07:00
Radostin Stoyanov	3e2ed18790	plugins/amdgpu: use C99-standard types Co-developed-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-19 15:23:42 -07:00
Radostin Stoyanov	d68205e919	ci: enable cross compile testing for amdgpu-plugin Skip cross-compilation on armv7 because, among many other errors, it fails with the following: In file included from ../../include/common/lock.h:9, from ../../criu/include/files.h:9, from amdgpu_plugin.c:30: ../../include/common/asm/atomic.h:60:2: error: #error ARM architecture version (CONFIG_ARMV) not set or unsupported. 60 \| #error ARM architecture version (CONFIG_ARMV) not set or unsupported. \| ^~~~~ ../../include/common/asm/atomic.h: In function 'atomic_add_return': ../../include/common/asm/atomic.h:81:9: error: implicit declaration of function 'smp_mb' [-Werror=implicit-function-declaration] 81 \| smp_mb(); \| ^~~~~~ Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-19 15:23:42 -07:00
Radostin Stoyanov	2ee5844411	plugins/amdgpu: fix cross-compilation To enable cross-compile we need to use the CC definition from criu/scripts/nmk/scripts/tools.mk: CC := $(CROSS_COMPILE)$(HOSTCC) Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-19 15:23:42 -07:00
Andrei Vagin	9a19cf34de	scripts/ci: run tests with the mocked cuda-checkpoint tool Signed-off-by: Andrei Vagin <avagin@google.com>	2024-09-11 16:02:11 -07:00
Andrei Vagin	de31abb970	criu/plugin: don't call plugin device hooks for non-alive tasks Dead tasks don't hold any resources. Fixes: 2465 Signed-off-by: Andrei Vagin <avagin@google.com>	2024-09-11 16:02:11 -07:00
Andrei Vagin	dea6305914	test/zdtm: allow to run tests with the mocked cuda-checkpoint tool Here is an example how to run one test: $ python test/zdtm.py run -t zdtm/static/env00 --ignore-taint --mocked-cuda-checkpoint Signed-off-by: Andrei Vagin <avagin@google.com>	2024-09-11 16:02:11 -07:00
haozi007	67fe44e981	support user set remote mmap vma address 1. os auto assignment vma addr maybe conflict with vma in gpu living migrate scene; 2. so, we should give choice to user; Signed-off-by: haozi007 <liuhao27@huawei.com>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	551cd92447	timer: fix printf specifiers for __suseconds64_t New internal glibc types __timeval64 [1] and __suseconds64_t [2] have been introduced as a solution for the Y2038 problem [3]. These 64-bit types are used across all architectures. However, this change causes the following build errors when cross-compiling on ARMv7 (armhf): criu/timer.c:49:17: error: format '%ld' expects argument of type 'long int', but argument 5 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=] 49 \| pr_info("Restored %s timer to %" PRId64 ".%ld -> %" PRId64 ".%ld\n", n, \| ^~~~~~~~~~~~~~~~~~~~~~~~ 50 \| (int64_t)val->it_value.tv_sec, val->it_value.tv_usec, \| ~~~~~~~~~~~~~~~~~~~~~ \| \| \| __suseconds64_t {aka long long int} criu/timer.c:49:17: error: format '%ld' expects argument of type 'long int', but argument 7 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=] 49 \| pr_info("Restored %s timer to %" PRId64 ".%ld -> %" PRId64 ".%ld\n", n, \| ^~~~~~~~~~~~~~~~~~~~~~~~ 50 \| (int64_t)val->it_value.tv_sec, val->it_value.tv_usec, 51 \| (int64_t)val->it_interval.tv_sec, val->it_interval.tv_usec); \| ~~~~~~~~~~~~~~~~~~~~~~~~ \| \| \| __suseconds64_t {aka long long int} ns.c:234:48: error: format '%ld' expects argument of type 'long int', but argument 5 has type 'time_t' {aka 'long long int'} [-Werror=format=] 234 \| len = snprintf(buf, sizeof(buf), "%d %ld 0", clk_id, offset); \| ~~^ ~~~~~~ \| \| \| \| long int time_t {aka long long int} \| %lld msg.c:58:41: error: format '%ld' expects argument of type 'long int', but argument 3 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=] 58 \| off += sprintf(buf + off, ".%.3ld: ", tv.tv_usec / 1000); \| ~~~~^ ~~~~~~~~~~~~~~~~~ \| \| \| \| long int __suseconds64_t {aka long long int} \| %.3lld ../lib/zdtmtst.h:137:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type '__time64_t' {aka 'long long int'} [-Werror=format=] 137 \| test_msg("ERR: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \ \| ^~~~~~~~~~~~~~ pthread_timers_h.c:72:17: note: in expansion of macro 'pr_perror' 72 \| pr_perror("wrong interval: %ld:%ld", itimerspec.it_interval.tv_sec, itimerspec.it_interval.tv_nsec); \| ^~~~~~~~~ vdso00.c:22:32: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=] 22 \| test_msg("%d time: %10li\n", getpid(), tv.tv_sec); \| ~~~~^ ~~~~~~~~~ \| \| \| \| long int __time64_t {aka long long int} \| %10lli vdso00.c:29:32: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=] 29 \| test_msg("%d time: %10li\n", getpid(), tv.tv_sec); \| ~~~~^ ~~~~~~~~~ \| \| \| \| long int __time64_t {aka long long int} \| %10lli vdso01.c:357:42: error: format '%li' expects argument of type 'long int', but argument 2 has type '__time64_t' {aka 'long long int'} [-Werror=format=] 357 \| test_msg("gettimeofday: tv_sec %li vdso_gettimeofday: tv_sec %li\n", tv1.tv_sec, tv2.tv_sec); \| ~~^ ~~~~~~~~~~ \| \| \| \| long int __time64_t {aka long long int} \| %lli vdso01.c:357:72: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=] 357 \| test_msg("gettimeofday: tv_sec %li vdso_gettimeofday: tv_sec %li\n", tv1.tv_sec, tv2.tv_sec); \| ~~^ ~~~~~~~~~~ \| \| \| \| long int __time64_t {aka long long int} \| vdso01.c:328:43: error: format '%li' expects argument of type 'long int', but argument 2 has type '__time64_t' {aka 'long long int'} [-Werror=format=] 328 \| test_msg("clock_gettime: tv_sec %li vdso_clock_gettime: tv_sec %li\n", ts1.tv_sec, ts2.tv_sec); \| ~~^ ~~~~~~~~~~ \| \| \| \| long int __time64_t {aka long long int} \| %lli vdso01.c:328:74: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=] 328 \| test_msg("clock_gettime: tv_sec %li vdso_clock_gettime: tv_sec %li\n", ts1.tv_sec, ts2.tv_sec); \| ~~^ ~~~~~~~~~~ \| \| \| \| long int __time64_t {aka long long int} \| ../lib/zdtmtst.h:144:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type 'time_t' {aka 'long long int'} [-Werror=format=] 144 \| test_msg("FAIL: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \ \| ^~~~~~~~~~~~~~~ mtime_mmap.c:80:17: note: in expansion of macro 'fail' 80 \| fail("mtime %ld wasn't updated on mmapped %s file", mtime_new, filename); \| ^~~~ ../lib/zdtmtst.h:144:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type '__time64_t' {aka 'long long int'} [-Werror=format=] 144 \| test_msg("FAIL: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \ \| ^~~~~~~~~~~~~~~ mtime_mmap.c:101:17: note: in expansion of macro 'fail' 101 \| fail("After migration, mtime changed to %ld", fst.st_mtime); \| ^~~~ [1] https://sourceware.org/git/?p=glibc.git;h=504c98717062cb9bcbd4b3e59e932d04331ddca5 [2] https://sourceware.org/git/?p=glibc.git;h=3fced064f23562ec24f8312ffbc14950993969e6 [3] https://en.wikipedia.org/wiki/Year_2038_problem Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	a045c874cb	ci: run tests with amdgpu and cuda plugins Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	2453ed69a2	zdtm: add option to run tests with criu plugins By default, if the "CRIU_LIBS_DIR" environment variable is not set, CRIU will load all plugins installed in `/usr/lib/criu`. This may result in running the ZDTM tests with plugins for a different version of CRIU (e.g., installed from a package). This patch updates ZDTM to always set the "CRIU_LIBS_DIR" environment variable and use a local "plugins" directory. This directory contains copies of the plugin files built from source. In addition, this patch adds the `--criu-plugin` option to the `zdtm.py run` command, allowing tests to be run with specified CRIU plugins. Example: - Run test only with AMDGPU plugin ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin amdgpu - Run test only with CUDA plugin ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin cuda - Run test with both AMDGPU and CUDA plugins ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin amdgpu cuda Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	ad66c27a11	cuda: fix launch cuda-checkpoint When the cuda-checkpoint tool is not installed, execvp() is expected to fail and return -1. In this case, we need to call exit() to terminate the child process that was created earlier with fork(). Since CRIU can be used with applications that do not use CUDA, even when the CUDA plugin is installed, this patch also updates the log messages to show debug and warning (instead of error) when the cuda-checkpoint tool is not found in $PATH. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org> Signed-off-by: Andrei Vagin <avagin@google.com>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	fde0b7ac69	cuda: don't leak fds to cuda-checkpoint Leaking open file descriptors to third-party tools can lead to security risks. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	4dde52a308	ci/podman: show mounts Show information about mounts available on the host filesystem. This is useful for debugging. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	9a85fb6382	ci/podman: show criu logs in case of error Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
liuchao173	8437663cc6	delete redundant include header files restorer.h has been included in line 43. Fixes: 22963d282729 ("Hide asm/restorer.h from sources") Signed-off-by: liuchao173 <liuchao173@huawei.com>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	c42b58f4fb	plugin: enable multiple plugins for the same hook CRIU provides two plugins for checkpoint/restore of GPU applications: amdgpu and cuda. Both plugins use the `RESUME_DEVICES_LATE` hook to enable restore: CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, amdgpu_plugin_resume_devices_late) CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, cuda_plugin_resume_devices_late) However, CRIU currently does not support running more than one plugin for the same hook. As a result, when both plugins are installed, the resume function for CUDA applications is not executed. To fix this, we need to make sure that both `plugin_resume_devices_late()` functions return `-ENOTSUP` when restore is not supported. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	85050be66b	seize: fix pause-devices plugin hook The plugin hook "PAUSE_DEVICES" was recently introduced in the following commit. This hook was intended to execute the cuda-checkpoint tool before the process tree is frozen. However, the run_plugins() call has been placed immediately after freeze_processes(). This causes the cuda-checkpoint tool to hang indefinitely during the checkpointing of CUDA applications running in containers, eventually leading to its termination by the timeout alarm. a85f488595e0a3a6e6cc6ca7c94d4a00b1341aaf criu/plugin: Introduce new plugin hooks PAUSE_DEVICES and CHECKPOINT_DEVICES to be used during pstree collection This problem can be reproduced with the following example: sudo podman run -d --rm \ --device nvidia.com/gpu=all --security-opt=label=disable \ quay.io/radostin/cuda-counter sudo podman container checkpoint -l -e /tmp/checkpoint.tar Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Andrei Vagin	21108b40de	test/zdtm: mount a new tmpfs to the zdtm root /dev The current file system can be mounted with nodev. Fixes #2441 Signed-off-by: Andrei Vagin <avagin@google.com>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	fcbadfbdbf	plugins: set executable bit on .so files For historical reasons, some tools like rpm [1] or ldd [2,3] may expect the executable bit to be present for the correct identification of shared libraries. The executable bit on .so files is set by default by compilers (e.g., GCC). It is not strictly necessary but primarily a convention. [1] https://docs.fedoraproject.org/en-US/package-maintainers/CommonRpmlintIssues/#unstripped_binary_or_object [2] https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/ldd.bash.in;h=d6b640df;hb=HEAD#l154 [3] $ sudo ldd /usr/lib/criu/*.so /usr/lib/criu/amdgpu_plugin.so: ldd: warning: you do not have execution permission for `/usr/lib/criu/amdgpu_plugin.so' linux-vdso.so.1 (0x00007fd0a2a3e000) libdrm.so.2 => /lib64/libdrm.so.2 (0x00007fd0a29eb000) libdrm_amdgpu.so.1 => /lib64/libdrm_amdgpu.so.1 (0x00007fd0a29de000) libc.so.6 => /lib64/libc.so.6 (0x00007fd0a27fc000) /lib64/ld-linux-x86-64.so.2 (0x00007fd0a2a40000) /usr/lib/criu/cuda_plugin.so: ldd: warning: you do not have execution permission for `/usr/lib/criu/cuda_plugin.so' linux-vdso.so.1 (0x00007f1806e13000) libc.so.6 => /lib64/libc.so.6 (0x00007f1806c08000) /lib64/ld-linux-x86-64.so.2 (0x00007f1806e15000) Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	5783706d57	docs: update amdgpu-plugin man page This patch updates the dependencies section of the AMDGPU plugin man page to reflect that the plugin has been merged upstream and to fix a formatting issue. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Florian Weimer	089345f77a	Adjust to glibc __rseq_size semantic change In commit 2e456ccf0c34a056e3ccafac4a0c7effef14d918 ("Linux: Make __rseq_size useful for feature detection (bug 31965)") glibc 2.40 changed the meaning of __rseq_size slightly: it is now the size of the active/feature area (20 bytes initially), and not the size of the entire initially defined struct (32 bytes including padding). The reason for the change is that the size including padding does not allow detection of newly added features while previously unused padding is consumed. The prep_libc_rseq_info change in criu/cr-restore.c is not necessary on kernels which have full ptrace support for obtaining rseq information because the code is not used. On older kernels, it is a correctness fix because with size 20 (the new value), rseq registeration would fail. The two other changes are required to make rseq unregistration work in tests. Signed-off-by: Florian Weimer <fweimer@redhat.com>	2024-09-11 16:02:11 -07:00
Bui Quang Minh	b9081ca56b	zdtm: make cgroup testcases run non-parallel cgroup testcases live in the same cgroup root zdtmtst and zdtmtst.defaultroot controller then create child subgroup for testing. This can cause problems when cgroup testcases run in parallel. For example, testcase A dumps the child subgroup of testcase B since it's in the cgroup root but in the middle of restoring of testcase A, testcase B completes and cleans up the subgroup directory. This causes error in testcase A restore. This commit adds excl flag to all cgroup testcases description so that these don't run parallel. Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>	2024-09-11 16:02:11 -07:00
Andrei Vagin	4f45572fde	util: use close_range when it's supported close_range is faster than reading /proc/self/fd and closing descriptors one by one. Signed-off-by: Andrei Vagin <avagin@gmail.com>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	42b177da62	scripts/build: drop centos 7 targets The CI tests with CentOS 7 have been disabled and removed [1,2]. This patch removes the obsolete Makefile targets for these tests. [1] `24bc083653` [2] `f8466ca798` Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Andrei Vagin	1815838191	vdso: proxify the __vdso_clock_gettime64 function It was added in v5.3-rc1~211^2~4^2~10. Fixes #2390 Signed-off-by: Andrei Vagin <avagin@gmail.com>	2024-09-11 16:02:11 -07:00
Andrei Vagin	ac22aaf576	apparmor: get_suspend_policy must return NULL in error cases Before this fix, it could return MAP_FAILED which is ((void *) -1). Signed-off-by: Andrei Vagin <avagin@gmail.com>	2024-09-11 16:02:11 -07:00
Pavel Tikhomirov	71999d8883	cgroupd: unblock SIGTERM to make stop_cgroupd actually work Sometimes due to sigblockmask inheritance cgroupd can inherit SIGTERM blocked. That will lead cgroupd ignoring SIGTERM from stop_cgroupd() and CRIU will get stuck due to waiting for never-stopping cgroupd. I see this happening in lxc-checkpoint, also saw this in OpenVZ jenkins on cgroup_inotify00 test. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2024-09-11 16:02:11 -07:00
Liu Hua	daed6c3535	irmap: duplicate string in irmap_scan_path_add Duplicate string in irmap_scan_path_add, otherwise it will free before parsing next configuration input. [ avagin: handle errors of xstrdup ] Signed-off-by: Liu Hua <weldonliu@tencent.com> Signed-off-by: Andrei Vagin <avagin@gmail.com>	2024-09-11 16:02:11 -07:00
Andrei Vagin	b169e3b63d	plugins/cuda: fix crosscompilation Signed-off-by: Andrei Vagin <avagin@gmail.com>	2024-09-11 16:02:11 -07:00
Pratyush Yadav	ca971b7f8b	compel: fix build on Amazon Linux 2 due to missing PTRACE_ARCH_PRCTL Commit fc683cb01 ("compel: shstk: save CET state when CPU supports it") started using PTRACE_ARCH_PRCTL to query shadow stack status. While PTRACE_ARCH_PRCTL has existed in the kernel for a long time, it was only added to glibc in version 2.27. Amazon Linux 2 (AL2) has glibc 2.26, which does not have this definition. As a result, build on AL2 fails with the below error: compel/arch/x86/src/lib/infect.c: In function ‘get_task_xsave’: compel/arch/x86/src/lib/infect.c:276:14: error: ‘PTRACE_ARCH_PRCTL’ undeclared (first use in this function) 276 \| if (ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long)&features, ARCH_SHSTK_STATUS)) { \| ^~~~~~~~~~~~~~~~~ While the definition is present on the system via the kernel headers (in asm/ptrace-abi.h) which can be reached by including linux/ptrace.h, the comment in compel/include/uapi/ptrace.h says: We'd want to include both sys/ptrace.h and linux/ptrace.h, hoping that most definitions come from either one or another. Alas, on Alpine/musl both files declare struct ptrace_peeksiginfo_args, so there is no way they can be used together. Let's rely on libc one. Since including linux/ptrace.h is not an option, define PTRACE_ARCH_PRCTL if it doesn't already exist. An interesting point to note is that in sys/ptrace.h, PTRACE_ARCH_PRCTL is an enum value so the preprocessor doesn't know about it. PT_ARCH_PRCTL is the preprocessor symbol that matches the value of PTRACE_ARCH_PRCTL. So look for PT_ARCH_PRCTL to decide if PTRACE_ARCH_PRCTL is available or not. Another interesting point to note is that AL2 ships with GCC 7 by default, which does not support the -mshstk option, causing other build failures. Luckily, it also ships GCC 10 which does have the option. Using GCC 10 lets the build succeed. Fixes: fc683cb01 ("compel: shstk: save CET state when CPU supports it") Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>	2024-09-11 16:02:11 -07:00
Jesus Ramos	bf417dd050	criu/plugin: Add NVIDIA CUDA plugin Adding support for the NVIDIA cuda-checkpoint utility, requires the use of an r555 or higher driver along with the cuda-checkpoint binary. Signed-off-by: Jesus Ramos <jeramos@nvidia.com>	2024-09-11 16:02:11 -07:00
Jesus Ramos	5f486d5aee	criu/plugin: Introduce new plugin hooks PAUSE_DEVICES and CHECKPOINT_DEVICES to be used during pstree collection PAUSE_DEVICES is called before a process is frozen and is used by the CUDA plugin to place the process in a state that's ready to be checkpointed and quiesce any pending work CHECKPOINT_DEVICES is called after all processes in the tree have been frozen and PAUSE'd and performs the actual checkpointing operation for CUDA applications Signed-off-by: Jesus Ramos <jeramos@nvidia.com>	2024-09-11 16:02:11 -07:00
Jesus Ramos	1012e542e5	criu: Restore rseq_cs state slightly earlier in the restore sequence and run the plugin finalizer later in the dump sequence Restore rseq_cs state before calling RESUME_DEVICES_LATE as the CUDA plugin will temporarily unfreeze a thread during the plugin hook to assist with device restore Run the plugin finalizer later in the dump sequence since the finalizer is used by the CUDA plugin to handle some process cleanup Signed-off-by: Jesus Ramos <jeramos@nvidia.com>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	7ac4537069	readme: update link to FAQ page The current link opens a page with the following text: The MediaWiki FAQ can be found at: https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:FAQ Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	4f15fe8c59	make: improve check for externally managed Python Move PYTHON_EXTERNALLY_MANAGED and PIP_BREAK_SYSTEM_PACKAGES into Makefile.install to avoid code duplication. In addition, add PIPFLAGS variable to enable specifying pip options during installation. This is particularly useful for packaging, where it is common for `pip install` to run in an environment with pre-installed dependencies and without internet access. In such environment, we need to specify the following options: --no-build-isolation --no-index --no-deps Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Adrian Reber	fdf546dbd5	ci: upgrade to Fedora 40 Vagrant images (38 is EOL) Signed-off-by: Adrian Reber <areber@redhat.com>	2024-09-11 16:02:11 -07:00
Bhavik Sachdev	f171649264	test/dump-crash: check code path when dump crashes Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>	2024-09-11 16:02:11 -07:00
Bhavik Sachdev	a252a240c3	zdtm: Distinguish between fail and crash of dump Adds a exit_signal static method to criu_cli, criu_config and criu_rpc used to detect a crash. Fixes: #350 Signed-off-by: Bhavik Sachdev <b.sachdev1904@gmail.com>	2024-09-11 16:02:11 -07:00
Adrian Reber	6feb57a840	ci: remove CentOS Stream 8 test (EOL) Signed-off-by: Adrian Reber <areber@redhat.com>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	1da29f27f6	zdtm: add support for LD_PRELOAD tests This commit adds a `--preload-libfault` option to ZDTM's run command. This option runs CRIU with LD_PRELOAD to intercept libc functions such as pread(). This method allows to simulate special cases, for example, when a successful call to pread() transfers fewer bytes than requested. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Andrei Vagin	e7276cf63b	pagemap-cache: handle short reads It is possible for pread() to return fewer number of bytes than requested. In such case, we need to repeat the read operation with appropriate offset. Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Andrei Vagin	cc88b1e1ff	net: Fix TOCTOU race condition in unix_conf_op The unix_conf_op function reads the size of the sysctl entry array twice. gcc thinks that it can lead to a time-of-check to time-of-use (TOCTOU) race condition if the array size changes between the two reads. Fixes #2398 Signed-off-by: Andrei Vagin <avagin@gmail.com>	2024-09-11 16:02:11 -07:00
Alexander Mikhalitsyn	457bc6a8ff	criu: use proper format-specified to accommodate time_t 64-bit change See also: https://wiki.debian.org/ReleaseGoals/64bit-time Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>	2024-09-11 16:02:11 -07:00
Arnav Bhatt	95f66d13db	criu: move sigact dump/restore code into sigact.c Seperate sigact dump/restore code from cr-restore.c and parasite-syscall.c into sigact.c Signed-off-by: Arnav Bhatt <arnav@ghativega.in>	2024-09-11 16:02:11 -07:00
Adrian Reber	9c8a6927aa	ci: update check for SELinux The rawhide tests runs in a container. Containers always have SELinux disabled from the inside. Somehow /sys/fs/selinux is now mounted. We used the existence of that directory if SELinux is available. This seems to be no longer true. Signed-off-by: Adrian Reber <areber@redhat.com> Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	b3c3422cd9	test/make: remove unused target A fault-injection test was introduced in commit [1] and later removed in commit [2]. This patch removes the obsolete Makefile target. [1] b95407e264fcf58f4f73f78abef6dac60436e7dd test: check, that parasite can rollback itself (v2) [2] 2cb4532e266d0c9f8e87839d5b5eb728a3e4d10d tests: remove zdtm.sh (v2) Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Radostin Stoyanov	30aa8dbe4d	mount: fix unbounded write Replace sprintf() with snprintf() and specify maximum length of characters to avoid potential overflow. Reported-by: GitHub CodeQL (https://codeql.github.com/) Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2024-09-11 16:02:11 -07:00
Juntong Deng	708f872a6d	sk-tcp: Add test cases for TCP_CORK and TCP_NODELAY socket options Currently there are no socket option test cases for TCP_CORK and TCP_NODELAY, this commit adds related test cases. The socket option test cases for TCP_KEEPCNT, TCP_KEEPIDLE, and TCP_KEEPINTVL already exist in socket-tcp_keepalive.c, so they are not included in this test case. Signed-off-by: Juntong Deng <juntong.deng@outlook.com>	2024-09-11 16:02:11 -07:00

1 2 3 4 5 ...

11457 Commits