mir/criu - criu - Mike's Git repositories

mir/criu

mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 13:58:34 +00:00

Author	SHA1	Message	Date
Pavel Tikhomirov	5cd7092fda	sk-unix: make add_fake_unix_queuers earier and rework find_queuer_for Before this patch, if we had a unixsk with incomming scm packets (with fds) and with the sender side fd closed, we got an error: Error (criu/sk-unix.c:1125): unix: Can't find sender for 0x1e First part of the problem is that unix_note_scm_rights() expects to see a "queuer" which would send scm packets to the unixsk, and there is no as the sender side is closed. Second part of the problem is that we already have "fake" queuers feature so that it already creates a unix socket pair and leaves other end open for later queuing packets. But function add_fake_unix_queuers() is called after unix_note_scm_rights() thus there is no chance to find queuer at the point of failure. Third part is that when we look for a queuer in find_queuer_for() we actually look for a socket for which we are a queuer and not for the socket which is a queuer for us, which is opposite to the name. For cases where both ends are alive both are queuers for each other so this was not important, but for our closed sender case it breaks. So let's reorder add_fake_unix_queuers() before unix_note_scm_rights() and make find_queuer_for() actually do what it's name implies. This situation is started to reproduce on Virtuozzo start/stop tests with the unixsk belonging to systemd, we suppose that this state where the sender fd side is closed happens rarely only on systemd start/stop, so we don't see it in regular suspend resume of long-living containers. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2023-04-15 21:17:21 -07:00
Ashutosh Mehra	28358db13b	Fix the check for mnt namespace in criu-ns criu-ns script incorrectly compares the pidns fd with mntns fd. Also reversed the condition in is_my_namespace function to align it with the function name. Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>	2023-04-15 21:17:21 -07:00
Pavel Tikhomirov	295dc85ca0	github: use git-clang-format instead of make indent This allows us to only detect bad formating in PR changes but not all the CRIU codebase. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2023-04-15 21:17:21 -07:00
Alexander Mikhalitsyn	ced4ab4b0a	zdtm: skip zdtm/static/shm-hugetlb when hugetlb is not supported Reported-by: Mr. Jenkins (ppc64le) Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2023-04-15 21:17:21 -07:00
Bui Quang Minh	c830643d86	Revert "ci: skip new hugetlb maps09/maps10 tests for pre-dump" This reverts commit `37ea8c5fcf`. Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>	2023-04-15 21:17:21 -07:00
Bui Quang Minh	b26e1fdbf7	mem: Skip pre-dumping on hugetlb mappings As private hugetlb mappings are not pre-mapped, the content of them is restored in the the restorer which cannot use page_read->read_pages. As a result, we cannot recursively read the content of pre-dumped image in the parent directory and use preadv to read the content from the last dumped image only. Therefore, it may freeze while restoring when the content of mapping is in pre-dumped image in parent directory. We need to skip pre-dumping on hugetlb mappings to resolve the issue. Suggested-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>	2023-04-15 21:17:21 -07:00
Pavel Tikhomirov	9066f87417	cr-dump: do not report success to logs if post-dump script failed It can be confusing to see error from post-dump action script and non zero return from criu though at the same time see "Dumping finished successfully" in log. I believe it is logical to consider post-dump action script as a part of "dump" process so fail in it means that the whole dump failed. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2023-04-15 21:17:21 -07:00
Adrian Reber	d46f40f4ff	criu: Version 3.17.1 * Fixes for pre-dump read mode * Fixes for mount-v2 * amdgpu plugin build and installation fixes * Some minor CI related fixes Signed-off-by: Adrian Reber <areber@redhat.com> v3.17.1	2022-06-23 14:53:25 -07:00
Radostin Stoyanov	46ec6749fa	ci: Fix code indent This patch contains auto-generated changes from `make indent` Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-06-22 10:20:33 -07:00
Pavel Tikhomirov	f29d51560e	zdtm: add mnt_root_ext test This test has one external mount [criumntns] /zdtm_root_ext.tmp -> [testmntns] /mnt_root_ext.test, and it specifically gives '--external mnt[MNT]:.zdtm_root_ext.tmp' option on restore without '/' to make dirname on it return static '.' path (see glibc dirname() code) and reproduce a segfault in resolve_mountpoint(). Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2022-06-22 10:20:33 -07:00
Pavel Tikhomirov	8a18faea09	util/mount-v2: fix resolve_mountpoint() to always return freeable pointer Else we have a Segmentation fault in __move_mount_set_group() on xfree(source_mp) if resolve_mountpoint() returned statically allocated path. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2022-06-22 10:20:33 -07:00
Pavel Tikhomirov	4cc8a18f3b	zdtm: test multiple ext bindmounts with no common root and same master Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2022-06-22 10:20:33 -07:00
Pavel Tikhomirov	229c5df5ce	mount-v2: workaround for multiple external bindmounts with no common root It's a problem when while restoring sharing group we need to copy sharing between two mounts with non-intersecting roots, because kernel does not allow it. We have a case https://github.com/opencontainers/runc/pull/3442, where runc adds different devtmpfs file-bindmounts to container and there is no fsroot mount in container for this devtmpfs, thus mount-v2 faces the above problem. Luckily for the case of external mounts which are in one sharing group and which have non-intersecting roots, these mounts likely only have external master with no sharing, so we can just copy sharing from external source and make it slave as a workaround. https://github.com/checkpoint-restore/criu/issues/1886 Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2022-06-22 10:20:33 -07:00
Pavel Tikhomirov	f8c9e07e4f	mount-v2: split out restore_one_sharing helper This helper restores master_id and shared_id of first mount in the sharing group. It first copies sharing from either external source or internal parent sharing group and makes master_id from shared_id. Next it creates new shared_id when needed. All other mounts except first are just copied from the first one. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2022-06-22 10:20:33 -07:00
Radostin Stoyanov	a90a1d4827	amdgpu: Set PLUGINDIR to /usr/lib/criu Building the criu packages for Ubuntu/Debian fails with: mkdir: cannot create directory '/var/lib/criu': Permission denied This patch updates PLUGINDIR with the value /usr/lib/criu Fixes: #1877 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-06-22 10:20:33 -07:00
Radostin Stoyanov	e6f292cb38	amdgpu/Makefile: Fix include path When building packages for CRIU the source directory might have a name different than 'criu'. Fixes: #1877 Reported-by: @siris Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-06-22 10:20:33 -07:00
Andrei Vagin	6507ae5331	ci: test the read mode of pre-dump Signed-off-by: Andrei Vagin <avagin@gmail.com>	2022-06-22 10:20:33 -07:00
Andrei Vagin	f43dae720a	page-xfer: refactoring analyze_iov and fill_userbuf * handle unexpected errors of process_vm_readv * adjust riovs in analyze_iov * call handle_faulty_iov only if process_vm_readv returns EFAULT. Signed-off-by: Andrei Vagin <avagin@gmail.com>	2022-06-22 10:20:33 -07:00
Andrei Vagin	efeedf3912	pre-dump: call vmsplice with SPLICE_F_GIFT In this case, vmplice attaches pages without coping them. Signed-off-by: Andrei Vagin <avagin@gmail.com>	2022-06-22 10:20:33 -07:00
Andrei Vagin	b2bfb7745d	page-xfer: adjust a buffer to a pipe size Due to side effects of F_SETPIPE_SZ, the actual pipe size can be greater than PIPE_MAX_SIZE. Signed-off-by: Andrei Vagin <avagin@gmail.com>	2022-06-22 10:20:33 -07:00
Andrei Vagin	0df0a7dace	page-xfer: use negative values for error codes Signed-off-by: Andrei Vagin <avagin@gmail.com>	2022-06-22 10:20:33 -07:00
Andrei Vagin	51533d98ac	page-pipe: fix limiting a pipe size But actually, `5a92f100b8` probably has to be reverted as a whole. PIPE_MAX_SIZE is the hard limit to avoid PAGE_ALLOC_COSTLY_ORDER allocations in the kernel. But F_SETPIPE_SZ rounds up a requested pipe size to a power-of-2 pages. It means that when we request PIPE_MAX_SIZE that isn't a power-of-2 number, we actually request a pipe size greater than PIPE_MAX_SIZE. Fixes: `5a92f100b8` ("page-pipe: Resize up to PIPE_MAX_SIZE") Signed-off-by: Andrei Vagin <avagin@gmail.com>	2022-06-22 10:20:33 -07:00
Radostin Stoyanov	ff92731690	crit: Use same version as criu Name collision with an abandoned project named 'crit' in pypi causes pip to show crit (CRiu Image Tool) as outdated. This patch updates crit to use the same version and license as criu. Fixes #1878 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-06-22 10:20:33 -07:00
Radostin Stoyanov	f522adec4a	ci: Fix unsafe repository error Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-06-22 10:20:33 -07:00
Adrian Reber	4f8f295e57	criu: Version 3.17 Amongst a huge number of fixes all over the place this release introduces: * mount-v2 engine * support for MAP_HUGETLB mappings * support for Linux Restartable Sequences * support for SOCK_SEQPACKET unix sockets * CRIU AMD GPU plugin * setsockopt(SO_BUF_LOCK) support for tcp sockets Signed-off-by: Adrian Reber <areber@redhat.com> v3.17	2022-05-05 12:42:14 -07:00
Alexander Mikhalitsyn	991f27c841	ci: skip new hugetlb maps09/maps10 tests for pre-dump This commit has to be reverted once we fix the issue. Issue: https://github.com/checkpoint-restore/criu/issues/1868 Reported-by: Mr. Jenkins Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-05-05 12:42:14 -07:00
Alexander Mikhalitsyn	0c1f0256ff	kerndat: handle the case when hugetlb isn't supported Currently we check memfd_hugetlb by doing memfd_create("", MFD_HUGETLB). If we see EINVAL we report that it's not supported, but we can also get ENOENT error in such case in hugetlb_file_setup() while trying to find proper hugetlbfs mount. Reference: https://github.com/torvalds/linux/blob/06fb4ecfeac/fs/hugetlbfs/inode.c#L1465 Fixes: `4245e6b02f` ("check: Add a check for using memfd with hugetlb") Reported-by: Mr. Jenkins (ppc64le) Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-05-05 12:42:14 -07:00
Alexander Mikhalitsyn	17a19676cd	zdtm: handle the case when hugetlb isn't supported Fixes: `e2e02bc83e` ("zdtm: Add MAP_HUGETLB memory mapping test") Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-05-05 12:42:14 -07:00
Alexander Mikhalitsyn	c1380c077a	ci: workaround race between sit module loading and bridge test https://github.com/checkpoint-restore/criu/issues/1866 Suggested-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-05-05 12:42:14 -07:00
Alexander Mikhalitsyn	550eafc5d8	ci: print kernel modules list Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-05-05 12:42:14 -07:00
Adrian Reber	f635b61f49	test: install criu in /usr GitHub Actions comes with pre-installed criu in /usr. configure scripts looking for CRIU will pickup the pre-installed version in /usr if we do not install CI criu also in /usr. Signed-off-by: Adrian Reber <areber@redhat.com>	2022-05-05 12:42:14 -07:00
Radostin Stoyanov	2f0f128396	readme: Add badge links to workflows This commit adds a link to the workflow runs for each badge. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-04-28 17:53:52 -07:00
Andrey Zhadchenko	d14dbb8c74	sk-unix: rework bind_on_deleted() return codes bind_on_delete() return code is only used for setting errno for pr_perror() This is mostly useless since a lot of syscalls already set it. All of non-syscall errors already have prints in case of failure. Fix bind_on_deleted() always returning 0 and simplify error juggling to returning -1 in case of errors. Fixes: #1771 Fixes: `d0308e5ecc` ("sk-unix: make criu respect existing files while restoring ghost unix socket fd") Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Radostin Stoyanov	5b872c7183	proc_parse: Fix parsing bpf map_extra The map_extra field has been introduced in Linux Kernel release 5.16 and does not exist in older kernel versions. The current parsing implementation fails when map_extra is missing. In particular, it tries to parse the `memlock` field as `map_extra` and fails but it does not exit with an error because map_extra is marked as "optional". It then tries to parse the `map_id` field as `memlock` and fails with an error because map_id is not optional: Error (criu/proc_parse.c:2161): parse_fdinfo_pid_s: error parsing [map_type:\t2] for 19: Success' To correctly handle this, we should try to parse again the next field when parsing of `map_extra` fails, without reading the next line from the bpfmap. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-04-28 17:53:52 -07:00
Radostin Stoyanov	d40b332cef	bpf: update deprecated API bpf_create_map_xattr() has been replaced with bpf_map_create() https://github.com/libbpf/libbpf/commit/6cfb97c DECLARE_LIBBPF_OPTS has been renamed to LIBBPF_OPTS https://github.com/libbpf/libbpf/commit/ea6c242 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	f641e0c4ba	ci: print mountinfo instead of mount cmd output mountinfo contains more info than just "mount" output Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	5c0b4fbcda	ci: criu-fault: skip inotify_irmap fault-injection on btrfs It looks like we've got broken fhandles from fdinfo for inotifies/fanotifies for btrfs. I will look into that. Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	7ac85cab86	scripts/ci: fix ZDTM_OPTS variable passing We have a separate target for alpine in script/ci/Makefile which defines some extra opts for zdtm using ZDTM_OPTIONS variable. But really it doesn't work. First of all, variable should be named as ZDTM_OPTS and also we have to specify it directly in the CONTAINER_RUNTIME cmdline to make it work. I've also changed variable value just to make it consistent with docker.env value which was really used. Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	ead227994b	zdtm: temporary disable rseq02 test That's strange but rseq02 test fails with: 09:06:16.222: 51: exit 555f52082120 555f52082120 09:06:16.282: 51: exit 555f52082120 555f52082120 09:06:16.340: 51: exit 555f52082120 555f52082120 09:06:16.397: 51: exit 555f52082120 555f52082120 09:06:16.503: 51: exit 0 555f52082120 09:06:16.503: 51: FAIL: rseq02.c:235: Failed to increment per-cpu counter (errno = 2 (No such file or directory)) 09:06:16.503: 51: FAIL: rseq02.c:246: (errno = 16 (Device or resource busy)) It means that rseq_cs pointer was cleaned up by the kernel despite of NO_RESTART* flags. That's a hardly reproducible and I will investigate that. Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	db9ec13616	zdtm: add rseq02 transition test with NO_RESTART CS flag Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	1e0bed3d69	rseq: handle rseq/rseq_cs flags properly Userspace may configure rseq cs abort policy by setting RSEQ_CS_FLAG_NO_RESTART_ON_* flags. In ("cr-dump: fixup thread IP when inside rseq cs") we have supported the case when process was caught by CRIU during rseq cs execution by fixing up IP to abort_ip. Thats a common case, but there is special flag called RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL, in this case we have to leave process IP as it was before CRIU seized it. Unfortunately, that's not all that we need here. We also must preserve (struct rseq)->rseq_cs field. You may ask like "why we need to preserve it by hands? CRIU is dumping all process memory and restores it". That's true. But not so easy. The problem here is that the kernel performs this field cleanup when it realized that the process gets out of rseq cs. But during dump/restore procedures we are executing parasite/restorer from the process context. It means that process will get out of rseq cs in any case and (struct rseq)->rseq_cs will be cleared by the kernel. So we need to restore this field by hands at the last stage of restore just before releasing processes. Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	13338dee5c	Revert "test: disable rseq also on Archlinux" This reverts commit `f008f74041`. Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	064e9925a0	zdtm: add transition/rseq01 test for amd64 Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	2d3354e7b6	cr-dump: fixup thread IP when inside rseq cs If we caught the process when it's inside rseq critical section we have to handle it properly. From the kernel side of view, if the process is executing inside the rseq cs and gets a signal, rseq critical section execution will be interrupted and after signal handler execution, we will proceed to rseq cs abort handler instead of continuing normal rseq cs execution (if RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL isn't set). When CRIU seizes processes that's the same thing as getting signal from the rseq point of view. So we need to fixup instruction pointer to rseq cs abort handler address. Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	4c7ece0bb7	compel: add helpers to get/set instruction pointer Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	441310c260	zdtm/static/rseq00: fix rseq test when linking with a fresh Glibc Fresh Glibc does rseq() register by default. We need to unregister rseq before registering our own. Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	f70ddab24e	pie/restorer: unregister (g)libc rseq before memory restoration Fresh glibc does rseq registration by default during start_thread(). [ see https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=95e114a0919d844d8fe07839cb6538b7f5ee920e ] This cause process crashes during memory restore procedure, because memory which corresponds to the struct rseq will be unmapped and overriden in __export_restore_task. Let's perform rseq unregistration just before unmap_old_vmas(). To achieve that we need to determine (struct rseq) address at first while we are in Glibc (we do that in prep_libc_rseq_info using Glibc exported symbols). See also ("nptl: Add public rseq symbols and <sys/rseq.h>") https://sourceware.org/git?p=glibc.git;a=commit;h=c901c3e764d7c7079f006b4e21e877d5036eb4f5 ("nptl: Add <thread_pointer.h> for defining __thread_pointer") https://sourceware.org/git?p=glibc.git;a=commit;h=8dbeb0561eeb876f557ac9eef5721912ec074ea5 TODO: do the same for musl-libc if it will start to register rseq by default Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	e1799e5305	include: add thread_pointer.h from Glibc Let's take thread_pointer() implementation from Glibc. It will be useful in the further because Glibc stores struct rseq on the TLS. Absolute address can be calculated as __criu_thread_pointer() + __rseq_offset. __rseq_offset is an exported symbol from Glibc itself. We need to have an ability to determine where struct rseq is stored to unregister it in CRIU during the restore stage. For different libc like musl-libc we will have to handle rseq separately depends on how struct rseq is stored. Right now that's not a problem because musl-libc has no rseq support, so we don't need to unregister it. https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=8dbeb0561eeb876f557ac9eef5721912ec074ea5 https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=cb976fba4c51ede7bf8cee5035888527c308dfbc Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	267c1fdade	ci: add Fedora Rawhide based test on Cirrus We have ability to use nested virtualization on Cirrus, and already have "Vagrant Fedora based test (no VDSO)" test, let's do analogical for Fedora Rawhide to get fresh kernel. Suggested-by: Adrian Reber <areber@redhat.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn	03aff7e823	Revert "ci: disable glibc rseq support" Let's see how rseq() C/R feature works This reverts commit `d99def7dcf`. Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>	2022-04-28 17:53:52 -07:00

... 10 11 12 13 14 ...

11541 Commits