mir/criu - criu - Mike's Git repositories

mir/criu

mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 05:48:05 +00:00

Author	SHA1	Message	Date
Cyrill Gorcunov	fa0587ed81	restore: Use min_t helper for type casting On arm \| CC crtools.o \| In file included from arch/arm/include/asm/bitops.h:4:0, \| from arch/arm/include/asm/types.h:9, \| from include/proc_parse.h:5, \| from include/ptrace.h:8, \| from cr-restore.c:27: \| cr-restore.c: In function 'restore_priv_vma_content': \| include/compiler.h:60:17: error: comparison of distinct pointer types lacks a cast [-Werror] \| (void) (&_min1 == &_min2); \ \| Reported-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-11-12 14:57:05 +03:00
Cyrill Gorcunov	c63a42ac2f	restore: Use bitmap_set helper Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-11-12 11:15:13 +03:00
Pavel Emelyanov	03b217c0a6	restore: Restore as many pages at once as possible When the VMA being restored is not COW-ed we read pages from images one-by-one which results in suboptimal pages.img access. Fix this by reading as many pages from iamge at once as possible withing the active pagemap and VMA. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-11-12 11:14:44 +03:00
Pavel Emelyanov	780d699401	page-read: Teach page-read to read multiple pages at once This is preparatory patch, the problem to solve is described in the next one. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-11-12 11:14:43 +03:00
Tycho Andersen	934c312554	rst: unmap restore memory after seccomp restore In order to restore seccomp filters, we need to have access to dynamically allocated memory from the restorer blob, so we should unmap this memory afterwards. In order to do this, we need to suspend seccomp earlier, right after we attach to the tasks instead of just before we do the unmap of the restorer blob itself. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-11-11 15:57:26 +03:00
Pavel Emelyanov	6f0681c1b1	Revert "rst: Re-use opened fd when restoring private mappings" This reverts commit 73cb87f9182bf46fceacde1e9023d8d5cdf99de6. Two reasons: individual VMA-s may require different open flags and ghost and link-remap files should be properly unlinked at the end of open_path(). Need some more intelligent solution to this.	2015-11-10 17:20:55 +03:00
Pavel Emelyanov	73cb87f918	rst: Re-use opened fd when restoring private mappings On restore we do a sequence of open+mmap+close steps. On real apps there exists chains of private file mappings for the same file with different pgoffs and/or flags/prots. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-11-10 15:59:28 +03:00
Pavel Emelyanov	68baf8e77d	criu: Fault injection core This patch(set) is inspired by similar from Andrey Vagin sent sime time earlier. The major idea is to artificially fail criu dump or restore at specific places and let zdtm tests check whether failed dump or restore resulted in anything bad. This particular patch introduces the ability to tell criu "fail at X point". Each point is specified with a integer constant and with the next patches there will appear places over the code checking for specific fail code being set and failing. Two points are introduced -- early on dump, right after loading the parasite and right after creation of the root task. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-19 12:42:29 +03:00
Kir Kolyshkin	4456273738	restore_root_task(): fix calling try_clean_remaps 1. As pointed out by Coverity (CID 114629), mnt_ns_fd is closed, but then the function calls try_clean_remaps(mnt_ns_fd) which tries to close the file descriptor which is already closed. To address this, let's use safe_close() which sets closed fd to -1. As it also checks its argument, there's no need for explicit check so let's remove "if" check before close(). 2. As Pavel pointed out, "calling the whole try_clean_remaps() is not required once we've passed the cleanup_mnt_ns() point". This could be addressed by introducing yet another label, but it's cleaner to just use a flag variable. Note that since the second issue is being addressed, the first one goes away, but let's keep the fix for it anyway, it might help in the future. Signed-off-by: Kir Kolyshkin <kir@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-19 12:33:58 +03:00
Andrew Vagin	14da0f780e	restore: wait while processes are dying If criu restore failed, criu should wait all processes because they hold files, namespaces and other stuff that caller might want to have released (in our case it was ploop device). Here we do this only for cases when processes are restored in a pid namespace. We'd like to do the same for non-ns case, but there's no simple way to wait for a bunch of unconnected processes. Another good side effect is that "Restoring FAILED." will be printed at the end of the log (now after we kill init tasks still have time to do smth and write log messages). Cc: Nikita Spiridonov <nspiridonov@odin.com> Reported-by: Nikita Spiridonov <nspiridonov@odin.com> Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-14 15:49:25 +03:00
Matthew Krafczyk	29c08d8672	Add pre-dump and pre-restore action scripts This allows the user to perform actions before dumping or restoration occurs. Signed-off-by: Matthew Krafczyk <krafczyk.matthew@gmail.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-09 18:23:41 +03:00
Andrew Vagin	d9b1b9ff37	restore: fix checking error code of sys_sigaction sys_sigaction() returns an error code Reported-by: Kir Kolyshkin <kir@openvz.org> Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-10-08 13:19:33 +03:00
Cyrill Gorcunov	0516001f91	restore: Report error if write into last pid failed Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-30 12:32:03 +03:00
Pavel Emelyanov	efa7dcf7c2	ghost: Remove ghost files if restore fails Issue #18. When restore fails ghost files remain there. And to remove them we have to know their list, paths to original files (to construct the ghost name) and the namespace ghost lives in. For the latter we keep the restore task namespace at hands till the final stage and setns into it to kill ghosts. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 22:00:37 +03:00
Pavel Emelyanov	b0e23c3d4f	files: Collect ghosts and regilfes early Info about ghosts presence and paths will be needed to remove the ghosts itself and thus are needed in criu. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 22:00:35 +03:00
Pavel Emelyanov	7ca6cc1eb2	mnt: Clean roots yard from criu process So here it is. If root task dies on restore the roots yard dir remains unrmdired :( Since we already know its name, we can remove one from criu. By the time we get to this place the sub mount namespace(s) are already dead and yard dir is empty. But umounting should be done by tasks after successfull restore, so keep depopulation there. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:57:35 +03:00
Pavel Emelyanov	3e7c92ed02	mnt: Renames around roots yard Same thing as in previous patch -- we have too many generic clean_ and fini_ prefixes over the code. And we need more (see next patch), so let's specify what exactly we clean or fini. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:57:21 +03:00
Pavel Emelyanov	e3f5ba3c37	ns: Prepare namespaces before tasks There's already two things we do in criu namespaces before forking the init task (start unsd and keep netnsfd for back reference). Next patches will introduce the 3rd action for mount namespaces, so have a special pre-call for all this stuff. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-09-28 21:56:26 +03:00
Christopher Covington	f9ae6d9dd4	Replace remaining hard-coded TASK_SIZE use If we want one CRIU binary to work across all AArch64 kernel configurations, a single task size value cannot be hard coded. While trivial applications successfully checkpoint and restore on AArch64 kernels with CONFIG_ARM64_64K_PAGES=y without this patch, replacing the remaining use of the hard-coded value seems like the best way to guard against failures that more complex process trees and future uses may expose. Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:14:19 +03:00
Christopher Covington	1438f013a2	Pass task_size to vma_area_is_private() If we want one CRIU binary to work across all AArch64 kernel configurations, a single task size value cannot be hard coded. Since vma_area_is_private() is used by both restorer blob code and non restorer blob code, which must use different variables for recording the task size, make task_size a function argument and modify the call sites accordingly. This fixes the following error on AArch64 kernels with CONFIG_ARM64_64K_PAGES=y. pie: Error (pie/restorer.c:929): Can't restore 0x3ffb7e70000 mapping w> pie: ith 0xfffffffffffffff7 Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:14:18 +03:00
Christopher Covington	7451fc7d23	restorer: Replace most hard-coded TASK_SIZE use If we want one CRIU binary to work across all AArch64 kernel configurations, a single task size value cannot be hard coded. This fixes the following error on AArch64 kernels with CONFIG_ARM64_64K_PAGES=y. pie: Error (pie/restorer.c:772): Unable to unmap (-): -1211695104 Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:14:17 +03:00
Andrew Vagin	f13ec96e58	restore: fix race in calculation of a number of zombies Currently each task subtracts number of zombies from task_entries->nr_threads without locks, so if two tasks will do this operation concurrently, the result may be unpredictable. https://github.com/xemul/criu/issues/13 Cc: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Andrew Vagin <avagin@openvz.org> Acked-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-08-03 17:12:10 +03:00
Christopher Covington	69d008d567	Use run-time page_size() for mremap The old and new address parameters passed to the mremap system call must be page size aligned. On AArch64, the page size can only be correctly determined at run time. This fixes the following errors for CRIU on AArch64 kernels with CONFIG_ARM64_64K_PAGES=y. call mremap(0x3ffb7d50000, 8192, 8192, MAYMOVE \| FIXED, 0x2a000) Error (rst-malloc.c:201): Can't mremap rst mem: Invalid argument call mremap(0x3ffb7d90000, 8192, 8192, MAYMOVE \| FIXED, 0x32000) Error (rst-malloc.c:201): Can't mremap rst mem: Invalid argument Signed-off-by: Christopher Covington <cov@codeaurora.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-28 13:38:30 +03:00
Christopher Covington	b61224bffe	Use run-time page_size() in pie_size() This fixes the following error for CRIU on AArch64 kernels with CONFIG_ARM64_64K_PAGES=y. Error (cr-restore.c:2828): Can't mmap section for restore code This occurred because the address being requested (0x16000 in one case) was not page aligned. Also change the capitalization of the pie_size() macro to make it clear that the value is not necessarily a build-time constant. Signed-off-by: Christopher Covington <cov@codeaurora.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-28 13:38:20 +03:00
Tycho Andersen	5f729636b4	rst: don't hang when SIGCHLD is coalesced When a TASK_HELPER would exit just before a zombie, sometimes the signal would get coalesced, and we would miss the zombie exit, causing us to block forever waiting for the zombie to complete. Let's use an entirely different strategy for waiting on zombies: explicitly wait on them with waitid, and use WNOWAIT to prevent their data from actually being reaped. v2: don't decrement nr_{tasks,threads} in the loop Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Acked-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-23 15:17:55 +03:00
Tycho Andersen	7b20f42f78	rst: move lsm memory allocations before rst_mem_lock() 8ffbe754bd9 moved the rst_mem_lock() call, but didn't move the corresponding LSM allocations, so we do that here. One unfortunate thing is that we have to split this into two steps: first we have to read the creds to figure out exactly how much memory to allocate for the lsm string. Since prepare_creds() wants to write directly to the task_restore_args struct and that can't be allocated until after we lock the restore memory, we break it up into two steps. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-16 14:26:44 +03:00
Andrey Vagin	445dbd9d09	log: don't forget LF for pr_err() Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-16 14:24:13 +03:00
Pavel Emelyanov	f231a11908	rst: Remove actually unused pid arg from restorer_get_vma_hint Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-14 14:03:00 +03:00
Pavel Emelyanov	6fe296a26e	rst: Remove actually unused pid arg from restore_one_zombie Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-14 14:02:53 +03:00
Pavel Emelyanov	dc149e884d	rst: Remove actually unused pid arg from prepare_mappings Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-14 14:02:45 +03:00
Pavel Emelyanov	73e3925bcd	pstree_item: Keep has_seccomp field on rst_info tail Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-14 14:01:48 +03:00
Pavel Emelyanov	8ffbe754bd	rst: Lock rst memory allocations earlier After we got the total remapable rst memory size, we no longer can allocate from it, otherwise the bootstrap area will not have enough size. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-14 14:00:27 +03:00
Pavel Emelyanov	d9a9d4c9b3	rst: Fix timerfd rst memory management It's similar to previous patch with tcp mem -- no need to realloc big arrays and then memcpy data between them. It's enough just to walk timerfd objects at the very end. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-14 13:59:39 +03:00
Pavel Emelyanov	73e303c8e2	rst: Fix rst_tcp_sock memory management In current scheme we grow an array with realloc()-s then memcpy() the result into rst_mem. I propose to get rid or realloc-s (we already have objects for the data we need to keep) and memcpy-s (and put objects directly into rst_mem at the end). Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-14 13:59:21 +03:00
Pavel Emelyanov	7166e3c984	rst: Fix helpers memory allocation Calling rst_mem_alloc() in a loop with increasing size causes the n^2 memory grow :) since _alloc is not _realloc. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-14 13:59:08 +03:00
Tycho Andersen	209693d49b	don't assume the kernel has CONFIG_SECCOMP linux/seccomp.h may not be available, and the seccomp mode might not be listed in /proc/pid/status, so let's not assume those two things are present. v2: add a seccomp.h with all the constants we use from linux/seccomp.h v3: don't do a compile time check for PTRACE_O_SUSPEND_SECCOMP, just let ptrace return EINVAL for it; also add a checkskip to skip the seccomp_strict test if PTRACE_O_SUSPEND_SECCOMP or linux/seccomp.h aren't present. v4: use criu check --feature instead of checkskip to check whether the kernel supports seccomp_suspend Reported-by: Mr. Jenkins Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Acked-by: Andrew Vagin <avagin@odin.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-07-13 14:50:35 +03:00
Tycho Andersen	e0b24e21d3	creds: fail to dump when creds in thread group don't match Since we don't support dumping per-thread creds, let's at least fail to dump if the creds don't match. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-24 17:38:35 +03:00
Tycho Andersen	0d8aec0c3a	seccomp: add initial support for SECCOMP_MODE_STRICT Unfortunately, SECCOMP_MODE_FILTER is not currently exposed to userspace, so we can't checkpoint that. In any case, this is what we need to do for SECCOMP_MODE_STRICT, so let's do it. This patch works by first disabling seccomp for any processes who are going to have seccomp filters restored, then restoring the process (including the seccomp filters), and finally resuming the seccomp filters before detaching from the process. v2 changes: * update for kernel patch v2 * use protobuf enum for seccomp type * don't parse /proc/pid/status twice v3 changes: * get rid of extra CR_STAGE_SECCOMP_SUSPEND stage * only suspend seccomp in finalize_restore(), just before the unmap * restore the (same) seccomp state in threads too; also add a note about how this is slightly wrong, and that we should at least check for a mismatch Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-24 17:38:32 +03:00
Andrey Vagin	2a0c8db72b	proc: mount proc with minimal permissions Eric wants to restrict permissions for proc mounts in a non-root userns according with proc mounts in the root userns. Author: Eric W. Biederman <ebiederm@xmission.com> Date: Fri May 8 23:49:47 2015 -0500 mnt: Modify fs_fully_visible to deal with locked ro nodev and atime Ignore an existing mount if the locked readonly, nodev or atime attributes are less permissive than the desired attributes of the new mount. ... Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-19 12:20:15 +03:00
Tycho Andersen	081a5b9e77	pie: use the /proc fd for last pid Instead of keeping around multiple fds that point to various places in /proc, let's just use /proc and openat() things relative to it. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-16 12:17:37 +03:00
Tycho Andersen	7083fc370d	lsm: restore lsm bits per tid instead of per pid This is a little tricky, since the threads are forked in the restorer blob, we can't open their attr/curent files to pass into the restorer blob. So, we pass in an fd for /proc that the restorer blob can use to access the attr/current files once they exist. N.B. this is still incorrect in that it restores the same credentials for all threads in the group; however, it matches the behavior of the current creds restore code, which also restores the same creds for all threads in the group. v2: use simple_sprintf() instead of pie_strcat() Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-16 12:17:36 +03:00
Cyrill Gorcunov	1998fbfa87	pie: relocs -- Fix compilation on ARM Otherwise getting \| parasite-syscall.c: In function ‘parasite_infect_seized’: \| parasite-syscall.c:1222:5: error: ‘elf_relocs’ undeclared (first use in this function) Simply wrap the @elf_relocs_apply with macros. Reported-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-16 11:40:20 +03:00
Cyrill Gorcunov	ea0fd2aa08	pie: piegen -- Make different names for parasite and restorer relocs Otherwise it's confusing since. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-15 21:15:57 +03:00
Cyrill Gorcunov	46270d11fe	pie: Use PIE_SIZE helper Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-08 23:53:37 +03:00
Cyrill Gorcunov	f03a4672ce	pie: piegen -- Slightly rework the building procedure - Move relocs application into a separate file which get compiled as a regular C file in criu (pie/pie-relocs.[ch]) - Move types used by piegen into pie/piegen/uapi/types.h Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-08 23:53:27 +03:00
Cyrill Gorcunov	ac187856c4	pie: x86 -- Do a real call for applying relocations At moment both parasite and restorer do not have any relocs because we support x86-64 only, but this will be changed soon so do a call and apply relocations. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-08 23:53:26 +03:00
Cyrill Gorcunov	c84fa8c506	pie: x86 -- Adjust size of parasite and restorer code In case of @gotpcrel relocations we need additional space to carry pointers. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-08 23:53:24 +03:00
Pavel Emelyanov	7a9813346b	rst: Sanitize standard arrays remapping On restore we have several arrays of objects that get remapped into pie area and their number is also passed. Clean and shorten the remapping code a bit and bing their naming to common format. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-08 23:39:27 +03:00
Tycho Andersen	6979838793	ensure SIGCHLD isn't inherited as blocked Use SIG_SETMASK instead of SIG_BLOCKMASK here in case the parent had SIGCHLD blocked. In this case if one of the criu threads has a problem, since the SIGCHLD is blocked, the restore simply hangs. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Acked-by: Andrew Vagin <avagin@virtuozzo.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2015-06-08 23:35:48 +03:00
Pavel Emelyanov	b08f3fae5b	vdso: Reduce the amount of in-code ifdef-s Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Reviewed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>	2015-06-08 23:34:33 +03:00

1 2 3 4 5 ...

729 Commits