mir/criu - criu - Mike's Git repositories

mir/criu

mirror of https://github.com/checkpoint-restore/criu synced 2025-08-31 06:15:24 +00:00

Author	SHA1	Message	Date
Ruslan Kuprieiev	aac9fd5bad	dump: allocate task cores in collect_task() instead of parasite_infect_seized() We need it to be able to dump signals into cores before calling parasite_infect_seized(). Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-08-19 13:09:46 +04:00
Sophie Blee-Goldman	e606c2141e	Dump capabilities from the parasite Needed for future user namespace support. Capabilities will have to be dumped from the parasite, ie from inside the namespace since there is no obvious way to 'translate' capabilities from the global namespace (unlike with uids and gids, where the id mappings can be used for translation). [ additional explanation from Andrew Vagin: "capabilities" are not translated between namespaces. They can exist only in one userns, where a process lives. If a process is created in a new userns, it gets a full set of capabilities in this userns, and loses all caps in a parent userns. So if capabilities are not shown in /proc/pid/stat, we have no way to get it except of using parasite code. ] Signed-off-by: Sophie Blee-Goldman <ableegoldman@google.com> Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-08-15 23:10:44 +04:00
Andrey Vagin	e4e22a00f7	mount: save remapped links on tmpfs (v2) For that mnt namespaces should be dumped after files. v2: rework enumeration of namespaces in dump_mnt_namespaces() Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-08-05 16:35:41 +04:00
Tycho Andersen	51876eea5d	Attempt to restore cgroups During the dump phase, /proc/cgroups is parsed to find co-mounted cgroups. Then, for each task /proc/self/cgroup is parsed for the cgroups that it is a member of, and that cgroup is traversed to find any child cgroups which may also need restoring. Any cgroups not currently mounted will be temporarily mounted and traversed. All of this information is persisted along with the original cg_sets, which indicate which cgroups a task is a member of. On restore, an initial phase creates all the cgroups which were saved. Tasks are then restored into these cgroups via cg_sets as usual. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-07-10 17:00:28 +04:00
Pavel Emelyanov	5e9c57a13d	criu: Dump and restore pdeath_sig value The implementation is pretty straightforward. When dumping per-thread misc data with parasite, collect one, then write in thread_core_info. On restore wait for creds restore and put the value back (some creds changes drop it to zero). Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>	2014-07-01 16:16:04 +04:00
Filipe Brandenburger	64dc66c29f	dump: do not fail dump when robust_lists are disabled Robust lists may be disabled, for example if the "futex_cmpxchg_enabled" variable in the kernel is unset. Detect that case by checking that both "get_robust_list" and "set_robust_list" syscalls return ENOSYS and do not make criu dump fail in that case, but simply assume an empty list, which is consistent with the syscalls not being available. Tested: Successfully ran the zdtm test suite on a kernel where the "get_robust_list" and "set_robust_list" syscalls are disabled. Signed-off-by: Filipe Brandenburger <filbranden@google.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-06-25 19:57:32 +04:00
Cyrill Gorcunov	0bb002ce69	vdso: dump -- Don't dump contents of vvar zone vvar zone is mapped by a kernel and must not ever been dumped into image, the data present there is valid on running kernel only. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-06-24 22:48:41 +04:00
Pavel Emelyanov	1ba9d2cae9	cg: Dump cgroups tasks live in Each task points to a single ID of cgroup-set it lives in. This is done so to save some space in the image, as tasks likely live in the same set of cgroups. Other than this we keep track of what cgroup set we dump the subtree from. If it happens, that root task lives in the same cgroup set as criu does, we don't allow for any other sub-cgroups and make restore (next patch) much simpler and faster. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-05-27 23:48:06 +04:00
Pavel Emelyanov	8b8eb53a0a	cg: Skeleton for cgroup code Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-05-27 23:48:06 +04:00
Cyrill Gorcunov	89faae1e9b	vdso: dump -- Drop duplicated include Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Alexander Kartashov <alekskartashov@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-05-27 23:40:00 +04:00
Filipe Brandenburger	d5bb7e9748	dump: preserve the dumpable flag on criu dump/restore Preserve the dumpable flag, which affects whether a core dump will be generated, but also affects the ownership of the virtual files under /proc/$pid after restoring a process. Tested: Restored a process with a criu including this patch and looked at /proc/$pid to confirm that the virtual files were no longer all owned by root:root. zdtm tests pass except for cow01 which seems to be broken. (see https://bugzilla.openvz.org/show_bug.cgi?id=2967 for details.) This patch fixes https://bugzilla.openvz.org/show_bug.cgi?id=2968 Signed-off-by: Filipe Brandenburger <filbranden@google.com> Change-Id: I8c386508448a84368a86666f2d7500b252a78bbf Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-05-14 01:02:37 +04:00
Andrey Vagin	2f4be997b6	mount: use per-namespace mntinfo_tree (v2) This patch removes the global mntinfo_tree and collect_mount_info where it was constructed. The mntinfo list is filled from dump_mnt_ns, rst_collect_local_mntns, collect_mnt_namespaces and read_mnt_ns_img. A mountinfo entry contains a reference on a proper ns_id entry, so we cau use mnt_id to look up a proper mount namespace. v2: remove trash after rebasing. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:40:19 +04:00
Andrey Vagin	b6d3314c54	check: collect mounts of the current mntns They are used for collecting unix sockets Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:40:04 +04:00
Andrey Vagin	e827a695f3	mount: separate collect_mnt_ns from dump_mnt_ns We are going to support nested mntns, so the global mntinfo_tree variable are useless and information about tree should be connected to a proper namespace. But when we don't dump mntns, we need to collect mounts for the current mntns. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:39:41 +04:00
Andrey Vagin	de4326a382	mount: return descriptor from mntns_collect_root We are going to support nested mount namespaces, so files can be opened from more than one namespace and a root must be collect for each file. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:39:32 +04:00
Andrey Vagin	22d384536d	files-ids: generate id-s accoding with mnt_id, st->st_dev and st->st_ino One device can be mounted a few times, so files are identical only, if they have the same mnt_id. Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:39:28 +04:00
Andrey Vagin	87b1f5408c	files: save mnt_id on fd_param Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:39:18 +04:00
Andrey Vagin	0721626902	namespaces: dump mount namespaces before tasks (v2) because we want to check, that all files are reachable. For that we need to collect all mounts from all namespaces. v2: dump mntns separately Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:38:47 +04:00
Andrey Vagin	d2012883ab	criu: rename current_ns_mask to root_ns_mask (v2) Now we supports sub-mntns, so root_ns_mask sounds more correct than current_ns_mask. v2: typo fix Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-21 22:38:33 +04:00
Pavel Emelyanov	1d438db66d	rlimits: Move entries from top-core into task-core This appeared after latest 1.2, so it's still possible to do this move. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-17 12:01:08 +04:00
Pavel Emelyanov	b54e340945	core: Move posix timers on core entry This as well gives us minus one image per-task and allocates more space on core task entry. One thing to note -- the amount of posix timers is not easily accessible at the core entry allocation time, so the respective array is allocated on demand. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-17 12:00:54 +04:00
Pavel Emelyanov	dfd5a62f38	core: Move itimers on core This allows to have one image less per-task, which in turn reduces live migration time a little bit. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-17 12:00:52 +04:00
Christopher Covington	c1cd6b5e5f	Allow dumps of stopped multithreaded processes CRIU can handle stopped multithreaded processes when all threads are stopped. Refine the check to allow this case. Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-11 15:16:18 +04:00
Andrey Vagin	4e1d81deb6	cr-dump: allocate dfds near the place where it's used Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-08 22:55:57 +04:00
Jamie Liu	efe594f8f4	criu: fix filemap open permissions An mmaped file is opened O_RDONLY or O_RDWR depending on the permissions on the first vma dump_task_mm() encounters mapping that file. This causes two problems: 1. If a file has multiple MAP_SHARED mappings, some of which are read-only and some of which are read-write, and the first encountered mapping happens to be read-only, the file will be opened O_RDONLY during restore, and mmap(PROT_WRITE) will fail with EACCES, causing the restore to fail. 2. If a file is opened read-write and mapped read-only, it will be opened O_RDONLY during restore, so restore will succeed, but mprotect(PROT_WRITE) on the read-only mapping after restore will fail. To fix both of these, record open flags per-vma based on the presence of VM_MAYWRITE in smaps. Signed-off-by: Jamie Liu <jamieliu@google.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-04-04 20:35:48 +04:00
Cyrill Gorcunov	5f433a6e81	pstree: Define RLIM_NLIMITS On PI machine we've got \| CC protobuf.o \| pstree.c: In function ‘core_entry_alloc’: \| pstree.c:36:10: error: ‘RLIM_NLIMITS’ undeclared (first use in this function) due to old kernel headers. Note I've dropped off BUG_ON here to localize all things in pstree code, no need to sprinkle constants. Reported-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-03-14 18:57:16 +04:00
Pavel Emelyanov	57825f6500	rlims: Unscrew up core->rlimits[i] assignment The array element is RlimitEntry properly initialized, no need in additional memcpy-s and size-checks. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-03-14 15:46:42 +04:00
Cyrill Gorcunov	79a88ae0dd	rlimit: Stop writting old rlimits image entries We're using new image format, but old image file is still generated. This will be addressed in next patch. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-03-14 15:44:46 +04:00
Cyrill Gorcunov	fd82384866	rlimit: Dump task rlimits into Core entry Note the restore remains as is for a while, it'll be addressed later. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-03-14 15:44:34 +04:00
Cyrill Gorcunov	ac03ca5599	auxv: Restore backward compatibility In commit `459828b6` I suddenly broke backward compatibility of auxv vector on 32bit machines. Bring it back. Reported-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-11 09:18:07 +04:00
Cyrill Gorcunov	459828b6be	dump: Read aux vector in one pass No need to read it in cycle. Repored-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-10 14:26:29 +04:00
Pavel Emelyanov	8a827ba403	files: Make fd_id_generate_special return ID into pointer Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-05 16:17:49 +04:00
Pavel Emelyanov	8b611770aa	files: Pass stat information into fd_id_generate_special Acked-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-05 16:17:18 +04:00
Pavel Emelyanov	d29238db83	dump: Missing \n in auxv dumping message Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-04 18:58:00 +04:00
Pavel Emelyanov	54f4f889a5	mm: Move VmaEntries from separate image into Mm one When writing VMAs we perform too many small writes into vma-.img files. This can be easily fixed by moving the vma-s into mm-s, all the more so they cannot be splitted from each other. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-04 11:44:05 +04:00
Pavel Emelyanov	eb1ae0a025	vma: Turn embeded VmaEntry on vma_area into pointer On restore we will read all VmaEntries in one big MmEntry object, so to avoif copying them all into vma_areas, make them be pointable. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-04 11:44:01 +04:00
Pavel Emelyanov	6da13c687f	dump: Merge mm and vmas dumping functions The plan is to merge vma images into mm ones (see further patching), so prepare the dumping code for that. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-04 11:43:59 +04:00
Pavel Emelyanov	608db864a3	vmas: Don't call stat on vm file twice When parsing mappings in proc, we fstat vm file, later, when dumping it, we stat it again to fill fd_parms. The 2nd stat is not required, we can keep the stat in vma_area. This removed 35% of all stat calls on dump of basic container. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-02-03 00:18:32 +04:00
Pavel Emelyanov	740eb9c101	proc-parse: Don't open and stat every single map_files link Quite a lot of VMAs in tasks map the same file with different perms. In that case we may skip opening all these files, but "borrow" one from the previous VMA parsed. There's little sense in seeking more that just previous VMA, as same files are rarely (can be though) mapped in different locations. After this on a basic Centos6 container the number of opens and stats in this function drops from ~1500 to ~500. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-31 20:31:06 +04:00
Pavel Emelyanov	c4b5a9d5b0	vma: Don't close socket's inode as fd The vm_socket_id is union with vm_file_fd and calling close on it is wrong. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-31 03:14:46 +04:00
Pavel Emelyanov	cc918897b0	irmap: Introduce irmap on-disk cache When dumping fsnotifies we may go to irmap to get inode->path mapping. The irmap engine scans FS (in hinted locations) to get one and it is slow even though we scan only part of the FS. Since the above scanning is done while tasks are frozen the freeze time goes up :( Improve the situation by generating irmap cache in working dir at pre-dump when tasks get unfrozen. The on-disk irmap cache is PB file, it sits in -W directory and can be loaded on dump/pre-dump start in memory. When resolving the inode->path mapping irmap may meet these entries, revalidate them and potentially save time. After pre-dump the (re-)collected irmap data is written back to irmap cache image. Typically entries written back are the same read in on cache load. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-30 16:20:16 +04:00
Pavel Emelyanov	751856c8b8	files: Pre-dump file descriptors We will generate some info about file-descriptors at that stage. For now these pre-dumped ones would be fsnotifies, so the pre-dump of a single fd is written as simple as possible, but enough for that type of FDs pre-dump. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-30 16:20:15 +04:00
Pavel Emelyanov	84ebc64b1f	pre-dump: Collect mount info, root and nsmask Well, we want to pre-dump files (fsnotifies), for that we will need mountinfo-s and root, and for the latter -- the current ns mask. The problem with current ns mask is that its generation is incorporated into ns IDs generation and dumping. And since the ids dumping is not performed on pre-dump, let's just provide a helper for ns-mask generation. Strictly speaking, the whole ns-mask idea is not great, but it's to be fixed later. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-30 16:20:15 +04:00
Pavel Emelyanov	040fe7712f	pre-dump: Enforce track-mem and leave-running in cr_pre_dump_tasks Service will call the pre-dump routine, so this is factoring out enforcin options for CLI and RPC. Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-30 15:58:39 +04:00
Cyrill Gorcunov	41a1b2da32	dump: collect_mappings -- Fix message text @vma_area_list::longest is in pages not bytes. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-29 15:42:34 +04:00
Andrey Vagin	3b7cc1e45e	pre-dump: handle errors from page_xfer_dump_pages() Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-24 15:47:45 +04:00
Andrey Vagin	e970fccb69	dump: call cpu_init() in pre-dump It's required for dumping fpu state [root@avagin-fc19-cr crtools]# bash test/zdtm.sh -d -P -i 2 static/fpu00 Execute zdtm/live/static/fpu00 ./fpu00 --pidfile=fpu00.pid --outfile=fpu00.out Dump 8411 Dump 8411 Check results 8411 01:29:30.550: 8411: FAIL: fpu00.c:78: 0.806165 != -nan (errno = 11 (Resource temporarily unavailable)) Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-20 14:13:17 +04:00
Pavel Emelyanov	2ec003eb85	predump: Spot the VDSO page/vma in victim task While doing pre-dump we don't do proper VDSO fixups, thus at this stage we may fail the should_dump_page() checks -- it will tread VDSO pages are 'regular' and may skip dumping some of them. This is not bad as is, but the subsequent dump will properly spot VDSO are and will try to dump _all_ pages from it. And if checks for soft-dirty will report that some pages are clean, dump will try to locate those in parent images and would fail. Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-14 22:54:17 +04:00
Pavel Emelyanov	91011328fa	criu: Several formatting fixes Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2014-01-14 09:33:19 +04:00
Andrew Vagin	a2eaa5cf44	ptrace: the task state is restored automatically It's a feature of PTRACE_SEIZE. So we need to do something, only if we want to change the state. [xemul: If task _was_ in stopped state before dump and we want them to stay alive after dump, the existing code queues one more STOP to it. This affects subsequent dump, as we seize a stopped task with STOP in queue. One more item in TODO list -- support stopped tasks with STOP in queue :) ] Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>	2013-12-20 20:36:12 +04:00

1 2 3 4 5 ...

545 Commits