mir/criu - criu - Mike's Git repositories

mir/criu

mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 22:05:36 +00:00

Author	SHA1	Message	Date
Andrei Vagin	09b4d01e8f	criu: print criu and kernel versions from log_init() log_init() opens a log file. We want to have criu and kernel versions in each log file, so it looks reasonable to print them from log_init(). Without this patch, versions are not printed, if criu is called in the swrk mode. Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Acked-by: Adrian Reber <areber@redhat.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 21:16:44 +03:00
Andrei Vagin	621a80fccf	criu: initialize logging for libraries from log_set_loglevel() We have three code paths, where we call log_set_loglevel() and in all these places, we need to initialize libraries logging engines, so it is better to do this from one function. For example, currently we forgot to initialize soccr logging for criu swrk. Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Acked-by: Adrian Reber <areber@redhat.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 21:09:47 +03:00
Andrei Vagin	a83620dc0b	zdtm: handle --tcp-established in the rpc mode Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 21:08:50 +03:00
Andrei Vagin	d47a7938c4	travis: don't fail a build due to the GCOV job It fails too often due to installing gcc-7. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:56:26 +03:00
Andrei Vagin	1d5438310b	test/other: add a test to check the --shell-job option This test creates a pty pair, creates a test process and sets a slave pty as control terminal for it. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:37:28 +03:00
Vitaly Ostrosablin	e19bd64876	zdtm: Add a test to check if we can C/R ghost files with no parent dirs. This is test that triggers a bug with ghost files, that was resolved in patch "Don't fail if ghost file has no parent dirs". Signed-off-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:36:47 +03:00
Mike Rapoport	d1d506d15b	lazy-pages: kill POLL_TIMEOUT In the current model we haven't started the background page transfer until POLL_TIMEOUT time has elapsed since the last uffd or socket event. If the restored process will do memory access one in (POLL_TIMEOUT - eplsilon) the filling of its memory can take ages. This patch changes them model in the following way: * poll for the events indefinitely until the restore is complete * the restore completion event causes reset of the poll timeout to zero and * starts the background transfers * after each transfer we return to check if there are any uffd events to handle Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:02 +03:00
Mike Rapoport	cc4bb5fd06	lazy-pages: add ability to limit background transfer size Currently, once we get to transfer pages in the "background", we try to fetch the entire IOV at once. For large IOVs this may impact #PF latency for the #PF events occurred during the transfer. Let's add a simple heuristic for controlling size of the background transfers. Initially, the transfer will be limited to some default value. Every time we transfer a chunk we increase the transfer size until it reaches a pre-defined maximal size. A page fault event resets the background transfer size to its initial value. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:02 +03:00
Mike Rapoport	882703ddaf	lazy-pages: make complete_forks more robust The complete_forks function presumes that it always has a work to do because we assume that fork event is the only case when we drop out of epoll_run_rfds with positive return value. Teach complete_forks to bail out when there is no pending forks to process to allow exiting epoll_run_rfds for different reasons. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:02 +03:00
Mike Rapoport	9b04535f79	lazy-pages: simplify background transfer logic First check if there are pages we need to transfer and only afterwards check if there are outstanding requests. Also, instead checking 'bool remaining' to see if there is more work to do we can simply check if all the lpi's have been already serviced. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:02 +03:00
Mike Rapoport	69b9ab7a58	lazy-pages: rename handle_remaining_pages to xfer_pages The intention is to use this function for transferring all the pages that didn't cause a #PF. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:01 +03:00
Mike Rapoport	bc00f635f1	lazy-pages: rename first_pending_iov to pick_next_range The function anyway pick the next page range to transfer it's just doing it in very simple FIFO manner. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:01 +03:00
Mike Rapoport	0b29bbd369	lazy-pages: rework requests queueing We already have a queue for the requested memory ranges which contains 'lp_req' objects. These objects hold the same information as the lazy_iov: start address of the range, end address and the address that the range had at the dump time. Rather than keep this information twice and use double bookkeeping, we can extract the requested range from lpi->iovs and move it to lpi->reqs. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:01 +03:00
Mike Rapoport	dd85d62e60	lazy-pages: rename iov->base to iov->start Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:01 +03:00
Mike Rapoport	9f3c1cdd4a	lazy-pages: lazy_iov: use end instead of len Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:01 +03:00
Mike Rapoport	770c31e585	lazy-pages: split_iov: always create the new iov above the one being split Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:01 +03:00
Mike Rapoport	76440cd4c9	lazy-pages: explicitly set process exited condition Instead of relying on length of various lists add a boolean variable to lazy_pages_info to make it clean when the process has exited Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-04-06 02:17:01 +03:00
Kirill Tkhai	0d11a50b95	zdtm: Actually add tun_ns test Previous patch missed "git add", so simlink and .desc file were not sent... Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-28 20:38:34 +03:00
Andrey Vagin	251dad530b	zdtm: check an exit code of a straced restore Currently zdtm doesn't detect when restore failed, if it is executed with strace. With this patch, fake-restore.sh creates a test file, and zdtm is able to distinguish when restore failed. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-28 00:31:53 +03:00
Andrei Vagin	509fac32dd	zdtm.py: fix a logic about determing a test flavor in a error case The get() method requires a key and now we are using an index. That will never work correctly as it is now. Acked-by: Adrian Reber <adrian@lisas.de> Reported-by: Adrian Reber <adrian@lisas.de> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 01:04:56 +03:00
Andrey Vagin	388dceac4a	unix: split dump_external_sockets() for readability Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:57 +03:00
Andrey Vagin	ebc4bf2872	unix: fix an error code in bind_unix_sk() Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:52 +03:00
Andrey Vagin	90a3475461	unit: don't check ui->ue->name.len twice in bind_unix_sk() Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:45 +03:00
Andrey Vagin	94fcc4445b	unix: split bind_unix_sk() for readability Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:41 +03:00
Andrey Vagin	dcf688b0d6	unix: restore sockets on correct mount points Currently we restore all sockets in the root mount namespace, because we were not able to get any information about a mount point where a socket is bound. It is obviously incorrect in some cases. In 4.10 kernel, we added the SIOCUNIXFILE ioctl for unix sockets. This ioctl opens a file to which a socket is bound and returns a file descriptor. This new ioctl allows us to get mnt_id by reading fdinfo, and mnt_id is enough to find a proper mount point and a mount namespace. The logic of this patch is straight forward. On dump, we save mnt_id for sockets, on restore we find a mount namespace by mnt_id and restore this socket in its mount namespace. Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:37 +03:00
Andrey Vagin	f7a0a7c3b7	unix: resolve a socket file when a socket descriptor is available unix_process_name() are called when sockets are being collected, but at this moment we don't have socket descriptors. A socket descriptor is reuired to get mnt_id, what will allow to resolve a socket path in its mount namespace. Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:33 +03:00
Andrey Vagin	5a35774f81	kerndat: check the SIOCUNIXFILE ioctl for unix sockets This ioctl opens a file to which a socket is bound and returns a file descriptor. This file descriptor can be used to get mnt_id and a file path. Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:25 +03:00
Andrey Vagin	ba2ddd09c4	unix: handle sockets with USK_CALLBACK as external sockets The USK_CALLBACK flag means that a socket is externel and will be restored by a plugin. open_unixsk_standalone should not be called to these sockets. $ make -C test/others/unix-callback/ run ... (00.109338) 7471: sk unix: Opening standalone socket (id 0xd ino 0 peer 0x63b) (00.109376) 7471: Error (criu/sk-unix.c:1128): sk unix: BUG at criu/sk-unix.c:1128 Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:21 +03:00
Andrey Vagin	9a429e3de8	zdtm: check unix sockets in two mount namespaces Unix file sockets have to be restored in proper mount namespaces. Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-23 00:51:06 +03:00
Cyrill Gorcunov	9fda2acb8f	unix: Fix nil dereference in find_queuer_for When walking over unix sockets make sure the queuer is present before accessing it. https://jira.sw.ru/browse/PSBM-82796 Reported-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-23 00:48:41 +03:00
Kir Kolyshkin	d8d9a67d5d	scripts/build/binfmt_misc: fix for bash There was a "; done" leftover here, somehow ignored by dash but not bash. Remove it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-03-23 00:46:11 +03:00
Kir Kolyshkin	1c67bbf1a7	scripts/build/Dockerfile.rawhide: rm It is not used, probably was committed by mistake. Fixes: `2d093a1702` ("travis: add a job to test on the fedora rawhide") Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-03-23 00:46:11 +03:00
Kir Kolyshkin	b5fe770f8d	CI: fix Fedora rawhide Fix Fedora rawhide CI failure caused by coreutils-single and our way of running under QEMU. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-03-23 00:46:11 +03:00
Kir Kolyshkin	b21fda47fd	scripts/build/Dockerfiles: nitpicks 1. Sort lists of packages to be installed, unify indentation. 2. Merge "ccache -s" and "ccache -z". Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-03-23 00:46:11 +03:00
Kir Kolyshkin	7953b4037e	Fix zdtm with Ubuntu Bionic/arm/clang In Ubuntu Bionic for armhf, clang is compiled for armv8l rather than armv7l (as it was and still is for gcc) and so it uses armv8 by default. This breaks compilation of tests using smp_mb(): > error: instruction requires: data-barriers The fix is to add "-march=armv7-a" to CFLAGS which we already do, except not for the tests. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-03-23 00:46:11 +03:00
Adrian Reber	bed49b39ea	Print CRIU and kernel version also in RPC mode The newly introduced output of the CRIU and kernel version does not happen when running CRIU under RPC. This moves the print_versions() function util.c and calls it from cr-service.c Signed-off-by: Adrian Reber <areber@redhat.com>	2018-03-16 08:41:17 +03:00
Kirill Tkhai	c66ce3095f	inotify: Use fast way of obtaining desired watch descriptor number This patch makes restore_one_inotify() to request specific watch descriptor number instead of iterating in (possible) long-duration loop if system supports it. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-16 08:41:17 +03:00
Kirill Tkhai	1cfb96956b	kdat: Add check for inotify() INOTIFY_IOC_SETNEXTWD cmd This is a new ioctl, which allows to request next descriptor allocated by inotify_add_watch(). https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git/commit/?h=for_next&id=e1603b6effe177210701d3d7132d1b68e7bd2c93 The patch checks this cmd is supported by kernel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-16 08:41:17 +03:00
Andrew Vagin	d5fe28da67	zdtm: Add tun_ns test tun test in nested net ns wrapper. Signed-off-by: Andrew Vagin <avagin@virtuozzo.com> ktkhai: Makefile hunks Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-16 08:41:17 +03:00
Andrew Vagin	80eaf6c90a	net: Dump tun device net id in img This adds new tunfile_entry::ns_id field and populates it in standard socket way. Restore uses this ns_id to choose correct namespace. Note, we could completelly skip set_netns() on restore in case of !has_ns_id, but using top_net_ns invents some definite behaviour. Signed-off-by: Andrew Vagin <avagin@virtuozzo.com> ktkhai: comment written/code movings Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-16 08:41:17 +03:00
Kirill Tkhai	2e5d5a399d	tun: Check that net ns of tun device is dumped Similar to socket logic, abort the dump, if tun is not related to any net ns, seen before. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-16 08:41:17 +03:00
Kirill Tkhai	b85f402672	tun: Check tun has ioctl() cmd SIOCGSKNS Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-16 08:41:16 +03:00
Kirill Tkhai	67e164bd32	net: Extrack ioctl() call from kerndat_socket_netns() Refactoring, no functional change. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-16 08:41:16 +03:00
Andrei Vagin	44dadb9a56	[v2] criu: add -fprofile-update=atomic for builds with gcov Sometimes we see errors like this: criu/cr-restore.gcda:Merge mismatch for function 106 It proabably means that this gcda file was corrupted. According to the gcc man page, the -fprofile-update=atomic should fix this problem. v2: this options appered in gcc7, so we need to install it. Reported-by: Mr Travis CI Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2018-03-10 01:15:23 +03:00
Kirill Tkhai	cecae02ede	service_fd: Place lazy pages sk to fdstore LAZY_PAGES_SK_OF is need only once for every process, and it's not frequently used, so we can place it to fdstore. https://travis-ci.org/tkhai/criu/builds/343405755 Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2018-03-02 22:48:28 +03:00
Pavel Tikhomirov	6a0f4cd787	usernsd: print additional debuging due to silent fails of userns_call We have these fail from prep_usernsd_transport, here is log (with only fitst hunk applied): ========================= Run zdtm/static/env00 in ns ========================== Start test ./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST Run criu dump Run criu restore Fail RESTORE failed: Error(0): Unknown =[log]=> dump/zdtm/static/env00/85/1/restore.log ------------------------ grep Error ------------------------ (00.022991) PID: real 106 virt 1 (00.140293) Running setup-namespaces scripts (00.140393) 1: cg: Cgroups 1 inherited from parent (00.140413) 1: uns: calling usernsd_recv_transport (0, 0) (00.140460) 1: Error (criu/cr-restore.c:1870): Failed to prepare usernsd transport (00.140500) uns: daemon calls 0x48dd30 (106, 1, 0) (00.140538) Error (criu/cr-restore.c:2329): Failed to switch restore stage to CR_STATE_PREPARE_NAMESPACES (00.170506) Error (criu/mount.c:3175): mnt: Can't remove the directory /tmp/.criu.mntns.W2YHjG: No such file or directory (00.170530) uns: calling exit_usernsd (-1, 1) (00.170589) uns: daemon calls 0x48dd00 (104, -1, 1) (00.170600) uns: `- daemon exits w/ 0 (00.175813) uns: daemon stopped (00.175827) Error (criu/cr-restore.c:2539): Restoring FAILED. ------------------------ ERROR OVER ------------------------ https://travis-ci.org/Snorch/criu/jobs/344237216 Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2018-03-02 22:48:28 +03:00
Pavel Tikhomirov	c3abb6999d	restore: print error messages on all error-paths in restore_root_task On restore we once had: Run criu restore Fail RESTORE failed: Error(0): Unknown =[log]=> dump/zdtm/static/env00/85/1/restore.log ------------------------ grep Error ------------------------ (00.122659) Running setup-namespaces scripts (00.122748) 1: cg: Cgroups 1 inherited from parent (00.122765) 1: uns: calling usernsd_recv_transport (0, 0) (00.123140) uns: daemon calls 0x48d8b0 (106, 1, 0) (00.145095) Error (criu/mount.c:3175): mnt: Can't remove the directory /tmp/.criu.mntns.SIOGLA: No such file or directory (00.145124) uns: calling exit_usernsd (-1, 1) (00.145174) uns: daemon calls 0x48d880 (104, -1, 1) (00.145181) uns: `- daemon exits w/ 0 (00.149591) uns: daemon stopped (00.149607) Error (criu/cr-restore.c:2498): Restoring FAILED. ------------------------ ERROR OVER ------------------------ Only "Restoring FAILED." error, so no clue to what's really going on. Printing errors for these "silent" errorpaths can help resolve issues in future: restore_root_task -> fork_with_pid -> open_core -> pb_read_one -> do_pb_read_one -> xmalloc -> prepare_userns -> write_id_map -> get_service_fd -> restore_wait_inprogress_tasks ->__restore_wait_inprogress_tasks -> futex_get -> restore_switch_stage -> restore_wait_other_tasks -> __restore_wait_inprogress_tasks -> futex_get -> move_veth_to_bridge -> external_for_each_type -> move_to_bridge -> external_val -> fault_injected -> __fault_injected -> depopulate_roots_yard -> try_clean_remaps -> clean_one_remap -> rst_get_mnt_root -> lookup_mnt_id -> __lookup_mnt_id Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2018-03-02 22:48:28 +03:00
Pavel Tikhomirov	ffd415a5b5	memory: don't use parent memdump if detected possible pid reuse We have a problem when a pid is reused between consequent dumps we can't understand if pagemap and pages from images of parent dump are invalid to restore these pid already. That can lead even to wrong memory restored for these pid, see the test in last patch. So these is a try do separate processes with (likely) invalid previous memory dump from processes with 100% valid previous dump. For that we use the value of /proc/<pid>/stat's start_time and also the timestamp of each (pre)dump. If the start time is strictly less than the timestamp, that means that the pagemap for these pid from previous dump is valid - was done for exactly the same process. Creation time is in centiseconds by default so if predump is really fast (<1csec) we can have false negative decisions for some processes, but in case of long running processes we are fine. https://jira.sw.ru/browse/PSBM-67502 v2: remove __maybe_unused for get_parent_stats; fix get_parent_stats to have static typing; print warning only if unsure; check has_dump_uptime v3: read parent stats from image only once; reuse stat from previous parse_pid_stat call on dump v4: move code to function; use unsigned long long for ticks; put proc_pid_stat on mem_dump_ctl; print warning on all pid-reuse cases v5: free parent's stats entry properly, pass it in arguments to (pre_)dump_one_task v6: free parent's stats in error path too v7: zero init parent_se v8: improve error message Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2018-02-28 22:57:30 +03:00
Pavel Tikhomirov	4a43486e24	stats: add a helper to get stats of parent pre-dump will be used in the next patch https://jira.sw.ru/browse/PSBM-67502 note: actually we need only one value from stats entry but I still prefer general helper as we still need to read and allocate memory for the whole structure v2: fix get_parent_stats to have static typing v3: simplify get_parent_stats to return a StatsEntry pointer instead of doing it through arguments v8: replace errors with warnings, we should whatch on them only if we have corresponding error in detect_pid_reuse else they are fine Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2018-02-28 22:57:30 +03:00
Pavel Tikhomirov	fbba4d249a	stats: save uptime to know when dump had happened We want to use a simple fact: If we have an alive process in a pstree we want to dump, and a starttime of that process is less than pre-dump's timestamp, then these exact process existed (100% sure) at the time of these pre-dump and the process' memory was dumped in images. Thus we save uptime while in freezed state else these won't work. https://jira.sw.ru/browse/PSBM-67502 Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>	2018-02-28 22:57:30 +03:00

... 3 4 5 6 7 ...

9632 Commits