2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-25 19:38:38 +00:00

9635 Commits

Author SHA1 Message Date
Andy Tucker
3ff6f66785 files: Fail dump if dump_one_file() fails
When dumping a process with a large number of open files,
dump_task_files_seized() processes the fds in batches. If
dump_one_file() results in an error, processing of the current batch is
stopped but the next batch (if any) will still be fetched and the error
value is overwritten. The result is a corrupt dump image (the fdinfo
file is missing a bunch of fds) which results in restore failure.

Also close all received fds after an error (previously the skipped ones
were left open).

Signed-off-by: Andy Tucker <agtucker@google.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-19 23:26:34 +03:00
Mike Rapoport
4459590d5f travis: temporarily disable lazy-pages testing
lazy-pages is currently broken, so to avoid false positives in travis
because of that, temporarily disable lazy-pages testing

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2018-04-11 15:37:36 +03:00
Vitaly Ostrosablin
ca179dd26a zdtm: Fix unlink_multiple_largefiles compilation on ppc64
Compiler wants arguments, explicitly casted to (long long) for %llx
specifier.

Signed-off-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 21:18:14 +03:00
Andrei Vagin
09b4d01e8f criu: print criu and kernel versions from log_init()
log_init() opens a log file. We want to have criu and kernel versions in
each log file, so it looks reasonable to print them from log_init().

Without this patch, versions are not printed, if criu is called in the
swrk mode.

Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 21:16:44 +03:00
Andrei Vagin
621a80fccf criu: initialize logging for libraries from log_set_loglevel()
We have three code paths, where we call log_set_loglevel() and in all
these places, we need to initialize libraries logging engines, so it is
better to do this from one function. For example, currently we forgot to
initialize soccr logging for criu swrk.

Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 21:09:47 +03:00
Andrei Vagin
a83620dc0b zdtm: handle --tcp-established in the rpc mode
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 21:08:50 +03:00
Andrei Vagin
d47a7938c4 travis: don't fail a build due to the GCOV job
It fails too often due to installing gcc-7.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:56:26 +03:00
Andrei Vagin
1d5438310b test/other: add a test to check the --shell-job option
This test creates a pty pair, creates a test process and sets a slave
pty as control terminal for it.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:37:28 +03:00
Vitaly Ostrosablin
e19bd64876 zdtm: Add a test to check if we can C/R ghost files with no parent dirs.
This is test that triggers a bug with ghost files, that was resolved in
patch "Don't fail if ghost file has no parent dirs".

Signed-off-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:36:47 +03:00
Mike Rapoport
d1d506d15b lazy-pages: kill POLL_TIMEOUT
In the current model we haven't started the background page transfer until
POLL_TIMEOUT time has elapsed since the last uffd or socket event. If the
restored process will do memory access one in (POLL_TIMEOUT - eplsilon) the
filling of its memory can take ages.

This patch changes them model in the following way:
* poll for the events indefinitely until the restore is complete
* the restore completion event causes reset of the poll timeout to zero and
* starts the background transfers
* after each transfer we return to check if there are any uffd events to
handle

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:02 +03:00
Mike Rapoport
cc4bb5fd06 lazy-pages: add ability to limit background transfer size
Currently, once we get to transfer pages in the "background", we try to
fetch the entire IOV at once. For large IOVs this may impact #PF latency
for the #PF events occurred during the transfer.

Let's add a simple heuristic for controlling size of the background
transfers. Initially, the transfer will be limited to some default value.
Every time we transfer a chunk we increase the transfer size until it
reaches a pre-defined maximal size. A page fault event resets the
background transfer size to its initial value.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:02 +03:00
Mike Rapoport
882703ddaf lazy-pages: make complete_forks more robust
The complete_forks function presumes that it always has a work to do
because we assume that fork event is the only case when we drop out of
epoll_run_rfds with positive return value.

Teach complete_forks to bail out when there is no pending forks to process
to allow exiting epoll_run_rfds for different reasons.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:02 +03:00
Mike Rapoport
9b04535f79 lazy-pages: simplify background transfer logic
First check if there are pages we need to transfer and only afterwards
check if there are outstanding requests. Also, instead checking 'bool
remaining' to see if there is more work to do we can simply check if all
the lpi's have been already serviced.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:02 +03:00
Mike Rapoport
69b9ab7a58 lazy-pages: rename handle_remaining_pages to xfer_pages
The intention is to use this function for transferring all the pages that
didn't cause a #PF.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:01 +03:00
Mike Rapoport
bc00f635f1 lazy-pages: rename first_pending_iov to pick_next_range
The function anyway pick the next page range to transfer it's just doing it
in very simple FIFO manner.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:01 +03:00
Mike Rapoport
0b29bbd369 lazy-pages: rework requests queueing
We already have a queue for the requested memory ranges which contains
'lp_req' objects. These objects hold the same information as the lazy_iov:
start address of the range, end address and the address that the range had
at the dump time.

Rather than keep this information twice and use double bookkeeping, we can
extract the requested range from lpi->iovs and move it to lpi->reqs.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:01 +03:00
Mike Rapoport
dd85d62e60 lazy-pages: rename iov->*base to iov->*start
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:01 +03:00
Mike Rapoport
9f3c1cdd4a lazy-pages: lazy_iov: use end instead of len
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:01 +03:00
Mike Rapoport
770c31e585 lazy-pages: split_iov: always create the new iov above the one being split
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:01 +03:00
Mike Rapoport
76440cd4c9 lazy-pages: explicitly set process exited condition
Instead of relying on length of various lists add a boolean variable to
lazy_pages_info to make it clean when the process has exited

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-04-06 02:17:01 +03:00
Kirill Tkhai
0d11a50b95 zdtm: Actually add tun_ns test
Previous patch missed "git add", so simlink and .desc
file were not sent...

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-28 20:38:34 +03:00
Andrey Vagin
251dad530b zdtm: check an exit code of a straced restore
Currently zdtm doesn't detect when restore failed, if it is executed
with strace. With this patch, fake-restore.sh creates a test file, and
zdtm is able to distinguish when restore failed.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-28 00:31:53 +03:00
Andrei Vagin
509fac32dd zdtm.py: fix a logic about determing a test flavor in a error case
The get() method requires a key and now we are using an index. That
will never work correctly as it is now.

Acked-by: Adrian Reber <adrian@lisas.de>
Reported-by: Adrian Reber <adrian@lisas.de>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 01:04:56 +03:00
Andrey Vagin
388dceac4a unix: split dump_external_sockets() for readability
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:57 +03:00
Andrey Vagin
ebc4bf2872 unix: fix an error code in bind_unix_sk()
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:52 +03:00
Andrey Vagin
90a3475461 unit: don't check ui->ue->name.len twice in bind_unix_sk()
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:45 +03:00
Andrey Vagin
94fcc4445b unix: split bind_unix_sk() for readability
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:41 +03:00
Andrey Vagin
dcf688b0d6 unix: restore sockets on correct mount points
Currently we restore all sockets in the root mount namespace, because we
were not able to get any information about a mount point where a socket
is bound. It is obviously incorrect in some cases.

In 4.10 kernel, we added the SIOCUNIXFILE ioctl for unix sockets.  This
ioctl opens a file to which a socket is bound and returns a file
descriptor.

This new ioctl allows us to get mnt_id by reading fdinfo, and mnt_id
is enough to find a proper mount point and a mount namespace.

The logic of this patch is straight forward. On dump, we save mnt_id for
sockets, on restore we find a mount namespace by mnt_id and restore this
socket in its mount namespace.

Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:37 +03:00
Andrey Vagin
f7a0a7c3b7 unix: resolve a socket file when a socket descriptor is available
unix_process_name() are called when sockets are being collected,
but at this moment we don't have socket descriptors.

A socket descriptor is reuired to get mnt_id, what will allow to resolve
a socket path in its mount namespace.

Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:33 +03:00
Andrey Vagin
5a35774f81 kerndat: check the SIOCUNIXFILE ioctl for unix sockets
This ioctl opens a file to which a socket is bound and
returns a file descriptor. This file descriptor can be used to get
mnt_id and a file path.

Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:25 +03:00
Andrey Vagin
ba2ddd09c4 unix: handle sockets with USK_CALLBACK as external sockets
The USK_CALLBACK flag means that a socket is externel and will be
restored by a plugin. open_unixsk_standalone should not be called to
these sockets.

$ make -C test/others/unix-callback/ run
...
(00.109338)   7471: sk unix: Opening standalone socket (id 0xd ino 0 peer 0x63b)
(00.109376)   7471: Error (criu/sk-unix.c:1128): sk unix: BUG at criu/sk-unix.c:1128

Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:21 +03:00
Andrey Vagin
9a429e3de8 zdtm: check unix sockets in two mount namespaces
Unix file sockets have to be restored in proper mount namespaces.

Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-23 00:51:06 +03:00
Cyrill Gorcunov
9fda2acb8f unix: Fix nil dereference in find_queuer_for
When walking over unix sockets make sure the
queuer is present before accessing it.

https://jira.sw.ru/browse/PSBM-82796

Reported-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-23 00:48:41 +03:00
Kir Kolyshkin
d8d9a67d5d scripts/build/binfmt_misc: fix for bash
There was a "; done" leftover here, somehow ignored by dash
but not bash. Remove it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-23 00:46:11 +03:00
Kir Kolyshkin
1c67bbf1a7 scripts/build/Dockerfile.rawhide: rm
It is not used, probably was committed by mistake.

Fixes: 2d093a170227 ("travis: add a job to test on the fedora rawhide")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-23 00:46:11 +03:00
Kir Kolyshkin
b5fe770f8d CI: fix Fedora rawhide
Fix Fedora rawhide CI failure caused by coreutils-single and our
way of running under QEMU.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-23 00:46:11 +03:00
Kir Kolyshkin
b21fda47fd scripts/build/Dockerfiles: nitpicks
1. Sort lists of packages to be installed, unify indentation.

2. Merge "ccache -s" and "ccache -z".

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-23 00:46:11 +03:00
Kir Kolyshkin
7953b4037e Fix zdtm with Ubuntu Bionic/arm/clang
In Ubuntu Bionic for armhf, clang is compiled for armv8l rather than
armv7l (as it was and still is for gcc) and so it uses armv8 by default.

This breaks compilation of tests using smp_mb():

> error: instruction requires: data-barriers

The fix is to add "-march=armv7-a" to CFLAGS which we already do,
except not for the tests.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-23 00:46:11 +03:00
Adrian Reber
bed49b39ea Print CRIU and kernel version also in RPC mode
The newly introduced output of the CRIU and kernel version does not
happen when running CRIU under RPC. This moves the print_versions()
function util.c and calls it from cr-service.c

Signed-off-by: Adrian Reber <areber@redhat.com>
2018-03-16 08:41:17 +03:00
Kirill Tkhai
c66ce3095f inotify: Use fast way of obtaining desired watch descriptor number
This patch makes restore_one_inotify() to request specific
watch descriptor number instead of iterating in (possible)
long-duration loop if system supports it.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-16 08:41:17 +03:00
Kirill Tkhai
1cfb96956b kdat: Add check for inotify() INOTIFY_IOC_SETNEXTWD cmd
This is a new ioctl, which allows to request next descriptor
allocated by inotify_add_watch().

https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git/commit/?h=for_next&id=e1603b6effe177210701d3d7132d1b68e7bd2c93

The patch checks this cmd is supported by kernel.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-16 08:41:17 +03:00
Andrew Vagin
d5fe28da67 zdtm: Add tun_ns test
tun test in nested net ns wrapper.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
ktkhai: Makefile hunks
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-16 08:41:17 +03:00
Andrew Vagin
80eaf6c90a net: Dump tun device net id in img
This adds new tunfile_entry::ns_id field and populates
it in standard socket way. Restore uses this ns_id
to choose correct namespace. Note, we could completelly
skip set_netns() on restore in case of !has_ns_id, but
using top_net_ns invents some definite behaviour.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
ktkhai: comment written/code movings
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-16 08:41:17 +03:00
Kirill Tkhai
2e5d5a399d tun: Check that net ns of tun device is dumped
Similar to socket logic, abort the dump,
if tun is not related to any net ns, seen
before.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-16 08:41:17 +03:00
Kirill Tkhai
b85f402672 tun: Check tun has ioctl() cmd SIOCGSKNS
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-16 08:41:16 +03:00
Kirill Tkhai
67e164bd32 net: Extrack ioctl() call from kerndat_socket_netns()
Refactoring, no functional change.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-16 08:41:16 +03:00
Andrei Vagin
44dadb9a56 [v2] criu: add -fprofile-update=atomic for builds with gcov
Sometimes we see errors like this:
criu/cr-restore.gcda:Merge mismatch for function 106

It proabably means that this gcda file was corrupted. According to the
gcc man page, the -fprofile-update=atomic should fix this problem.

v2: this options appered in gcc7, so we need to install it.

Reported-by: Mr Travis CI
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-03-10 01:15:23 +03:00
Kirill Tkhai
cecae02ede service_fd: Place lazy pages sk to fdstore
LAZY_PAGES_SK_OF is need only once for every process,
and it's not frequently used, so we can place it
to fdstore.

https://travis-ci.org/tkhai/criu/builds/343405755

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2018-03-02 22:48:28 +03:00
Pavel Tikhomirov
6a0f4cd787 usernsd: print additional debuging due to silent fails of userns_call
We have these fail from prep_usernsd_transport, here is log (with only
fitst hunk applied):

========================= Run zdtm/static/env00 in ns ==========================
Start test
./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST
Run criu dump
Run criu restore
Fail RESTORE failed: Error(0): Unknown
=[log]=> dump/zdtm/static/env00/85/1/restore.log
------------------------ grep Error ------------------------
(00.022991) PID: real 106 virt 1
(00.140293) Running setup-namespaces scripts
(00.140393)      1: cg: Cgroups 1 inherited from parent
(00.140413)      1: uns: calling usernsd_recv_transport (0, 0)
(00.140460)      1: Error (criu/cr-restore.c:1870): Failed to prepare usernsd transport
(00.140500) uns: daemon calls 0x48dd30 (106, 1, 0)
(00.140538) Error (criu/cr-restore.c:2329): Failed to switch restore stage to CR_STATE_PREPARE_NAMESPACES
(00.170506) Error (criu/mount.c:3175): mnt: Can't remove the directory /tmp/.criu.mntns.W2YHjG: No such file or directory
(00.170530) uns: calling exit_usernsd (-1, 1)
(00.170589) uns: daemon calls 0x48dd00 (104, -1, 1)
(00.170600) uns: `- daemon exits w/ 0
(00.175813) uns: daemon stopped
(00.175827) Error (criu/cr-restore.c:2539): Restoring FAILED.
------------------------ ERROR OVER ------------------------

https://travis-ci.org/Snorch/criu/jobs/344237216

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2018-03-02 22:48:28 +03:00
Pavel Tikhomirov
c3abb6999d restore: print error messages on all error-paths in restore_root_task
On restore we once had:

Run criu restore
Fail RESTORE failed: Error(0): Unknown
=[log]=> dump/zdtm/static/env00/85/1/restore.log
------------------------ grep Error ------------------------
(00.122659) Running setup-namespaces scripts
(00.122748)      1: cg: Cgroups 1 inherited from parent
(00.122765)      1: uns: calling usernsd_recv_transport (0, 0)
(00.123140) uns: daemon calls 0x48d8b0 (106, 1, 0)
(00.145095) Error (criu/mount.c:3175): mnt: Can't remove the directory
	/tmp/.criu.mntns.SIOGLA: No such file or directory
(00.145124) uns: calling exit_usernsd (-1, 1)
(00.145174) uns: daemon calls 0x48d880 (104, -1, 1)
(00.145181) uns: `- daemon exits w/ 0
(00.149591) uns: daemon stopped
(00.149607) Error (criu/cr-restore.c:2498): Restoring FAILED.
------------------------ ERROR OVER ------------------------

Only "Restoring FAILED." error, so no clue to what's really going on.

Printing errors for these "silent" errorpaths can help resolve issues in
future:

restore_root_task
	-> fork_with_pid -> open_core -> pb_read_one -> do_pb_read_one
		-> xmalloc
	-> prepare_userns -> write_id_map -> get_service_fd
	-> restore_wait_inprogress_tasks
		->__restore_wait_inprogress_tasks -> futex_get
	-> restore_switch_stage -> restore_wait_other_tasks
		-> __restore_wait_inprogress_tasks -> futex_get
	-> move_veth_to_bridge -> external_for_each_type -> move_to_bridge
		-> external_val
	-> fault_injected -> __fault_injected
	-> depopulate_roots_yard -> try_clean_remaps -> clean_one_remap
		-> rst_get_mnt_root -> lookup_mnt_id -> __lookup_mnt_id

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2018-03-02 22:48:28 +03:00