When dumping a process with a large number of open files,
dump_task_files_seized() processes the fds in batches. If
dump_one_file() results in an error, processing of the current batch is
stopped but the next batch (if any) will still be fetched and the error
value is overwritten. The result is a corrupt dump image (the fdinfo
file is missing a bunch of fds) which results in restore failure.
Also close all received fds after an error (previously the skipped ones
were left open).
Signed-off-by: Andy Tucker <agtucker@google.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
lazy-pages is currently broken, so to avoid false positives in travis
because of that, temporarily disable lazy-pages testing
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
log_init() opens a log file. We want to have criu and kernel versions in
each log file, so it looks reasonable to print them from log_init().
Without this patch, versions are not printed, if criu is called in the
swrk mode.
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
We have three code paths, where we call log_set_loglevel() and in all
these places, we need to initialize libraries logging engines, so it is
better to do this from one function. For example, currently we forgot to
initialize soccr logging for criu swrk.
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This test creates a pty pair, creates a test process and sets a slave
pty as control terminal for it.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This is test that triggers a bug with ghost files, that was resolved in
patch "Don't fail if ghost file has no parent dirs".
Signed-off-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
In the current model we haven't started the background page transfer until
POLL_TIMEOUT time has elapsed since the last uffd or socket event. If the
restored process will do memory access one in (POLL_TIMEOUT - eplsilon) the
filling of its memory can take ages.
This patch changes them model in the following way:
* poll for the events indefinitely until the restore is complete
* the restore completion event causes reset of the poll timeout to zero and
* starts the background transfers
* after each transfer we return to check if there are any uffd events to
handle
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Currently, once we get to transfer pages in the "background", we try to
fetch the entire IOV at once. For large IOVs this may impact #PF latency
for the #PF events occurred during the transfer.
Let's add a simple heuristic for controlling size of the background
transfers. Initially, the transfer will be limited to some default value.
Every time we transfer a chunk we increase the transfer size until it
reaches a pre-defined maximal size. A page fault event resets the
background transfer size to its initial value.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The complete_forks function presumes that it always has a work to do
because we assume that fork event is the only case when we drop out of
epoll_run_rfds with positive return value.
Teach complete_forks to bail out when there is no pending forks to process
to allow exiting epoll_run_rfds for different reasons.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
First check if there are pages we need to transfer and only afterwards
check if there are outstanding requests. Also, instead checking 'bool
remaining' to see if there is more work to do we can simply check if all
the lpi's have been already serviced.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The intention is to use this function for transferring all the pages that
didn't cause a #PF.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The function anyway pick the next page range to transfer it's just doing it
in very simple FIFO manner.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
We already have a queue for the requested memory ranges which contains
'lp_req' objects. These objects hold the same information as the lazy_iov:
start address of the range, end address and the address that the range had
at the dump time.
Rather than keep this information twice and use double bookkeeping, we can
extract the requested range from lpi->iovs and move it to lpi->reqs.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Instead of relying on length of various lists add a boolean variable to
lazy_pages_info to make it clean when the process has exited
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Currently zdtm doesn't detect when restore failed, if it is executed
with strace. With this patch, fake-restore.sh creates a test file, and
zdtm is able to distinguish when restore failed.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The get() method requires a key and now we are using an index. That
will never work correctly as it is now.
Acked-by: Adrian Reber <adrian@lisas.de>
Reported-by: Adrian Reber <adrian@lisas.de>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Currently we restore all sockets in the root mount namespace, because we
were not able to get any information about a mount point where a socket
is bound. It is obviously incorrect in some cases.
In 4.10 kernel, we added the SIOCUNIXFILE ioctl for unix sockets. This
ioctl opens a file to which a socket is bound and returns a file
descriptor.
This new ioctl allows us to get mnt_id by reading fdinfo, and mnt_id
is enough to find a proper mount point and a mount namespace.
The logic of this patch is straight forward. On dump, we save mnt_id for
sockets, on restore we find a mount namespace by mnt_id and restore this
socket in its mount namespace.
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
unix_process_name() are called when sockets are being collected,
but at this moment we don't have socket descriptors.
A socket descriptor is reuired to get mnt_id, what will allow to resolve
a socket path in its mount namespace.
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This ioctl opens a file to which a socket is bound and
returns a file descriptor. This file descriptor can be used to get
mnt_id and a file path.
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The USK_CALLBACK flag means that a socket is externel and will be
restored by a plugin. open_unixsk_standalone should not be called to
these sockets.
$ make -C test/others/unix-callback/ run
...
(00.109338) 7471: sk unix: Opening standalone socket (id 0xd ino 0 peer 0x63b)
(00.109376) 7471: Error (criu/sk-unix.c:1128): sk unix: BUG at criu/sk-unix.c:1128
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Unix file sockets have to be restored in proper mount namespaces.
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
When walking over unix sockets make sure the
queuer is present before accessing it.
https://jira.sw.ru/browse/PSBM-82796
Reported-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
It is not used, probably was committed by mistake.
Fixes: 2d093a170227 ("travis: add a job to test on the fedora rawhide")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In Ubuntu Bionic for armhf, clang is compiled for armv8l rather than
armv7l (as it was and still is for gcc) and so it uses armv8 by default.
This breaks compilation of tests using smp_mb():
> error: instruction requires: data-barriers
The fix is to add "-march=armv7-a" to CFLAGS which we already do,
except not for the tests.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The newly introduced output of the CRIU and kernel version does not
happen when running CRIU under RPC. This moves the print_versions()
function util.c and calls it from cr-service.c
Signed-off-by: Adrian Reber <areber@redhat.com>
This patch makes restore_one_inotify() to request specific
watch descriptor number instead of iterating in (possible)
long-duration loop if system supports it.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
tun test in nested net ns wrapper.
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
ktkhai: Makefile hunks
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
This adds new tunfile_entry::ns_id field and populates
it in standard socket way. Restore uses this ns_id
to choose correct namespace. Note, we could completelly
skip set_netns() on restore in case of !has_ns_id, but
using top_net_ns invents some definite behaviour.
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
ktkhai: comment written/code movings
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Sometimes we see errors like this:
criu/cr-restore.gcda:Merge mismatch for function 106
It proabably means that this gcda file was corrupted. According to the
gcc man page, the -fprofile-update=atomic should fix this problem.
v2: this options appered in gcc7, so we need to install it.
Reported-by: Mr Travis CI
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
LAZY_PAGES_SK_OF is need only once for every process,
and it's not frequently used, so we can place it
to fdstore.
https://travis-ci.org/tkhai/criu/builds/343405755
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>