Otherwise there is an imbalance in logs with number of
"Reverted working dir" message
| (00.018604) 36: unix: Connected 0x11ceff -> 0x11cf00 (0x11cf00) flags 0
| (00.018644) 36: unix: Reverted working dir
| (00.018652) 36: unix: Connected 0x11cefd -> 0x11cefe (0x11cefe) flags 0
| (00.018665) 36: unix: Reverted working dir
| (00.018688) 36: unix: Reverted working dir
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
To unify style of pointers fetching from memory slab.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
For readability sake. And use standart uint8_t types.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
On ppc64/aarch64 Linux can be set to use Large pages, so the PAGE_SIZE
isn't build-time constant anymore. Define it through _SC_PAGESIZE.
There are different sizes for a page on ppc64:
: #if defined(CONFIG_PPC_256K_PAGES)
: #define PAGE_SHIFT 18
: #elif defined(CONFIG_PPC_64K_PAGES)
: #define PAGE_SHIFT 16
: #elif defined(CONFIG_PPC_16K_PAGES)
: #define PAGE_SHIFT 14
: #else
: #define PAGE_SHIFT 12
: #endif
And on aarch64 there are default sizes and possibly someone can set his
own PAGE_SHIFT:
: config ARM64_PAGE_SHIFT
: int
: default 16 if ARM64_64K_PAGES
: default 14 if ARM64_16K_PAGES
: default 12
On the downside - each time we need PAGE_SIZE, we're doing libc
function call on aarch64/ppc64.
Fixes: #415
Tested-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
PAGE_SIZE will be a variable value on platforms where it can be
different due to large pages.
And looks like (c) there is no reason for BUF_SIZE == PAGE_SIZE,
so let's keep it as it was, rather than complicating it with dynamic
allocation for the buffer.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The same value, but as PAGE_SIZE can be different for the same
platform - it's no more static value.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Personality value is printed in kernel like this:
static int proc_pid_personality(/* .. */)
{
int err = lock_trace(task);
if (!err) {
seq_printf(m, "%08x\n", task->personality);
unlock_trace(task);
}
return err;
}
So, we don't need a whole page to read the value.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
For architectures like aarch64/ppc64 it's needed to propagate the size
of page inside PIEs. For the parasite page size will be defined during
seizing, and for restorer during early initialization.
Afterward we can use PAGE_SIZE in PIEs like we did before.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The macro is used only in aio_estimate_nr_reqs():
unsigned int k_max_reqs = NR_IOEVENTS_IN_NPAGES(size/PAGE_SIZE);
Which compiler may evaluate as (((PAGE_SIZE*size)/PAGE_SIZE) - ...)
It works as long as PAGE_SIZE is long.
The patches set converts PAGE_SIZE to use sysconf() returning
(unsigned), non-long type and making the aio macro overflowing.
I do not see any value making PAGE_SIZE (unsigned long) typed.
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It's actually number of bytes spliced, not pages.
And I bet (unsigned long) suits the purpose more than (int).
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It's unused since commit fd3f33f5d2 ("headers: image.h -- Drop unused
entries"), so let's remove it completely.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Otherwise, tests dependencies are not considered for build.
Add an error in Makefile.inc so this won't happen again.
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This reverts commit dcafa78b96.
I've found that we already include deps in Makefile.inc,
was to fast on the first attempt and overlooked this.
(the include just doesn't work like it should yet..)
The origin patch may be just dropped before preparing master's merge.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
CID 190174 (#1 of 1): Argument cannot be negative (NEGATIVE_RETURNS)
6. negative_returns: fd is passed to a parameter that cannot be negative.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
>>> CID 190177: Integer handling issues (NEGATIVE_RETURNS)
>>> rpc_sk is passed to a parameter that cannot be negative.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
CID 84654 (#1 of 1): Resource leak (RESOURCE_LEAK)
6. leaked_handle: Handle variable fd going out of scope leaks the handle.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The purpose of the num_children field in 'struct lazy_pages_info' was to
prevent closing the page-read while there are still active processes that
share it. It did work for the case when handling of the child processes
finished before the parent process. However, if the parent lpi is closed
first, we've got a dangling pointer at lpi->parent.
The obvious solution is to use simple reference counting.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It is possible that notification about restore finish arrives at the same
time with a fork event. In such case we return to epoll_run_rfds without
resetting the poll_timeout and then we'll keep polling for events
indefinitely. To avoid this, we reset the poll_timeout to 0 as soon as we
know that restore is finished.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The only use for the userfault file descriptor after the process exited is
for debug logs. Using negative value instead of 0 makes logs more readable.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Although they are the same, xfree() looks more consistent with other code
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
There are a lot of lines, which are longer than 79:
(04.331172)pie: 1: Error (criu/pie/restorer.c:460): seccomp: Unexpected tid ->
(04.331172)pie: 1: 1 != 1
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
kerndat_try_load_cache() fills kdat from /run/criu.kdat,
so it will contain some trash, if criu.kdat isn't compatible with the
current version of criu.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This is test that triggers a bug with ghost files, that was resolved in
patch "Don't fail if ghost file has no parent dirs".
Signed-off-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This test creates a pty pair, creates a test process and sets a slave
pty as control terminal for it.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Here is one of often mistakes:
int funcX()
{
int ret;
ret = funcA()
if (ret < 0)
goto err;
if (smth)
goto err; // return 0 !!!!
err:
return ret;
}
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
There is no reasons we need this cleanup code in generic
restore_one_task(), so let's move it for better readability.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
When dumping a process with a large number of open files,
dump_task_files_seized() processes the fds in batches. If
dump_one_file() results in an error, processing of the current batch is
stopped but the next batch (if any) will still be fetched and the error
value is overwritten. The result is a corrupt dump image (the fdinfo
file is missing a bunch of fds) which results in restore failure.
Also close all received fds after an error (previously the skipped ones
were left open).
Signed-off-by: Andy Tucker <agtucker@google.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
log_init() opens a log file. We want to have criu and kernel versions in
each log file, so it looks reasonable to print them from log_init().
Without this patch, versions are not printed, if criu is called in the
swrk mode.
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
We have three code paths, where we call log_set_loglevel() and in all
these places, we need to initialize libraries logging engines, so it is
better to do this from one function. For example, currently we forgot to
initialize soccr logging for criu swrk.
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
iptables creates /run/xtables.lock file and
we want to have it per-test.
(00.332159) 1: Running iptables-restore -w for iptables-restore -w
Fatal: can't open lock file /run/xtables.lock: Permission denied
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Starting with iptables 1.6.2, we have to use the --wait option,
but it doesn't work properly with userns, because in this case,
we don't have enough rights to open /run/xtables.lock.
(00.174703) 1: Running iptables-restore -w for iptables-restore -w Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.192058) 1: Error (criu/util.c:842): exited, status=4
(00.192080) 1: Error (criu/net.c:1738): iptables-restore -w failed
(00.192088) 1: Error (criu/net.c:2389): Can't create net_ns
(00.192131) 1: Error (criu/util.c:1567): Can't wait or bad status: errno=0, status=65280
This patch workarounds this problem by mounting tmpfs into /run.
Net namespaces are restored in a separate process, so we can create a
new mount namespace and create new mounts.
https://github.com/checkpoint-restore/criu/issues/469
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
iptables 1.6.2 fails without /run, because it tries to create
the /run/xtables.lock file
Test output: ================================
Fatal: can't open lock file /run/xtables.lock: No such file or directory
23:29:06.098: 4: ERR: netns-nf.c:21: Can't set input rule (errno = 2 (No such file or directory))
23:29:06.098: 3: ERR: test.c:315: Test exited unexpectedly with code 255
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>