As a preparation for __must_check on compel_syscall(), check it on
close() too - maybe not as useful as with other syscalls, but why not.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Also, don't use the magic -2 => return errno on failure.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
>From man ptrace:
> On error, all requests return -1, and errno is set appropriately.
> Since the value returned by a successful PTRACE_PEEK* request may be
> -1, the caller must clear errno before the call, and then check
> it afterward to determine whether or not an error occurred.
FWIW: if ptrace_peek_area() is called with (errno != 0) it may
false-fail if the data is (-1).
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Now that Travis also supports ppc64le and s390x we can remove all qemu
based docker emulation from our test setup. This now runs ppc64le and
s390x tests on real hardware (LXD containers).
Signed-off-by: Adrian Reber <areber@redhat.com>
This switches all arm related tests (32bit and 64bit) to the aarch64
systems Travis provides. For arm32 we are running in a armv7hf container
on aarch64 with 'setarch linux32'.
The main changes are that docker on Travis aarch64 cannot use
'--privileged' as Travis is using unprivileged LXD containers to setup
the testing environment.
Signed-off-by: Adrian Reber <areber@redhat.com>
For CRIU's compile only tests for armv7hf on Travis we are using
'setarch linux32' which returns armv8l on Travis aarch64.
This adds a path in the Makefile to treat armv8l just as armv7hf during
compile. This enables us to run armv7hf compile tests on Travis aarch64
hardware. Much faster. Maybe not entirely correct, but probably good
enough for compile testing in an armv7hf container.
Signed-off-by: Adrian Reber <areber@redhat.com>
Travis uses unprivileged containers for aarch64 in LXD. Docker with
'--privileged' fails in such situation. This changes the travis setup
to only start docker with '--privileged' if running on x86_64.
Signed-off-by: Adrian Reber <areber@redhat.com>
In my previous commit I copied a line with a return into the main script
body. bash can only return from functions. This changes return to exit.
Signed-off-by: Adrian Reber <areber@redhat.com>
Add mnt_subtree_next DFS-next search to remove recursion.
v5: add these patch, remove recursion from sorting helpers
v6: rip out butifull yet unused step-part of nfs-next algorithm
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Build each mntns mount tree alone just after reading mounts for it from
image. These additional step before merging everything to a single mount
tree allows us to have pointers to each mntns root mount at hand, also
it allows us to remove extra complication from mnt_build_tree.
Teach collect_mnt_from_image return a tail pointer, so we can merge
lists together later after building each tree.
Add separate merge_mount_trees helper to create joint mount tree for all
mntns'es and simplify mnt_build_ids_tree.
I don't see any place where we use mntinfo_tree on restore, so save the
real root of mntns mounts tree in it, instead of root_yard_mp, will need
it in next patches for checking restore of these trees.
v2: prepend children to the root_yard in merge_mount_trees so that the
order in merged tree persists
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Images for mount and net namespaces are empty if ns does not belong to
us, thus we don't need to collect on restore.
By adding these checks we will eliminate suspicious messages in logs
about lack of images:
./test/zdtm.py run -k always -f h -t zdtm/static/env00
env00/54/2/restore.log:(00.000332) No mountpoints-5.img image
env00/54/2/restore.log:(00.000342) No netns-2.img image
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The path:
restore_root_task
prepare_namespace_before_tasks
mntns_maybe_create_roots
is always called before the path below:
retore_root_task
fork_with_pid
restore_task_with_children
prepare_namespace
prepare_mnt_ns
populate_mnt_ns
So (!!mnt_roots) == (root_ns_mask & CLONE_NEWNS) in populate_mnt_ns, but
in prepare_mnt_ns we've already checked that it is true, so there is no
need in these check - remove it.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
It seems pretty stable and hasn't add many false-positives during last
months. While can reveal some issues for compatible C/R code.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Update test to support both iptables and nft to create conntrack rules.
Signed-off-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
With the newly introduced aarch64 at Travis it is possible for the CRIU
test-cases to switch to aarch64.
Travis uses unprivileged LXD containers on aarch64 which blocks many of
the kernel interfaces CRIU needs. So for now this only tests building
CRIU natively on aarch64 instead of using the Docker+QEMU combination.
All tests based on Docker are not working on aarch64 is there currently
seems to be a problem with Docker on aarch64. Maybe because of the
nesting of Docker in LXD.
Signed-off-by: Adrian Reber <areber@redhat.com>
Signal masks propagate through execve, so we need to clear them before
invoking the action scripts as it may want to handle SIGCHLD, or SIGSEGV.
Signed-off-by: Nicolas Viennot <nicolas.viennot@twosigma.com>
I don't see many issues with early-log, so we probably don't
need the warning when it was used. Note that after
commit 74731d9 ("zdtm: make grep_errors also grep warnings")
also warnings are grepped by zdtm.py (and I believe that was
an improvement) which prints some bothering lines:
> =[log]=> dump/zdtm/static/inotify00/38/1/dump.log
> ------------------------ grep Error ------------------------
> (00.000000) Will allow link remaps on FS
> (00.000034) Warn (criu/log.c:203): The early log isn't empty
> ------------------------ ERROR OVER ------------------------
Instead of decreasing loglevel of the message, improve it by
reporting a real issue.
Cc: Adrian Reber <adrian@lisas.de>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
PATH is pointing to incorrect location for `criu` executable
causing libcriu tests to fail when running in travis.
Also added statements to display log file contents on failure
to help in debugging.
Signed-off-by: Ashutosh Mehra <asmehra1@in.ibm.com>
libcriu tests are currently broken. This patch fixes couple of
issues to allow the building and running libcriu tests.
1. lib/c/criu.h got updated to include version.h which is present
at "criu/include", but the command to compile libcriu tests is not
specifying "criu/include" in the path to be searched for header
files. This resulted in compilation error.
This can be fixed by adding "-I ../../../../../criu/criu/include"
however it causes more problems as "criu/include/fcntl.h" would now
hide system defined fcntl.h
Solution is to use "-iquote ../../../../../criu/criu/include"
which applies only to the quote form of include directive.
2. Secondly, libcriu.so major version got updated to 2 but
libcriu/run.sh still assumes verion 1. Instead of just updating the
version in libcriu/run.sh to 2, this patch updates the libcriu/Makefile
to use "CRIU_SO_VERSION_MAJOR" so that future changes to major version
of libcriu won't cause same problem again.
Signed-off-by: Ashutosh Mehra <asmehra1@in.ibm.com>
RPC messages are have fairly small size and using space on the stack
might be a better option. This change follows the pattern used with
do_pb_read_one() and pb_write_one().
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
The support for per-pid images with locks has been dropped with
commit d040219 ("locks: Drop support for per-pid images with locks")
and CR_FD_FILE_LOCKS_PID is not used.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
When performing pre-dump we continuously increase the page-pipe size to
fit the max amount memory pages in the pipe's buffer. However, we never
actually set the pipe's buffer size to max. By doing so, we can reduce
the number of pipe-s necessary for pre-dump and improve the performance
as shown in the example below.
For example, let's consider the following process:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void main(void)
{
int i = 0;
void *cache = calloc(1, 1024 * 1024 * 1024);
while(1) {
printf("%d\n", i++);
sleep(1);
}
}
stats-dump before this change:
frozen_time: 123538
memdump_time: 95344
memwrite_time: 11980078
pages_scanned: 262721
pages_written: 262169
page_pipes: 513
page_pipe_bufs: 519
stats-dump after this change:
frozen_time: 83287
memdump_time: 54587
memwrite_time: 12547466
pages_scanned: 262721
pages_written: 262169
page_pipes: 257
page_pipe_bufs: 263
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
The lock status string may be empty. This can happen when the owner of
the lock is invisible from our PID namespace. This unfortunate behavior
is fixed in kernels v4.19 and up (see commit 1cf8e5de40)
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Conflict register for file "sk-opts.proto": READ is already defined in
file "rpc.proto". Please fix the conflict by adding package name on the
proto file, or use different name for the duplication. Note: enum
values appear as siblings of the enum type instead of children of it.
https://github.com/checkpoint-restore/criu/issues/815
Signed-off-by: Andrei Vagin <avagin@gmail.com>
lib/c/criu.c:343:30: error: implicit conversion from enumeration type
'enum criu_pre_dump_mode' to different enumeration type 'CriuPreDumpMode'
(aka 'enum _CriuPreDumpMode') [-Werror,-Wenum-conversion
opts->rpc->pre_dump_mode = mode;
~ ^~~~
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Pre-dump using the process_vm_readv syscall.
During frozen state, only iovecs will be
generated and draining of memory happens
after the task is unfrozen. Pre-dumping of
shared memory remains unmodified.
Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
adding cnt_sub function (complement of cnt_add).
cnt_sub is utilized to decrement stats counter
according to skipped page count during "read"
mode pre-dump.
Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
"read" mode pre-dump may fail even after
adding PROT_READ flag. Adding PROT_READ
works when dumping statically. See added
comment for details.
Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Skip iov-generation for regions not having
PROT_READ, since process_vm_readv syscall
can't process them during "read" pre-dump.
Handle random order of "read" & "splice"
pre-dumps.
Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Two modes of pre-dump algorithm:
1) splicing memory by parasite
--pre-dump-mode=splice (default)
2) using process_vm_readv syscall
--pre-dump-mode=read
Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
1) Instead of tampering with the nr argument, do_full_int80() returns
the value of the system call. It also avoids copying all registers back
into the syscall_args32 argument after the syscall.
2) Additionally, the registers r12-r15 were added in the list of
clobbers as kernels older than v4.4 do not preserve these.
3) Further, GCC uses a 128-byte red-zone as defined in the x86_64 ABI
optimizing away the correct position of the %rsp register in
leaf-functions. We now avoid tampering with the red-zone, fixing a
SIGSEGV when running mmap_bug_test() in debug mode (DEBUG=1).
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Right now, it is created from the pre-dump hook, but
if the --snap option is set, the test fails:
$ python test/zdtm.py run -t zdtm/static/cgroup_yard -f h --snap --iter 3
...
Running zdtm/static/cgroup_yard.hook(--pre-dump)
Traceback (most recent call last):
File zdtm/static/cgroup_yard.hook, line 14, in <module>
os.mkdir(yard)
OSError: [Errno 17] File exists: 'external_yard'
Cc: Michał Cłapiński <mclapinski@google.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Right now it is cleaned up from a post-restore hook,
but zdtm.py can be executed with the norst option:
$ zdtm.py run -t zdtm/static/cgroup_yard --norst
...
OSError: [Errno 17] File exists: 'external_yard'
Cc: Michał Cłapiński <mclapinski@google.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Prior log initialisation CRIU preserves all (early) log messages in a
buffer. In case of error the content of the content of this buffer
needs to be printed out (flushed).
Suggested-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Before the 5.2 kernel, only fpu_state->fpu_state_64.xsave has to be
64-byte aligned. But staring with the 5.2 kernel, the same is required
for pu_state->fpu_state_ia32.xsave.
The behavior was changed in:
c2ff9e9a3d9d ("x86/fpu: Merge the two code paths in __fpu__restore_sig()")
Signed-off-by: Andrei Vagin <avagin@gmail.com>