So, this time we have quite a lot of new features for a monthly
release cadence, including --leave-stopped on restore, TMEM for
PPC and shmem changes tracking.
Also bugfixes, of course, and a little bit more deprecations.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
TL;DR: this allows to check if printf argument types are valid.
Apparently, gcc is not able to check if the printf arguments
are in sync with the format string, it a string is not a literal.
This can be seen by compiling the code with -Wformat-nonliteral:
CC criu/netfilter.o
criu/netfilter.c: In function ‘nf_connection_switch_raw’:
criu/netfilter.c:80:4: error: format not a string literal, argument
types not checked [-Werror=format-nonliteral]
dip, (int)dst_port, sip, (int)src_port);
Unfortunately we can't just add -Wformat-nonliteral to CFLAGS as there
is at least one other place in the code what uses non-literal string
as a format string for printf-like function. In this very case, though,
there is no need to use a non-literal, so change it to a define.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Apparently when travis checks patches, it compiles code with
-Wformat-security (most probably because the distro/gcc it uses
has it on by default), but on my system (Fedora 24/gcc 6.1.1)
this flag is not on. As a result, code compiles fine for me
but travis reports an error.
Add -Wformat-security to default CFLAGS. It helps to catch
problems like using printf(str) instead of printf("%s", str).
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
It turned out that calling log_first_error() is possible w/o
calling log_keep_first_err(), so don't bug_on() on it, just
return NULL.
Reported-by: Adrian Reber <adrian@lisas.de>
Signed-off-by: Pavel Emelyanov <xemul@virtouzzo.com>
Reviewed-by: Adrian Reber <adrian@lisas.de>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Dmitry:
Thanks for the patch, it looks like it was part of commit 1c249d08870b
("x86: add 32-bit sigframe for rt_sigreturn") from criu-dev.
When I've prepared the patches set, I've tested patches separately from
the set on x86, but hadn't possibility to test them separately on ppc.
And for x86 it didn't matter when to call restore_gpregs() before compat
patches, so I didn't catch that it does matter for ppc.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
1. Fix uninitialized use of pr in cr_dedup_one_pagemap and get_page:
https://github.com/xemul/criu/issues/178
2. In ud_open, close pr in case of error returned from find_vmas->
collect_uffd_pages as we free lpi with lpi->pr open; so need check in
lpi_fini if uffd is >0 before close
v2:rebase to new criu-dev
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
An atexit hook is executed for forked processes too,
clean_tests_root() has to be called only once.
v2: fix flak8 warnings
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Make sure travis/jenkins complain if someone sends a patch
that results in criu --help output violate standard terminal
width requirement.
Cc: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Remove that weird special case from check_add_feature() function, making
it a separate pr_check_features(), which prints a SEP-separated list of
feature names, obeying the max WIDTH, and prepending each line with
OFFSET. That way, we have a decept --help output:
--feature FEAT only check a particular feature, one of:
mnt_id, mem_dirty_track, aio_remap, timerfd, tun,
userns, fdinfo_lock, seccomp_suspend,
seccomp_filters, loginuid, cgroupns, autofs
Alternatively, we could just drop the functionality of showing all the
individual features to check.
[v2: use %s in pr_msg to fix a -Wformat-security warning]
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
* Fix English and rephrase
* Mark with curly braces that PID and NS_FILE are exclusive options
* Fix 80 columns violations
* Remove usage examples
* Remove the "experimental feature" warning
* Add an empty line before "Check options" header
As for removals, I believe --help output is not the proper place for
examples or notices.
Was:
-J|--join-ns NS:PID|NS_FILE[,EXTRA_OPTS]
Join exist namespace and restore process in it.
Namespace can be specified in pid or file path format.
--join-ns net:12345 or --join-ns net:/foo/bar.
Extra_opts is optional, for now only user namespace support:
--join-ns user:PID,UID,GID to specify uid and gid.
Please NOTE: join-ns with user-namespace is not fully tested.
It may be dangerous to use this feature
Check options:
Now:
-J|--join-ns NS:{PID|NS_FILE}[,OPTIONS]
Join existing namespace and restore process in it.
Namespace can be specified as either pid or file path.
OPTIONS can be used to specify parameters for userns:
user:PID,UID,GID
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
There is no need to have it here.
Also, remove curly braces around {net} to avoid confusion.
Was:
--empty-ns {net}
Create a namespace, but don't restore its properies
(assuming it will be restored by action scripts)
Now:
--empty-ns net Create a namespace, but don't restore its properies
(assuming it will be restored by action scripts)
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
* Simplify phrases, removing duplicate words -- saving 1 line
* Drop <>, use UPPERCASE for variable parts as in other places
* Obey 80 columns
Was:
--inherit-fd fd[<num>]:<existing>
Inherit file descriptors. This allows to treat file desc
riptor
<num> as being already opened via <existing> one and ins
tead of
trying to open we inherit it:
tty[rdev:dev]
pipe[inode]
socket[inode]
file[mnt_id:inode]
Now:
--inherit-fd fd[NUM]:RES
Inherit file descriptors, treating fd NUM as being
already opened via an existing RES, which can be:
tty[rdev:dev]
pipe[inode]
socket[inode]
file[mnt_id:inode]
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
In general, we do not end the [last] sentence of an option description
with a period. In a few cases, we do that -- let's fix it.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
* fix a typo (descrition -> description)
* add a comma before "but"
* remove a period at the end of the sentence
Was:
--cgroup-props-file FILE
same as --cgroup-props but taking descrition
from the path specified.
Now:
--cgroup-props-file FILE
same as --cgroup-props, but taking description
from the path specified
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
* fix a typo (usig)
* slightly rephrased
* remove a period at the end
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Hopefully without losing any meaning, but now it fits in 80 cols
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Such a lengthy description is not quite suitable for --help output.
Also, it violates 80 columns and looks ugly as a result.
Fix both issues.
Was:
--skip-in-flight this option skips in-flight TCP connections.
if TCP connections are found which are not yet completely
established, criu will ignore these connections in favor
of erroring out.
Now:
--skip-in-flight skip (ignore) in-flight TCP connections
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
* add missing space between option and the argument
* mark argument as optional
* fix English
* obey 80 columns width
Was:
-x|--ext-unix-skinode,.. allow external unix connections (optionally can be assign socket's inode that allows one-sided dump)
Now:
-x|--ext-unix-sk [inode,...]
allow external unix connections (optional arguments
are socketpair inode(s) that allow one-sided dump)
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
It is somewhat hard to fully describe --cpu-cap in --help output,
but let's at least say that:
- option is used to either write or check capabilities;
- the argument is a comma-separated list;
- empty argument means "all".
Also, while saying it, contain ourselves within 80 columns of output.
The last item requires more work of course. I'm not sure about others,
but I often work in terminals which are 80 columns wide, and non-wrapped
output looks pretty ugly. I mean, we surely can be better than 'adp'.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
We need to allow read/write access for these directories to execute
tests in user namespaces. zdtm.py does this too, but it is racy if
we run a few tests in parallel.
------------------------ grep Error ------------------------
(00.748406) 5: Error (criu/files-reg.c:1487): File zdtm/static has bad mode 040777 (expect 040775)
(00.752027) 1: Error (criu/cr-restore.c:1132): 5 exited, status=1
(00.790562) Error (criu/cr-restore.c:1135): 88 killed by signal 9: Killed
(00.790623) Error (criu/cr-restore.c:2019): Restoring FAILED.
------------------------ ERROR OVER ------------------------
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
The Power 8 introduces the transactional memory (TM) operations (see
Power ISA 3.0 for details).
The support for the transactional memory operation during the
checkpoint and restart requires extended ptrace API provided by the
kernel 4.8.
When checkpointing a thread while a transactional memory operation is
in progress, the TM checkpointed state is checkpointed through the new
ptrace API. If these new APIs are not available, the checkpoint is
aborted and an explicit error is reported.
At restart time, the TM state is pushed on the stack frame to be
reloaded by the kernel when reading the stack frame.
Only suspended TM operation could be checkpointed since active one
will be aborted once a system call is made. Suspended operation will
be aborted as well, and the checkpointed thread is expected to handle
the TM failure as usual (retrying is a good option).
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Since the Transactional memory state will contains VSX, VMX and FP
registers, extracting the common code copying data to protobuf buffer in
separate functions.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
The new constant NVSXREG is defining the number of double word needed
to be save to get the remaining part of the VSX registers to be save.
A major part of the VSX registers is saved when saving FPU and Altivec
registers.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
When dealing with the number of Altivec registers (VR), we should use
the NVRREG constant defined in system file
/usr/include/powerpc64le-linux-gnu/sys/ucontext.h.
However this constant take in account the extra quad word containing
vrsave in split vectors so we must remove 1 to get the exact number of
registers VR.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
It gives us more information why a test hasn't completed in time.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Fixes:
cow01.c: In function 'parent_check':
../lib/zdtmtst.h:120:11: error: format '%lx' expects argument of type 'long unsigned int', but argument 7 has type 'uint64_t {aka long long unsigned int}' [-Werror=format=]
test_msg("FAIL: %s:%d: " format " (errno = %d (%s))\n", \
^
cow01.c:287:5: note: in expansion of macro 'fail'
fail("%s[%#x]: %p is not COW-ed (pagemap of "
^~~~
In file included from inotify_system.c:14:0:
inotify_system.c: In function 'read_set':
../lib/zdtmtst.h:117:11: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'unsigned int' [-Werror=form
at=]
test_msg("ERR: %s:%d: " format " (errno = %d (%s))\n", \
^
inotify_system.c:299:3: note: in expansion of macro 'pr_perror'
pr_perror("read(%d, buf, %lu) Failed, errno=%d",
^~~~~~~~~
deleted_dev.c:53:36: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type '__dev_t {aka long long unsigned
int}' [-Werror=format=]
test_msg("mode %x want %x, dev %lx want %lx\n",
^
deleted_dev.c:53:45: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'dev_t {aka long long unsigned i
nt}' [-Werror=format=]
test_msg("mode %x want %x, dev %lx want %lx\n",
^
../lib/zdtmtst.h:117:11: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'off64_t {aka long long int
' [-Werror=format=]
test_msg("ERR: %s:%d: " format " (errno = %d (%s))\n", \
^
Nothing really interesting, but printings with right format specifier.
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Fixes:
maps03.c: In function 'main':
maps03.c:15:32: error: result of '10l << 30' requires 35 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
mem = (void *)mmap(NULL, (10L << 30), PROT_READ | PROT_WRITE,
^~
maps03.c:22:9: error: result of '4l << 30' requires 34 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
mem[4L << 30] = 1;
^~
maps03.c:23:9: error: result of '8l << 30' requires 35 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
mem[8L << 30] = 2;
^~
maps03.c:30:13: error: result of '4l << 30' requires 34 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
if (mem[4L << 30] != 1 || mem[8L << 30] != 2) {
^~
maps03.c:30:35: error: result of '8l << 30' requires 35 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
if (mem[4L << 30] != 1 || mem[8L << 30] != 2) {
^~
Proceses virtual address space is smaller than 4Gb - omit this test for
those archs.
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
As RPC server the swrk mode is used which, in turn, is easily used
by nice lib/py/criu.py thingie from Ruslan.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Now we have a single place that is really about calling criu
as CLI tool inside this class, so pull one out as a preparation
to having RPC support.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
In the criu_cli class there's the whole bunch of useful code which
not CLI-specific, so drop the _cli suffix.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
When migrating process it might not have slave tty peers at
all so instead of exiting early just wait for its real usage
and only then fail.
Reported-by: Manuel Rodríguez Pascual <manuel.rodriguez.pascual@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
This files are used by zdtm.sh. zdtm.py uses *.desc files.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Without this patch any error in check_fs_type function is considered as a
grant to process to bind-mount.
This patch splits mount point fs type discovering and comparison to autofs
type, thus allowing to check for discovery errors.
Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
It turned out that anon shmem can have pages with non zero content
and with both PME_PRESENT and PME_SWAP bits unset in all its vmas
in the whole ps tree.
Such case is reproduced in issue #209:
1. Dump ps tree with anon shmem filled using datagen.
2. Restore ps tree. anon shmem content is restored
in open_shmem(). fd is created for it and it is
unmapped from restorer process.
3. anon shmem vma is mapped in restore_mapping() of pie restorer context.
anon shmem content is already initialized to non zero content
but restored process doesn't touch its newly mapped vma.
4. Run CRIU dump again. All the pages of anon shmem vmas have
PME_PRESENT and PME_SWAP bits unset and we don't put
vma pages to dump.
So if we filter anon shmem pages using PME_PRESENT and PME_SWAP bits
the same way as we do it for anon private mem then we have a bug.
PME_PRESENT and PME_SWAP bits work for anon private mem because
at least one process would restore content of private anon vma
in its own address space thus PME bits will be set and pages
will be damped.
We can't just stop using PME_PRESENT and PME_SWAP bits and dump all
non soft dirty and non zero pfn pages. In this case each 1Gb of
mapped and not used anon shmem vma will go to dump. This is too bad.
To fix the bug in this patch we use mincore bits to finally
understand should we dump page or not. mincore bits show page
usage status better because mincore performs deeper checking of
internal in-kernel state. PME bits filling is based only on
process page table.
Using mincore has a drawback. It doesn't work when page is in swap.
But it's ok for now because mincore was used before we started using
PME bits. Also mincore doesn't break page changes tracking
functionality for anon shmem that we have now.
This bug can be fixed in another way. For example we can make anon shmem
restoration work similar to anon private mem restoration.
But this fix looks much harder to implement.
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
When running criu in swrk mode the client typically wants to know
the reason of failure. Right now criu reports back NOTHING but the
fact that dump/restore/etc fails. We've tried to address this by
introducing the cr-errno engine, but it doesn't seem to be informative
enough and is hard to maintain -- adding new errno-s is boring :(
I propose to report back the first message with ERROR level upon
failrure as __typically__ the very first error message indicates
that proceeding is impossible and criu rolls back (generating more
error messages, so it's crucial to know the very first one).
If we ever meet the situation that the first pr_err/pr_perror doesn't
cause criu to exit, this printing should be fixed to be pr_warn.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>