When test is run in pseudo-container (--parallel execution) the new namespace's
init is the python script itself. Thus all dying tests get reparent-ed to it and
sit as zombies forever.
Create pseudo-init for such containers ripping all the children.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
pr_perror() is special, it adds \n at the end so there is
no need to supply one.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In places where we have errno value set, such as after calling ptrace(),
it makes sense to use pr_perror as it appends the errno string. This
also fixes missing '\n' at the end (as pr_perror() adds it).
In places where we keep using pr_err(), don't forget to have '\n'.
Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Don't forget '\n' when using pr_err()
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
As ptrace() sets errno, it makes sense to use pr_perror().
This also fixes the bug of missing '\n'.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we don't use userns, __userns_sysctl_op is called
in context of the current process. A mount namespaces is restored
the last one, so when we restore namespaces, we see /proc from the
host pid namespace. In this case we can't use virtual pid to access
/proc/pid.
Let's open /proc/self/ns and use this descriptor to switch namespaces.
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Fixes: f79f4546cfc0 ("sysctl: move sysctl calls to usernsd")
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There are still systems which do not support dirty memory tracking.
This offers an interface to query the dirty memory tracking availability
via the new feature check RPC.
This is in preparation of a p.haul change which will use this RPC
interface to automatically detect if pre-dumps should be executed
or not.
This change introduces an additional optional field in the
criu_request and criu_response message (features) which is a
'criu_features' message.
Right now only the check for the memory tracking feature is supported
in the message 'criu_features':
optional bool mem_track = 1;
v2: Instead of checking for memory tracking only, provide a generic
interface to check for arbitrary features.
v3: Do not use bitfields for feature encoding but protobuf optional
message parameters.
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This allows the user to perform actions before dumping or restoration
occurs.
Signed-off-by: Matthew Krafczyk <krafczyk.matthew@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Chris Lamb <lamby@debian.org> reported in Debian that criu is not
building reproducible while working on the "reproducible builds" effort
[0].
[0] https://wiki.debian.org/ReproducibleBuilds
[1] https://bugs.debian.org/801211
Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
sys_sigaction() returns an error code
Reported-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Namespace roots might be slave ones from another
namespace roots, so we should not treat them as
"always ready" for mounting but rely on general
logic in can_mount_now which tests slaves relations.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we restore sockets with relative names we change
current working directory into the one provided by
socket image data. This actually affects current
criu state because the rest of code doesn't know
about such tricks and may rely on working dir
consistency.
So remember the current working dir and restore it
back once socket cwd operations are complete.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
377763e5 is incorrect since we can't always chop off the last element in
the buffer:
Execute static/cgroup00
./cgroup00 --pidfile=cgroup00.pid --outfile=cgroup00.out --dirname=cgroup00.test
Dump 12819
(00.003514) Error (files-reg.c:624): Can't create link remap for /dev/nul. Use link-remap option.
(00.003523) Error (cr-dump.c:1257): Dump files (pid: 12819) failed with -1
(00.004042) Error (cr-dump.c:1619): Dumping FAILED.
WARNING: cgroup00 returned 1 and left running for debug needs
Test: zdtm/live/static/cgroup00, Result: FAIL
==================================== ERROR ====================================
Test: zdtm/live/static/cgroup00, Namespace:
================================= ERROR OVER =================================
Hopefully the >= will appease coverity (instead of just a ==).
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Such case is actually a BUG but while we can resolve
the situation without real bug-on call lets walk in
a gentle way.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CID 152112 (#1 of 1): Missing break in switch (MISSING_BREAK)
unterminated_case: The case for value 4 is not terminated by a 'break' statement.
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Naturally, checking strstr()+1 for NULL is useless.
Reported by Coverity, CID 51594.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's better to
1. Use strlcpy() instead of strncpy() as otherwise we might end up
with a not NULL-terminated string, which opens a portal to hell.
There are a few places reported by Coverity for this, such as:
- in criu_connect(), Coverity CID 51591;
- in proc_pid_parse(), Coverity CID 51590;
- in move_veth_to_bridge(), Coverity CID 51593;
- etc.
2. Use strlcpy() instead of strcpy() to avoid buffer overruns.
Some of these are also reported by Coverity, for example
the one in dump_filemap(), Coverity CID 51630.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This must be boolean not logical NOT.
Reported by Coverity, CID 114612.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is obviously a copy-paste typo.
Reported by Coverity, CID 114615.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case buf size is not adequate, report and return an error.
This is modelled after commit 1ead3d7.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is a classical off-by-one error. If sizeof(buf) is 512,
the last element is buf[511] but not buf[512].
Note that if read() returns 0, we return 0 but buf stays
uninitialized.
Reported by Coverity, CID 114623.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is a classical off-by-one error. If sizeof(buf) is 512,
the last element is buf[511] but not buf[512].
Reported by Coverity, CID 114624, 114622 etc.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The commit 871da9a11147 ("pie: Give VDSO symbol table local scope")
move the definition of the vdso_symbols from global to local variable
in vdso_fill_symtable(). This makes sense since this variable is only
used in this function. However this raises a build issue on powerPC,
where a memcpy undefined symbol is detected when doing the first
relocation phase of the parasite code:
parasite_blob: Error (pie/piegen/elf.c:258): Unexpected undefined
symbol:memcpy
This memcpy symbol is pulled by the C compiler generated code which
tries to optimize the stack initialization when entering
vdso_fill_symtable(). The optimization is done by copying the
initialized data to the stack using memcpy. But when building the
parasite code, the C library is not linked and there is no memcpy
symbol. However there is builtin_memcpy() which is doing the same.
Ideally, the builtin_memcpy should be named memcpy() to replace the C
library one, and it should only be built for the parasite/restorer
code. But the way CRIU is built, the same vdso-util.o file is used
twice for criu which is linked with the C library and by the
parasite/restorer code. Thus naming builtin_memcpy memcpy leads to
belongs on builtin_memcpy even when the C library is in the picture,
which is not the best option (assuming C library mem operation are
more efficient).
Among the memcpy symbol issue, this shows that same objects are used
both in CRIU and the parasite/restorer code. This should not be the
case since parasite/restorer are built in pie form and criu's object
not. The shared code should be built twice, once on pie form for the
parasite/restorer code, once *normally* for the criu binary.
Addressing the build issue implies more work than expected.
For the moment, this patch is defining a memcpy service when building
the parasite code to fix the build issue on ppc64.
Once the build issue is addressed, builtin_memcpy should be renamed
memcpy and only be used for parasite/restorer code, and this
definition removed.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A little bit more stuff added :) With these changes I can run the
zdtm.py run --all -x cgroup -x maps04 -x different_creds -x rtc
To run cgroups tests need to add .hook calls, for maps04 I don't have
enough RAM and disk in my VM (will fix), for different_creds need to
support crfail test option (dump _must_ fail), for rtc -- plugins.
So changes since v2:
1. Added exclusion (-x option)
2. Bugfix in parallel run
3. Fixed NS root permissions
4. Fixed checks for maps before and after dump
5. Fixed thread_bomb launch
6. Print test output
7. Support .checkskip scripts
8. Support features
9. Fixed test list
Andrey, thoughts?
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@openvz.org>
When we initialize a sub-mount namespace, we need to use absolute paths.
For example we change cwd in prep_unix_sk_cwd()
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There is no way to redefine install paths into
some custom ones (say plain /usr). Fix it by
using conditional inits.
Reported-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In commit c2271198, Laurent Dufour kindly reunified the VDSO code
that had become duplicated between architectures. Unfortunately
this introduced a regression in AArch64 where apparently due to
the scope of vdso_symbols array of pointers to characters changing
from local to global, load-time relocations became necessary.
The following thread on the GCC mailing list discusses why
load-time relocations can be necessary when pointers are used,
although it doesn't mention the potential for locally scoped
arrays to be handled differently:
https://gcc.gnu.org/ml/gcc/2004-05/msg01016.html
Because the alternatives, such as porting piegen to AArch64, are
far more involved, simply revert the change in scope.
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In the recent VDSO code reunification, some types were changed but
a pair of necessary corresponding changes was omitted. Fix that so
the AArch64 build succeeds without type-related
warnings-turned-errors. Also move the definition to the
AArch64-specific header since it's not currently being used by any
other architectures.
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Tasks in a user namespace cannot write themselves into cgroup tasks files,
(00.845068) 1: Error (cgroup.c:901): cg: Can't move into blkio/lxc/unpriv/tasks (-1/-1): Permission denied
So, let's write them into the tasks via usernsd.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When in a userns, tasks can't write to certain sysctl files:
(00.009653) 1: Error (sysctl.c:142): Can't open sysctl kernel/hostname: Permission denied
See inline comments for details on affected namespaces.
Mostly for my own education in what is required to port something to be
userns restorable, I ported the sysctl stuff. A potential concern for this
patch is that copying structures with pointers around is kind of gory. I
did it ad-hoc here, but it may be worth inventing some mechanisms to make
it easier, although I'm not sure what exactly that would look like
(potentially re-using some of the protobuf bits; I'll investigate this more
if it looks helpful when doing the cgroup user namespaces port?).
Another issue is that there is not a great way to return non-fd stuff in
memory right now from userns_call; one of the little hacks in this code
would be "simplified" if we invented a way to do this.
v2: coalesce the individual struct sysctl_req requests into one big
sysctl_userns_req that is in a contiguous region of memory so that we
can pass it via userns_call. Hopefully nobody finds my little ascii
diagram too offensive :)
v3: use the fork/setns trick to change the syctl values in the right ns for
IPC/UTS nses; see inline comment for details
v4: only use sysctl_userns_req when actually doing a userns_call.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CRIU should not affect process states when it can't dump them.
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2: fix one more place
Reported-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
"ip route dump" dumps only ipv4 routes.
Reported-by: Ross Boucher <boucher@gmail.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To build piegen tool with different compiler/linker than
gcc/ld -- simply run make as
HOSTCC="host-compiler" HOSTLD="host-ld" make
where host-compiler/ld is appropriate program needed.
https://github.com/xemul/criu/issues/63
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This entry is only required if we have it, i.e. -- at restore stage
in the tree we _built_. All other cases, in particular, local tree
collection on restore, do not need such.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@odin.com>
The rfi->path doesn't contain the leading /, neither does the ghost->rpath,
so when attaching it to root don't forget to include one there.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@odin.com>