Since we don't support dumping per-thread creds, let's at least fail to
dump if the creds don't match.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Note that we don't add the test into the list of tests to run, because it will
fail without the associated kernel patch.
v2: spin lock until seccomp strict is set on the child
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Unfortunately, SECCOMP_MODE_FILTER is not currently exposed to userspace,
so we can't checkpoint that. In any case, this is what we need to do for
SECCOMP_MODE_STRICT, so let's do it.
This patch works by first disabling seccomp for any processes who are going
to have seccomp filters restored, then restoring the process (including the
seccomp filters), and finally resuming the seccomp filters before detaching
from the process.
v2 changes:
* update for kernel patch v2
* use protobuf enum for seccomp type
* don't parse /proc/pid/status twice
v3 changes:
* get rid of extra CR_STAGE_SECCOMP_SUSPEND stage
* only suspend seccomp in finalize_restore(), just before the unmap
* restore the (same) seccomp state in threads too; also add a note about
how this is slightly wrong, and that we should at least check for a
mismatch
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For testing purpose we need to disable using of
piegen utility. So lets add PIEGEN make option
thus one can "PIEGEN=no make" to build criu
without piegen at all.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Eric wants to restrict permissions for proc mounts in a non-root userns
according with proc mounts in the root userns.
Author: Eric W. Biederman <ebiederm@xmission.com>
Date: Fri May 8 23:49:47 2015 -0500
mnt: Modify fs_fully_visible to deal with locked ro nodev and atime
Ignore an existing mount if the locked readonly, nodev or atime
attributes are less permissive than the desired attributes
of the new mount.
...
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Reasoning: some systems have /sys/fs/cgroup stuff mounted as read-only
and we have to either remount it rw or create our own set. The former
doesn't look sane as this rw remounting is also done by ststemd, so
let's return back to manual cgyard construction.
This reverts commit 860df95f85.
Conflicts:
cgroup.c
include/cr_options.h
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Instead of keeping around multiple fds that point to various places in
/proc, let's just use /proc and openat() things relative to it.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is a little tricky, since the threads are forked in the restorer blob, we
can't open their attr/curent files to pass into the restorer blob. So, we pass
in an fd for /proc that the restorer blob can use to access the attr/current
files once they exist.
N.B. this is still incorrect in that it restores the same credentials for all
threads in the group; however, it matches the behavior of the current creds
restore code, which also restores the same creds for all threads in the group.
v2: use simple_sprintf() instead of pie_strcat()
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We'll use this in the next patch for printing paths to LSM files in /proc.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise getting
| parasite-syscall.c: In function ‘parasite_infect_seized’:
| parasite-syscall.c:1222:5: error: ‘elf_relocs’ undeclared (first use in this function)
Simply wrap the @elf_relocs_apply with macros.
Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When been playing wich checkpoint/restore of container I found
that we can't reuse existing controller if they were pre-created.
For example currently in PCS7 we're bindmount cgroups which belong
to a container in a form of
/sys/fs/cgroup/<controller>/<container> ==> /sys/fs/cgroup/<controller>
so that CRIU dumps such configuration fine but on restore
it recreates controllers from the scratch which we would
like to bindmount them and ask CRIU to restore subcgroups
and their parameters.
So I extended --manage-cgroups option to take <mode> arguments.
Detailed description in docs.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we always create temporary directory where we restore
cgroups, but this won't work in case if mounting cgroups is forbidden
from inside of a container for some reason (as in OpenVZ kernel).
So one can pass --cgroup-yard option to specify an existing
directory where cgroups are living. By default we assume it
lays in /sys/fs/cgroup.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For example some linkers generate @__export_parasite_args
as symbol which won't relocate. Handle such case properly.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The error I got was:
CC pie/piegen/elf-x86-64.o
In file included from pie/piegen/elf-x86-32.c:16:0:
pie/piegen/elf.c: In function ‘handle_elf_x86_32’:
pie/piegen/elf.c:476:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘Elf32_Word’ [-Werror=format=]
pr_debug("Copying section '%s'\n" \
^
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On PPC64, the hard definition of TFD_IOC_SET_TICKS doesn't match the kernel
one.
We should use the _IOW based on to be more flexible here.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We'll need this for use in the restorer blob for restoring LSMs. It looks like
arm already has openat, so I think it's just x86 and ppc that need it. In any
case, please double check this, as I've only tested it on x86.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If the netns image is absent, the NetnsEntry entry will not be initialized.
Currently restore from old images crashes:
Core was generated by `criu swrk 3'.
Program terminated with signal SIGSEGV, Segmentation fault.
$0 0x0000000000427d80 in netns_entry.free_unpacked ()
(gdb) bt
$0 0x0000000000427d80 in netns_entry.free_unpacked ()
$1 0x0000000000436d07 in prepare_net_ns ()
$2 0x0000000000457c78 in prepare_namespace ()
$3 0x0000000000432917 in restore_task_with_children ()
$4 0x00007fc86acfccfd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
v2: remove debugging code
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The checkpoint and restore of the Power PC floating point registers is
buggy.
The issue is that the signal frame context is defined to store double value
while the protocol buffer is handling unsigned 64bits integer value. A
silent cast done by the compiler was modifying the restored value in our
back.
This fix changes the type used when manipulating the FP registers value to
be consistent between checkpoint and restart.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise the root yard can be propagated into the host mount namespace
and remain there and criu will fail, because it will not be able to
remove the roots yard.
It occures if we give a shared mount as root to "criu restore" and
criu converts it into a slave mount.
Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Introduce a new -o argument to piegen to specify generate file name.
Send the debug stream to stdout and force it to /dev/null in the makefile
if V=1 is not specify.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When building the blob in the generated header file, we may
shrink the output blobk and only copy the sections with the SHF_ALLOC
bit set, the other ones are not needed at runtime.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This cleans the assembly code, removing no more needed trick with the
register 2 (TOC pointer). As a consequence, the __export_restore_task_trampoline()
and __export_unmap_trampoline() are no more needed.
Thus, the changes introduced by the commit de9df91002 ("Per architecture restorer
trampolines") in cr-restore.c are no more used but are not impacting
runtime code anyway.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
- Move relocs application into a separate file which get
compiled as a regular C file in criu (pie/pie-relocs.[ch])
- Move types used by piegen into pie/piegen/uapi/types.h
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
At moment both parasite and restorer do not have
any relocs because we support x86-64 only, but
this will be changed soon so do a call and apply
relocations.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We should use provided @nr_relocs instead of ARRAY_SIZE here.
Didn't spot it earlier simply because at moment on x86-64
there is no relocs at all.
Also when we apply relocation they are to be computed from
virtual base of parasite address, not from local memory
map address, so add @vbase parameter. And fix typo on
addend in gotpcrel.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case of @gotpcrel relocations we need additional
space to carry pointers.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
After this patch one can run ARCH="ia32" make to build
32bit version on CRIU on 64bit host. Note this is only
build procedure which tuned up, the CRIU itself is not
yet ready to make a checkpoint/restore cycle -- a lot
of additional code is needed and here we rather put
stubs simply to make build procedure run.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There is no rip addressing in 32bit code but PIE code
require GOT tables and friend which we better escape
for performance sake. So lets use pc relocations it
should do the trick.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To support x86-32 mode we will need own syscall table.
Here is it. Note the CRIU itself doesn't support such
mode yet.
Meanwhile put syscall table here just in case if someone
is adding new syscall 32 bit variant should be updated
as well.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since at the moment we're running only x86-64 not 32 bit tasks,
and our code is not carrying any big statically defined structures
we can use relocatable files directly with all relocation applied
during building.
This gonna be changed soon once we start supporting 32 bit tasks.
IOW even currently we need (which is not yet done but it's safe)
- check for gotpcrel relocations
- apply relocations with generated elf_apply_relocs helper
Currently overall scheme looks this way
- our object files are linked together into parasite.built-in.bin.o file
- then pie/piegen/piegen tool is called which parses this file and generates
C source code file with bytestream and all information needed to rellocate
this bytestream into a new place (see elf_apply_relocs helper)
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Here we simply build piegen tool which gonna be used
to generate parasite code safe to rellocate. The tool
is taking object file as an argument, parses it and
generates C file with rellocations encoded in form
suitable for fast appliance.
Currently only x86-32 x86-64 is supported.
v2 (by ldufour@):
- Filter PIEGEN
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On restore we have several arrays of objects that get remapped
into pie area and their number is also passed. Clean and shorten
the remapping code a bit and bing their naming to common format.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The criu(8) man-page is generated using asciidoc. The problem with
asciidoc is that, due to its dependencies, it is not available on
all distributions or it is undesired to install all asciidoc
dependencies. The install target was unconditionally installing and thus
building the man-page even if not explicitly specified with 'make docs'.
With the new 'install-criu' target everything besides the man-page is
installed and the target 'install-man' is only called by the target
'install'.
Signed-off-by: Adrian Reber <areber@redhat.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Looks-ok-to: Cyril Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On restore we will use the peer's name to connect() the
socket back, so if there's no name dump should be aborted.
This situation happens when we create a socketpair(), fork
and dump only one task with one pair end.
Reported-by: Artem Kuzmitskiy <artem.kuzmitskiy@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>