2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 22:05:36 +00:00
Commit Graph

5675 Commits

Author SHA1 Message Date
Ruslan Kuprieiev
18034bb642 libcriu: allow user to specify service fd, v2
Currently, libcriu is connecting to CRIU service
by itself, just asking user for a path to socket.
But in some cases users need to provide fd instead
path. For example, sometimes task has no access to
criu socket because of strict security mesures, but
is able to inherit fd from a parent that has access
to criu socket.

v2, use union for addr and fd

Signed-off-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-16 23:09:43 +03:00
Ruslan Kuprieiev
5551bbb301 test: libcriu: include protobuf dir
We only need a rpc.pb-c.h header.

Signed-off-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-16 14:30:02 +03:00
Ruslan Kuprieiev
9e0ff7af4f libcriu: use criu_opts structure to keep all the options
criu_opts contains rpc options and notify callback,
so we can keep all options in just one structure.
This will allow us to easily extend libcriu functionality
and yet keep all options in one place.

We're also not hiding rpc opts structure anymore, so
it is pretty clear where power-user should put his own
CriuOpts instance if he would like to do that.

Signed-off-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-16 14:29:47 +03:00
Tycho Andersen
7b20f42f78 rst: move lsm memory allocations before rst_mem_lock()
8ffbe754bd moved the rst_mem_lock() call, but didn't move the
corresponding LSM allocations, so we do that here.

One unfortunate thing is that we have to split this into two steps: first
we have to read the creds to figure out exactly how much memory to
allocate for the lsm string. Since prepare_creds() wants to write directly
to the task_restore_args struct and that can't be allocated until after we
lock the restore memory, we break it up into two steps.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-16 14:26:44 +03:00
Tycho Andersen
3122529fc8 unix: wait for listen() as well as bind()
We need to wait for listen() as well as bind() for internal unix sockets, or we
can race like this:

(00.135950)      1: Opening standalone socket (id 0xb ino 0x9422f peer 0)
(00.135974)    353: Error (sk-unix.c:701): Can't connect 0x947c4 socket: Connection refused
(00.136390)      1: Error (cr-restore.c:1228): 353 exited, status=1
(00.136407)      1:  Putting 0x9422f into listen state

(where 0x9422f is the peer for 0x947c4)

This race was pretty rare for me, but I've run 1000 tests and it didn't
happen so hopefully this patch fixes it :)

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-16 14:25:39 +03:00
Andrey Vagin
445dbd9d09 log: don't forget LF for pr_err()
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-16 14:24:13 +03:00
Cyrill Gorcunov
0162665c8c cgroup: rpc -- Fix @manage_cgroups option parsing
In commit c7d646afb3 we introduced cgroup resotre
modes but when option passed via RPC code it simply
either true or false which erroniously maps to
CG_MODE_PROPS or CG_MODE_IGNORE modes.

Lets map @true to CG_MODE_SOFT to preserve backward
compatibility and enhance this option in future via
separate option.

Reported-by: Ross Boucher <rboucher@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-16 14:23:48 +03:00
Andrey Vagin
de70936ec7 util: don't ignore standard descriptors
It's rudiment. close_old_fds() closes all extra descriptors.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-15 17:36:18 +03:00
Andrey Vagin
686467832f test/pipes: execute two test cases
* reopen a pipe descriptor via /proc/self/fd/X
* give another end of a pipe to "criu restore"

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-15 17:36:11 +03:00
Andrey Vagin
bea8903d38 test/pipes: check that file status flags are restored
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-15 17:36:05 +03:00
Andrey Vagin
d6f39d70ea pipe: add ability to restore both ends of inhereted pipes
When we restore an inhereted pipe, we have only one end and we
don't know whether it's write or read one. So we need to call
reopen_pipe each time.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-15 17:35:59 +03:00
Andrey Vagin
98c8e44f74 pipe: don't create a tranport descriptor for inhereted pipes
Otherwise we can get an error like this:
1: \t\tCreate transport fd /crtools-fd-1-5
...
1: Found id pipe:[122747] (fd 8) in inherit fd list
1: File pipe:[122747] will be restored from fd 9 duped from inherit

1: Error (util.c:131): fd 5 already in use (called at files.c:872)
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-15 17:35:48 +03:00
Andrey Vagin
2045ef1884 tcp: disable the repair mode only for sockets of a current process (v3)
rst_tcp_repair_sockets is filled when all sockets are collected,
but the repair mode should be disabled only for sockets which
are restored in a current process.

https://bugzilla.openvz.org/show_bug.cgi?id=3281

v2: add a comment
v3: typo fix

Fixes: 73e303c8e2 ('rst: Fix rst_tcp_sock memory management')
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-15 17:35:21 +03:00
Andrey Vagin
9c96c55b7a zdtm: don't include linux/seccomp.h
linux/seccomp.h doesn't use in the different_creds test.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-15 17:34:43 +03:00
Tycho Andersen
8d16fe6da9 build: get rid of vestigial Makefile.config test
We don't use this any more (and the test was deleted in a previous patch),
so let's get rid of this too.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-15 17:34:14 +03:00
Pavel Emelyanov
96e0d14283 Revert "travis: install libseccomp-dev"
This reverts commit 063c5b8946.
2015-07-14 18:28:05 +03:00
Andrey Vagin
1776821876 check: try to call clone with CLONE_NEWPID and CLONE_PARENT
This combination was forbidden in 3.12
commit 40a0d32d1eaffe6aac7324ca92604b6b3977eb0e :
"fork: unify and tighten up CLONE_NEWUSER/CLONE_NEWPID checks"

and then it was permited again in 3.13:
commit 1f7f4dde5c945f41a7abc2285be43d918029ecc5
fork:  Allow CLONE_PARENT after setns(CLONE_NEWPID)

Cc: Adrian Reber <adrian@lisas.de>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:08:31 +03:00
Tycho Andersen
e880dbd9b8 test: add test for failing to dump different creds
v2: use the test list instead of the file for telling zdtm.sh the test will
    fail

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:06:09 +03:00
Tycho Andersen
6e218166b4 test: add support for expecting dump failures
We'll use this in the next patch when testing the creds comparison for
threads.

v2: use an explicit list in zdtm.sh instead of a file in the test dir

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:06:07 +03:00
Pavel Emelyanov
f231a11908 rst: Remove actually unused pid arg from restorer_get_vma_hint
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:03:00 +03:00
Pavel Emelyanov
6fe296a26e rst: Remove actually unused pid arg from restore_one_zombie
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:02:53 +03:00
Pavel Emelyanov
dc149e884d rst: Remove actually unused pid arg from prepare_mappings
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:02:45 +03:00
Pavel Emelyanov
73e3925bcd pstree_item: Keep has_seccomp field on rst_info tail
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:01:48 +03:00
Pavel Emelyanov
24f0f49871 pstree_item: Keep creds field on dump_info tail
And rename it for easier grepping

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:01:27 +03:00
Andrey Vagin
063c5b8946 travis: install libseccomp-dev
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:00:57 +03:00
Pavel Emelyanov
8ffbe754bd rst: Lock rst memory allocations earlier
After we got the total remapable rst memory size, we no longer
can allocate from it, otherwise the bootstrap area will not
have enough size.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 14:00:27 +03:00
Pavel Emelyanov
d9a9d4c9b3 rst: Fix timerfd rst memory management
It's similar to previous patch with tcp mem -- no need to
realloc big arrays and then memcpy data between them. It's
enough just to walk timerfd objects at the very end.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 13:59:39 +03:00
Pavel Emelyanov
73e303c8e2 rst: Fix rst_tcp_sock memory management
In current scheme we grow an array with realloc()-s then
memcpy() the result into rst_mem. I propose to get rid
or realloc-s (we already have objects for the data we
need to keep) and memcpy-s (and put objects directly
into rst_mem at the end).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 13:59:21 +03:00
Pavel Emelyanov
7166e3c984 rst: Fix helpers memory allocation
Calling rst_mem_alloc() in a loop with increasing size causes the
n^2 memory grow :) since _alloc is not _realloc.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-14 13:59:08 +03:00
Saied Kazemi
49dc94ad73 Need bigger log buffer to avoid message truncation
The help message of CRIU has grown in size and is truncated because the
size of the private buffer in log.c is too small.  This patch increases
the size of the buffer.

[ The "bad" message is the --help output one ]

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 15:19:14 +03:00
Andrey Vagin
1fff98f76e cr-dump: wait killed processes
The kill syscall queues a signal, but doesn't wait when it will be
handled.

We need to wait processes if we kill them. The user doesn't
expect to find processes after dump in this case.

PTRACE_DETACH returns errors for dead tasks, so we don't need to do it
in these cases.

Cc: Nikita Spiridonov <nspiridonov@odin.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:57:41 +03:00
Ruslan Kuprieiev
679aaa56ca libcriu: add ability to use local options structure
Having ability to have your own options structure is quite nice
and allows much more flexible use of libcriu in cases when you
want to have a bunch of instances of options structures.

This patch also allows users to use raw CriuOpts structure
modified in any suitable way, whether by libcriu's criu_local_set
methods or by using protobuf-c directly.

It is also worth noting, that backward-compatibility in API and ABI
is preserved.

Signed-off-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:55:00 +03:00
Cyrill Gorcunov
337ba4f3a6 pie: piegen -- Fix memory leak
| CID 96750 (#1 of 1): Resource leak (RESOURCE_LEAK)
 | 163. leaked_storage: Variable sec_hdrs going out of scope leaks the storage it points to.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:53:45 +03:00
Laurent Dufour
b857ca947c test/zdtm: Adding ppc64 support
Adding ppc64le specific parts to run test on this architecture.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:52:26 +03:00
Laurent Dufour
5a34ae1891 test/zdtm: Fix test_msg massive stack usage
In test_msg() a buffer is allocated on stack to cook the outputed message.
This buffer's size was defined using the PAGE_SIZE constant defined in
zdtmtst.h file.

On some system like ppc64, the page size is large (64K), leading to massive
stack allocation, which may be too large in case of alternate stack like
the one used in the sigaltstack test.

This fix, defines a 2048 characters buffer for test_msg, and expose a
constant to allocate stack accordingly in the sigaltstack test.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:52:25 +03:00
Laurent Dufour
6f119a22a6 test/zdtm: Fix pagesize issue in PACKET_RX/TX_RING
Calls to setsockopt(PACKET_RX_RING/PACKET_TX_RING) are dependent of the
system's page size.
Using sysconf() page size makes these tests working on ppc64 where page
size is 64K.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:52:25 +03:00
Laurent Dufour
cf4496beba test/zdtm: Do not use hard coded PAGE_SIZE value
Since the page size may be different from an architecture/a system to
another it should not be hard coded to 4096.

As a consequence, several tests are failing on ppc64 due to a wrong page
size value.

This fix belongs to sysconf to get the current page size.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:52:23 +03:00
Tycho Andersen
209693d49b don't assume the kernel has CONFIG_SECCOMP
linux/seccomp.h may not be available, and the seccomp mode might not be
listed in /proc/pid/status, so let's not assume those two things are
present.

v2: add a seccomp.h with all the constants we use from linux/seccomp.h
v3: don't do a compile time check for PTRACE_O_SUSPEND_SECCOMP, just let
    ptrace return EINVAL for it; also add a checkskip to skip the
    seccomp_strict test if PTRACE_O_SUSPEND_SECCOMP or linux/seccomp.h
    aren't present.
v4: use criu check --feature instead of checkskip to check whether the
    kernel supports seccomp_suspend

Reported-by: Mr. Jenkins
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:50:35 +03:00
Tycho Andersen
1fa30840da add a criu check test for PTRACE_O_SUSPEND_SECCOMP
v2: actually set ret = -1 on failure
v3: add a --feature option for suspend_seccomp (and make this patch 1,
    since the tests depend on it now)

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-13 14:50:28 +03:00
Laurent Dufour
fa3c94d52f aio: Don't consider aio_nr_req as a fd
When freeing the vma entries, don't call close on vm_file_fd when dealing
with a VMA AIO entry since the vm_file_fd is then filled with aio_nr_req as
part of the union.

I hit this issue when running the test aio00 on ppc64. Here the value of
the VMA aio aio_nr_req field was matching the value of the service file
descriptor IMG_FD_OFF. This leads to an obscure checkpoint error.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:59:04 +03:00
Laurent Dufour
6fc15485d0 ppc64: Fix broken SYS V shared memory support
The initial support of the SYS V shared memory on ppc64 is broken. The call
to shmat done in the restore blob has no chance to work correctly.

This patch fixes the sys_shmat call.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:57:38 +03:00
Ruslan Kuprieiev
7197aff703 pycriu: images: pb2dict: preserve fields order with "pretty" option
Using collections.OrderedDict allows us to keep fields in the
same order as they appear in corresponding proto files, which
helps to impove readability. In non-pretty mode we still use
regular dict.

Signed-off-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:56:31 +03:00
Tycho Andersen
e0b24e21d3 creds: fail to dump when creds in thread group don't match
Since we don't support dumping per-thread creds, let's at least fail to
dump if the creds don't match.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:38:35 +03:00
Tycho Andersen
f8502fc3d1 ptrace: fix typo in comment
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:38:34 +03:00
Tycho Andersen
c03df1ba2d add a test for SECCOMP_MODE_STRICT
Note that we don't add the test into the list of tests to run, because it will
fail without the associated kernel patch.

v2: spin lock until seccomp strict is set on the child

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:38:33 +03:00
Tycho Andersen
0d8aec0c3a seccomp: add initial support for SECCOMP_MODE_STRICT
Unfortunately, SECCOMP_MODE_FILTER is not currently exposed to userspace,
so we can't checkpoint that. In any case, this is what we need to do for
SECCOMP_MODE_STRICT, so let's do it.

This patch works by first disabling seccomp for any processes who are going
to have seccomp filters restored, then restoring the process (including the
seccomp filters), and finally resuming the seccomp filters before detaching
from the process.

v2 changes:

* update for kernel patch v2
* use protobuf enum for seccomp type
* don't parse /proc/pid/status twice

v3 changes:

* get rid of extra CR_STAGE_SECCOMP_SUSPEND stage
* only suspend seccomp in finalize_restore(), just before the unmap
* restore the (same) seccomp state in threads too; also add a note about
  how this is slightly wrong, and that we should at least check for a
  mismatch

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-24 17:38:32 +03:00
Cyrill Gorcunov
bf4243e303 make: Be able to force turning off piegen
For testing purpose we need to disable using of
piegen utility. So lets add PIEGEN make option
thus one can "PIEGEN=no make" to build criu
without piegen at all.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-19 12:22:54 +03:00
Andrey Vagin
2a0c8db72b proc: mount proc with minimal permissions
Eric wants to restrict permissions for proc mounts in a non-root userns
according with proc mounts in the root userns.

Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri May 8 23:49:47 2015 -0500

    mnt: Modify fs_fully_visible to deal with locked ro nodev and atime

    Ignore an existing mount if the locked readonly, nodev or atime
    attributes are less permissive than the desired attributes
    of the new mount.
...

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-19 12:20:15 +03:00
Pavel Emelyanov
350a7a982a Revert "cgroups: Add ability to reuse existing cgroup yard directory"
Reasoning: some systems have /sys/fs/cgroup stuff mounted as read-only
and we have to either remount it rw or create our own set. The former
doesn't look sane as this rw remounting is also done by ststemd, so
let's return back to manual cgyard construction.

This reverts commit 860df95f85.

Conflicts:
	cgroup.c
	include/cr_options.h

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-16 19:15:20 +03:00
Tycho Andersen
081a5b9e77 pie: use the /proc fd for last pid
Instead of keeping around multiple fds that point to various places in
/proc, let's just use /proc and openat() things relative to it.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-16 12:17:37 +03:00