Make it possible to skip network lock to enable uses that break connections
anyway to work without iptables/nftables being present.
Signed-off-by: Michał Mirosław <emmir@google.com>
This commit enables checkpointing and restoring of applications as
non-root.
First goal was to enable checkpoint and restore of the env00 and
pthread00 test case.
This uses the information from opts.unprivileged and opts.cap_eff to
skip certain code paths which do not work as non-root.
Co-authored-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
A file's r/w/x changing between checkpoint and restore does
not necessarily imply that something is wrong. For example,
if a process opens a file having perms rw- for reading and
we change the perms to r--, the process can be restored and
will function as expected.
Therefore, this patch adds an option
--skip-file-rwx-check
to disable this check on restore. File validation is unaffected
and should still function as expected with respect to the content
of files.
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
We plan to switch to Mounts-v2 engine for restoring mounts by default,
this options is to allow switching to old engine. This patch only adds
an option, no engine behind it yet.
Cherry-picked from Virtuozzo criu:
https://src.openvz.org/projects/OVZ/repos/criu/commits/503f9ad2c
Changes: allow --mntns-compat-mode option only on restore and only if
MOVE_MOUNT_SET_GROUP is supported (this also requires change in
unittest/mock.c), change id in rpc criu_opts.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This commit adds feature check support to libcriu. It already exists in
the CLI and RPC and this just extends it to libcriu.
This commit provides one function to do all possible feature checks in
one call. The parameter to the feature check function is a structure and
the user can enable which features should be checked.
Using a structure makes the function extensible without the need to
break the API/ABI in the future.
Signed-off-by: Adrian Reber <areber@redhat.com>
In contrast to the CLI it is not possible to do a single pre-dump via
RPC and thus libcriu. In cr-service.c pre-dump always goes into a
pre-dump loop followed by a final dump. runc already works around this
to only do a single pre-dump by killing the CRIU process waiting for the
message for the final dump.
Trying to implement pre-dump in crun via libcriu it is not as easy to
work around CRIU's pre-dump loop expectations as with runc that directly
talks to CRIU via RPC.
We know that LXC/LXD also does single pre-dumps using the CLI and runc
also only does single pre-dumps by misusing the pre-dump loop interface.
With this commit it is possible to trigger a single pre-dump via RPC and
libcriu without misusing the interface provided via cr-service.c. So
this commit basically updates CRIU to the existing use cases.
The existing pre-dump loop still sounds like a very good idea, but so
far most tools have decided to implement the pre-dump loop themselves.
With this change we can implement pre-dump in crun to match what is
currently implemented in runc.
Signed-off-by: Adrian Reber <areber@redhat.com>
In runc we use the join-ns RPC API to enable checkpoint/restore of
containers with shared namespaces. Shared namespaces are often used
when containers run inside Kubernetes Pod.
In crun we use libcriu to interface with CRIU, however it currently
doesn't provide an API for join-ns. This patch adds the necessary
libcriu API to enable checkpoint/restore of containers with shared
namespaces with crun.
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
Fixes: #1560
The latest protobuf-c compiler breaks CRIU because they removed
leading underscores from structs in 1.4.0.
This replaces those definitions with the standard generated structs.
v2: remove struct _VmaEntry, struct _CredsEntry and struct _CoreEntry
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
pidfd_store_sk option will be used later to store tasks pidfds
between predumps to detect pid reuse reliably.
pidfd_store_sk should be a fd of a connectionless unix socket.
init_pidfd_store_sk() steals the socket from the RPC client using
pidfd_getfd, checks that it is a connectionless unix socket and
checks if it is not initialized before (i.e. unnamed socket).
If not initialized the socket is first bound to an abstract name
(combination of the real pid/fd to avoid overlap), then it is
connected to itself hence allowing us to store the pidfds in the
receive queue of the socket (this is similar to how fdstore_init()
works).
v2:
- avoid close(pidfd) overriding errno of SYS_pidfd_open in
init_pidfd_store_sk()
- close pidfd_store_sk because we might have leftover from
previous iterations
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
When using libcriu with the notify callback functionality CRIU transmits
an FD during 'orphan-pts-master' back to libcriu user. This is message
is sent via sendmsg() to transmit the FD and not via write() as all
other protobuf messages.
libcriu was using recv() and to be able to receive the FD this needs to
be changed to recvmsg() and if an FD is attached to it (currently only
for 'orphan-pts-master' this FD is stored in a variable which can be
retrieved with the function criu_get_orphan_pts_master_fd().
Signed-off-by: Adrian Reber <areber@redhat.com>
Although the CRIU version is exported in macros in version.h it only
contains the CRIU version of libcriu during build time.
As it is possible that CRIU is upgraded since the last time something
was built against libcriu, this adds functions to query the actual CRIU
binary about its version.
Signed-off-by: Adrian Reber <areber@redhat.com>
The orphan pts master option was introduced with commit [1]
to enable checkpoint/restore of containers with a pty pair
used as a console.
[1] 6afe523d97
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Two modes of pre-dump algorithm:
1) splicing memory by parasite
--pre-dump-mode=splice (default)
2) using process_vm_readv syscall
--pre-dump-mode=read
Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Instead of creating cgroup yard in CRIU, now we can create it externally
and pass it to CRIU. Useful if somebody doesn't want to grant
CAP_SYS_ADMIN to CRIU.
Signed-off-by: Michał Cłapiński <mclapinski@google.com>
Including the version information of CRIU in criu.h is required by
projects that use libcriu to preserve backward compatibility.
Closes#738
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
This commit removes the functions criu_(local_)set_service_comm().
These functions are not neccessary, because if
set_service_address(), set_service_fd() or
set_service_binary() has been called it is already clear, which
service comm type should be used.
Furhtermore, this commit reduces the number of misuses.
E.g. if set_service_comm() was set to socket, but a binary was given
via set_service_binary().
Signed-off-by: Martin Wührer <martin.wuehrer@artech.at>
This commit checks after each strdup() call if the call was successful.
If not, the function that calls strdup() returns an error.
This requires, that the return value of several functions has to be
changed from void to int.
Signed-off-by: Martin Wührer <martin.wuehrer@artech.at>
As most of the `criu_(local_)*` functions already call `strdup()`,
it is possible, to change the function signature to `const char *`.
As the struct `criu_opts` already contains a `const char *
service_binary`, also the member `service_address` is changed to
`const char`.
Additonally, also the function `criu_local_set_freeze_cgroup()` now
calls `strdup()`.
Signed-off-by: Martin Wührer <martin.wuehrer@artech.at>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
According to https://criu.org/API_compliance, the C-library
doesn't support the pageserver option.
This patch contains the functions
`criu_(local_)set_page_server_address_port()`
that allow to specify on which ip and tcp-port the pageserver
is listening.
This patch affects only the c-lib, as criu-rpc already supports the
pageserver settings.
Signed-off-by: Martin Wührer <martin.wuehrer@artech.at>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
If `criu_local_init_opts()` is applied on the same opts-object
several times, not all of the allocated memory gets freed.
Therefore, the functions `criu_(local_)free_opts()` were introduced.
These functions ensure, that opts get freed accordingly.
Furthermore, `criu_(local_)free_opts()` gets part of the c-api,
and can therefore be called by external projects too.
Additionally, with this commit `criu_local_init_opts()` now uses
`criu_local_free_opts()`, to free the opts-parameter if it was already
initalized before.
This commit also contains a fix in `send_req_and_recv_resp_sk()` which
lead to a memory leak, if criu-notifications were received.
Signed-off-by: Martin Wührer <martin.wuehrer@artech.at>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
service_binary is either set to a const char * (CR_DEFAULT_SERVICE_BIN)
or to a user provided char *, but there is no reason to give a char *.
Users of such function will most likely provide a const char *,
that will generate a warning.
Thus, we add the const qualifier to better represent the usage of
service_binary, and avoid such warnings.
Signed-off-by: Ronny Chevalier <ronny.chevalier@hp.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It is already present in rpc, so lets add it to libcriu too.
Signed-off-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
It is already present in CLI and RPC, so libcriu should reflect it too.
travis-ci: success for lib: add inherit_fd
Signed-off-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
C standard specifies that the first enum element is 0 and the next ones
are +1 to a previous element (C90, "3.5.2.2 Enumeration
specifiers").
Therefore, there is no need to explicitly specify element values.
The explicit initializers were added in the first commit introducing
this enum (commit 46e8aee).
While at it, let's also add a comma after the last element, for any
future patch adding more elements to look better.
No functional change.
Cc: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
For the previously added option to skip in-flight connections this adds
that option to the RPC interface. The skip in-flight connections is also
described in criu.txt.
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Needed for container migration, where arguments are
set via p.haul as rpc request.
Signed-off-by: Nikita Spiridonov <nspiridonov@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
For handling --cgroup-props, --cgroup-props-file and
--cgroup-dump-controller from RPC interface.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
This reverts commit a98014f306be4b4fefdf01af31e1efa5d83e5e4f.
As per Saied Kazemi, actually dump works without seccomp support
from the kernel on non-seccomped tasks. The only problem was with
criu check, but this would be addressed separately.
Reverting the commit not to burden the API with (yet) unneeded stuff.
Conflicts:
lib/c/criu.h
Sometimes we may want to use CRIU on older kernels which don't support
dumping seccomp state where we don't actually care about the seccomp state.
Of course this is unsafe, but it does allow for c/r of things using
seccomp on these older kernels in some cases. When the task is in
SECCOMP_MODE_STRICT or SECCOMP_MODE_FILTER with filters that block the
syscalls criu's parasite code needs, the dump will still fail.
Note that we disable seccomp by simply feigning that we are in mode 0. This
is a little hacky, but avoids distributing ifs throughout the code and
keeps them in this one place.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
CC: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Both CRIU library and CRIT python data are moved into
lib/c and lib/py.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>