The tar command was failing with the following message:
$ tar cf criu.tar ../../../criu
tar: Removing leading `../../../' from member names
tar: ../../../criu/scripts/ci/criu.tar: archive cannot contain itself; not dumped
In addition, the /vagrant no-longer exist in the new Fedora images.
bash: line 1: cd: /vagrant: No such file or directory
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Installing this package currently fails with the following message:
Package qemu is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
E: Package 'qemu' has no installation candidate
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
A kernel change (commit 12f147ddd6de, "do_change_type(): refuse to
operate on unmounted/not ours mounts") modified how mount propagation
properties can be changed. Previously, these properties could be changed
from any mount namespace. Now, they can only be modified from the
specific mount namespace where the target mount is actually mounted
This commit addresses this new restriction by ensuring that CRIU enters the
correct mount namespace before attempting to restore mount propagation
properties (MS_SLAVE or MS_SHARED) for a mount.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
See the previous commit for rationale and architecture-specific details.
[ avagin: tweak code comment ]
Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
After the CRIU process saves the parasite code for the target thread in
the shared mmap, it is necessary to call __clear_cache before the target
thread executes the code.
Without this step, the target thread may not see the correct code to
execute, which can result in a SIGILL signal.
For the specific arm64 case. this is important so that the newly copied
code is flushed from d-cache to RAM, so that the target thread sees the
new code.
The change is based on commit 6be10a2 by @fu.lin and on input received
from @adrianreber.
[ avagin: tweak code comment ]
Signed-off-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
In general, we use "$(E)" instead of "$(Q) echo", but we also have
a msg-gen macro which can be used here.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit 68f92b551 removed images/google/protobuf directory, so it is
re-created each time during the build process.
This resulted in a weird behavior change. Previously, one could do
something like this:
git clone $CRURL criu
(cd criu && sudo make install-criu)
rm -rf criu
This worked fine, including running rm -rf as a non-root user, since no
new directories were created under criu -- all directories were still
owned by the original user.
Since commit 68f92b551 the same sequence fails:
rm: cannot remove '/home/runner/criu/images/google/protobuf/descriptor.pb-c.c': Permission denied
rm: cannot remove '/home/runner/criu/images/google/protobuf/descriptor.pb-c.d': Permission denied
rm: cannot remove '/home/runner/criu/images/google/protobuf/descriptor.pb-c.h': Permission denied
A workaround is to keep empty images/google/protobuf directory,
which is what this commit does.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit 68f92b551 used `$$(Q)` instead of `$(Q)` in the Makefile target,
which resulted in the following error:
$(Q) echo "Generating descriptor.pb-c.c"
/bin/sh: 1: Q: not found
Generating descriptor.pb-c.c
$(Q) protoc --proto_path=/usr/include --proto_path=images/ --c_out=images/ /usr/include/google/protobuf/descriptor.proto
/bin/sh: 1: Q: not found
as well as:
$(Q) rm -rf images/google
/bin/sh: line 1: Q: command not found
Fix it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Currently the build scripts create the following symlink:
criu-4.1/images/google/protobuf/descriptor.proto -> /usr/include/google/protobuf/descriptor.proto
This symlink points to a system-wide absolute-path target. Also,
this symlink ends up in the release tarball. The tarball may later be
downloaded and unpacked by e.g. OS distributions. If unpacking is
done using Python 3.14+, it will fail.
This happens because Python 3.14 will switch the default behavior of
extractall() from "fully trusting the content of archive" to
"disallow common attack vectors while extracting the archive".
With this new behavior, extractall() raises an exception when at
least one file in the archive extracts or points to outside of the
extraction directory (these are called path traversal attacks and
zip slip attacks).
Reported-by: Dmitrii Kuvaiskii <dimakuv@amazon.de>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The test creates a file bindmount in criu mntns and binds it into test
mntns, this external file bindmount is autodetected and restored via
"--external mnt[]" criu option.
Note: In previous patch we fix the problem on this code path where file
bindmount restore fails as there is excess "/" in source path.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
E.g. I have a /etc/hosts in workspace mounted from the host, and get the following message.
(00.141008) 1: mnt-v2: Create plain mountpoint /tmp/.criu.mntns.K1biY1/mnt-0000000938 for 938
(00.141546) 1: mnt-v2: Mounting unsupported @938 (0)
(00.141887) 1: mnt-v2: Bind /tmp/agent/1-d8c746c6fda3a8b2/workspace/etc/hosts/ to /tmp/.criu.mntns.K1biY1/mnt-0000000938
(00.142179) 1: Error (criu/mount-v2.c:319): mnt-v2: Failed to open_tree /tmp/agent/1-d8c746c6fda3a8b2/workspace/etc/hosts/: Not a directory
(00.143774) Error (criu/cr-restore.c:2320): Restoring FAILED.
Signed-off-by: Chuan Qiu <qiuc12@gmail.com>
Add ZDTM static tests for IP4/ICMP and IP6/ICMP
socket feature.
Signed-off-by: समीर सिंह Sameer Singh <lumarzeli30@gmail.com>
Signed-off-by: Andrei Vagin <avagin@google.com>
Currently there is no option to checkpoint/restore programs that use
ICMP sockets, such as `ping`. This patch adds support for the same.
Fixes#2557
Signed-off-by: समीर सिंह Sameer Singh <lumarzeli30@gmail.com>
net/unix/max_dgram_qlen can't be tuned from non-root userns before:
v5.17-rc1~170^2~215 ("net: Enable max_dgram_qlen unix sysctl to be
configurable by non-init user namespaces")
Signed-off-by: Andrei Vagin <avagin@google.com>
We dump sysctls from criu user namespace, but restore from restored user
namespace. So group id values should be mapped to the restored user
namespace gid space to restore correctly.
Signed-off-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
We have ability to skip sysctl if there is no value, but we still give
n requests to sysctl_op, that is not correct and probably can segfault
on nullptr access. Fix it by adding ri to count non skipped requests.
To be on the safe side, let's add a check that ri == n on read, as we
should not do any skips there.
While on it lets fix bad error message prefix: s/unix/ipv4/.
Remove excess has_iarg set, and add sarg reset to NULL for the case
sysctl_op skipped it.
Signed-off-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Having CTL_FLAGS_IPC_EACCES_SKIP == (CTL_FLAGS_OPTIONAL |
CTL_FLAGS_READ_EIO_SKIP) is probably not what we want. So let's make it
a real distinct flag.
Fixes: 840735aa0 ("ipc_sysctl: Prioritize restoring IPC variables using non usernsd approach")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The `criu cpuinfo check` command calls cpu_validate_cpuinfo(), which
attempts to open the cpuinfo.img file using `open_image()`. If the
image file is not found, `open_image()` returns an "empty image"
object. As a result, `cpu_validate_cpuinfo()` tries to read from it
and fails with the following error:
(00.002473) Error (criu/protobuf.c:72): Unexpected EOF on (empty-image)
This patch adds a check for an empty image and appropriate error message.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The cpuinfo command requires a "dump" or "check" subcommand. Thus, we
replace `CR_CPUINFO` with `CR_CPUINFO_DUMP` and `CR_CPUINFO_CHECK`.
This allows us to remove unnecessary subcommand check in
`image_dir_mode()` and perform all parsing in `parse_criu_mode()`.
With this change the check for validating the cpuinfo subcommand is
now done only once with `CR_CPUINFO_DUMP` or `CR_CPUINFO_CHECK` enum.
Signed-off-by: Liana Koleva <43767763+lianakoleva@users.noreply.github.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
CRIU currently requires a number of dependencies in order to build from
source. The package names vary across distributions and package
managers. A Nix flake allows developers to spin up a dev environment
with `nix develop`, eliminating the hassle of manual dependency
management. It also prevents polluting the global package set on the
machine.
Signed-off-by: Prajwal S N <prajwalnadig21@gmail.com>
In this test we want to ensure that contents of droppable mappings
and mappings with MADV_WIPEONFORK is properly restored in
parent/child processes.
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Support MAP_DROPPABLE [1] by detecting it from /proc/<pid>/smaps
and restoring it as a normal private mapping flag on vma with only
difference that instead of MAP_PRIVATE we should use MAP_DROPPABLE.
[1] 9651fcedf7
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Support VM_WIPEONFORK [1] by detecting it from /proc/<pid>/smaps
and setting a corresponding MADV_WIPEONFORK flag on vma.
[1] d2cd9ede6e
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
The opts['action'] contains actor function and not the action name, so
we should compare it with a function.
While on it let's also add a comment about --criu-bin option if CRIU
binary is missing.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
By default zdtm expects that criu is built from source first and only
then you can run zdtm tests against it. But what if you really want to
run tests against a criu version installed on the system? Yes there is
already a nice option for zdtm to change the criu binary it uses
"--criu-bin", but it would still end up using the pycriu module from
source and you would still have to build everything beforehand.
Let's add an option to change the path where zdtm searches for pycriu
module "--pycriu-search-path". This way we can run zdtm tests on the
criu installed on the system directly without building criu from source,
e.g. on Fedora it works like:
test/zdtm.py run --criu-bin /usr/sbin/criu \
--pycriu-search-path /usr/lib/python3.13/site-packages \
-t zdtm/static/env00
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This patch implements the entire logic to enable the offloading of
buffer object content restoration.
The goal of this patch is to offload the buffer object content
restoration to the main CRIU process so that this restoration can occur
in parallel with other restoration logic (mainly the restoration of
memory state in the restore blob, which is time-consuming) to speed up
the restore phase. The restoration of buffer object content usually
takes a significant amount of time for GPU applications, so
parallelizing it with other operations can reduce the overall restore
time.
It has three parts: the first replaces the restoration of buffer objects
in the target process by sending a parallel restore command to the main
CRIU process; the second implements the POST_FORKING hook in the amdgpu
plugin to enable buffer object content restoration in the main CRIU
process; the third stops the parallel thread in the RESUME_DEVICES_LATE
hook.
This optimization only focuses on the single-process situation (common
case). In other scenarios, it will turn to the original method. This is
achieved with the new `parallel_disabled` flag.
Signed-off-by: Yanning Yang <yangyanning@sjtu.edu.cn>
Currently the restore of buffer object comsumes a significant amount of
time. However, this part has no logical dependencies with other restore
operations. This patch introduce some structures and some helper
functions for the target process to offload this task to the main CRIU
process.
Signed-off-by: Yanning Yang <yangyanning@sjtu.edu.cn>
When enabling parallel restore, the target process and the main CRIU
process need an IPC interface to communicate and transfer restore
commands. This patch adds a Unix domain TCP socket and stores this
socket in `fdstore`.
Signed-off-by: Yanning Yang <yangyanning@sjtu.edu.cn>
Currently, parallel restore only focuses on the single-process
situation. Therefore, it needs an interface to know if there is only one
process to restore. This patch adds a `has_children` function in
`pstree.h` and replaces some existing implementations with this
function.
Signed-off-by: Yanning Yang <yangyanning@sjtu.edu.cn>
Currently, when CRIU calls `cr_plugin_init`, `fdstore` is not
initialized. However, during the plugin restore procedure, there may be
some common file operations used in multiple hooks. This patch moves
`cr_plugin_init` after `fdstore_init`, allowing `cr_plugin_init` to use
`fdstore` to place these file operations.
Signed-off-by: Yanning Yang <yangyanning@sjtu.edu.cn>
Currently, in the target process, device-related restore operations and
other restore operations almost run sequentially. When the target
process executes the corresponding CRIU hook functions, it can't perform
other restore operations. However, for GPU applications, some device
restore operations have no logical dependencies on other common restore
operations and can be parallelized with other operations to speed up the
process.
Instead of launching a thread in child processes for parallelization,
this patch chooses to add a new hook, `POST_FORKING`, in the main CRIU
process to handle these restore operations. This is because the
restoration of memory state in the restore blob is one of the most
time-consuming parts of all restore logic. The main CRIU process can
easily parallelize these operations, whereas parallelizing in threads
within child processes is challenging.
- POST_FORKING
*POST_FORKING: Hook to enable the main CRIU process to perform some
restore operations of plugins.
Signed-off-by: Yanning Yang <yangyanning@sjtu.edu.cn>
Building CRIU on Ubuntu 20.04 fails with the following error:
criu/sk-inet.c: In function 'can_dump_ipproto':
criu/sk-inet.c:131:16: error: 'IPPROTO_MPTCP' undeclared (first use in this function); did you mean 'IPPROTO_MTP'?
131 | if (proto == IPPROTO_MPTCP)
| ^~~~~~~~~~~~~
| IPPROTO_MTP
Add definition for MPTCP to fix this error.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The container checkpointing procedure in Kubernetes freezes running
containers to create a consistent snapshot of both the runtime state
and the rootfs of the container. However, when checkpointing a GPU
container, the container must be unfrozen before invoking the
cuda-checkpoint tool.
This is achieved in prepare_freezer_for_interrupt_only_mode(), which
needs to be called before the PAUSE_DEVICES hook. The patch introducing
this functionality fixes this problem for containers with multiple
processes. However, if the container has a single process,
prepare_freezer_for_interrupt_only_mode() must be invoked immediately
before the PAUSE_DEVICES hook.
Fixes: #2514
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
In 0a7c5fd1bd8d1e49e273b51ff39af473d6c68cbc we swapped the BSD
implementation of strlcat and strlcpy in favor of our own replacement.
The checks and the predefined macros are not needed anymore.
Signed-off-by: Lorenzo Fontana <fontanalorenz@gmail.com>
In some cases, they might not work in virtual machines if the hypervisor
doesn't virtualize them. For example, they don't work in AMD SEV virtual
machines if the Debug Virtualization extension isn't supported or isn't
enabled in SEV_FEATURES.
Fixes#2658
Signed-off-by: Andrei Vagin <avagin@gmail.com>
With Go version 1.24, ListenConfig now uses MPTCP by default [1].
Checkpoint/restore for this protocol is not currently supported
and adding support requires kernel changes that are not trivial
to implement. As a result, checkpointing of many containers that
run Go programs is likely to fail with the following error [2]:
(00.026522) Error (criu/sk-inet.c:130): inet: Unsupported proto 262 for socket 2f9bc5
This patch adds a message with suggested workaround for this problem.
[1] https://go.dev/doc/go1.24#netpkgnet
[2] https://github.com/checkpoint-restore/criu/issues/2655
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
It makes root mount readonly and checks that it is still readonly after
migration.
Make zdtm/static writable for logs via "bind" desc option.
v2: explain why we don't have explicit rw/ro flag check
v3: use new zdtm "bind" desc option
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Add {'bind': 'path/to/bindmount'} zdtm descriptor option, so that in
test mount namespace a directory bindmount can be created before running
the test.
This is useful to leave test directory writable (e.g. for logs) while
the test makes root mount readonly. note: We create this bindmount early
so that all test files are opened on it initially and not on the below
mount. Will be used in mnt_ro_root test.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Mount flags belong to mount and mount namespace of the Container, so we
should preserve them, as Container user will not expect mounts switching
between ro and rw over c/r.
Fixes: #2632
v5: fix both mount-v1 and mount-v2
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Building CRIU package on Debian 11 aarch64 fails with
criu/arch/aarch64/crtools.c: In function 'save_pac_keys':
criu/arch/aarch64/crtools.c:32:31: error: storage size of 'paca' isn't known
struct user_pac_address_keys paca;
^~~~
criu/arch/aarch64/crtools.c:33:31: error: storage size of 'pacg' isn't known
struct user_pac_generic_keys pacg;
^~~~
criu/arch/aarch64/crtools.c:47:15: error: 'HWCAP_PACA' undeclared (first use in this function); did you mean 'HWCAP_FCMA'?
if (hwcaps & HWCAP_PACA) {
^~~~~~~~~~
HWCAP_FCMA
criu/arch/aarch64/crtools.c:47:15: note: each undeclared identifier is reported only once for each function it appears in
criu/arch/aarch64/crtools.c:53:44: error: 'NT_ARM_PACA_KEYS' undeclared (first use in this function); did you mean 'NT_ARM_SVE'?
if ((ret = ptrace(PTRACE_GETREGSET, pid, NT_ARM_PACA_KEYS, &iov))) {
^~~~~~~~~~~~~~~~
NT_ARM_SVE
criu/arch/aarch64/crtools.c:73:39: error: 'NT_ARM_PAC_ENABLED_KEYS' undeclared (first use in this function)
ret = ptrace(PTRACE_GETREGSET, pid, NT_ARM_PAC_ENABLED_KEYS, &iov);
^~~~~~~~~~~~~~~~~~~~~~~
criu/arch/aarch64/crtools.c:82:15: error: 'HWCAP_PACG' undeclared (first use in this function); did you mean 'HWCAP_AES'?
if (hwcaps & HWCAP_PACG) {
^~~~~~~~~~
HWCAP_AES
criu/arch/aarch64/crtools.c:88:44: error: 'NT_ARM_PACG_KEYS' undeclared (first use in this function); did you mean 'NT_ARM_SVE'?
if ((ret = ptrace(PTRACE_GETREGSET, pid, NT_ARM_PACG_KEYS, &iov))) {
^~~~~~~~~~~~~~~~
NT_ARM_SVE
criu/arch/aarch64/crtools.c:33:31: error: unused variable 'pacg' [-Werror=unused-variable]
struct user_pac_generic_keys pacg;
^~~~
criu/arch/aarch64/crtools.c:32:31: error: unused variable 'paca' [-Werror=unused-variable]
struct user_pac_address_keys paca;
^~~~
criu/arch/aarch64/crtools.c: In function 'arch_ptrace_restore':
criu/arch/aarch64/crtools.c:227:31: error: storage size of 'upaca' isn't known
struct user_pac_address_keys upaca;
^~~~~
criu/arch/aarch64/crtools.c:228:31: error: storage size of 'upacg' isn't known
struct user_pac_generic_keys upacg;
^~~~~
criu/arch/aarch64/crtools.c:241:18: error: 'HWCAP_PACA' undeclared (first use in this function); did you mean 'HWCAP_FCMA'?
if (!(hwcaps & HWCAP_PACA)) {
^~~~~~~~~~
HWCAP_FCMA
criu/arch/aarch64/crtools.c:255:44: error: 'NT_ARM_PACA_KEYS' undeclared (first use in this function); did you mean 'NT_ARM_SVE'?
if ((ret = ptrace(PTRACE_SETREGSET, pid, NT_ARM_PACA_KEYS, &iov))) {
^~~~~~~~~~~~~~~~
NT_ARM_SVE
criu/arch/aarch64/crtools.c:261:44: error: 'NT_ARM_PAC_ENABLED_KEYS' undeclared (first use in this function)
if ((ret = ptrace(PTRACE_SETREGSET, pid, NT_ARM_PAC_ENABLED_KEYS, &iov))) {
^~~~~~~~~~~~~~~~~~~~~~~
criu/arch/aarch64/crtools.c:268:18: error: 'HWCAP_PACG' undeclared (first use in this function); did you mean 'HWCAP_AES'?
if (!(hwcaps & HWCAP_PACG)) {
^~~~~~~~~~
HWCAP_AES
criu/arch/aarch64/crtools.c:275:44: error: 'NT_ARM_PACG_KEYS' undeclared (first use in this function); did you mean 'NT_ARM_SVE'?
if ((ret = ptrace(PTRACE_SETREGSET, pid, NT_ARM_PACG_KEYS, &iov))) {
^~~~~~~~~~~~~~~~
NT_ARM_SVE
criu/arch/aarch64/crtools.c:233:6: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
int ret;
^~~
criu/arch/aarch64/crtools.c:228:31: error: unused variable 'upacg' [-Werror=unused-variable]
struct user_pac_generic_keys upacg;
^~~~~
criu/arch/aarch64/crtools.c:227:31: error: unused variable 'upaca' [-Werror=unused-variable]
struct user_pac_address_keys upaca;
^~~~~
This patch adds the missing constants and structs if undefined.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>