2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 05:48:05 +00:00

11234 Commits

Author SHA1 Message Date
Radostin Stoyanov
42a5b640f6 ci: disable CentOS 7 test in Cirrus CI
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-07-07 15:45:05 -07:00
Abhishek Guleri
3a932e9115 readme: refactor asciinema link for video playback
Instead of opening the image directly, the commit refactors the
asciinema image embedded link to redirect users to the corresponding
video.

Signed-off-by: Abhishek Guleri <abhishekguleri24@gmail.com>
2023-07-07 15:42:48 -07:00
Michał Mirosław
4c2b71c372 zdtm: Update netns purpose comment in zdtm_ct.
With the parasite socket clash now guaranteed not to happen,
the comment becomes obsolete. netns is steel needed though, so
update the comment to point at the requirement.

Change-Id: I3cfb253cd5c53b91b955fcb001530b4aee5129f4
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-07-05 08:53:32 -07:00
Michał Mirosław
ba11426de5 util: Make CRIU run_id machine-level unique.
Instead of relying on chance of CLOCK_MONOTONIC reading being unique,
use pid namespace ID that combined with the process ID will make it
unique on the machine level.

If pidns is not enabled on a kernel we'll get ENOENT, but then CRIU's
pid will already be unique. If there is some other error, log it but
continue, as the socket clash (if it happens) will result in a failed
run anyway.

Fixes: 45e048d77a6a (2022-03-31 "criu: generate unique socket names")
Fixes: 408a7d82d644 (2022-02-12 "util: add an unique ID of the current criu run")
Change-Id: I111c006e1b5b1db8932232684c976a84f4256e49
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-07-05 08:53:32 -07:00
Michał Mirosław
a5939b006c kerndat: Don't fail on NETLINK/nsid support missing.
If not dumping netns nor connections, nsid support is not used. Don't
fail the run as if the support is needed, the dumping process will fail
later.

Change-Id: I39a086756f6d520c73bb6b21eaf6d9fb49a18879
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-07-05 08:47:18 -07:00
Michał Mirosław
f0d1b89f56 kerndat: unexport kerndat_nsid()
kerndat_nsid() is not used outside kerndat.c. Make it static.

Change-Id: I52e518ecb7c627cc1866e373411b2be3f71a2c9d
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-07-05 08:47:18 -07:00
Michał Mirosław
fb8ca647f2 util: Downgrade ignored errors to warnings.
If the error is ignored it is not important enough - make it a warning
instead.

From: Mian Luo <mianl@google.com>
Change-Id: If2641c3d4e0a4d57fdf04e4570c49be55f526535
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-07-05 08:47:18 -07:00
Michał Mirosław
6cfe7aa114 rpc: Support setting images_dir by path.
Google's RPC client process is in a different pidns and has more privileges --
CRIU can't open its /proc/<pid>/fd/<fd>.  For images_dir_fd to be useful here
it would need to refer to a passed or CRIU's fd.

From: Michał Cłapiński <mclapinski@google.com>
Change-Id: Icbfb5af6844b21939a15f6fbb5b02264c12341b1
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-30 16:06:59 -07:00
Michał Mirosław
8ba6efb005 rpc: Support gathering external file list after freezing process tree.
New 'query-ext-files' action for `criu dump` is sent after
freezing the process tree. This allows to defer gathering
the external file list when the process tree is in a stable
state and avoids race with the process creating and deleting
files.

Change-Id: Iae32149dc3992dea086f513ada52cf6863beaa1f
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-30 16:06:59 -07:00
Radostin Stoyanov
cde9bcf63d docker/podman: test c/r with action-script
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-06-30 16:05:20 -07:00
Radostin Stoyanov
b02d53a8f4 action-scripts: allow shell scripts in rpc mode
Container runtimes commonly use CRIU with RPC. However, this prevents
the use of action-scripts set in a CRIU configuration file due to the
explicit scripts mode introduced with the following commit:

  ac78f13bdfaee260dd4234f054bf4c5d2a373783
  actions: Introduce explicit scripts mode

This patch enables container checkpoint/restore with action-scripts
specified via configuration file.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-06-30 16:05:20 -07:00
Michał Mirosław
0588c3b21a build: libnfnetlink: Remove nla_get_s32().
nla_get_s32() was added to libnl 3.2.7 in 2015. Remove CRIU's definition
as it breaks build when statically linking the binary.

From: Uros Prestor <urosp@google.com>
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-27 21:45:46 -07:00
Michał Mirosław
eece28d29c build: Guard against libbsd's version of __aligned.
When trying to build CRIU with libbsd enabled the compilation fails due
to duplicate definition of __aligned macro. Other such definitions are
already wrapped with #ifndef make __aligned definition consistent and
make it easier in the future to use the libbsd features if needed.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-27 21:45:46 -07:00
Michał Mirosław
85f53bdecd build: Use make-provided AR for building libzdtmtst.
Make $(AR) used also for libzdtmtst build.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-27 21:45:46 -07:00
Michał Mirosław
722a90ccd9 build: Fix LIBS vs LDFLAGS order.
$LDFLAGS can contain `-Ldir`s that are required by '-lib's in $LIBS.
Reverse the order so that `-L` options make effect.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-27 21:45:46 -07:00
Michał Mirosław
2d76e4b31a build: Debug system feature tests.
`make` without `-s` option will normally show the commands executed. In
the case of detecting build environment features current makefile will
cause detected features to be seen as 'echo #define' commands, but not
detected ones will be silent. Change it so that all tried features can
be seen (outside of make's silent mode) regardless of detection result.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-27 21:45:46 -07:00
Michał Mirosław
aa6b633912 build: Remove HAS_MEMFD test
The test for HAS_MEMFD is empty and noit used. Remove it.

Fixes: 5ee1ac1f28e6 ("criu: remove FEATURE_TEST_MEMFD")
Change-Id: I43b8f0cfd50ce9bdf93dafb647377318df1deae8
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-27 21:45:46 -07:00
Michał Mirosław
1e90fc8f4b restore: remove unused secbits field.
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-26 09:01:54 -07:00
Michał Mirosław
f5cd44f797 kerndat: Make socket feature probing work on IPv6-only host.
Try IPv6 if IPv4 sockets are not supported.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-21 17:58:59 -07:00
Michał Mirosław
12290f4583 pipes: Plug pipe fd leak in "Unable to set a pipe size" error case.
From: Piotr Figiel <figiel@google.com>
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-21 17:51:34 -07:00
Michał Mirosław
e6427c5600 sockets: Increase the size of sockets hashmap to 16K.
During dump, CRIU stores the structs representing sockets in a statically sized
hashmap of size 32. We have some (admittedly crazy) tasks that use tens of
thousands of sockets, and seem to spend most of the dump time iterating over
the linked lists of the map.

16K is chosen arbitrarily, so that it reduces the lengths of the chains to few
elements on average, while not introducing significant memory overhead.

From: Radosław Burny <rburny@google.com>
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-21 17:47:52 -07:00
Radostin Stoyanov
f018893d26 test/thp_disable: fix lint
The fail() macro provides a new line character at the end of the
message. This patch fixes the following lint check that currently
fails in CI:

$ git --no-pager grep -E '^\s*\<(pr_perror|fail)\>.*\\n"'
test/zdtm/static/thp_disable.c:		fail("prctl(GET_THP_DISABLE) returned unexpected value: %d != 1\n", ret);
test/zdtm/static/thp_disable.c:		fail("Flags changed %lx -> %lx\n", orig_flags, new_flags);
test/zdtm/static/thp_disable.c:		fail("Madvs changed %lx -> %lx\n", orig_madv, new_madv);
test/zdtm/static/thp_disable.c:		fail("post-migration prctl(GET_THP_DISABLE) returned unexpected value: %d != 1\n", ret);
test/zdtm/static/thp_disable.c:		fail("Flags changed %lx -> %lx\n", orig_flags, new_flags);
test/zdtm/static/thp_disable.c:		fail("Madvs changed %lx -> %lx\n", orig_madv, new_madv);

Fixes: #2193

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-06-19 11:41:33 -07:00
Michał Mirosław
a2c4dd2265 Allow skipping iptables/nftables invocation.
Make it possible to skip network lock to enable uses that break connections
anyway to work without iptables/nftables being present.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-19 11:22:57 -07:00
Michał Mirosław
9301aba877 zdtm: sock_opts00: Improve error messages.
Make it clear that the option numbers are indexes not the option
identifiers ("names"). Also show the value change that prompted test
failure.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-17 15:03:51 -07:00
Michał Mirosław
e595787cf7 zdtm: Implement test sharding.
Allow to split test suite into predictable sets to parallelize runs on
multiple machines or VMs.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-17 15:03:05 -07:00
Michał Mirosław
05f2319f1e zdtm: Allow --keep-going for single test.
We don't want test framework to change its behaviour on whether we
run a single or multiple tests in a run. When we shard the test suite
it can result in some shards having a single test to run and
unexpectedly change the test output format.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-17 14:59:33 -07:00
Michał Mirosław
0bd5abe4ed zdtm: Add timeouts for test commands.
Extend ability to limit time taken to all CRIU invocations.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-17 14:54:10 -07:00
Michał Mirosław
e55e168e90 zdtm: Allow overriding /tmp.
Use $TMPDIR for tests_root as the host's /tmp might not have enough
features or space.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-17 14:44:23 -07:00
Michał Cłapiński
7f7b553af3 Allow passing --log_to_stderr via RPC.
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-17 09:39:30 -07:00
Michał Cłapiński
161a5ff8d4 Allow passing --display_stats via RPC.
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-17 09:37:36 -07:00
Michał Mirosław
3d4d943253 Allow passing --leave_stopped by RPC.
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-17 09:34:48 -07:00
Haorong Lu
9e2e56006b compel/test: Return 0 in case of error in fdspy
This commit revises the error handling in the fdspy test. Previously,
a failure case could have been incorrectly reported as successful because
of a specific check `pass != 0`, leading to potential false positives
when `check_pipe_ends()` returned `-1` due to a read/write pipe error.

To improve this, we've adjusted the error handling to return `0` in case
of any error. As such, the final success condition remains unchanged. This
approach will help accurately differentiate between successful and failed
cases, ensuring the output "All OK" is printed for success, and "Something
went WRONG" for any failure.

Fixes: 5364ca3 ("compel/test: Fix warn_unused_result")

Signed-off-by: Haorong Lu <ancientmodern4@gmail.com>
2023-06-17 09:21:57 -07:00
Michał Mirosław
c6ee1ba05e Fill FPU init state if it's not provided by kernel.
Apparently Skylake uses init-optimization when saving FPU state, and ptrace()
returns XSTATE_BV[0] = 0 meaning FPU was not used by a task (in init state).
Since CRIU restore uses sigreturn to restore registers, FPU state is always
restored. Fill the state with default values on dump to make restore happy.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-15 17:42:02 -07:00
Michał Mirosław
c75c017e4c zdtm: thp_disable: Verify MADV_NOHUGEPAGE before migration
Add a sanity check for THP_DISABLE. This discovered a broken commit
in Google's kernel tree.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-14 17:38:37 -07:00
Michał Mirosław
6006cb6eaf zdtm: thp_disable: Verify prctl(THP_DISABLE) migration
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-14 17:38:37 -07:00
Michał Mirosław
d3a33ca1ef zdtm: thp_disable: Output a single failure message
While at it, don't carry over stale errno to the fail() message.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-14 17:38:37 -07:00
Michał Mirosław
7ca6856be4 Log if prctl(SET_THP_DISABLE) doesn't work as expected.
If prctl(SET_THP_DISABLE) is not used due to bad semantics, log it
for easier debugging.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-14 17:38:37 -07:00
Michał Mirosław
9e6454f50b Restore THP_DISABLE prctl.
The original commit added saving THP_DISABLED flag value, but missed
restoring it. There is restoring code, but used only when --lazy_pages
mode is enabled. Restore the prctl flag always. While at it, rename the
`has_thp_enabled` -> `!thp_disabled` for consistency.

Fixes: bbbd597b4124 (2017-06-28 "mem: add dump state of THP_DISABLED prctl")
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-14 17:38:37 -07:00
Michał Mirosław
11288c968d Fix mount(cgroup2) for older kernels.
Linux 4.15 doesn't like empty string for cgroup2 mount options.
Pass NULL then to satisfy the kernel check. Log the options for
easier debugging.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-14 17:32:24 -07:00
Michał Mirosław
af0e413e03 Fix dumping hugetlb-based memfd on kernels < 4.16.
4.15-based kernels don't allow F_*SEAL for memfds created with MFD_HUGETLB.
Since seals are not possible in this case, fake F_GETSEALS result as if it
was queried for a non-sealing-enabled memfd.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-06-14 17:32:24 -07:00
Valeriy Vdovin
4d137b81a0 cgroup/restore: split prepare_task_cgroup code into two separate functions
This does cgroup namespace creation separately from joining task
cgroups. This makes the code more logical, because creating cgroup
namespace also involves joining cgroups but these cgroups can be
different to task's cgroups as they are cgroup namespace roots
(cgns_prefix), and mixing all of them together may lead to
misunderstanding.

Another positive thing is that we consolidate !item->parent checks in
one place in restore_task_with_children.

Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-06-11 23:32:59 -07:00
Radostin Stoyanov
104a82856f action-scripts: Add pre-stream hook
This hook allows to start image streamer process from an action script.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-06-11 23:32:59 -07:00
Thomas Gleixner
358f09cf48 timers: improve and fix posix timer id sequence checks
This is a patch proposed by Thomas here:
https://lore.kernel.org/all/87ilczc7d9.ffs@tglx/

It removes (created id > desired id) "sanity" check and adds proper
checking that ids start at zero and increment by one each time when we
create/delete a posix timer.

First purpose of it is to fix infinite looping in create_posix_timers on
old pre 3.11 kernels.

Second purpose is to allow kernel interface of creating posix timers
with desired id change from iterating with predictable next id to just
setting next id directly. And at the same time removing predictable next
id so that criu with this patch would not get to infinite loop in
create_posix_timers if this happens.

Thanks a lot to Thomas!

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-06-11 23:32:59 -07:00
Dhanuka Warusadura
9130fefa4d criu-ns: Update shebang line to python
CentOS 7 CI environment uses Python 2. To execute criu-ns
script in CentOS 7 changing the current shebang line to
python is required.

This reverse the changes made in a15a63fce0ad4d1a9119771577fa7ef562bbfd6b

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-06-11 23:32:59 -07:00
Dhanuka Warusadura
f0e9358590 criu-ns: Install Python pathlib module in CentOS 7
These changes fix the `ImportError: No module named pathlib`
error when executing criu-ns tests located at criu/test/others/criu-ns

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-06-11 23:32:59 -07:00
Dhanuka Warusadura
8094df8ddb criu-ns: Add tests for criu-ns script
These changes add test implementations for criu-ns script.

Fixes: #1909

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-06-11 23:32:59 -07:00
Dhanuka Warusadura
38db5e1f2f criu-ns: Add support for older Python version in CI
These changes remove and update the changes introduced in
7177938e60b81752a44a8116b3e7e399c24c4fcb in favor of the
Python version in CI.

os.waitstatus_to_exitcode() function appeared in Python 3.9

Related to: #1909

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-06-11 23:32:59 -07:00
Dhanuka Warusadura
f308272062 criu-ns: Add --criu-binary argument to run_criu()
--criu-binary argument provides a way to supply the CRIU binary
location to run_criu().

Related to: #1909

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-06-11 23:32:58 -07:00
Adrian Reber
f57bda46a3 lib/c: add empty_ns interfaces to libcriu
crun wants to set empty_ns and this interface is missing from the
library. This adds it to libcriu.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-06-11 23:32:58 -07:00
Radostin Stoyanov
806ee35015 docs: rename amdgpu_plugin.txt to criu-amdgpu-plugin.txt
By default, the file name 'amdgpu_plugin.txt' is used also as the name
for the corresponding man page (`man amdgpu_plugin`). However, when
this man page is installed system-wide it would be more appropriate
to have a prefix 'criu-' (e.g., `man criu-amdgpu-plugin`).

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-06-11 23:32:58 -07:00