2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 21:38:16 +00:00

11541 Commits

Author SHA1 Message Date
Michał Mirosław
21ce76263b Restore THP_DISABLE prctl.
The original commit added saving THP_DISABLED flag value, but missed
restoring it. There is restoring code, but used only when --lazy_pages
mode is enabled. Restore the prctl flag always. While at it, rename the
`has_thp_enabled` -> `!thp_disabled` for consistency.

Fixes: bbbd597b4124 (2017-06-28 "mem: add dump state of THP_DISABLED prctl")
Signed-off-by: Michał Mirosław <emmir@google.com>
2023-10-22 13:29:25 -07:00
Michał Mirosław
9943dcde17 Fix mount(cgroup2) for older kernels.
Linux 4.15 doesn't like empty string for cgroup2 mount options.
Pass NULL then to satisfy the kernel check. Log the options for
easier debugging.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-10-22 13:29:25 -07:00
Michał Mirosław
0218b1e8f2 Fix dumping hugetlb-based memfd on kernels < 4.16.
4.15-based kernels don't allow F_*SEAL for memfds created with MFD_HUGETLB.
Since seals are not possible in this case, fake F_GETSEALS result as if it
was queried for a non-sealing-enabled memfd.

Signed-off-by: Michał Mirosław <emmir@google.com>
2023-10-22 13:29:25 -07:00
Valeriy Vdovin
a638043a7f cgroup/restore: split prepare_task_cgroup code into two separate functions
This does cgroup namespace creation separately from joining task
cgroups. This makes the code more logical, because creating cgroup
namespace also involves joining cgroups but these cgroups can be
different to task's cgroups as they are cgroup namespace roots
(cgns_prefix), and mixing all of them together may lead to
misunderstanding.

Another positive thing is that we consolidate !item->parent checks in
one place in restore_task_with_children.

Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-10-22 13:29:25 -07:00
Radostin Stoyanov
2ac15e3ad0 action-scripts: Add pre-stream hook
This hook allows to start image streamer process from an action script.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-10-22 13:29:25 -07:00
Pavel Tikhomirov
c6ac396aa3 timers: improve and fix posix timer id sequence checks
This is a patch proposed by Thomas here:
https://lore.kernel.org/all/87ilczc7d9.ffs@tglx/

It removes (created id > desired id) "sanity" check and adds proper
checking that ids start at zero and increment by one each time when we
create/delete a posix timer.

First purpose of it is to fix infinite looping in create_posix_timers on
old pre 3.11 kernels.

Second purpose is to allow kernel interface of creating posix timers
with desired id change from iterating with predictable next id to just
setting next id directly. And at the same time removing predictable next
id so that criu with this patch would not get to infinite loop in
create_posix_timers if this happens.

Thanks a lot to Thomas!

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-10-22 13:29:25 -07:00
Dhanuka Warusadura
fc08fa9077 criu-ns: Update shebang line to python
CentOS 7 CI environment uses Python 2. To execute criu-ns
script in CentOS 7 changing the current shebang line to
python is required.

This reverse the changes made in a15a63fce0ad4d1a9119771577fa7ef562bbfd6b

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-10-22 13:29:25 -07:00
Dhanuka Warusadura
9cd09f5860 criu-ns: Install Python pathlib module in CentOS 7
These changes fix the `ImportError: No module named pathlib`
error when executing criu-ns tests located at criu/test/others/criu-ns

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-10-22 13:29:25 -07:00
Dhanuka Warusadura
9c9e8ea3f2 criu-ns: Add tests for criu-ns script
These changes add test implementations for criu-ns script.

Fixes: #1909

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-10-22 13:29:25 -07:00
Dhanuka Warusadura
e4b6fb2d1f criu-ns: Add support for older Python version in CI
These changes remove and update the changes introduced in
7177938e60b81752a44a8116b3e7e399c24c4fcb in favor of the
Python version in CI.

os.waitstatus_to_exitcode() function appeared in Python 3.9

Related to: #1909

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-10-22 13:29:25 -07:00
Dhanuka Warusadura
733f165512 criu-ns: Add --criu-binary argument to run_criu()
--criu-binary argument provides a way to supply the CRIU binary
location to run_criu().

Related to: #1909

Signed-off-by: Dhanuka Warusadura <csx@tuta.io>
2023-10-22 13:29:25 -07:00
Adrian Reber
36709536e5 lib/c: add empty_ns interfaces to libcriu
crun wants to set empty_ns and this interface is missing from the
library. This adds it to libcriu.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-10-22 13:29:25 -07:00
Radostin Stoyanov
b665dce3c7 docs: rename amdgpu_plugin.txt to criu-amdgpu-plugin.txt
By default, the file name 'amdgpu_plugin.txt' is used also as the name
for the corresponding man page (`man amdgpu_plugin`). However, when
this man page is installed system-wide it would be more appropriate
to have a prefix 'criu-' (e.g., `man criu-amdgpu-plugin`).

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-10-22 13:29:25 -07:00
Pavel Tikhomirov
cc607f8103 criu-ns: make --pidfile option show pid in caller pidns
Using the fact that we know criu_pid and criu is a parent of restored
process we can create pidfile with pid on caller pidns level.

We need to move mount namespace creation to child so that criu-ns can
see caller pidns proc.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-10-22 13:29:25 -07:00
Adrian Reber
50e17a1cf3 scripts: make newer versions of shellcheck happy
Signed-off-by: Adrian Reber <areber@redhat.com>
2023-10-22 13:29:25 -07:00
Adrian Reber
df7b897a22 ci: fix new codespell errors
Signed-off-by: Adrian Reber <areber@redhat.com>
2023-10-22 13:29:25 -07:00
Adrian Reber
727d796505 compel: support XSAVE on newer Intel CPUs
Newer Intel CPUs (Sapphire Rapids) have a much larger xsave area than
before. Looking at older CPUs I see 2440 bytes.

    # cpuid -1 -l 0xd -s 0
    ...
        bytes required by XSAVE/XRSTOR area     = 0x00000988 (2440)

On newer CPUs (Sapphire Rapids) it grows to 11008 bytes.

    # cpuid -1 -l 0xd -s 0
    ...
        bytes required by XSAVE/XRSTOR area     = 0x00002b00 (11008)

This increase the xsave area from one page to four pages.

Without this patch the fpu03 test fails, with this patch it works again.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-10-22 13:29:25 -07:00
hdzhoujie
fa6af25e75 dump: increase fcntl call failure judgment
The pipe_size type is unsigned int, when the fcntl call fails and
return -1, it will cause a negative rollover problem.

Signed-off-by: zhoujie <zhoujie133@huawei.com>
2023-10-22 13:29:25 -07:00
Suraj Shirvankar
1c0f8787b2 zdtm: Add tests for ip tos restore
Signed-off-by: Suraj Shirvankar <surajshirvankar@gmail.com>
2023-10-22 13:29:25 -07:00
Suraj Shirvankar
04cdbd6106 sk-inet: Add IP TOS socket option
The TOS(type of service) field in the ip header allows you specify the
priority of the socket data.

Signed-off-by: Suraj Shirvankar <surajshirvankar@gmail.com>
2023-10-22 13:29:25 -07:00
Andrei Vagin
4c1a2ac41b criu: Version 3.18 (Silver Sandpiper)
The highlight feature of this release is the ability to use CRIU for
non-root users. Adrian Reber implemented the kernel part and created the
initial version of CRIU changes. Then Younes Manton joined the effort
and pushed it to the finish line.

The full change log is here: https://criu.org/Download/criu/3.18

Signed-off-by: Andrei Vagin <avagin@gmail.com>
v3.18
2023-04-22 20:47:09 -07:00
Pavel Tikhomirov
b689bcc354 cr-check: remove excess kerndat_has_nspid from check_ns_pid
We do kerndat_has_nspid in kerndat_init already and save result to
kerndat cache, we don't need to recheck it each time.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Michal Clapinski
9b3496043d log: fix timestamp logging when tv_sec>=100
Previously when tv_sec>=100, the line would look like this:
(269.189615 Error [...]
Now the last char is overwritten with ')'.

Signed-off-by: Michal Clapinski <mclapinski@google.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
94ac9ee3cc proc_parse: fix while condition in parse_pid_status
In parse_pid_status there are 13 places where we do done++, so when
"done" is 13 it means that we have matched each of those 13 places and
we are ready to stop. In next lines we are not going to find anything.

So the right condition for the while loop is (done < 13).

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
hdzhoujie
45e4a6b274 netlink: fix netlink fd flags dump/restore failed
During the restore process, netlink fd uses the flags in the
NetlinkSkEntry structure to restore the file state, but during
the dump process, the flags values is not saved to the structure.

Signed-off-by: zhoujie <zhoujie133@huawei.com>
Signed-off-by: hejingxian <hejingxian@huawei.com>
2023-04-15 21:17:21 -07:00
Michal Clapinski
6c728df1dc zdtm: modify rseq01 to include a thread
Testing only the thread group leader is not enough and can hide bugs.

Signed-off-by: Michal Clapinski <mclapinski@google.com>
2023-04-15 21:17:21 -07:00
Michal Clapinski
f8da250bb3 cr-dump: properly apply rseq fixup for all threads
Previously fixup was done before threads' registers were dumped so it
didn't actually work. This commit splits rseq fixup into thread leader
fixup and other threads fixup and applies them after the entities are
seized.

Signed-off-by: Michal Clapinski <mclapinski@google.com>
2023-04-15 21:17:21 -07:00
Michal Clapinski
78c4e2c0f7 cr-dump: move rseq functions before dump_task_thread
Signed-off-by: Michal Clapinski <mclapinski@google.com>
2023-04-15 21:17:21 -07:00
Michal Clapinski
85e46c44d6 dump: extend parasite_thread_ctl lifetime to dump_task_thread
Signed-off-by: Michal Clapinski <mclapinski@google.com>
2023-04-15 21:17:21 -07:00
Michal Clapinski
9683097f27 zdtm: don't ignore rseq_cs mismatch in rseq01 test
Kernel shouldn't clean up rseq_cs inside a critical section.
If rseq_cs has been cleaned up, it means there is a bug in migration.

Signed-off-by: Michal Clapinski <mclapinski@google.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
0c52399322 ci: cancel preceding workflows run
This patch adds concurrency groups to the CI workflows to automatically
cancel any in-progress workflows when a pull request has been updated.
A `concurrency` group allows to ensure that a single job or workflow
will run at a time. For example, when a pull request is updated with
a force-push, the GiHub CI workflows currently in-progress will be
automatically cancelled, and the CI would run only with the updated
commits.

https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
5c8cdceec2 sk-unix: rework unix_resolve_name
- use exit_code instead of returning ret
- replace -errno return with -1
- move fallback to if (!kdat.sk_unix_file)
- fix readlinkat error checking (ret < 0 && ret >= PATH_MAX) by using
  read_fd_link helper

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
de39bd2bd1 sk-unix: simplify error handling in unix_resolve_name_old
As we now don't have any calls to free in this function we can replace
all lables with explicit returns.

While on it: Replace useless -errno and 1 returns with -1 as from the
very first implementation of unix_resolve_name (it changed name to _old
later) in [1] any non-zero return was treated as error.

6d785e6cd ("unix: resolve a socket file when a socket descriptor is
available") [1]

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
d93409cf11 sk-unix: remove bogus xfree from unix_resolve_name_old
It is strange to free a pointer which is already in unix_sk_desc, either
on error path or on skip as we leave freed pointer in desc and it can
probably be used after free later and lead to some corruption. So I
would prefer not to free it as we don't have full controll over it here.

Fixes: 6d785e6cd ("unix: resolve a socket file when a socket descriptor is available")

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Yuriy Vasiliev
ccc790d540 zdtm/lib: fix cwd path freeing
Fix cwd freeing on error path in get_cwd_check_perm and
on non-error-path in unix_fill_sock_name.

v2: use cleanup_free attribute in unix_fill_sock_name

Signed-off-by: Yuriy Vasiliev <yuriy.vasiliev@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Cyrill Gorcunov
8e6fa9c3b9 net: Add net log prefix
For better logging.

Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
7f0f07599a crit: fix compatibility with Python 3.12
Python 3.12 includes a few breaking changes, such as the removal of the
distutils module [1] and the deprecation of `setup.py install` in
favour of pip install [2]. This patch updates the installation script
for crit to reflect these changes by replacing the use of
`setup.py install` with `pip install` and `distutils` with
`setuptools`. In addition, a minimal pyproject.toml file has
been added as it is required by the new version of pip [3].

It is worth noting that with this change we are switching from the egg
packaging format to wheel [4] and add pip as a build dependency.

[1] https://www.python.org/downloads/release/python-3120a2/
[2] https://github.com/pypa/setuptools/pull/2824
[3] https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
[4] https://packaging.python.org/en/latest/discussions/wheel-vs-egg/

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
bd0f209c2b pstree: improve id intersection detection in prepare_pstree_for_shell_job
First, let's move lookup_create_item-s to the end so that on pgid
replacement we don't have false positive pstree_pid_by_virt check
founding item created by sid replacement. (note: we need those
lookup_create_item-s for the sake of free pid selection mechanism)

Second, let's add checks for sid/pgid in images intersecting with
current_sid/pgid, as this would also bring problems on restore.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
66fd45d51d sk-unix: add some missed error printing
In Virtuozzo tests we have seen uninformative errors:

(26.575039) 151187 fdinfo 6: pos: 0 flags: 2/0
(26.575076) sockets: Searching for socket 0x346d1 family 1
(666.230281 ----------------------------------------
(666.230586 Error (criu/cr-dump.c:1850): Dump files (pid: 151187) failed
with -1

So let's add some error messages to this stack.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
a0158e6927 zdtm: add MNTNS_ZDTM macro to fix initialization
With this macro we can easily declare struct mntns_zdtm variables with
all lists properly initiallized. Let's use it in mount_complex_sharing
as without it we can have segfault on error path when accessing
uninitialized list pointers.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
12423abdb5 mount: allow bindmounts for external fuse mounts
Currently we only allow external fuse mount itself, let's allow
bindmount for it too. Other mount code is ready for this change and will
be able to bindmount it from corresponding external mount.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
65407616e0 ci/archlinux: initialize machine ID
When installing packages within Archlinux container, pacman fails with
the following errors:

(3/7) Creating temporary files...
/usr/lib/tmpfiles.d/journal-nocow.conf:26: Failed to replace specifiers in '/var/log/journal/%m': No such file or directory
/usr/lib/tmpfiles.d/systemd.conf:23: Failed to replace specifiers in '/run/log/journal/%m': No such file or directory
/usr/lib/tmpfiles.d/systemd.conf:25: Failed to replace specifiers in '/run/log/journal/%m': No such file or directory
/usr/lib/tmpfiles.d/systemd.conf:26: Failed to replace specifiers in '/run/log/journal/%m/*.journal*': No such file or directory
/usr/lib/tmpfiles.d/systemd.conf:29: Failed to replace specifiers in '/var/log/journal/%m': No such file or directory
/usr/lib/tmpfiles.d/systemd.conf:30: Failed to replace specifiers in '/var/log/journal/%m/system.journal': No such file or directory
/usr/lib/tmpfiles.d/systemd.conf:32: Failed to replace specifiers in '/var/log/journal/%m': No such file or directory
/usr/lib/tmpfiles.d/systemd.conf:33: Failed to replace specifiers in '/var/log/journal/%m/system.journal': No such file or directory

To solve this problem we need to initialize the machine ID.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
KKrypt
34e2b02219 Optimized shell code with <'s (instead of cat + |)
This patch optimizes shell code as reading a single file as input using a 'cat' command to a program.

It is considered to be a Useless Use of Cat (UUOC).

It's more efficient to simply use redirection.

However, in some cases, even using the redirection operator '<' seems unnecessary.

Signed-off-by: KKrypt <sankalpacharya1211@gmail.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
7cae16e971 mount: do collect_mntinfo of external mount namespace with no for_dump
When we collect external mount namespace we don't want to dump mounts in
it, so lets remove this flag. This way we can e.g. use for_dump in
->parse() callbacks to separate in-container mounts from others.

This only affects rare case of `--ext-mount-map auto` but to be
absolutely correct let's fix it anyway.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
69befdde18 cgroup-v2: make new field cg_set optional
The new field cg_set is currently marked as required which causes backward
compatibility problem when using newer CRIU version to restore dumped image
from older version. This commit makes this field optional and reworks the
logic to fallback to use cg_set from task_core when it is not in
thread_core.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
529f298913 cgroup-v2: make new field is_threaded optional
The new field is_threaded is currently marked as required which causes
backward compatibility problem when using newer CRIU version to restore
dumped image from older version. This commit makes this field optional and
reworks the logic the skip fixing up threaded cgroup controllers if there
is no information in dumped image.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Alexander Mikhalitsyn
6e681afb69 net: fail restore if nftables isn't supported but image is present
Fixes: e1c4871 ("net: add nftables c/r")
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
156c8da33c make: disable '-Wdangling-pointer' warning with gcc 12
The patch is similar to what has been done in linux kernel, as this
warning effectively prevents us from adding list elements to local list
head. See https://github.com/torvalds/linux/commit/49beadbd47c2

Else we have:

    CC       criu/mount.o
  In file included from criu/include/cr_options.h:7,
                   from criu/mount.c:13:
  In function '__list_add',
      inlined from 'list_add' at include/common/list.h:41:2,
      inlined from 'mnt_tree_for_each' at criu/mount.c:1977:2:
  include/common/list.h:35:19: error: storing the address of local variable 'postpone' in
    '((struct list_head *)((char *)start + 8))[24].prev' [-Werror=dangling-pointer=]
     35 |         new->prev = prev;
        |         ~~~~~~~~~~^~~~~~
  criu/mount.c: In function 'mnt_tree_for_each':
  criu/mount.c:1972:19: note: 'postpone' declared here
   1972 |         LIST_HEAD(postpone);
        |                   ^~~~~~~~

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Dmitry Safonov
4930c98020 x86/xsave: Set only used XFEATURE_* in xstate_bv
Setting all supported by CPU features in xstate_bv may bring it into
dirty-upper-state as documented in specs, resulting in lower
performance. Let's not do this and set only those have been used by
dumpee.

P.S.
Off course it has to be a one-liner!

Fixes: #1171

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
3f8e3220ba CONTRIBUTING.md: document make lint / indent
This patch documents how do we use `make lint` and `make indent` and
adds a note about their integration with CI.

Co-authored-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00