2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 05:48:05 +00:00

11541 Commits

Author SHA1 Message Date
Pavel Tikhomirov
abfe0b5d24 clang-format: add for_each_bit macros to ForEachMacros
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
c8b4fb9ba5 autofs: fix a frankenstein auto-created by clang-format
Fixes: 93dd984ca ("Run 'make indent' on all C files")

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Cyrill Gorcunov
aab709b602 log: Write more details in write_pidfile
Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Michal Clapinski
7c6eb0b85c asm: fix for_each_bit macro
find_next_bit operates on a bit instead of byte positions/sizes.

Signed-off-by: Michal Clapinski <mclapinski@google.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
bb3f7bef66 crtools: fix help message alignment for --network-lock
Fixes: 2e30db5c3 ("criu: add --network-lock option to allow nftables alternative")

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
a302b36940 zdtm: fix 'zdtm.py list' command
The command ./zdtm.py list currently fails with

    if opts['rootless']:
       ~~~~^^^^^^^^^^^^
    KeyError: 'rootless'

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Andrei Vagin
21f5be91a9 cgroups: ignore EOPNOTSUPP on setting memory.kmem.limit_in_byte
memory.kmem.limit_in_bytes has been deprecated. Look at e7c4184164f7
("memcg, kmem: further deprecate kmem.limit_in_bytes") for more details.

Signed-off-by: Andrei Vagin <avagin@google.com>
2023-04-15 21:17:21 -07:00
Andrei Vagin
9686693aa6 test/javaTests: update org.testng:testng (Maven)
TestNG is vulnerable to Path Traversal

Fixes https://github.com/checkpoint-restore/criu/security/dependabot/1.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Andrei Vagin
5c60d35be4 sockets: tiny style fix
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-04-15 21:17:21 -07:00
Younes Manton
5a19c34322 non-root: Don't dump socket option SO_MARK if 0
Restoring SO_MARK requires root or CAP_NET_ADMIN. If the value
is 0 we will avoid dumping it so that we don't need to do a
privileged call on restore.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
2180e03b90 non-root: Rework socket bufs for unprivileged mode
SO_SNDBUFFORCE/SO_RCVBUFFORCE require root or CAP_NET_ADMIN.
We can use SO_SNDBUF/SO_RCVBUF in some cases and avoid
needing elevated privileges.

This patch renames sk_setbufs() to sk_setbufs_ns() and
makes sk_setbufs() a general helper that sets socket
send and receive buffer sizes. The helper tries to use
SO_SNDBUFFORCE/SO_RCVBUFFORCE first and falls back to
SO_SNDBUF/SO_RCVBUF if we're in unprivileged mode.

The existing sk_setbufs_ns() which takes a pid parameter
and is intended to be called via userns_call() is rewritten
to call sk_setbufs().

Existing code that sets buffer sizes via setsockopt() is
modified to call sk_setbufs() instead.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Shubham Verma
e5ccfbb240 Fix typo in comment
Signed-off-by: Shubham Verma <shubhamv.sv@gmail.com>
2023-04-15 21:17:21 -07:00
Liang-Chun Chen
bdbccc315a zdtm: add two tests for highly sparse ghost file
ghost_multi_hole00 and ghost_multi_hole01 are tests which create a ghost file
with a lot of holes, there are 4K data and 4K hole inside every 8K length.

The only difference between them is ghost-fiemap option, 01 is a
test for the fiemap dumping algorithm, and we want to test the
behavior of EXTENT_MAX_COUNT part, so the file size should be 8M, thus there
will be 1024 chunks in the ghost file.

In some file system, such as xfs, we somehow can not easily create highly sparse
file as in ext4 or btrfs, therefore we need `fallocate` to forcibly create holes.

Signed-off-by: Liang-Chun Chen <featherclc@gmail.com>
2023-04-15 21:17:21 -07:00
Liang-Chun Chen
f3fdce81a6 files-reg.c: fiemap algorithm for ghost file
In order to reduce the frequency of using system call, based on
https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/tree/misc/create_inode.c#n519,
I created a new algorithm of dumping chunk via fiemap.(copy_file_to_chunks_fiemap)

Also, I added another BOOL_OPT for users to determine which algorithm they
want to use. Moreover, for those filesystem not supporting fiemap, criu
will fall back to the original algorithm(SEEK_HOLE/SEEK_DATA).

v2: don't call copy_chunk_from_file on outstanding extent; rearange
headers to workaround "redeclaration of ‘enum fsconfig_command’" problem

Signed-off-by: Liang-Chun Chen <featherclc@gmail.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
14b9ec195f ci: fix make indent
This patch fixes applies the changes required by clang-format v15.0.5
for `make indent`.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
d0c64b7b34 ci/alpine: remove symlink for /usr/bin/python
The python3 package in Alpine has recently been updated to install
symbolic link for /usr/bin/python.

https://git.alpinelinux.org/aports/commit/main/python3?id=d91da210b1614eb75517d59b7f348fee01699f35

This causes the following error in CI:

  Step 10/11 : RUN ln -s /usr/bin/python3 /usr/bin/python
   ---> Running in a5a94be9dc93
  ln: failed to create symbolic link '/usr/bin/python': File exists
  The command '/bin/sh -c ln -s /usr/bin/python3 /usr/bin/python' returned a non-zero code: 1

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
095b3e84b7 ci/lint: install ShellCheck with dnf
The way ShellCheck is installed was changed in commit c056f99
(ci/gha/lint: install a recent shellcheck) to use the latest version
v0.8.0 and remove some of the "shellcheck disable=..." annotations.
Since then, Fedora 37 has been released and the ShellCheck package
has been updated to v0.8.0.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Drew Wock
c48b5290dc Fix warnings from -Wstrict-prototypes in clang 16.0.0
While building on a machine that has a HOL clang compiler,
I ran into warnings regarding the changed line.  It appears
this warning is on by default because of anticipated changes
to the C standard.

Signed-off-by: Drew Wock <ajwock@gmail.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
fa2c585c22 amdgpu: define __nmk_dir if missing
This patch adds a missing definition for `__nmk_dir` in the Makefile
for the amdgpu plugin. This definition is required, for example, when
building the `test_topology_remap` target:

	make -C plugins/amdgpu/ test_topology_remap

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Mathias Gibbens
c7211f52db Remove execute bit from source file
Signed-off-by: Mathias Gibbens <mathias@calenhad.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
f5e0f641a8 cgroup: Remove redundant code that handles zombie tasks
Zombie tasks are dumped in dump_zombies() so it is redundant to handle them
in dump_one_task().

Deprecate cg_set in task_core_entry as this field must be per thread now.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
c1ae880eb4 kerndat: Mark memfd_create(MFD_HUGETLB) unavailable when ENOSYS is returned
Some users on Raspberry Pi report that the kerndat checking for
memfd_create(MFD_HUGETLB) support returns ENOSYS even when memfd_create
syscall is available. We currently treat this error as unexpected and
return error. This commit marks the memfd_create(MFD_HUGETLB) as
unavailable when ENOSYS is returned.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Adrian Reber
153614cb1d ci: move cgroup unmounting to run-ci-tests.sh
A previous commit added a cgroup cpuset unmounting to
scripts/ci/Makefile. We are sometimes running in a container without the
necessary privileges to unmount certain cgroups.

This commit moves the cgroup unmounting to a place in run-ci-tests.sh
which already requires privileged access and does not break unprivileged
build-only CI runs.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-04-15 21:17:21 -07:00
Adrian Reber
516ebc4f58 ci: Do not fail if latest epel repository definition is already installed
Signed-off-by: Adrian Reber <areber@redhat.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
2ebce92333 ci: Make cpuset move to cgroup-v2 hierarchy
As cgroupv2_00, cgroupv2_01 need cpuset in cgroup-v2 hierarchy to check CRIU
handle cgroup-v2 properly, umount cpuset in cgroup-v1 to make it move to
cgroup-v2.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
07d538cefc zdtm: Check threads are restored into correct threaded controllers
This test creates a process with 2 threads in different threaded controllers and
check if CRIU restores these threads' cgroup controllers properly.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
20ea8a0647 cgroup-v2: Restore threads in a process into correct threaded controllers
As threads in a process may be in different threaded controllers, we need to
move thoses threads to the correct controllers.

Because the threads of a process are restored in later stage in restorer.c, we
need to create a cgroupd service to help to move those threads into correct
controllers when they are restored. We cannot use usernsd as the code in
restorer does not know the address of outside function to pass to userns_call.
However, this cgroupd service still reuses a lot of code from usernsd.

The main logic is that restored threads receive the cg_set number they belong to
before restorer stage in case their cg_set are different from main thread. When
these threads are restored, they send the cg_set number and their thread ids
through unix socket to cgroupd. cgroupd receives the cg_set number and thread
ids and moves those threads into correct controllers. Thread ids are sent
through SCM_CREDENTIALS of unix socket so they are translated into correct
thread ids in the receiving end.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
17d1d8810e cgroup-v2: Dump cgroup controllers of every threads in a process
Currently, we assume all threads in process are in the same cgroup controllers.
However, with threaded controllers, threads in a process may be in different
controllers. So we need to dump cgroup controllers of every threads in process
and fixup the procfs cgroup parsing to parse from self/task/<tid>/cgroup.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
ad3936e81e zdtm: Add test to check global properties of cgroup-v2 are preserved
Check that CRIU can checkpoint/restore global properties in cgroup-v2 properly.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
d7e8746598 zdtm: Add write_value/read_value helpers into zdtm library
Add write_value/read_value helpers to write/read buffer to/from files into zdmt
library.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
167cfd366e cgroup-v2: Checkpoint and restore some global properties
This commit supports checkpoint/restore some new global properties in cgroup-v2

	cgroup.subtree_control
	cgroup.max.descendants
	cgroup.max.depth
	cgroup.freeze
	cgroup.type

Only cgroup.subtree_control, cgroup.type need some more code to handle.
cgroup.subtree_control value needs to be set with "+", "-" prefix and
cgroup.type can only be written with value "threaded" if we want to make this
controller threaded. cgroup.type is a special property because this property
must be restored before any processes can move into this controller.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Adrian Reber
8a336ab226 Switch aarch64 builds to Cirrus CI
It seems like drone.io no longer provides free aarch64/armhf CI runs.

This switches the aarch64 CI runs to Cirrus CI. armhf CI runs have been
dropped for now as they are not directly supported.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-04-15 21:17:21 -07:00
Bui Quang Minh
840735aa08 ipc_sysctl: Prioritize restoring IPC variables using non usernsd approach
Since commit https://github.com/torvalds/linux/commit/5563cabdde, user with
enough capability can open IPC sysctl files and write to them. Therefore, we
don't need to use usernsd process in the outside user namespace to help with
that anymore. Furthermore, some later commits:
https://github.com/torvalds/linux/commit/1f5c135ee5,
https://github.com/torvalds/linux/commit/0889f44e28 bind the IPC namespace to
the opened file descriptor of IPC sysctl at the open() time, the changed value
does not depend on the IPC namespace of write() time anymore. This breaks the
current usernsd approach.

So, we prioritize opening/writing IPC sysctl files in the context of restored
process directly without usernsd help. This approach succeeds in the newer
kernel since the restored process has enough capabilities at this restore stage.
With older kernel, the open() fails and we fallback to the usernsd approach.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
3db8d1a6c6 cgroup: add a comment to restore_cgroup_prop about path argument requirements
In Virtuozzo we've faced out-of-bound access when calling this function
on short path string, which corrupted other memory and lead to
segmentation fault. So it may be useful to have this comment in code to
avoid such a missuse of this function in future.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Adrian Reber
1cba559da4 non-root: add non-root test case to cirrus runs
Run env00 and pthread00 test as non-root as initial proof of concept.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-04-15 21:17:21 -07:00
Adrian Reber
6743d608cf non-root: extend zdtm.py to be able to run tests as non-root
These are the minimal changes to make zdtm.py successfully run the
env00 and pthread test case as non-root using the '--rootless' zdtm option.

Co-authored-by: Younes Manton <ymanton@ca.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Adrian Reber
251939992a Documentation: add details about --unprivileged
This adds the non-root section and information about the parameter
--unprivileged to the man page.

Co-authored-by: Anna Singleton <annabeths111@gmail.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Anna Singleton <annabeths111@gmail.com>
2023-04-15 21:17:21 -07:00
Younes Manton
47b07d0110 non-root: Introduce unprivileged mode to kerndat
This patch modifies how kerndat is handled in unprivileged mode.

Initialization and functionality that can only be done as root is
made separate from common code. The kerndat file's location is
defined as $XDG_RUNTIME_DIR/criu.kdat in unprivileged mode. Since
we expect that directory to be on tmpfs we maintain the same behavior
as the root-mode kerndat which lives in /run.

Co-authored-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
6a30c7d1ed non-root: enable non-root checkpoint/restore
This commit enables checkpointing and restoring of applications as
non-root.

First goal was to enable checkpoint and restore of the env00 and
pthread00 test case.

This uses the information from opts.unprivileged and opts.cap_eff to
skip certain code paths which do not work as non-root.

Co-authored-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Adrian Reber
ce01f70d94 non-root: add functions to work with capabilities
This adds the function check_caps() which checks if CRIU is running
with at least CAP_CHECKPOINT_RESTORE. That is the minimum capability
CRIU needs to do a minimal checkpoint and restore from it.

In addition helper functions are added to easily query for other
capability for enhanced checkpoint/restore support.

Co-authored-by: Younes Manton <ymanton@ca.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Adrian Reber
4b4bf0421b non-root: add infrastructure to run as non-root
The idea behind the rootless CRIU code is, that CRIU reads out its
effective capabilities and stores that in the global opts structure.

Different parts of CRIU can then, based on the existing capabilities,
automatically enable or disable certain code paths.

Currently at least CAP_CHECKPOINT_RESTORE is required. CRIU will not
start without this capability.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
4c7f91afff ci: enable EPEL for CentOS 7
python2-future, python2-junit_xml, python-flake8 and libbsd-devel are
now provided from EPEL.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Younes Manton
a39d416568 compel: Fix ppc64le parasite stack layout
The ppc64le ABI allows functions to store data in caller frames.
When initializing the stack pointer prior to executing parasite code
we need to pre-allocating the minimum sized stack frame before
jumping to the parasite code.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
17ec539132 compel: Add test to check parasite stack setup
Some ABIs allow functions to store data in caller frame, which
means that we have to allocate an initial stack frame before
executing code on the parasite stack.

This test saves the contents of writable memory that follows the stack
after the victim has been infected but before we start using the
parasite stack. It later checks that the saved data matches the
current contents of the two memory areas. This is done while the
victim is halted so we expect a match unless executing parasite code
caused memory corruption. The test doesn't detect cases where we
corrupted memory by writing the same value.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
556ab0deaf compel: Fix infect test to not override failures
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>

return zero on chk success

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

Co-authored-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Younes Manton
461fa72715 compel: Add APIs to facilitate testing
Starting the daemon is the first time we run code in the victim
using the parasite stack.

It's useful for testing to be able to infect the victim without starting
the daemon so that we can inspect the victim's state, set up stack
guards, and so on before stack-related corruption can happen.

Add compel_infect_no_daemon() to infect the victim but not start the
daemon and compel_start_daemon() to start the daemon after the victim
is infected.

Add compel_get_stack() to get the victim's main and thread parasite
stacks.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Liu Hua
debc9c16cc seize: do not overwrite exit code from failpath
Signed-off-by: Liu Hua <weldonliu@tencent.com>
2023-04-15 21:17:21 -07:00
Kir Kolyshkin
16f1c147c8 test/others/crit/test.sh: use bash array
In fact an array (aptly named array) is already used in run_test2,
so let's just make it an array right from the start.

While at it, remove ls invocation.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-04-15 21:17:21 -07:00
Kir Kolyshkin
0a872ccf16 scripts/protobuf-gen.sh: fix (not ignore) shellcheck warnings
This basically replaces

	for x in $(sed ...); do

with

	sed ... | while IFS= read -r x; do

The only caveat is, sed program was amended to remove empty lines
(there was one right above the PB_AUTOGEN_STOP).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-04-15 21:17:21 -07:00
Kir Kolyshkin
75b859f23f scripts/ci: rm shellcheck disable annotations
Those are no longer needed with shellcheck 0.8.0.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-04-15 21:17:21 -07:00