We see that libbsd redefines __has_include to be always true, which
breaks such checks for rseq. The idea behind this patch is remove the
use of libbsd functions and always export our replacement functions.
Using __strlcat and __strlcpy everywhere in existing code:
git grep --files-with-matches "strlcat" | xargs sed -i 's/strlcat/__strlcat/g'
git grep --files-with-matches "strlcpy" | xargs sed -i 's/strlcpy/__strlcpy/g'
Fixes: #2036
Suggested-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
As our pr_* functions are complex and can call different system calls
inside before actual printing (e.g. gettimeofday for timestamps) actual
errno at the time of printing may be changed.
Let's just use %s + strerror(errno) instead of %m with pr_* functions to
be explicit that errno to string transformation happens before calling
anything else.
Note: tcp_repair_off is called from pie with no pr_perror defined due to
CR_NOGLIBC set and if I use errno variable there I get "Unexpected
undefined symbol: `__errno_location'. External symbol in PIE?", so it
seems there is no way to print errno there, so let's just skip it.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
As our pr_* functions are complex and can call different system calls
inside before actual printing (e.g. gettimeofday for timestamps) actual
errno at the time of printing may be changed.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This kernel feature contained some bugs initially. Those logs are useful in identifing what the
underlaying issue is and which kernel patch to backport.
Signed-off-by: Michal Clapinski <mclapinski@google.com>
This way we can check that mount tree topology (including sharing
groups) is the same before and after c/r.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Now we can compare mount tree and sharing group tree topology before and
after c/r with mntns_compare() helper.
Algorithm here is:
1) build mount tree based on mnt_id and parent_mnt_id from mountinfo
2) sort mount tree children based on path comparison
3) at the same time set topology_id for mounts by DFS order and order
mounts in list accordingly
4) build shared groups tree based on sharing_id and master_id
5) at the same time set topology_id for sharings as smallest topology_id
of its mounts, also sharings are put in their list in order of
their topology_id
6) walk sorted mounts lists for both namespaces simultaneously each
pair of moutns should have matching ids and parent ids
7) walk sorted sharings lists for both namespaces simultaneously each
pair of sharings should have matching ids and parent ids
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
For mount testing it is nice to be able to parse mountinfo from zdtm
test itself, for instance to be able to compare mountinfo topology
before and after c/r, or for anything else. So let's add a helper
mntns_parse_mountinfo() which parses current mount namespace mountinfo.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Need it to use linux lists in zdtm.
Also copy container_of from comiler.h to zdtmtst.h like we already do
for e.g. __stack_aligned__ macro.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Previousely "make indent" checked all files in criu source directory for
codding style flaws. We have several problems with it:
- clang-format default format sometimes changes in new versions of the
package and we need to reformat all our code base each time it happens
- on different systems we may have different versions of clang-format
and on latest criu-dev "make indent" may be still unhappy on your system
- when we want to update clang-format rules ourselves we need to update
all our code base each time
- sometimes clang-format rules are not fitting all our cases, (e.g.: an
option IndentGotoLabels works nice for simple C code, but is a no go for
assembler and C macros) and putting "clang-format off" everywhere is a
mess
- sometimes we intentionally want to break clang-format rules (e.g.:
we want to put function arguments on a new line separating them
"logically" not "mechanically" following 120-char rule like clang-format
does).
This adds a BASE option for "make indent" where all commits in range
BASE..HEAD would be checked with git-clang-format for codding style
flaws. For instance when developing on top of criu-dev, one can use
"make BASE=origin/criu-dev indent" to check all their commits for
compliance with the clang-format rules. Default base is HEAD~1 to make
last commit checked when "make indent" is called. The closest thing to
the old behaviour would then be "make indent BASE=init", note that only
commited files would be checked.
Extra options to git-clang-format may be passed through OPTS variable.
Also reuse "make indent" in github lint workflow.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The command ./zdtm.py list currently fails with
if opts['rootless']:
~~~~^^^^^^^^^^^^
KeyError: 'rootless'
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
memory.kmem.limit_in_bytes has been deprecated. Look at e7c4184164f7
("memcg, kmem: further deprecate kmem.limit_in_bytes") for more details.
Signed-off-by: Andrei Vagin <avagin@google.com>
Restoring SO_MARK requires root or CAP_NET_ADMIN. If the value
is 0 we will avoid dumping it so that we don't need to do a
privileged call on restore.
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
SO_SNDBUFFORCE/SO_RCVBUFFORCE require root or CAP_NET_ADMIN.
We can use SO_SNDBUF/SO_RCVBUF in some cases and avoid
needing elevated privileges.
This patch renames sk_setbufs() to sk_setbufs_ns() and
makes sk_setbufs() a general helper that sets socket
send and receive buffer sizes. The helper tries to use
SO_SNDBUFFORCE/SO_RCVBUFFORCE first and falls back to
SO_SNDBUF/SO_RCVBUF if we're in unprivileged mode.
The existing sk_setbufs_ns() which takes a pid parameter
and is intended to be called via userns_call() is rewritten
to call sk_setbufs().
Existing code that sets buffer sizes via setsockopt() is
modified to call sk_setbufs() instead.
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
ghost_multi_hole00 and ghost_multi_hole01 are tests which create a ghost file
with a lot of holes, there are 4K data and 4K hole inside every 8K length.
The only difference between them is ghost-fiemap option, 01 is a
test for the fiemap dumping algorithm, and we want to test the
behavior of EXTENT_MAX_COUNT part, so the file size should be 8M, thus there
will be 1024 chunks in the ghost file.
In some file system, such as xfs, we somehow can not easily create highly sparse
file as in ext4 or btrfs, therefore we need `fallocate` to forcibly create holes.
Signed-off-by: Liang-Chun Chen <featherclc@gmail.com>
In order to reduce the frequency of using system call, based on
https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/tree/misc/create_inode.c#n519,
I created a new algorithm of dumping chunk via fiemap.(copy_file_to_chunks_fiemap)
Also, I added another BOOL_OPT for users to determine which algorithm they
want to use. Moreover, for those filesystem not supporting fiemap, criu
will fall back to the original algorithm(SEEK_HOLE/SEEK_DATA).
v2: don't call copy_chunk_from_file on outstanding extent; rearange
headers to workaround "redeclaration of ‘enum fsconfig_command’" problem
Signed-off-by: Liang-Chun Chen <featherclc@gmail.com>
This patch fixes applies the changes required by clang-format v15.0.5
for `make indent`.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The python3 package in Alpine has recently been updated to install
symbolic link for /usr/bin/python.
https://git.alpinelinux.org/aports/commit/main/python3?id=d91da210b1614eb75517d59b7f348fee01699f35
This causes the following error in CI:
Step 10/11 : RUN ln -s /usr/bin/python3 /usr/bin/python
---> Running in a5a94be9dc93
ln: failed to create symbolic link '/usr/bin/python': File exists
The command '/bin/sh -c ln -s /usr/bin/python3 /usr/bin/python' returned a non-zero code: 1
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The way ShellCheck is installed was changed in commit c056f99
(ci/gha/lint: install a recent shellcheck) to use the latest version
v0.8.0 and remove some of the "shellcheck disable=..." annotations.
Since then, Fedora 37 has been released and the ShellCheck package
has been updated to v0.8.0.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
While building on a machine that has a HOL clang compiler,
I ran into warnings regarding the changed line. It appears
this warning is on by default because of anticipated changes
to the C standard.
Signed-off-by: Drew Wock <ajwock@gmail.com>
This patch adds a missing definition for `__nmk_dir` in the Makefile
for the amdgpu plugin. This definition is required, for example, when
building the `test_topology_remap` target:
make -C plugins/amdgpu/ test_topology_remap
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Zombie tasks are dumped in dump_zombies() so it is redundant to handle them
in dump_one_task().
Deprecate cg_set in task_core_entry as this field must be per thread now.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Some users on Raspberry Pi report that the kerndat checking for
memfd_create(MFD_HUGETLB) support returns ENOSYS even when memfd_create
syscall is available. We currently treat this error as unexpected and
return error. This commit marks the memfd_create(MFD_HUGETLB) as
unavailable when ENOSYS is returned.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
A previous commit added a cgroup cpuset unmounting to
scripts/ci/Makefile. We are sometimes running in a container without the
necessary privileges to unmount certain cgroups.
This commit moves the cgroup unmounting to a place in run-ci-tests.sh
which already requires privileged access and does not break unprivileged
build-only CI runs.
Signed-off-by: Adrian Reber <areber@redhat.com>
As cgroupv2_00, cgroupv2_01 need cpuset in cgroup-v2 hierarchy to check CRIU
handle cgroup-v2 properly, umount cpuset in cgroup-v1 to make it move to
cgroup-v2.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
This test creates a process with 2 threads in different threaded controllers and
check if CRIU restores these threads' cgroup controllers properly.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
As threads in a process may be in different threaded controllers, we need to
move thoses threads to the correct controllers.
Because the threads of a process are restored in later stage in restorer.c, we
need to create a cgroupd service to help to move those threads into correct
controllers when they are restored. We cannot use usernsd as the code in
restorer does not know the address of outside function to pass to userns_call.
However, this cgroupd service still reuses a lot of code from usernsd.
The main logic is that restored threads receive the cg_set number they belong to
before restorer stage in case their cg_set are different from main thread. When
these threads are restored, they send the cg_set number and their thread ids
through unix socket to cgroupd. cgroupd receives the cg_set number and thread
ids and moves those threads into correct controllers. Thread ids are sent
through SCM_CREDENTIALS of unix socket so they are translated into correct
thread ids in the receiving end.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Currently, we assume all threads in process are in the same cgroup controllers.
However, with threaded controllers, threads in a process may be in different
controllers. So we need to dump cgroup controllers of every threads in process
and fixup the procfs cgroup parsing to parse from self/task/<tid>/cgroup.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
This commit supports checkpoint/restore some new global properties in cgroup-v2
cgroup.subtree_control
cgroup.max.descendants
cgroup.max.depth
cgroup.freeze
cgroup.type
Only cgroup.subtree_control, cgroup.type need some more code to handle.
cgroup.subtree_control value needs to be set with "+", "-" prefix and
cgroup.type can only be written with value "threaded" if we want to make this
controller threaded. cgroup.type is a special property because this property
must be restored before any processes can move into this controller.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
It seems like drone.io no longer provides free aarch64/armhf CI runs.
This switches the aarch64 CI runs to Cirrus CI. armhf CI runs have been
dropped for now as they are not directly supported.
Signed-off-by: Adrian Reber <areber@redhat.com>
Since commit https://github.com/torvalds/linux/commit/5563cabdde, user with
enough capability can open IPC sysctl files and write to them. Therefore, we
don't need to use usernsd process in the outside user namespace to help with
that anymore. Furthermore, some later commits:
https://github.com/torvalds/linux/commit/1f5c135ee5,
https://github.com/torvalds/linux/commit/0889f44e28 bind the IPC namespace to
the opened file descriptor of IPC sysctl at the open() time, the changed value
does not depend on the IPC namespace of write() time anymore. This breaks the
current usernsd approach.
So, we prioritize opening/writing IPC sysctl files in the context of restored
process directly without usernsd help. This approach succeeds in the newer
kernel since the restored process has enough capabilities at this restore stage.
With older kernel, the open() fails and we fallback to the usernsd approach.
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
In Virtuozzo we've faced out-of-bound access when calling this function
on short path string, which corrupted other memory and lead to
segmentation fault. So it may be useful to have this comment in code to
avoid such a missuse of this function in future.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
These are the minimal changes to make zdtm.py successfully run the
env00 and pthread test case as non-root using the '--rootless' zdtm option.
Co-authored-by: Younes Manton <ymanton@ca.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
This adds the non-root section and information about the parameter
--unprivileged to the man page.
Co-authored-by: Anna Singleton <annabeths111@gmail.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Anna Singleton <annabeths111@gmail.com>