2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-31 14:25:49 +00:00
Commit Graph

561 Commits

Author SHA1 Message Date
Kir Kolyshkin
75b859f23f scripts/ci: rm shellcheck disable annotations
Those are no longer needed with shellcheck 0.8.0.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-04-15 21:17:21 -07:00
Kir Kolyshkin
aeb6961f3d scripts/ci/run-ci-tests: use bash arrays
This is a preferred way of fixing SC2086 shellcheck warning.

Note that since ZDTM_OPTS is passed as a string (via make or docker),
we are converting it to an array using read -a.

Remove all "shellcheck disable=SC2086" annotations.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-04-15 21:17:21 -07:00
Kir Kolyshkin
b1fb9f2f0b Fix, not ignore, shellcheck SC1091 warnings
This is easy to fix (but we have to specify -x).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-04-15 21:17:21 -07:00
Kir Kolyshkin
9d2948b239 scripts/ci/asan.sh: fix, not ignore, shellcheck warning
We can use globstar bash feature instead of find in this case.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-04-15 21:17:21 -07:00
Kir Kolyshkin
968eec0d59 scripts/ci/apt-install: fix (not ignore) shellcheck warning
It is ok to quote $@, as it expands to "$1" "$2" ...

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2023-04-15 21:17:21 -07:00
Pavel Tikhomirov
9e91e62a7c criu-ns: capture controlling tty
When we are restoring in new pidns we specifically do setsid() from
criu-ns init so that sids of restored tasks are non-zero in this pidns
and on next dump CRIU would not have problems with zero sids, see [1].

But after this CRIU tries to inherit and setup a tty for the restored
process, and it fails to set it's process group via TIOCSPGRP to be a
foreground group for it's tty, because tty already is a controlling tty
for other session (which we had before setsid).

So to make it restore we need to reset tty to be a controlling tty of
criu-ns init via TIOCSCTTY before calling criu.

Else when restoring first time via criu-ns (from criu-ns dump) we get:

Error (criu/tty.c:689): tty: Failed to set group 40816 on 0: Inappropriate ioctl for device

https://github.com/checkpoint-restore/criu/issues/232 [1]

v2: add why and what comment in code, set controlling tty only for
--shell-job and fail if stdin is not a tty.

Fixes: #1893
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Younes Manton
7bc24688d6 ci: Clean up and improve Java testing
This patch changes top-level OpenJ9 filename and data references to Java
to make them generic and launches tests against both HotSpot and OpenJ9
JVMs.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
0178f2f990 ci: Add Dockerfile for openj9 on Ubuntu
Semeru builds (which use OpenJ9 instead of HotSpot) are the successors
of AdoptOpenJDK's OpenJ9 builds.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Younes Manton
39b3de60b6 ci: Rename openj9 Dockerfiles to hotspot
We used to pull AdoptOpenJDK's OpenJ9 builds but switched to
Eclipse Temurin, which uses the HotSpot VM instead of OpenJ9.
Rename the corresponding Dockerfiles to hotspot.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
2642b657da docker-test: handle race condition error
There is a race condition in docker/containerd that causes docker to
occasionally fail when starting a container from a checkpoint immediately
after the checkpoint has been created.

This problem is unrelated to criu and has been reported in
https://github.com/moby/moby/issues/42900

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
f9bc0a750a docker-test: use containerd installed from package
In commits [1, 2] the version of containerd installed by default in the
GitHub CI virtual environment was replaced with the latest release from
GitHub as a workaround to a bug in containerd.  This bug has been fixed
sometime ago and the current default version of containerd (1.6.6) does
not require this workaround. However, with the latest release, the
containerd binaries uploaded on GitHub have been built for Ubuntu 22.04
[3]. Our tests are still running on Ubuntu 20.04 and this results in the
following error:

/usr/bin/containerd: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/bin/containerd)
/usr/bin/containerd: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /usr/bin/containerd)

[1] https://github.com/checkpoint-restore/criu/commit/046cad8
[2] https://github.com/checkpoint-restore/criu/commit/81a68ad
[3] https://github.com/containerd/containerd/commit/6b2dc9a37

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
750acec25f Revert "ci: Switch to non overlaysfs tests"
This reverts commit 8bb05e3bf3.

The following bug has been fixed:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1967924

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Radostin Stoyanov
e8a6765d1e criu: fix conflicting headers
There are several changes in glibc 2.36 that make sys/mount.h header
incompatible with kernel headers:

https://sourceware.org/glibc/wiki/Release/2.36#Usage_of_.3Clinux.2Fmount.h.3E_and_.3Csys.2Fmount.h.3E

This patch removes conflicting includes for `<linux/mount.h>` and
updates the content of `criu/include/linux/mount.h` to match
`/usr/include/sys/mount.h`. In addition, inline definitions sys_*()
functions have been moved from "linux/mount.h" to "syscall.h" to
avoid conflicts with `uapi/compel/plugins/std/syscall.h` and
`<unistd.h>`. The include for `<linux/aio_abi.h>` has been replaced
with local include to avoid conflicts with `<sys/mount.h>`.

Fixes: #1949

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2023-04-15 21:17:21 -07:00
Alexander Mikhalitsyn
e30d18f435 rseq: fix headers conflict on Mariner GNU/Linux
1. For some reason, Marier distribution headers
not correctly define __GLIBC_HAVE_KERNEL_RSEQ
compile-time constant. It remains undefined,
but in fact header files provides corresponding
rseq types declaration which leads to conflict.

2. Another issue, is that they use uint*_t types
instead of __u* types as in original rseq.h.

This leads to compile time issues like this:
format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type 'uint64_t' {aka 'long unsigned int'}

and we can't even replace %llx to %PRIx64 because it will break
compilation on other distros (like Fedora) with analogical error:

error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘__u64’ {aka ‘long long unsigned int’}

Let's use our-own struct rseq copy fully equal to the kernel one,
it's safe because this structure is a part of Linux Kernel ABI.

Fixes #1934

Reported-by: Nikola Bojanic
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2023-04-15 21:17:21 -07:00
Younes Manton
ad58553d90 Add --skip-file-rwx-check opt test
Add a simple test using tail to check that processes can't be restored
by default when the r/w/x mode of an open file changes, unless
--skip-file-rwx-check is used.

Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
2023-04-15 21:17:21 -07:00
Ashutosh Mehra
28358db13b Fix the check for mnt namespace in criu-ns
criu-ns script incorrectly compares the pidns fd with mntns fd.
Also reversed the condition in is_my_namespace function to align it
with the function name.

Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
2023-04-15 21:17:21 -07:00
Andrei Vagin
6507ae5331 ci: test the read mode of pre-dump
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2022-06-22 10:20:33 -07:00
Radostin Stoyanov
ff92731690 crit: Use same version as criu
Name collision with an abandoned project named 'crit' in pypi causes pip
to show crit (CRiu Image Tool) as outdated.  This patch updates crit to
use the same version and license as criu.

Fixes #1878

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-06-22 10:20:33 -07:00
Alexander Mikhalitsyn
c1380c077a ci: workaround race between sit module loading and bridge test
https://github.com/checkpoint-restore/criu/issues/1866

Suggested-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-05-05 12:42:14 -07:00
Alexander Mikhalitsyn
550eafc5d8 ci: print kernel modules list
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-05-05 12:42:14 -07:00
Adrian Reber
f635b61f49 test: install criu in /usr
GitHub Actions comes with pre-installed criu in /usr. configure scripts
looking for CRIU will pickup the pre-installed version in /usr if we do
not install CI criu also in /usr.

Signed-off-by: Adrian Reber <areber@redhat.com>
2022-05-05 12:42:14 -07:00
Alexander Mikhalitsyn
f641e0c4ba ci: print mountinfo instead of mount cmd output
mountinfo contains more info than just "mount" output

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn
7ac85cab86 scripts/ci: fix ZDTM_OPTS variable passing
We have a separate target for alpine in script/ci/Makefile
which defines some extra opts for zdtm using ZDTM_OPTIONS
variable. But really it doesn't work. First of all, variable
should be named as ZDTM_OPTS and also we have to specify
it directly in the CONTAINER_RUNTIME cmdline to make it work.

I've also changed variable value just to make it consistent
with docker.env value which was really used.

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn
13338dee5c Revert "test: disable rseq also on Archlinux"
This reverts commit f008f74041.

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn
267c1fdade ci: add Fedora Rawhide based test on Cirrus
We have ability to use nested virtualization on
Cirrus, and already have "Vagrant Fedora based test (no VDSO)"
test, let's do analogical for Fedora Rawhide to get fresh kernel.

Suggested-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Alexander Mikhalitsyn
03aff7e823 Revert "ci: disable glibc rseq support"
Let's see how rseq() C/R feature works

This reverts commit d99def7dcf.

Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Kir Kolyshkin
0194ed392f Fix some codespell warnings
Brought to you by

	codespell -w

(using codespell v2.1.0).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2022-04-28 17:53:52 -07:00
Adrian Reber
8bb05e3bf3 ci: Switch to non overlaysfs tests
Switch to non overlaysfs tests for Podman and Docker.
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1967924

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-04-28 17:53:52 -07:00
Andrei Vagin
791651f1b6 criu-ns: add a helper to hold a pid namespace
The init process can exit if it doesn't have any child processes and its
pidns is destroyed in this case. CRIU dump is running in the target pid
namespace and it kills dumped processes at the end. We need to create a
holder process to be sure that the pid namespace will not be destroy
before criu exits.

Fixes: #1775

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2022-04-28 17:53:52 -07:00
Andrei Vagin
805559c1de scripts/ci: mount test cgroups once
zdtm.py mounts two named controllers for tests. In CI, we run zdtm.py a few
times, so we can mount (create) these controllers once to avoid any unwanted
effects.

Signed-off-by: Andrei Vagin <avagin@google.com>
2022-04-28 17:53:52 -07:00
Pavel Tikhomirov
3c0e99ccfa ci: make others/mnt_ext_dev also run for old mount engine
Now when we switched to mount-v2 by default to check old mount engine we
need to explicitly run with --mntns-compat-mode option.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Pavel Tikhomirov
3db949d821 ci: run tests for old mount engine
Now when we switched to mount-v2 by default to check old mount engine we
need to explicitly run with --mntns-compat-mode option.

Note that if the feature move_mount_set_group is not supported then
regular run will just fallback to old mount engine and then we don't
need separate run with --mntns-compat-mode.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Pavel Tikhomirov
cef8366f52 kerndat: check whether the openat2 syscall is supported
Will use openat2 + RESOLVE_NO_XDEV to detect mountpoints.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2022-04-28 17:53:52 -07:00
Rajneesh Bhardwaj
99a2380fc0 criu/plugin: Dockerfile for amdgpu_plugin
This sets up the pytorch environment for BERT Transformers and also sets
up CRIU along with all its dependencies including amdgpu plugin for
supporting CR with AMDGPUs.

Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
a8dd7d2909 ci: run criu-config tests
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
1c54c45fc5 zdtm: drop redundant config_inotify_irmap test
The config_inotify_irmap test duplicates inotify_irmap with slight
change to add the --force-irmap and --irmap-scan-path options in
a configuration file.

The --criu-config option of ZDTM provides more general solution
for testing CRIU options provided in configuration files.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-04-28 17:53:52 -07:00
Adrian Reber
6b842635bd test: disable rseq also on Archlinux
Seems like Archlinux also uses rseq now and that breaks CRIU.
Also disable rseq on Archlinux.

Signed-off-by: Adrian Reber <areber@redhat.com>
2022-04-28 17:53:52 -07:00
Bui Quang Minh
56df8aeeb5 ci: skip MAP_HUGETLB tests in stream test
Currently, hugetlb mappings is not premapped so in the restore content phase, we
skip page read these pages, enqueue the iovec for later reading in restorer and
eventually close the page read. However, image-streamer expects the whole image
to be read and the image is not re-opened, sent twice. These MAP_HUGETLB test
cases will result in EPIPE error. Temporarily disable these test cases for now.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
7177938e60 criu-ns: use os.waitstatus_to_exitcode()
os.WEXITSTATUS() returns the process exit status and it should be used
only if WIFEXITED() is true, i.e., the process terminated normally.

os.waitstatus_to_exitcode() does the same as os.WEXITSTATUS() but it
also handles the case when the process has been terminated by a signal.

Suggested-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
bb1b1681ab criu-ns: fix exit code o for criu dump
Fixes: #1739

Reported-by: @PavloMykhailyshyn
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
89267dbcc8 ci: install libbsd dependency
The libbsd dependency is used to enable support for `setproctitle()`
and `strlcpy()`.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-04-28 17:53:52 -07:00
Adrian Reber
7f4265dc0b ci: update to latest Vagrant and Fedora images
Signed-off-by: Adrian Reber <areber@redhat.com>
2022-04-28 17:53:52 -07:00
Nicolas Viennot
6f9d62eb38 ci: test criu-image-streamer with all tests
All the bugs that were in the way got fixed. We can enable all tests.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2022-04-28 17:53:52 -07:00
Andrei Vagin
8775cf3a50 ci: reenable the lazy-thp test in the lazy-remote mode
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
119a798856 ci: disable glibc rseq support
This patch sets the glibc.pthread.rseq tunable [1] to disable rseq
support in glibc as a temporary solution for the problem described in
[2]. This would allow us to run CI tests until CRIU has rseq support.

This commit also disables the rpc tests as they fail even
when GLIBC_TUNABLES is set.

[1] https://sourceware.org/git/?p=glibc.git;a=commit;h=e3e589829d16af9f7e73c7b70f74f3c5d5003e45
[2] https://github.com/checkpoint-restore/criu/issues/1696

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
9fd000c58d ci: use unstable release for cross-compile
We added cross-compile tests with testing debian release to be able to
replicate the error reported in #1653, however, installing build
dependencies in this release currently fails with the following error:

libc6-dev:armhf : Breaks: libc6-dev-armhf-cross (< 2.33~) but 2.32-1cross4 is to be installed

This is not something we can fix, therefore using the debian unstable
release (instead of testing) could be more reliable option for our CI.
This would still replicate the problem reported in #1653.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
d79d73e3a0 ci: install procps in Alpine
The version of ps in Alpine image by default is very limited.
It is based on the one from busybox and doesn't support options
such as '-p'.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00
Bui Quang Minh
3eba68089e ci: Enable disabled unix socket related tests
As the unix socket broken tests have been fixed in the pull request

https://github.com/checkpoint-restore/criu/pull/1680

We re-enable these tests.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2022-04-28 17:53:52 -07:00
Adrian Reber
a52185ffe3 ci: disable broken tests until fixed
Broken tests are being tracked at

 * https://github.com/checkpoint-restore/criu/issues/1669
 * https://github.com/checkpoint-restore/criu/issues/1635

This also enables previously disabled BPF related tests:

 * https://github.com/checkpoint-restore/criu/issues/1354

Signed-off-by: Adrian Reber <areber@redhat.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
d514bacb40 ci: Run cross compile with debian testing
Debian testing has newer compiler version and running
cross compilation tests would allow us to catch any compilation
errors early.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00