Just set all possible values 0-3 and chack if it persists.
Reviewed-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
When exceptions are raised during testing, the image streamer process
should be terminated as opposed to being left hanging.
This could lead to the whole test suite to be left hanging as it waits
for all child processes to exit.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
ShellCheck reports the following problems:
SC2086: Double quote to prevent globbing and word splitting.
SC2035: Use ./*glob* or -- *glob* so names with dashes won't become options.
SC1091: Not following: ../env.sh was not specified as input (see shellcheck -x).
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
Previous commit added support for python3 in criu-coredump. For convenience,
add two files (coredump-python2 and coredump-python3) that start
criu-coredump with respective python version. Edit env.sh accordingly.
Signed-off-by: Andrey Vyazovtsev <viazovtsev.av@phystech.edu>
run_test was trying to read criu logs on build failure
instead of runtime error.
This patch also removes the unnecessary subfolder with name "i"
and resolves some of issues reported by shellcheck.
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
This test case aims to verify that CRIU correctly
restores a process in IPC, UTS and Time namespaces
with criu_join_ns_add() libcriu API.
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
In test/jenkins/{crit.sh,criu-dump}, ZDTM is run with --norst,
Causing tests to only go through dump wihtout restoring.
The network locking tests are highly dependant on dump/restore hooks
causing them to hang when run with --norst.
We just add a reqrst flag to all network lock tests.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
When the network is locked using a specific method like iptables
or nftables there is no need to require passing the same method
during restore.
We save the lock method during dump in the inventory image and
use that in restore.
This always overwrites the restore --network-lock option.
v2: store opts.network_lock_method directly to avoid dependency
on rpc.proto's 'enum criu_network_lock_method'.
v3: fall back to iptables if image is generated with an older
version of CRIU.
v4: remove --network-lock from netns_lock_* from restore
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
v2: remove unnecessary elif and else after return in
wait_server_addr()
v3: use IOError instead of FileNotFoundError for python2
compatibility
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This is just a symlink to the original static/net_lock_socket_iptables
test with the right options passed to use nftables instead.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
When criu dumps a process with --tcp-established opt it locks
the open tcp connections so that no packets from peer enters
the stack, otherwise RST will be sent by a kernel causing the
connection to fail.
Post-start hook creates a connection with the test server and
creates a background thread that stays alive for the duration
of the test. This background thread sends data to the test
server at three stages:
- Pre-dump: Should send normally
- Pre-restore:
If connection is locked properly, packets will be dropped
and TCP will just retry, which will eventually be sent when
the process is restored and the network is unlocked.
- Post-restore: Should send normally
Data sent at the three stages is then checked at the server's side.
v2:
- remove unused imports and constants
- delete sync file in wait_sync_file() instead of --clean
v3:
- add comments
Co-developed-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This is just a symlink to the original static/netns_lock test with
the right options passed to use nftables instead.
v2:
- make static/netns_lock test iptables explicitly
- prevent netns_lock tests from running in parallel because
netns & sync files creation were conflicting in both tests.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
On Virtuozzo7 jenkins we see a fail of criu-dev zdtm:
===================== Run zdtm/static/pthread_timers in ns =====================
Start test
./pthread_timers --pidfile=pthread_timers.pid --outfile=pthread_timers.out
Run criu dump
=[log]=> dump/zdtm/static/pthread_timers/112/1/dump.log
------------------------ grep Error ------------------------
(00.004817) netlink: Collect netlink sock 0x1cad6e21
(00.004821) netlink: Collect netlink sock 0x1cad6e22
(00.004831) Collecting pidns 9/112
(00.004886) No parent images directory provided
(00.004903) Warn (criu/lsm.c:328): don't know how to suspend LSM 0
------------------------ ERROR OVER ------------------------
Run criu restore
4: Old maps lost: set([])
4: New maps appeared: set([u'7fe4c54ca000-7fe4c54cb000 ---p', u'7fe4c0000000-7fe4c0021000 rw-p', u'7fe4c0021000-7fe4c4000000 ---p', u'7fe4c54cb000-7fe4c5ccb000 rw-p'])
############# Test zdtm/static/pthread_timers FAIL at maps compare #############
https://ci.openvz.org/job/CRIU/job/CRIU-virtuozzo/job/criu-dev/8032/consoleFull
First thing to mention is that this is not related to criu. I can manage
to reproduce it with "--nocr", problem is that some mapping appears a
bit later when we do pre-cr get_visible_state().
By debugging SIGEV_THREAD thread with gdb I can see that addresses from
this unexpectedly appearing mapping are used by glibc here as "struct
pthread *pd":
clone()
start_thread()
timer_helper_thread()
__pthread_create_2_1()
So the mapping looks allocated by allocate_stack(), and it is only
gets done after first timer trigger (we have glibc-2.17 on vz7):
https://github.com/bminor/glibc/blob/release/2.17/master/nptl/sysdeps/unix/sysv/linux/timer_routines.c#L92
So let's wait at least 1 timer trigger so that memory outfit of the test
become permanent and our check_visible_state zdtm check would not be
false negative.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Jenkins test runs are failing with:
./test/jenkins/run_ct ./test/jenkins/crit.sh
./test/jenkins/crit.sh: 3: source: not found
Switch to bash which has 'source'.
Signed-off-by: Adrian Reber <areber@redhat.com>
AppArmor namespaces are officially colon-separated. The double-slash
syntax is just convenience:
"The trailing : separates the namespace name from the profile name and
the optional / and // separators are provided as a convenience for those
familiar with ssh and protocol urls." (see [1])
[1]: https://gitlab.com/apparmor/apparmor/-/wikis/AppArmorNamespaces
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
v2: use a profile that doesn't have "unix" to test the suspend feature too
v3: use "/" in the profile names to make sure this works
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Resolving real pid to vpid for notify thread ids requires NSpid feature
supported by kernel, though in simple non-pid-ns case we can deal
without it, so add a requirement and split out the host test without the
requirement.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Prioritize /lib/* because iptables fails to search /usr/lib64/*
first on archlinux.
This change of 'deps' order prioritizes the default library location.
This affects:
- zdtm/static/netns-nf
- zdtm/static/netns-nft-ipt
- zdtm/static/socket-tcp-closed-last-ack
- zdtm/static/socket-tcp-reseted
- zdtm/static/socket-tcp-syn-sent
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This is useful to investigate problems on pre-dump iterations. After
this patch test output with "--pre=2 --sbs" would have new usefull stop
points.
While on it let's remove confusion in sbs stop point naming. "Pause at
pre-dump" actually has nothing to do with pre-dump, let's better use
"before " instead of "at pre-", similar let's use "after " instead of
"at post-".
Result would look like:
========================== Run zdtm/static/env00 in h ==========================
Start test
./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST
Pause before pre-dump 0. Press Enter to continue.
Run criu pre-dump
Pause before pre-dump 1. Press Enter to continue.
Run criu pre-dump
Pause before dump. Press Enter to continue.
Run criu dump
Pause before restore. Press Enter to continue.
Run criu restore
Pause after restore. Press Enter to continue.
v2: improve sbs step naming; rename "iter" to more meaningfull
"pre-dump"/"snap".
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The --post-start hook creates a netns which the test should enter
at the beginning of the test.
The test randomly failed in CI tests, it is most likely caused by
a race condition.
I suspect this flow is root cause:
1. --post-start hook starts just after the test (in parallel)
2. --post-start hook calls ip netns add to create the test netns
3. ip creates the netns file
4. netns_lock test opens that file and uses it in setns
5. ip mounts the netns to the file
Of course test fails at step 4 because the netns is not yet mounted
to the file.
I made the test wait for SYNCFILE to be created by the --post-start
hook before it tries to open the netns file and call setns.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
When criu dumps a process in a network namespace it locks
the network so that no packets from peer enters the stack,
otherwise RST will be sent by a kernel causing the connection
to fail.
In netns_lock.c we try to enter the netns created by post-start
hook so that criu locks the network namespace between dump and
restore.
A TCP server is started in post-start hook inside the test netns
and runs in the background detached from its parent so that
it stays alive for the duration of the test.
Other hooks (pre-dump, pre-restore, post-restore) try to
connect to the server.
Pre-dump and post-restore hooks should be able to connect
successfully.
Pre-restore hook client with SOCCR_MARK should also connect
successfully.
Pre-restore hook client without SOCCR_MARK should not be able
to connect but also should not get connection refused as all
packets are dropped in the namespace so the kernel shouldn't
send an RST packet as a result. Instead we check that the
connect operation causes a timeout.
This test would be useful when testing that the network is
locked using different ways (using iptables currently and
other methods later).
v2:
- check that packets with SOCCR_MARK are allowed to
pass when the netns is locked.
v3:
- fix pre-restore hook skipping non SOCCR_MARK
connection test due to early exit in SOCCR_MARK
variant.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
criu exec is deprecated for some time now and criu just exits with an
error if running 'criu exec'. This removes the test for that non-working
subcommand.
Signed-off-by: Adrian Reber <areber@redhat.com>
This adds a test run to ensure known (but fixed) configuration file
parser errors are not crashing CRIU anymore.
Based on missing test code coverage this script also tests code paths of
the option handling which have not been tested until now.
Signed-off-by: Adrian Reber <areber@redhat.com>
Now that we are running CI on an actual CentOS 7 kernel different
tests are no longer working as they require newer kernels.
This commit disables a few tests only on CentOS 7.
Signed-off-by: Adrian Reber <areber@redhat.com>
With this change tainted kernels can be ignored with setting
ZDTM_IGNORE_TAINT=1. This is just to simplify the CI script to not
require to change every call of zdtm. Setting the variable once should
be enough.
Signed-off-by: Adrian Reber <areber@redhat.com>
These files use $PKG_CONFIG before they include the common files that
setup a default, so set early defaults in them too.
Signed-off-by: Mike Frysinger <vapier@chromium.org>
The build needs to respect $PKG_CONFIG env var like other standard
build systems and the the upstream pkg-config project itself. This
allows the package builder to point it to the right tool when doing
a cross-compile build. Otherwise the host pkg-config tool is used
which won't have access to the packages in the cross sysroot.
Signed-off-by: Mike Frysinger <vapier@chromium.org>
The tun_ns test was introduced with [1] and [2], however, these commits
didn't add per-test dependencies required for the test.
Per-test dependencies are listed in the .desc file as 'deps': [<list>]
These dependencies are made available inside the test namespace and without
the ip dependency, the tests fails on Fedora 34 with
Error: ipv4: FIB table does not exist.
[1] https://github.com/checkpoint-restore/criu/commit/7e355e7
[2] https://github.com/checkpoint-restore/criu/commit/3ba0893
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Although CentOS 8 comes with 4.18 kernel it has time namespace patches
backported but not all the required once. This disables time namespaced
tests on everything older than 5.11.
Signed-off-by: Adrian Reber <areber@redhat.com>
There are several problems with the loop.sh script. First, the code is
duplicated across tests in the so-called 'othres' category. Second, we
need to run it with the 'setsid' utility to make sure that it runs in
a new session. Third, we have to redirect the standard file descriptors
and use the '&' operator to make it run in the background. Finally,
obtaining the PID of the 'loop.sh' process resulted in race condition.
In this patch we replace the loop.sh script with a program that would
address all problems mentioned above. The requirements for this program
are as follows.
- It must be reusable across tests
- It must start a process that is detached from the current shell
- It must wait for the process to start and output its PID
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The function name '_exit' is misleading as this function doesn't
actually exit when the status of the previous command is zero.
In addition, the behaviour of this function is not really needed.
This patch removes the '_exit' function and applies the correct
behaviour to stop the test on failure.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This is just a symlink to the original transition/pid_reuse test with
the right options passed to trigger the pidfd store based pid reuse
detection code path.
Pidfd store based detection is supported only in RPC mode which
requires passing a unix socket fd to be used as pidfd store and
the kernel should support pidfd_open and pidfd_getfd syscalls
{'feature': 'pidfd_store'} for this test to work.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
When testing pid reuse using pidfd_store feature in RPC mode we need
to pass a unix socket fd used to CRIU in the RPC option
pidfd_store_sk to store the pidfds between predump/dump iterations.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This commit extends the CRIT tests to cover the 'x' command, which is
used to explore an image directory.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This testcase reproduces deadlock in "wait_fds_event" futex in open_fdinfos()
function (files subsystem).
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>