In test/jenkins/{crit.sh,criu-dump}, ZDTM is run with --norst,
Causing tests to only go through dump wihtout restoring.
The network locking tests are highly dependant on dump/restore hooks
causing them to hang when run with --norst.
We just add a reqrst flag to all network lock tests.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
During network locking CRIU creates an nftables table to
add needed rules. If more than one instance of CRIU run
in parallel, those tables' names would conflict.
Solution is to append root task pid to the nftables table
name as a postfix (e.g. inet CRIU-3231).
We also need to use `create table` instead of `add table`
because using `create` returns an error in case table name
already exists so we could detect conflicts if they happen.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
When the network is locked using a specific method like iptables
or nftables there is no need to require passing the same method
during restore.
We save the lock method during dump in the inventory image and
use that in restore.
This always overwrites the restore --network-lock option.
v2: store opts.network_lock_method directly to avoid dependency
on rpc.proto's 'enum criu_network_lock_method'.
v3: fall back to iptables if image is generated with an older
version of CRIU.
v4: remove --network-lock from netns_lock_* from restore
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
v2: remove unnecessary elif and else after return in
wait_server_addr()
v3: use IOError instead of FileNotFoundError for python2
compatibility
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This is just a symlink to the original static/net_lock_socket_iptables
test with the right options passed to use nftables instead.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This adds nftables based connection locking as an alernative
for iptables. This avoid the external dependency of
iptables-restore.
It works by creating a 'connection set', which is a set of
connection identifying tuples. Rules are added to drop packets that
match the connection tuples in the set. Locking is now reduced to
just adding the connection identifying tuple to the set.
Unlocking is just as simple as deleteing the CRIU table.
v2: split ip string conversion into two if conditions
v3: add better message when CRIU is build without libnftables support
v4: fix indentation in nftables_lock_connection_raw()
v5: move 'ret = -1' below err: lable to avoid redundancy
v6: add better error message on lock failure
v7: run make indent
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
When criu dumps a process with --tcp-established opt it locks
the open tcp connections so that no packets from peer enters
the stack, otherwise RST will be sent by a kernel causing the
connection to fail.
Post-start hook creates a connection with the test server and
creates a background thread that stays alive for the duration
of the test. This background thread sends data to the test
server at three stages:
- Pre-dump: Should send normally
- Pre-restore:
If connection is locked properly, packets will be dropped
and TCP will just retry, which will eventually be sent when
the process is restored and the network is unlocked.
- Post-restore: Should send normally
Data sent at the three stages is then checked at the server's side.
v2:
- remove unused imports and constants
- delete sync file in wait_sync_file() instead of --clean
v3:
- add comments
Co-developed-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This is just a symlink to the original static/netns_lock test with
the right options passed to use nftables instead.
v2:
- make static/netns_lock test iptables explicitly
- prevent netns_lock tests from running in parallel because
netns & sync files creation were conflicting in both tests.
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This adds nftables based internal network locking as an
alernative for iptables. This avoid the external dependency
of iptables-restore.
v2: fix indentation & rename 'free' lable to 'out'
v3: add better message when CRIU is build without libnftables support
v4:
- move 'ret = -1' below err: lable to avoid redundancy
- fix nft ctx memory leak in case of success in
nftables_network_unlock()
v5: add better error message on lock failure
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
Related to the new --network-lock option, other methods for network
locking/unlocking will be added as an alternative to iptables like
nftables.
This option is used in the core network locking/unlocking hooks to
decide which method should be used, making it easier to add new
methods later smoothly.
i.e.
- network_lock_internal
- network_unlock_internal
- lock_connection (renamed from nf_lock_connection)
- unlock_connection (renamed from nf_unlock_connection)
- unlock_connection_info (renamed from unlock_connection_info)
nf_* functions are renamed to iptables_* to avoid confusion with
other netfilter methods in the future like nftables.
v2: run make indent
v3: make error messages more descriptive
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
Nftables based network locking/unlocking will be added later.
Nftables sets will be used to load the connection tuples that
will be locked, to be able to store those tuples we need to
check "Set Concatenations" support.
https://wiki.nftables.org/wiki-nftables/index.php/Concatenations
v2: fix 'has_nftables_concat=true' when adding CRIU table fails
v3: add better message when CRIU is build without libnftables support
v4: run make indent
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This adds the option to choose the networking locking method.
CRIU currently uses iptables-restore cli for network locking/unlocking
but nftables support will be added later.
There have been reports from users that iptables-restore fails in some
way and an nftables based approach using libnftables could avoid this
external dependency.
v2: remove dependency details in man page for --network-lock.
v3: remove --network-lock from restore section in docs because it is
automatically detected from the inventory image now.
v4: add message that --network-lock will be ignored during restore
and value from dump will be used.
v5: run make indent
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
We have some code written with the assumption that argv is never
destroyed, but when we handle configs, we construct an argv-like array
for each config, parse it and release it.
I am still not sure that we need to release memory of per-config argv
arrays... The current scheme is going to be a source of used-after-free
bugs. When we will add the non-privileged mode, all these bugs will be
serious security issues.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Without any LSM detected CRIU would print a warning for every run:
Warn (criu/lsm.c:328): don't know how to suspend LSM 0
Which clutters up the CI logs.
Change the message to a debug message.
Signed-off-by: Adrian Reber <areber@redhat.com>
On systems where there is no build_id we get a warning for each run
Warn (criu/files-reg.c:1458): Couldn't find the build-id note for file with fd 15
Change the log level to debug for this message as the file just does not have
a build_id and printing a warning clutters the CI logs.
Signed-off-by: Adrian Reber <areber@redhat.com>
On Virtuozzo7 jenkins we see a fail of criu-dev zdtm:
===================== Run zdtm/static/pthread_timers in ns =====================
Start test
./pthread_timers --pidfile=pthread_timers.pid --outfile=pthread_timers.out
Run criu dump
=[log]=> dump/zdtm/static/pthread_timers/112/1/dump.log
------------------------ grep Error ------------------------
(00.004817) netlink: Collect netlink sock 0x1cad6e21
(00.004821) netlink: Collect netlink sock 0x1cad6e22
(00.004831) Collecting pidns 9/112
(00.004886) No parent images directory provided
(00.004903) Warn (criu/lsm.c:328): don't know how to suspend LSM 0
------------------------ ERROR OVER ------------------------
Run criu restore
4: Old maps lost: set([])
4: New maps appeared: set([u'7fe4c54ca000-7fe4c54cb000 ---p', u'7fe4c0000000-7fe4c0021000 rw-p', u'7fe4c0021000-7fe4c4000000 ---p', u'7fe4c54cb000-7fe4c5ccb000 rw-p'])
############# Test zdtm/static/pthread_timers FAIL at maps compare #############
https://ci.openvz.org/job/CRIU/job/CRIU-virtuozzo/job/criu-dev/8032/consoleFull
First thing to mention is that this is not related to criu. I can manage
to reproduce it with "--nocr", problem is that some mapping appears a
bit later when we do pre-cr get_visible_state().
By debugging SIGEV_THREAD thread with gdb I can see that addresses from
this unexpectedly appearing mapping are used by glibc here as "struct
pthread *pd":
clone()
start_thread()
timer_helper_thread()
__pthread_create_2_1()
So the mapping looks allocated by allocate_stack(), and it is only
gets done after first timer trigger (we have glibc-2.17 on vz7):
https://github.com/bminor/glibc/blob/release/2.17/master/nptl/sysdeps/unix/sysv/linux/timer_routines.c#L92
So let's wait at least 1 timer trigger so that memory outfit of the test
become permanent and our check_visible_state zdtm check would not be
false negative.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Jenkins test runs are failing with:
./test/jenkins/run_ct ./test/jenkins/crit.sh
./test/jenkins/crit.sh: 3: source: not found
Switch to bash which has 'source'.
Signed-off-by: Adrian Reber <areber@redhat.com>
The following error occurs when creating a checkpoint of
a container immediately after the container has been restored
from another checkpoint.
Error response from daemon: Cannot checkpoint container cr: content
sha256:12c69b7a9d25695dd5f9d37d4e858e2f7c3f9da738ccf86f8d3042f6973af1df:
already exists
In this patch we add a healthcheck to the test container and update the
test to perform a checkpoint only when the container is in a 'healthy'
state. In addition, this patch adds a scenario to test the
checkpoint/restore of multiple containers.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Otherwise, criu pre-dump can fail with errors like this:
lib/infect.c:650: Unable to connect a transport socket: Permission denied'
Reported-by: Mr Jenkins
Signed-off-by: Andrei Vagin <avagin@gmail.com>
AppArmor namespaces are officially colon-separated. The double-slash
syntax is just convenience:
"The trailing : separates the namespace name from the profile name and
the optional / and // separators are provided as a convenience for those
familiar with ssh and protocol urls." (see [1])
[1]: https://gitlab.com/apparmor/apparmor/-/wikis/AppArmorNamespaces
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
The original patches didn't pass down the "suspend" boolean into
write_aa_policy() and so suspend never really happened. Pass it down.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
v2: use a profile that doesn't have "unix" to test the suspend feature too
v3: use "/" in the profile names to make sure this works
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Support for apparmor namespaces and stacking is coming to Ubuntu kernels in
16.10, and should hopefully be upstreamed Soon (TM) :).
The basic idea is similar to how cgroups are done: we can restore the
apparmor namespace and profile blobs independently of the tasks, and then
at the end we can just set the task's label appropriately. This means the
code that moves tasks under a label stays the same, and the only new code
is the stuff that dumps and restores the policy blobs that are in the
namespace that were loaded by the container.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Instead of collecting profiles right at thread dump time, let's collect
them all at a single point. This way, when we add support for suspending
LSMs, we know what profiles if any we need to suspend.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
This is another attempt to introduce a tool to format CRIU's source
code. This time it is based on clang-format.
The .clang-format file is taken from the linux kernel git tree (5.13).
I removed all comments from lines which state that it requires at least
clang-format 4 or 5. For this resulting file at least clang-format 11
is required. See scripts/fetch-clang-format.sh for all the changes
done to the Linux kernel .clang-format file.
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Fedora Rawhide updated to a glibc using clone3(). clone3() is, however,
not yet part of the seccomp filter. Unfortunately 'docker build' does
not allow dropping seccomp but luckily 'podman build' does.
This switches the Fedora Rawhide test to use Podman. Podman is part of
GitHub Actions and no additional packages need to be installed.
Signed-off-by: Adrian Reber <areber@redhat.com>
If retcode of dump_posix_timers is not zero it is treated as an error in
compel_rpc_sync. And currently we can return positive overrun of last
timer (if we are lucky and it is not zero) as retcode of function
dump_posix_timers, let's fix it. Also I don't see any point in putting
negative value into .overrun on error path, fix it too.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Resolving real pid to vpid for notify thread ids requires NSpid feature
supported by kernel, though in simple non-pid-ns case we can deal
without it, so add a requirement and split out the host test without the
requirement.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
1) If all dumped processes are in host pidns we can skip pid conversion
logic and just use real pid.
2) If we have pidns to dump we should also have kernel NSpid feature,
else we should fail to dump notify thread id, as it's not possible to
properly convert rpid to vpid.
While on it let's put the code to encode_notify_thread_id helper to
improve code readability.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
On Centos7 we don't have NSpid field in /proc/[pid]/status so for
compatibility let's skip it.
(cherry-pick one hunk from Virtuozzo commit
https://src.openvz.org/projects/OVZ/repos/criu/commits/c6d0ee567c)
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
If there is nested pid_ns, we need to be able to get pid in
the whole pid hierarhy. This may be taken from "/proc/[pid]/status"
file only. Check, that kernel has support for it.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
NSpid is not (yet) supported on Centos7 thus we need this check for
compatibility.
(cherry-picked from Virtuozzo criu commit
https://src.openvz.org/projects/OVZ/repos/criu/commits/94f4653f20)
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
When sigev_notify_thread_id is not set, get_pid will return a NULL
pointer and do_timer_create will return -EINVAL in kernel. So criu
will failed to create posix timer:
(09.806760) pie: 41301: Error (criu/pie/restorer.c:1998): Can't restore posix timers -22
(09.806824) pie: 41301: Error (criu/pie/restorer.c:2133): Restorer fail 41301
(09.891880) Error (criu/cr-restore.c:2596): Restoring FAILED.
Signed-off-by: Liu Chao <liuchao173@huawei.com>