2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-09-05 00:35:23 +00:00
Commit Graph

11124 Commits

Author SHA1 Message Date
Radostin Stoyanov
06306c8b1e coredump: drop exec permission
The shebang line in this file was removed in a previous commit and the
file should be non-executable.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
1b368238b5 coredump: drop unused variable
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
a92a7887a6 python: replace equality with identity test
PEP8 recommends for comparisons to singletons like None to always be
done with 'is' or 'is not', never the equality operators.

https://python.org/dev/peps/pep-0008/#programming-recommendations

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
c71a81a6bd coredump: convert indentation to spaces
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
bf8a3c9f62 coredump: sort imports
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
a0b738cb8f coredump: remove unused import
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00
AndreyVV-100
1c866dbb51 Add new files for running criu-coredump via python 2 or 3
Previous commit added support for python3 in criu-coredump. For convenience,
add two files (coredump-python2 and coredump-python3) that start
criu-coredump with respective python version. Edit env.sh accordingly.

Signed-off-by: Andrey Vyazovtsev <viazovtsev.av@phystech.edu>
2022-04-28 17:53:52 -07:00
Andrey Vyazovtsev
3180d35fa4 Add support for python3 in criu-coredump
Resolve the following python3 portability issues:

1) Python 3 needs explicit relative import path.

2) Coredumps are binary data, not unicode strings. Use byte strings
(b"" instead of "") and open files in binary format.

3) Some functions (for example: filter) return a list in python 2,
but an iterator in python 3. Port code to a common subset of python 2
and python 3 using itertool.

4) Division operator / changed meaning in Python 3. Use explicit
integer division (//) where appropriate.

Signed-off-by: Andrey Vyazovtsev <viazovtsev.av@phystech.edu>
2022-04-28 17:53:52 -07:00
Bui Quang Minh
f24360658f criu(8): Add more detailed description about --tcp-close dump option
The expected behavior of --tcp-close option when dumpping is to close
all established tcp connections including connection that is once
established but now closed. This adds an explicit description about
that behavior.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2022-04-28 17:53:52 -07:00
Bui Quang Minh
abf6b15c14 zdtm: Dumping/restoring with --tcp-close on TCP_CLOSE socket
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2022-04-28 17:53:52 -07:00
Bui Quang Minh
7959730551 tcp: Skip restoring TCP state when dumping with --tcp-close
Since commit e42f5e0 ("tcp: allow to specify --tcp-close on dump"),
--tcp-close option can be used when checkpointing. This option skips
checkpointing established socket's state (including once established
but now closed socket). However, when restoring, we still try to
restore closed socket's state. As a result, a non-existent protobuf
image is opened.

This commit skips TCP_CLOSE socket when restoring established TCP
connection and removes the redundant check for TCP_LISTEN socket as
TCP_LISTEN socket cannot reach this function.

Suggested-by: Andrei Vagin <avagin@gmail.com>
Suggested-by: Radostin Stoyanov <radostin@redhat.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
2022-04-28 17:53:52 -07:00
Rajneesh Bhardwaj
74d1233b59 criu/files: Don't cache fd ids for device files
Restore operation fails when we perform CR operation of multiple
independent proceses that have device files  because criu caches
the ids for the device files with same mnt_ids, inode pair. This
change ensures that even in case of a cached id found for a device, a
unique subid is generated and returned which is used for dumping.

Suggested-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
2022-04-28 17:53:52 -07:00
Rajneesh Bhardwaj
7b6239b6dd criu/plugin: Implement dummy amdgpu plugin hooks
This is just a placeholder dummy plugin and will be replaced by a proper
plugin that implements support for AMD GPU devices. This just
facilitates the initial pull request and CI build test trigger for early
code review of CRIU specific changes. Future PRs will bring in more
support for amdgpu_plugin to enable CRIU with AMD ROCm.

Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
2022-04-28 17:53:52 -07:00
Rajneesh Bhardwaj
17e2a8c709 criu: Introduce new device file plugin hooks
Currently CRIU cannot handle Checkpoint Restore operations when a device
file is involved in a process, however, CRIU allows flexible extensions
via special plugins but still, for certain complex devices such as a GPU,
the existing hooks are not sufficient. This introduces few new hooks
that will be used to support Checkpoint Restore operation with AMD GPU
devices and potentially to other similar devices too.

 - HANDLE_DEVICE_VMA
 - UPDATE_VMA_MAP
 - RESUME_DEVICES_LATE

 *HANDLE_DEVICE_VMA:
	Hook to detect a suitable plugin to handle device file VMA with
	PF | IO mappings.

 *UPDATE_VMA_MAP:
	Hook to handle VMAs during a device file restore.

	When restoring VMAs for the device files, criu runs sys_mmap in
	the pie restore context but the offsets and file path within a
	device file may change during restore operation so it needs to be
	adjusted properly.

 *RESUME_DEVICES_LATE:
	Hook to do some special handling in late restore phase.

	During criu restore phase when a device is getting restored with
	the help of a plugin, some device specific operations might need
	to be delayed until criu finalizes the VMA placements in address
	space of the target process. But by the time criu finalizes this,
	its too late since pie phase is over and control is back to criu
	master process. This hook allows an external trigger to each
	resuming task to check whether it has a device specific operation
	pending such as issuing an ioctl call? Since this is called from
	criu master process context, supply the pid of the target process
	and give a chance to each plugin registered to run device
	specific operation if the target pid is valid.

A future patch will add consumers for these plugin hooks to support AMD
GPUs.

Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
2022-04-28 17:53:52 -07:00
Radostin Stoyanov
dd46e79196 criu(8): add --external net option
Support for external net namespaces has been introduced with
commit c2b21fbf (criu: add support for external net namespaces).

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2022-04-28 17:53:52 -07:00
Andrei Vagin
be239109a8 github: update the stale version
https://github.com/actions/stale/issues/708
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2022-04-06 10:57:20 -07:00
Adrian Reber
4a1731891e criu: Version 3.16.1
* Switch criu-ns from unversioned 'python' to 'python3'
   for easier distribution packaging
 * Add '--join-ns' interface to libcriu to allow joining
   namespaces via libcriu like CLI and RPC already allow

Signed-off-by: Adrian Reber <areber@redhat.com>
v3.16.1
2021-10-13 22:44:30 -07:00
Radostin Stoyanov
62b3779574 Makefile: add shellcheck test/others/libcriu/*.sh
Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-10-12 12:58:43 -07:00
Radostin Stoyanov
59d0dfba96 test/libcriu: print logs on fail
run_test was trying to read criu logs on build failure
instead of runtime error.

This patch also removes the unnecessary subfolder with name "i"
and resolves some of issues reported by shellcheck.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-10-12 12:58:43 -07:00
Radostin Stoyanov
53bf82bcfc test/libcriu: add test case for join-ns
This test case aims to verify that CRIU correctly
restores a process in IPC, UTS and Time namespaces
with criu_join_ns_add() libcriu API.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-10-12 12:58:43 -07:00
Radostin Stoyanov
a8c5efe4c1 libcriu: define log level constants
Replace magic numbers used to set log level in libcriu
with constants.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-10-12 12:58:43 -07:00
Radostin Stoyanov
5ec2a6aaad libcriu: add join_ns API
In runc we use the join-ns RPC API to enable checkpoint/restore of
containers with shared namespaces. Shared namespaces are often used
when containers run inside Kubernetes Pod.

In crun we use libcriu to interface with CRIU, however it currently
doesn't provide an API for join-ns. This patch adds the necessary
libcriu API to enable checkpoint/restore of containers with shared
namespaces with crun.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-10-12 12:58:43 -07:00
Radostin Stoyanov
f2cdb062aa Makefile: install criu-ns only with python3
Python 2 has been deprecated since January 1, 2020 and linux distributions
already support Python 3. Thus, to simplify maintenance and packaging
we could support criu-ns as Python 3 only.

v2: Add a message for criu-ns installation

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-10-12 12:58:43 -07:00
Radostin Stoyanov
a15a63fce0 criu-ns: change python shebang to python3
PEP 394 recommends changing python shebangs to python3 when Python 3.x
is supported. This is similar to `crit-python3`.

https://www.python.org/dev/peps/pep-0394/

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-10-12 12:58:43 -07:00
Adrian Reber
000ea82669 criu: Version 3.16
Amongst a huge number of fixes all over the place and the move away from
Travis this release introduces:

* better support for restoring containers into existing pods
* pidfd based pid reuse detection for RPC clients
* allow restoring of precreated veth devices
* license change for all files in the images/ directory to MIT
* criu-ns helper script
* use clang-format for automatic code indentation
* support checkpoint/restore of stacked apparmor profiles
* [GSoC] Add nftables based network locking/unlocking (Zeyad Yasser)

Signed-off-by: Adrian Reber <areber@redhat.com>
v3.16
2021-09-22 08:22:46 -07:00
Radostin Stoyanov
8567a09524 ci: Update openj9 container images
The AdoptOpenJDK docker images have been deprecated in favor of
eclipse-temurin.

https://github.com/AdoptOpenJDK/openjdk-docker/pull/604

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-09-22 08:21:06 -07:00
Nicolas Viennot
0b2a7223bb mount: fix double-dump file system bug
In the function dump_one_fs(), there's a comment that says "mnt_bind is
a cycled list, so list_for_each can't be used here." That means that the
list head of the list is also a node of the list.

The subsequent list_for_each_entry() marks all the mount info nodes as
dumped, except it skips the list head, which is also a mount info.
This is the bug we fix.

This bug made CRIU dump a file system twice.
See https://github.com/checkpoint-restore/criu-image-streamer/issues/8

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
2021-09-17 10:42:21 -07:00
Radostin Stoyanov
bea9580e3c gitignore: add build directory
The build directory is generated when running make install.

build/ was initially added to gitignore with commit 967797a (Add build
directory to gitignore) and it was accidentally removed.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-09-17 10:42:21 -07:00
Radostin Stoyanov
4db8ef15ce podman-test: use crun from git repository
This patch allows to test the integration of libcriu
with the upstream crun.

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-09-17 10:42:21 -07:00
Radostin Stoyanov
6a15dbdefa lib: install images/rpc.pb-c.h
Since commit 1c25914 compiling crun with libcriu also requires
/usr/include/criu/rpc.pb-c.h

Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
2021-09-17 10:42:21 -07:00
Pavel Tikhomirov
c6b5e7d923 sk-unix: fix prep_unix_sk_cwd root and cwd restoring
If we save root and cwd after we've already switched to mntns we save
different root and cwd from what we had before prep_unix_sk_cwd, we just
save default root/cwd for new mntns... Let's fix it by proper order.

Also while on it lets fix ns_fd leak on switch_ns_by_fd error path.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-17 10:42:21 -07:00
Pavel Tikhomirov
f0e968ffed binfmt_misc: restore current work directory after restoring mnt ns
Else criu would change cwd while binfmt_misc leftover mount cleanup.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-17 10:42:21 -07:00
Pavel Tikhomirov
776f3cff7c autofs: restore current work directory after restoring mnt ns
Else criu would change cwd while dumping autofs mounts.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2021-09-17 10:42:21 -07:00
Liu Hua
45409c35d6 mount: use swich_mnt_ns/restore_mnt_ns helpers to simplify code
Signed-off-by: Liu Hua <weldonliu@tencent.com>
2021-09-17 10:42:21 -07:00
Liu Hua
f79d15c44d binfmt_misc: restore current work directory after restoring mnt ns
Restore current work directory after restoring mnt ns, otherwise
"stats-dump" will been written to /

Signed-off-by: Liu Hua <weldonliu@tencent.com>
2021-09-17 10:42:21 -07:00
Liu Hua
eea63587e6 namespaces: add helpers to switch/restore mnt ns
When restoring mnt, we should restore current work directory.

Signed-off-by: Liu Hua <weldonliu@tencent.com>
2021-09-17 10:42:21 -07:00
liuchao173
41f4489682 remove tls parameter description if without GnuTLS support
Signed-off-by: SuperSix173 <liuchao173@huawei.com>
2021-09-17 10:42:21 -07:00
Zeyad Yasser
d879220996 kerndat: create separate netns for has_nftables_concat check
There was a race window in kerndat_has_nftables_concat(void) when
two or more instances of CRIU are running in parallel.
One instance would create the table normally, and another would
fail when trying to create a table with the same name and wrongly
set kdat.has_nftables_concat = false, and will save it to kerndat
cache.

A separate netns is created to avoid table collisions.

v2: use call_in_child_process helper instead of fork

Co-developed-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
aa772bf286 zdtm: fix network lock tests when run with --norst
In test/jenkins/{crit.sh,criu-dump}, ZDTM is run with --norst,
Causing tests to only go through dump wihtout restoring.

The network locking tests are highly dependant on dump/restore hooks
causing them to hang when run with --norst.

We just add a reqrst flag to all network lock tests.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
9838d34ded criu: use unique table names for nftables based locking
During network locking CRIU creates an nftables table to
add needed rules. If more than one instance of CRIU run
in parallel, those tables' names would conflict.

Solution is to append root task pid to the nftables table
name as a postfix (e.g. inet CRIU-3231).

We also need to use `create table` instead of `add table`
because using `create` returns an error in case table name
already exists so we could detect conflicts if they happen.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
ca3e3c50be inventory: save network lock method to reuse in restore
When the network is locked using a specific method like iptables
or nftables there is no need to require passing the same method
during restore.

We save the lock method during dump in the inventory image and
use that in restore.

This always overwrites the restore --network-lock option.

v2: store opts.network_lock_method directly to avoid dependency
    on rpc.proto's 'enum criu_network_lock_method'.
v3: fall back to iptables if image is generated with an older
    version of CRIU.
v4: remove --network-lock from netns_lock_* from restore

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
cd1570b15e zdtm: add ipv6 variants of net_lock_socket_* tests
v2: remove unnecessary elif and else after return in
    wait_server_addr()
v3: use IOError instead of FileNotFoundError for python2
    compatibility

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
212db1d9a6 zdtm: add nftables per-socket locking test
This is just a symlink to the original static/net_lock_socket_iptables
test with the right options passed to use nftables instead.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
826d3d7407 criu: add nftables connection locking/unlocking
This adds nftables based connection locking as an alernative
for iptables. This avoid the external dependency of
iptables-restore.

It works by creating a 'connection set', which is a set of
connection identifying tuples. Rules are added to drop packets that
match the connection tuples in the set. Locking is now reduced to
just adding the connection identifying tuple to the set.

Unlocking is just as simple as deleteing the CRIU table.

v2: split ip string conversion into two if conditions
v3: add better message when CRIU is build without libnftables support
v4: fix indentation in nftables_lock_connection_raw()
v5: move 'ret = -1' below err: lable to avoid redundancy
v6: add better error message on lock failure
v7: run make indent

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
6e59b2bd77 zdtm: add iptables per-socket locking test
When criu dumps a process with --tcp-established opt it locks
the open tcp connections so that no packets from peer enters
the stack, otherwise RST will be sent by a kernel causing the
connection to fail.

Post-start hook creates a connection with the test server and
creates a background thread that stays alive for the duration
of the test. This background thread sends data to the test
server at three stages:
- Pre-dump: Should send normally
- Pre-restore:
	If connection is locked properly, packets will be dropped
	and TCP will just retry, which will eventually be sent when
	the process is restored and the network is unlocked.
- Post-restore: Should send normally

Data sent at the three stages is then checked at the server's side.

v2:
	- remove unused imports and constants
	- delete sync file in wait_sync_file() instead of --clean
v3:
	- add comments

Co-developed-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
c15327656a zdtm: add nftables network namespace locking test
This is just a symlink to the original static/netns_lock test with
the right options passed to use nftables instead.

v2:
	- make static/netns_lock test iptables explicitly
	- prevent netns_lock tests from running in parallel because
	  netns & sync files creation were conflicting in both tests.

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
19cc0bfa65 criu: add nftables netns-wide locking/unlocking
This adds nftables based internal network locking as an
alernative for iptables. This avoid the external dependency
of iptables-restore.

v2: fix indentation & rename 'free' lable to 'out'
v3: add better message when CRIU is build without libnftables support
v4:
	- move 'ret = -1' below err: lable to avoid redundancy
	- fix nft ctx memory leak in case of success in
	  nftables_network_unlock()
v5: add better error message on lock failure

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
f246ca56c0 criu: rename iptables network locking/unlocking functions
Related to the new --network-lock option, other methods for network
locking/unlocking will be added as an alternative to iptables like
nftables.

This option is used in the core network locking/unlocking hooks to
decide which method should be used, making it easier to add new
methods later smoothly.
i.e.
	- network_lock_internal
	- network_unlock_internal
	- lock_connection (renamed from nf_lock_connection)
	- unlock_connection (renamed from nf_unlock_connection)
	- unlock_connection_info (renamed from unlock_connection_info)

nf_* functions are renamed to iptables_* to avoid confusion with
other netfilter methods in the future like nftables.

v2: run make indent
v3: make error messages more descriptive

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
e9d24a2ba3 cr-check: add check for nftables based network locking
Nftables based network locking/unlocking will be added later.

Nftables sets will be used to load the connection tuples that
will be locked, to be able to store those tuples we need to
check "Set Concatenations" support.

https://wiki.nftables.org/wiki-nftables/index.php/Concatenations

v2: fix 'has_nftables_concat=true' when adding CRIU table fails
v3: add better message when CRIU is build without libnftables support
v4: run make indent

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00
Zeyad Yasser
b85fad797c cr-service: add network_lock option to RPC and libcriu
v2: run make indent

Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
2021-09-03 10:31:00 -07:00