criu-3.12/criu/net.c:2043: overwrite_var: Overwriting "img" in "img =
open_image_at(-1, CR_FD_IP6TABLES, 0UL, pid)" leaks the storage that
"img" points to.
Signed-off-by: Adrian Reber <areber@redhat.com>
The print_data() function was part of the deprecated (and removed)
'show' action, and it was moved in util.c with the following commit:
a501b4804b
The 'show' action has been deprecated since 1.6, let's finally drop it.
The print_data() routine is kept for yet another (to be deprecated too)
feature called 'criu exec'.
The criu exec feature was removed with:
909590a355
Remove criu exec code
It's now obsoleted by compel library.
Maybe-TODO: Add compel tool exec action?
Therefore, now we can drop print_data() as well.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
One can pass --stream to zdtm.py for testing criu with image streaming.
criu-image-streamer should be installed in ../criu-image-streamer
relative to the criu project directory. But any path will do providing
that criu-image-streamer can be found in the PATH env.
Added a few tests to run on travis-ci to make sure streaming works.
We run test that are likely to fail. However, it would be good to once
in a while run all tests with `--stream -a`.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
This adds the ability to stream images with criu-image-streamer
The workflow is the following:
1) criu-image-streamer is started, and starts listening on a UNIX
socket.
2) CRIU is started. img_streamer_init() is invoked, which connects to the
socket. During dump/restore operations, instead of using local disk to
open an image file, img_streamer_open() is called to provide a UNIX pipe
that is sent over the UNIX socket.
3) Once the operation is done, img_streamer_finish() is called, and the
UNIX socket is disconnected.
criu-image-streamer can be found at:
https://github.com/checkpoint-restore/criu-image-streamer
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Instead of erroring, we should loop until we get the desired number of
bytes written, like regular I/O loops.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
When CRIU calls the ip tool on restore, it passes the fd of remote
socket by replacing the STDIN before execvp. The stdin is used by the
ip tool to receive input. However, the ip tool calls ftell(stdin)
which fails with "Illegal seek" since UNIX sockets do not support file
positioning operations. To resolve this issue, read the received
content from the UNIX socket and store it into temporary file, then
replace STDIN with the fd of this tmp file.
# python test/zdtm.py run -t zdtm/static/env00 --remote -f ns
=== Run 1/1 ================ zdtm/static/env00
========================= Run zdtm/static/env00 in ns ==========================
Start test
./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST
Adding image cache
Adding image proxy
Run criu dump
Run criu restore
=[log]=> dump/zdtm/static/env00/31/1/restore.log
------------------------ grep Error ------------------------
RTNETLINK answers: File exists
(00.229895) 1: do_open_remote_image RDONLY path=route-9.img snapshot_id=dump/zdtm/static/env00/31/1
(00.230316) 1: Running ip route restore
Failed to restore: ftell: Illegal seek
(00.232757) 1: Error (criu/util.c:712): exited, status=255
(00.232777) 1: Error (criu/net.c:1479): IP tool failed on route restore
(00.232803) 1: Error (criu/net.c:2153): Can't create net_ns
(00.255091) Error (criu/cr-restore.c:1177): 105 killed by signal 9: Killed
(00.255307) Error (criu/mount.c:2960): mnt: Can't remove the directory /tmp/.criu.mntns.dTd7ak: No such file or directory
(00.255339) Error (criu/cr-restore.c:2119): Restoring FAILED.
------------------------ ERROR OVER ------------------------
################# Test zdtm/static/env00 FAIL at CRIU restore ##################
##################################### FAIL #####################################
Fixes#311
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
When saddr.ss_family is AF_INET6 we should cast &saddr to
(struct sockaddr_in6 *).
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
The newest version of flake reports errors that variable names like 'l'
should not be used, because they are hard to read.
This changes 'l' to 'line' to make flake8 happy.
Signed-off-by: Adrian Reber <areber@redhat.com>
With the latest version of the alpine container image it seems that
alpine changed a few package names. This adapts the alpine container
to solve the travis failures.
Signed-off-by: Adrian Reber <areber@redhat.com>
When using zdtm.py with --tls it started to fail as the certificates
seem to have expired. Following commands have been used to re-generate
the certificate:
# Generate CA key and certificate
echo -ne "ca\ncert_signing_key" > temp
certtool --generate-privkey > cakey.pem
certtool --generate-self-signed \
--template temp \
--load-privkey cakey.pem \
--outfile cacert.pem
# Generate server key and certificate
echo -ne "cn=$HOSTNAME\nencryption_key\nsigning_key" > temp
certtool --generate-privkey > key.pem
certtool --generate-certificate \
--template temp \
--load-privkey key.pem \
--load-ca-certificate cacert.pem \
--load-ca-privkey cakey.pem \
--outfile cert.pem
rm temp cakey.pem
Without this tests will fail in Travis.
Signed-off-by: Adrian Reber <areber@redhat.com>
The long-tempting release with lots of new features on board.
We have finally the time namespace support, great improvment of
the pre-dump memory consumption, new clone3 support and many
more.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
When testing runc checkpointing, I frequently see the following error:
> Error (criu/mount.c:1107): mnt: Can't create a temporary directory: Read-only file system
This happens because container root is read-only mount.
The error here is not actually fatal since it is handled later
in ns_open_mountpoint() (at least since [1] is fixed), but it is shown
as error in runc integration tests.
Since it is not fatal, let's demote it to a warning to avoid confusion.
[1] https://github.com/checkpoint-restore/criu/issues/520
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Clock IDs in this file has been replaced by clock symbolic names.
Now it looks like this:
$ cat /proc/774/timens_offsets
monotonic 864000 0
boottime 1728000 0
Signed-off-by: Andrei Vagin <avagin@gmail.com>
First don't free pstree_item as they are allocated with shmalloc on
restore. Second always pstree_entry__free_unpacked PstreeEntry. Third
remove all breaks replacing them with implict goto err, so that it would
be easier to understand that we are on error path. Forth split out
code for reading one pstree item in separate function.
Sadly there is no much use in xfree-ing pi->threads because in case of
an error we still have ->threads unfreed from previous entries anyway.
But at least some cleanup can be done here.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Changed all the %u into %d.
Ideally, we should implement the %u format for parasite code.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
kerndat_socket_netns() is called twice. We keep the latter to avoid
changing the behavior.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
We should follow Linux Kernel Codding Style:
... the closing brace is empty on a line of its own, except in the cases
where it is followed by a continuation of the same statement, ie ... an
else in an if-statement ...
https://www.kernel.org/doc/html/v4.10/process/coding-style.html#placing-braces-and-spaces
Automaticly fixing with:
:!git grep --files-with-matches "^\s*else[^{]*{" | xargs
:argadd <files>
:argdo :%s/}\s*\n\s*\(else[^{]*{\)/} \1/g | update
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
In real life cases pipe_ino param could be larger that INT_MAX,
but in autofs_parse() function we using atoi function, that uses
4 byte integers. It's a bug.
Example of mount info from real case:
(00.508286) type autofs source /etc/auto.misc mnt_id 2824 s_dev 0x4b9 / @
./misc flags 0x300000 options fd=5,pipe_ino=3480845226,pgrp=95929,timeout=300,
minproto=5,maxproto=5,indirect
3480845226 > 2147483647 (32-bit wide signed int max value) => we have a problem
It causes a error:
(03.195915) Error (criu/pipes.c:529): The packetized mode for pipes is not supported yet
Signed-off-by: Alexander Mikhalitsyn (Virtuozzo) <alexander@mihalicyn.com>
This commit introduces an optimization when rsti(t)->vma_io is empty.
This optimization allows streaming a non-seekable image as CR_FD_PAGES
is not reopened.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
When an image is opened but errored with a ENOENT error, the image is
still valid. Later on, do_pb_read_one() can fail and will invoke
image_name(). The image fd is EMPTY_IMG_FD (-404). read_fd_link fails.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
After restoring processes, we have to be sure that monotonic and
boottime clocks will not go backward. For this, we can restore processes
in a new time namespace and set proper offsets for the clocks.
In this patch, criu dumps clocks values event when processes are running
in this host time namespace and on restore, criu creates a new time
namespace, sets dumped clock values and restores processes.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
This test checks that monotonic and boottime don't jump after C/R.
In ns and uns flavors, the test is started in a separate time namespace
with big offsets, so if criu will restore a time namespace incorrectly
the test will detect the big delta of clocks values before and after C/R.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
The time namespace allows for per-namespace offsets to the system
monotonic and boot-time clocks.
C/R of time namespaces are very straightforward. On dump, criu enters a
target time namespace and dumps currents clocks values, then on restore,
criu creates a new namespace and restores clocks values.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Hope I have enough experience in the project to be nominated. I want to
help with review and will try to do my best in it.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The struct memfd_inode has a union for dump and restore parts.
The only common parts are the list_head node, and the inode id.
Suggested-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Per-object image is acceptable if we expect to have 1-3 objects
per-container. If we expect to have more objects, it is better to save
them all into one image. There are a number of reasons for this:
* We need fewer system calls to read all objects from one image.
* It is faster to save or move one image.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
After running make install, build directory is generated but not ignored
in gitignore. So this commit add build directory to gitignore.
Signed-off-by: Byeonggon Lee <gonny952@gmail.com>
Fix n_xid_map leaks on error path and remove useless exit_code.
Fixes: 6e1726f8 ("userns: set uid and gid before entering into userns")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The helper function removes code duplication from tests that want to
initialize unix socket address to an absolute file path, derived from
current working directory of the test + relative filename of a resulting
socket. Because the former code used cwd = get_current_dir_name() as
part of absolute filename generation, the resulting filepath could later
cause failure of bind systcall due to unchecked permissions and
introduce confusing permission errors.
Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
Any filesystem syscall, that needs to navigate to inode by it's
absolute path performs successive lookup operations for each part of the
path. Lookup operation includes access rights check.
Usually but not always zdtm tests processes fall under 'other' access
category. Also, usually directories don't have 'x' bit set for other.
In case when bit 'x' is not set and user-ID and group-ID of a process
relate it to 'other', test's will not succeed in performing these
syscalls which are most of filesystem api, that has const char *path
as part of it arguments (open, openat, mkdir, bind, etc).
The observable behavior of that is that zdtm tests fail at file
creation ops on one system and pass on the other. The above is not
immediately clear to the developer by just looking at failed test's logs.
Investigation of that is also not quick for a developer due to the
complex structure of zdtm runtime where nested clones with
NAMESPACE flags take place alongside with bind-mounts.
As an additional note: 'get_current_dir_name' is documented as returning
EACCESS in case when some part of the path lacks read/list permissions.
But in fact it's not always so. Practice shows, that test processes can
get false success on this operation only to fail on later call to
something like mkdir/mknod/bind with a given path in arguments.
'get_cwd_check_perm' is a wrapper around 'get_current_dir_name'. It also
checks for permissions on the given filepath and logs the error. This
directs the developer towards the right investigation path or even
eliminates the need for investigation completely.
Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
Here is a fast path when two consequent vma-s share the same file.
But one of these vma-s can map a file with MAP_SHARED, but another one
can map it with MAP_PRIVATE and we need to take this into account.