2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 05:48:05 +00:00

9272 Commits

Author SHA1 Message Date
Andrei Vagin
c767b11c54 test: check that corked udp sockets are not dumped
The kernel doesn't have an interface to get a sent queue for udp
sockets, so currently we can't dump them and criu dump has to fail in
such cases.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-12-15 09:14:03 +03:00
Andrei Vagin
776782fe20 sk-inet: detect corked sockets by getting a proper sock opt
Now we block all sockets with non-zero idiag_wqueue, but it doesn't mean
that a CORK option is enabled for a socket. A packet can be in a network
stack and it is accounted into idiag_wqueue.

https://github.com/checkpoint-restore/criu/issues/409

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-12-15 09:14:03 +03:00
Andrei Vagin
9ca8953baa test/docker: check a continaer with a read-only file system
Now it's probably one valide use case, because there is no way to commit
a container when a container is being checkpointed.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-12-07 19:56:24 +03:00
Andrei Vagin
514510ba87 zdtm/tempfs_subns: sync children with the main process
All static tests has to stop any activity before C/R.

./tempfs_subns --pidfile=tempfs_subns.pid --outfile=tempfs_subns.out --dirname=tempfs_subns.test
Run criu dump
Unable to kill 128: [Errno 3] No such process
Run criu restore
7: Old mounts lost: []
7: New mounts appeared: [('/rootfs/criu/test', '/'), ('/', '/proc'), ('/', '/dev/pts')]
:

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-12-07 19:55:23 +03:00
Pavel Tikhomirov
35f7f5e360 pr_err: add \n where we miss them
Except for several false positives done by:
find -type f -name "*.c" -not -path "./test/*" -exec sed -i
's/\(\<pr_err.*[^\][^n]\)\("[,)]\)/\1\\n\2/g' {} \;

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-12-07 19:52:13 +03:00
Pierre-Olivier Mercier
acde064232 compel: add missing header required by musl
This fix compilation issue regarding undeclared NULL and memcpy when using musl
on ARM.

Signed-off-by: Pierre-Olivier Mercier <nemunaire@nemunai.re>
2017-12-06 21:44:24 -08:00
Andrei Vagin
0c1b1d0fe7 pipe: dump all data from a pipe
Currently we use an additional pipe to steal data from a pipe, but we
don't check that we steal all data. And the additional pipe can have a
smaller size.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-12-05 04:49:50 +03:00
Andrei Vagin
b921ad2908 test: check a pipe with a custom size
CRIU doesn't handle correctly pipes with sizes which are bigger than a
default one.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-12-04 23:59:06 +03:00
Pavel Tikhomirov
4d3ae51725 remap: don't free rpath and don't shfree_last gf
a) As we shmalloced rpath it can not be xfreed. b) As we shmalloc
variables in collect_remap_ghost() far away from open_remap_ghost()
where we want to free them, there is no guaranty that our shmalloc was
last and we can't use shfree_last().

fixes commit 0c675a5e9d40 ("files: remove link_remaps when everything
has been restored")

When create_ghost() fails for some reason that produces a segfault for me.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-12-02 04:08:47 +03:00
Mike Rapoport
8c545089ed zdtm: use {read,write}_data in fifo tests
Reading and writing large buffers may result in short read/write. In cases
we expect the entire buffer to be transferred use {read,write}_data rather
than plain read/write syscalls.

Reported-by: Mr Jenkins
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 21:59:03 +03:00
Mike Rapoport
a76ff847d6 zdtm: lib: add {read,write}_data helpers
Some tests expect that all the data will be handled in a single invocation
of read/write system call. But it possible to have short read/write on a
loaded system and this is not an error.
Add helper functions that will reliably read/write the entire buffer.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 21:58:59 +03:00
Andrei Vagin
39d65fba82 compel: use a correct name format for vma files in /proc/pid/map_files/
Currently we use the "map_files/%p-%p" format, but actually it should
be "map_files/%lx-%lx".

The kernel could handle both formats, but recently Alexey Dobriyan fixed
the kernel and it accept only the second format.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Mike Rapoport
7491d9191c lazy-pages: do not allow background fetch before restore is finished
If we start backroung memory fetch before restore is completely finished,
we may try to write to the memory areas which were not yet remapped to
proper place and are not registered with userfaultfd.
Add synchronization between restore and the lazy-pages so that lazy-pages
will only handle #PFs before all the tasks are restored.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Mike Rapoport
b3a754d706 page-server: implement epoll->hangup_event
The remote page read has nothing to do if the page-server on the source has
closed the connection. Just report an error and abort.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Mike Rapoport
c6cb9d882a util: epoll: add processing of EPOLL{RD}HUP
Currently when we poll a file descriptor, we only process EPOLLIN events
and if a connection is closed the receiving side has no means to deal with
it.
Add a callback for EPOLL{RD}HUP events and the default implementation for
these events that removes the file descriptor from epoll and closes it.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Mike Rapoport
584123e99b util: epoll: rename revent to read event
A bit more readable and will be easy to distinguish from upcoming
hevent^Whangup_event.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Mike Rapoport
307c55640c util: epoll: move comment about timeout decrease to uffd.c
The generic epoll_wait wrapper should not do any assumptions about timeout.
It's it up to lazy-pages daemon to make (future) policy decisions.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
de467742c6 netfilter: use ipv4 iptables rules to block IPv4-mapped IPv6 addresses
If ipv6 socket has an IPv4-mapped address, it is used to handle ipv4
connection, so we have to use ipv4 iptables rules to block this
connection.

Reported-by: Mr Jenkins
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Adrian Reber
89cdc6bcb4 crtools: also print the current kernel version
In addition to writing the CRIU version to the log file this adds the
current kernel version to the log file:

(00.000008) Version: 3.5 (gitid v3.5-511-ga8cc6cf)
(00.000303) Running on node01 Linux 3.10.0-513.el7.x86_64 #1 SMP Tue Feb 29 06:78:90 EST 2017 x86_64

v2:
 - small changes as suggested by Dmitry (thanks)

Signed-off-by: Adrian Reber <areber@redhat.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Pavel Begunkov
6b7c744f7b locks: skip 'lease correction' for non-regular files
Leases can be set only on regular files. Thus, as optimization we can
skip attempts to find associated leases in 'correct_file_leases_type'
for other fd types.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Jacob Wen
5efcb6028d phaul: use relative path for parent link
Absolute paths for parent links may not work on restore.
e.g: restore on a different server(during migration).

See https://github.com/checkpoint-restore/criu/blob/criu-2.x-stable/criu/image.c#L432

Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Pavel Begunkov
369b56068b zdtm: Test inherited file leases
-- check childs' errors in file_leases03
-- test c/r of lease transfered to child process

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Pavel Begunkov
394875cdbc locks: Remove duplicated locks
CRIU creates dictinct lock record for each file descriptor on the same
OFD. The patch removes this duplicates. To do so, it adds new field into
struct file_lock, which stores pid of fd, on which lock was found.
'owner pid' is not actually helpful, because the original fd, on which
lock have been set, can be already closed.

Also it purges crutches doing the same stuff but only for file leases.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Mike Rapoport
6db46e554f lazy-pages: drop_iovs: mark iov as not queued
If we receive only part of the IOV from the page-server we recalculate the
IOV so it will point to the area we still have to fetch. During the split,
the IOV covering the remaining area may remain marked as 'queued' and we'll
never retry fetching it.
Marking the IOV as not queued will ensure its pages will be requested
again.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Mike Rapoport
5f87346f27 page-xfer: remote-pages: allow receiving partial data
Since commit e609267f681062b4370e528a50f635222e0c2330 ("page-pipe: allow to
share pipes between page pipe buffers") the assumption that we will receive
the exact amount of pages we've requested with PS_IOV_GET does not always
hold.
In the case we serve pages data from the images using 'page-server
--lazy-page' the IOVs seen by the pagemap may cross page-pipe buffer
boundaries and read_page_pipe will clamp the pages in the response to those
boundaries.
Adjust page_server_read so it will not try to receive more pages than
page-server is going to send.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Mike Rapoport
1c94e98bf1 debug_show_page_pipe: add PPB's pipe offset
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
59770d4f8b phaul: run the phaul test in a docker container
golang from the Ubuntu Trusty is too old.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
ff599cd966 travis: run phaul tests
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
eb2736dd56 phaul/Makefile: add a target to run tests
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
ddb09e6e18 phaul: check an exit code of a page-server
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
7f482d20e8 phaul/test: exit with a non-zero code in error cases
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
08a0555ea5 phaul: print a message from error objects
It can help to understand a error.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
98888ec773 phaul: add a script to run tests
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:15 +03:00
Andrei Vagin
cc1c41a03c phaul/test: add github.com/golang/protobuf in vendor/
In this case, we can compile tests without cloning third party libraries.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
3637c414ca phaul: add phaul/src/stats/stats.pb.go
This is required for "go get", it can't execute any commands.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
767d6e1ace lib: add lib/go/src/rpc/rpc.pb.go
It is required for "go get", it can't execute any commands.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
e5f1a37925 phaul: use full paths for modules
It is a general practice in golang and "go get" works in this case.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
d1ba8b8831 zdtm/maps006: modify test so that file and anon vma-s are mixed
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
62629e4728 page-pipe: allow to share pipes between page pipe buffers
Now criu create a new pipe buffer, if a previous one has another set of
flags. In this case, a pipe is not full and we can use it for the
next page buffer.

We need 88 pipes to pre-dump the zdtm/static/fork test without this
patch, and we need only 17 pipes with this patch.

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
7758a4e7f3 page-pipe: move code to resize a pipe in a separate function
v2: and move it upper, because it is going to be used in ppb_alloc()

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
0cb37add6a parasite: remove restriction to a number of iovec-s
vmsplice can't splice more than UIO_MAXIOV, but we can
call it a few times from a parasite.

v2: s/nr/nr_segs/

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
e11805ec79 restore: don't call free_mappings for an uninitialized list
vma_area_list@entry=0x818) at criu/cr-dump.c:107
107             list_for_each_entry_safe(vma_area, p, &vma_area_list->h, list)

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:51 +03:00
Andrei Vagin
a1e880aff6 net: handle a case when --empty net is set only for criu dump
The origin idea was to set --empty net for criu dump and criu restore,
but before cde33dcb0639 ("empty-ns: Don't C/R iptables too (v2)"),
criu restore worked without --empty net and we didn't notice that
docker doesn't set this option on restore.

After a small brainstorm, we decided that it is better to remove
this requirement. Docker has to set this option, but with this changes,
the docker issue will be less urgent.

https://github.com/checkpoint-restore/criu/issues/393
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
a04f8ae965 travis: check docker checkpoint
Install the last version of Docker, start a container and C/R it a few times.

v2: call make to install criu
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Begunkov
4986cdf225 zdtm: Add file lease tests
Test cases:
0. Basic non-breaking read/write leases.
1. Multiple read leases and OFDs with no lease for the same file.
2. Breaking leases.
3. Multiple fds (dup + inherited) for single lease (mutual OFD).

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Begunkov
1fb5a051f7 locks: Add leases c/r for kernels v4.0 and older
Information about locks in /proc/<pid>/fdinfo is presented only since
kernel v4.1. This patch adds logic to *note_file_lock* to match leases
and OFDs.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Begunkov
b6dd25e939 locks: Add c/r of breaking leases (kernel>=v4.1)
restore of breaking leases is executed in 2 steps:
1. restore the lease in a state it was before break
2. break it by opening associated file.

The patch fixes type of broken leases to 'target lease type',
because procfs always returns 'READ' in this case.

Also, it adds 'updated' field in lock structure. It's used to remove all
duplicated records for single lease from the image, which wasn't
corrected by 'correct_lease_type'.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Begunkov
37028d86d7 locks: Add c/r of non broken leases (kernel>=v4.1)
Leases in breaking state are not supported. In that case criu will
report an error during the dumping. Also lock info in
/proc/<pid>/fdinfo should be presented (since kernel 4.1).

Before taking out new lease it modifies process fsuid to match file uid
(see fcntl F_SETLEASE).

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
641d5ff987 alpine: cast addresses to struct sockaddr *
Otherwise we get errors like this:

/usr/include/sys/socket.h:315:5: note: expected 'const struct sockaddr *' but argument is of type 'struct sockaddr_un *'
 int bind (int, const struct sockaddr *, socklen_t);

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
adff973881 restore: don't write pidfile if check_only is set
If check_only is set, criu kills all processes instead of resuming them.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00