2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-26 11:57:52 +00:00

9635 Commits

Author SHA1 Message Date
Andrei Vagin
d1ba8b8831 zdtm/maps006: modify test so that file and anon vma-s are mixed
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
62629e4728 page-pipe: allow to share pipes between page pipe buffers
Now criu create a new pipe buffer, if a previous one has another set of
flags. In this case, a pipe is not full and we can use it for the
next page buffer.

We need 88 pipes to pre-dump the zdtm/static/fork test without this
patch, and we need only 17 pipes with this patch.

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
7758a4e7f3 page-pipe: move code to resize a pipe in a separate function
v2: and move it upper, because it is going to be used in ppb_alloc()

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
0cb37add6a parasite: remove restriction to a number of iovec-s
vmsplice can't splice more than UIO_MAXIOV, but we can
call it a few times from a parasite.

v2: s/nr/nr_segs/

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:36:14 +03:00
Andrei Vagin
e11805ec79 restore: don't call free_mappings for an uninitialized list
vma_area_list@entry=0x818) at criu/cr-dump.c:107
107             list_for_each_entry_safe(vma_area, p, &vma_area_list->h, list)

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:51 +03:00
Andrei Vagin
a1e880aff6 net: handle a case when --empty net is set only for criu dump
The origin idea was to set --empty net for criu dump and criu restore,
but before cde33dcb0639 ("empty-ns: Don't C/R iptables too (v2)"),
criu restore worked without --empty net and we didn't notice that
docker doesn't set this option on restore.

After a small brainstorm, we decided that it is better to remove
this requirement. Docker has to set this option, but with this changes,
the docker issue will be less urgent.

https://github.com/checkpoint-restore/criu/issues/393
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
a04f8ae965 travis: check docker checkpoint
Install the last version of Docker, start a container and C/R it a few times.

v2: call make to install criu
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Begunkov
4986cdf225 zdtm: Add file lease tests
Test cases:
0. Basic non-breaking read/write leases.
1. Multiple read leases and OFDs with no lease for the same file.
2. Breaking leases.
3. Multiple fds (dup + inherited) for single lease (mutual OFD).

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Begunkov
1fb5a051f7 locks: Add leases c/r for kernels v4.0 and older
Information about locks in /proc/<pid>/fdinfo is presented only since
kernel v4.1. This patch adds logic to *note_file_lock* to match leases
and OFDs.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Begunkov
b6dd25e939 locks: Add c/r of breaking leases (kernel>=v4.1)
restore of breaking leases is executed in 2 steps:
1. restore the lease in a state it was before break
2. break it by opening associated file.

The patch fixes type of broken leases to 'target lease type',
because procfs always returns 'READ' in this case.

Also, it adds 'updated' field in lock structure. It's used to remove all
duplicated records for single lease from the image, which wasn't
corrected by 'correct_lease_type'.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Begunkov
37028d86d7 locks: Add c/r of non broken leases (kernel>=v4.1)
Leases in breaking state are not supported. In that case criu will
report an error during the dumping. Also lock info in
/proc/<pid>/fdinfo should be presented (since kernel 4.1).

Before taking out new lease it modifies process fsuid to match file uid
(see fcntl F_SETLEASE).

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
641d5ff987 alpine: cast addresses to struct sockaddr *
Otherwise we get errors like this:

/usr/include/sys/socket.h:315:5: note: expected 'const struct sockaddr *' but argument is of type 'struct sockaddr_un *'
 int bind (int, const struct sockaddr *, socklen_t);

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
adff973881 restore: don't write pidfile if check_only is set
If check_only is set, criu kills all processes instead of resuming them.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
7b8de6bce6 zdtm: don't mix images from dump with and without check_only
The idea of the check-only option is that criu dump and criu
restore is executed with this option to check whether c/r is
possible for a set of processes. This has to work faster than
without the check-only option.

Now we run criu restore --check-only for images which have
been generated by criu dump without --check-only, it is obviously wrong.

Cc: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
9c8d9f1f08 zdtm: don't overwrite logs if the check-only option is set
If the check-only option is set, dump and restore is executed twice,
and we need to set separate logs for both cases.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
f951a3cc92 zdtm: restore ns_last_pid before executing restore in a second time
Otherwise a criu process can get a pid of one of restored processes.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Andrei Vagin
14ab677ef2 restore: wait restored tasks in the check-only case
If the restore was exexuted with the check-only option,
after restoring all resources tasks waits children and
exits with the 0 code.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Mike Rapoport
71b5fe5f5b Revert "travis: temporarily disable ppc64 until repo is fixed"
This reverts commit 26680a8b6901f312163ed19263c2a580ec927f68.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Mike Rapoport
b0f8e4d11a check/servise: use cached kdat values for features check
The kerndat_init() is now called before the jump to action handler. This
allows us to directly use kdat without calling to the corresponding
kerndat_*() methods.

✓ travis-ci: success for lazy-pages: update checks for availability of userfaultfd (rev3)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Emelyanov
a3b515f6b7 files,remote: Support chunked ghost files
Those may not support sendfiles, so use read/write-s instead

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Emelyanov
cb4f5d1954 travis: temporarily disable ppc64 until repo is fixed
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-11-30 01:34:50 +03:00
Pavel Tikhomirov
ab1eaad3c3 zdtm/session04: do cleanup on success and wait children in it
https://github.com/xemul/criu/issues/372
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-11-30 01:31:19 +03:00
Cyrill Gorcunov
8d16e0d0e2 crit: Show CLONE_ flags in ns image
For better readability

 | {
 |     "magic": "NS",
 |     "entries": [
 |         {
 |             "id": 10,
 |             "ns_cflag": "CLONE_NEWPID"
 |         },
 |         {
 |             "id": 8,
 |             "ns_cflag": "CLONE_NEWNET"
 |         }
 |     ]
 | }

[xemul: Removed non-ns flags from map]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-11-30 01:31:19 +03:00
Pavel Tikhomirov
3341ad532b pstree: allow shelljob to inherit sid from criu process again
In commit 8ce156970cb1 ("pstree: rework init reparent handling for pid
namespaces") I've changed session leader lookup to walk up untill
session leader, in sid inheritance check, not as before to just any
same session process. So we need to allow external sid to be inherited
explicitly for shell jobs.

With these fix I manage to c/r your example fine.

Note: for shell jobs with nested pidnses we need also "[PATCH 04/10]
pstree: add prepare_pstree_leaders to create sid/pgid helpers in
advance" which is in crml now, to prevent creating session helpers
for processes which want to inherit sid from criu process. For non
nestedns case we are fine.

Reported-by: Connor Zanin <zanin.c@husky.neu.edu>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-11-30 01:31:19 +03:00
Kirill Tkhai
cd738b8b01 ns: Make pid_ns helpers as children of criu main process
If a task, holding userns_sync_lock unexpectedly exits,
criu will hang on error path in restore_root_task(),
because it can't use usernsd to destroy them.

Lets remove the intermediary: we'll create pid_ns helpers
as children of criu main task, and criu main task will
be able to use simple kill() to stop them.

v2: Make code more compact, add a comment.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:19 +03:00
Kirill Tkhai
5af3ca59eb utils: Add sys_clone_unified()
Cleanup fork() definition and make a generic function
for all archs. It may be useful, when you want to add
more clone flags to fork(), or if you want to pass more,
than one argument to child function (glibc's clone
alows only one).

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:19 +03:00
Kirill Tkhai
ec1de7aeb1 pid_ns: Extract functionality of exit of pid_ns helper in function
This functionality will be moved in criu task in next patches,
the patch is a preparation.

v2: Rename the function and move pr_err() to it.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:19 +03:00
Cyrill Gorcunov
67639620cb images: Reserve tty numbers in task_core_entry
We will need them to handle tty inheritance.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:19 +03:00
Dmitry Safonov
0eb9cbdbf0 cr-check: Make compat_cr warning arch-independent
I think, we should warn a user when we can't C/R compatible
applications. That's valid for different than x86 archs.
Let's correct the message the way it'll suit non-x86.

Reported-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:04 +03:00
Andrei Vagin
c6b930b3ae criu: don't abort criu in lookup_create_item()
Currently lookup_create_item() calls BUG_ON(), if it meets a thread.
We don't expect to meet a thread there, but if images contain incorrect
data, we can be in this situation in open_remap_dead_process().

(gdb) bt

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:04 +03:00
Andrei Vagin
e84443bb2b dump: set pid->state for threads
It is cheched in dead_pid_conflict, otherwise criu may segfault:

Program terminated with signal 11, Segmentation fault.
1073				if (item->pid->real == item->threads[i].real ||
(gdb) p item
$1 = (struct pstree_item *) 0x0
(gdb) bt

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:04 +03:00
Kirill Tkhai
7c4fda0f80 ns: Implement setns_from_fdstore() for repeating code
Introduce a helper and use it instead of repeating code.
Use file and line of caller in error message printing
to allow the caller do not use additional print.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:04 +03:00
Dmitry Safonov
f5c11168d5 fault-inj: Silently dying helper's child
The restorer blob may die silently due to anything:
- Segmentation fault
- OOM killer
- User-sended SIGKILL
- Child CRIU restorer did't abort futex on error path (and exited)

We should terminate the restoring process and avoid locking
self up on waiting for died restoree.

Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:31:03 +03:00
Dmitry Safonov
ff1c5ca693 pstree: Add nspid() helper
To get pid in ns of current.

Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:30:47 +03:00
Pavel Emelyanov
9ecab6c53f page-xfer: Normalize remote/local parent xfer checks
We have two places to check for parent via page server -- as
a part of _OPEN req and explicit req. Make the latter code
be in-sync with the opening one.

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:30:46 +03:00
Pavel Emelyanov
3e151f025d page-read: Warn about async read w/o completion cb
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:45 +03:00
Pavel Emelyanov
aae3cd6356 page-read: Don't check for cache/proxy in local case
The opts.remote is always false in this code.

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:45 +03:00
Pavel Emelyanov
c1d90f46a2 page-read: Don't try to dedup from img cache/proxy
It's simply impossible (yet), so emit a warning.

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Omri Kramer
4996c2b352 Merge img-remote and img-remote-proto
There is no real need to have both.

Signed-off-by: Omri Kramer <omri.kramer@gmail.com>
Singed-off-by: Lior Fisch <fischlior@gmail.com>
Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
4f23a1fd90 files: Fix crossing unused and service fds of shared fd tables
service_fd_id is id of a specific task, while other tasks
in shared fd table group may have bigger id numbers.
In this case given unused fd intersects with service fds
of such tasks. This leads to undefined behaviour. Fix that.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
8cb7b43a49 restore: Fix waited task pid
The pid must be taken relative to the parent pid namespace.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
2c6327e30b restore: Do not iterate over parent's files to find leftovers
This patch speeds up creation of child process by disabling
iteration over open files for the most cases. Really, we don't
need that now, as previous patches make parent files do not leak:

mnt namespace fds are stored in fdstore, pid proc files
are closed directly.

So, now we can skip closing old files for the most cases,
except some CLONE_FILES cases: we need that only if parent
have CLONE_FILES in its flags (and for root_item).

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
2ccacac0f2 restore: Use vpid in log_init_by_pid() instead of getpid()
When task is in pid namespace, getpid() can't be used
to identify it. So, use vpid instead of that.

Also, move log_init_by_pid() above pid check.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
0d52c7e9e2 forking: Always close pid proc before child creation
Child does not know about parent's pid proc fd,
and it can't close it by fd. Next patch will do
close_old_files() optional, and it will base on
the fact there is no leftover fds. So, close pid
proc directly.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
0382311056 mnt_ns: Use fdstore to keep mount namespaces
This allows to decrese number of file descriptors,
which are passed to children, and that is need to
close in close_old_files().

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>:
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
a8ab6c569e mnt: Move ns_fd assignment down in prepare_mnt_ns()
No functional changes.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
bfbd7bbacb utils: Introduce SWAP() helper to exchange two variables
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Kirill Tkhai
f41a087891 mnt_ns: Move open_proc() up in prepare_mnt_ns()
The both branches need this, so move it up.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Veronika Kabatova
18c22b77c5 Modify and add test for configuration file functionality
Creating a test for verifying configuration parsing feature. The
test is created by reusing already present inotify_irmap test.

Because of addition of default configuration files, --no-default-config
option is added to zdtm.py to not break the test suite on systems with
these files present.

Signed-off-by: Veronika Kabatova <vkabatov@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00
Veronika Kabatova
85e8216080 Add documentation for configuration files
Signed-off-by: Veronika Kabatova <vkabatov@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:29:44 +03:00