2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-27 12:28:14 +00:00

1696 Commits

Author SHA1 Message Date
Cyrill Gorcunov
fd07bc7791 cpu: Add 'ins' mode to --cpu-cap option
In this mode we test if target cpu has all features present
in image file but do not require bit to bit match: target cpu
may be a new one with more features present.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-26 18:15:46 +03:00
Pavel Emelyanov
2694a74a00 aio: Restore AIO contexts
Restoring AIO is quite simple. Once all VMAs are put in
their places we can call io_setup() to let kernel create
the context back and then move the ring into proper place.

Another thing we should "restore" is the context ID. But
the thing is, upon ring creation kernel repots the ring
start address as this ID. And there's a patch in the -next
tree that changes the ID when we remap the ring. That
said after AIO context creation and ring remap we need
to check that the new ID is seen by the kernel.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-26 18:13:40 +03:00
Pavel Emelyanov
08c204820f aio: Dump AIO rings
When AIO context is set up kernel does two things:

1. creates an in-kernel aioctx object
2. maps a ring into process memory

The 2nd thing gives us all the needed information
about how the AIO was set up. So, in order to dump
one we need to pick the ring in memory and get all
the information we need from it.

One thing to note -- we cannot dump tasks if there
are any AIO requests pending. So we also need to
go to parasite and check the ring to be empty.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-26 18:13:36 +03:00
Pavel Emelyanov
80cf042695 x86: Add io syscalls
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-26 18:13:33 +03:00
Pavel Emelyanov
6a6cdb8d4a proc: Drop always true last argument of parse_smaps()
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-12-22 13:52:03 +03:00
Ruslan Kuprieiev
b30940eee2 cr_errno: move cr_err helpers into cr_errno.h
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-22 13:50:45 +03:00
Ruslan Kuprieiev
1ace257022 tty: add vt support, v2
/dev/ttyN are the virtual terminals which are provided
by the system with major 4 and minor 1..63.
You can run some program on ttyN by pressing alt+ctrl+FN
and running it manualy or by using open(openvt nowadays).

This patch also allows us to run all our tests from a vt.

v2, style fix + using linux/vt.h for constants

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-22 13:48:31 +03:00
Ruslan Kuprieiev
8eaf0142ab cr-service: set cr_errno to EBADRQC if set_opts_from_req fails
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-19 18:59:28 +03:00
Ruslan Kuprieiev
e76749b790 cr-restore: set cr_error to EEXIST if such pid already exists, v3
This is a very common error when using criu.

The problem here is that we need to somehow transfer cr_errno
from one process to another. I suggest using pipe to give
one end to children and read cr_errno on other after restore
is finished.

v2, Pavel suggested putting errno into shared task_entries.
v3. and he also suggested using cmpxchg

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-19 18:59:17 +03:00
Ruslan Kuprieiev
b09a88b5f9 util: set cr_errno to ESRCH if no PID dir in proc
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-19 18:59:14 +03:00
Ruslan Kuprieiev
ef283e505c cr-errno: initial commit
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-19 18:58:46 +03:00
Saied Kazemi
0412152fc5 Add inherit fd support
There are cases where a process's file descriptor cannot be restored
from the checkpoint images.  For example, a pipe file descriptor with
one end in the checkpointed process and the other end in a separate
process (that was not part of the checkpointed process tree) cannot be
restored because after checkpoint the pipe will be broken.

There are also cases where the user wants to use a new file during
restore instead of the original file at checkpoint time.  For example,
the user wants to change the log file of a process from /path/to/oldlog
to /path/to/newlog.

In these cases, criu's caller should set up a new file descriptor to be
inherited by the restored process and specify the file descriptor with the
--inherit-fd command line option.  The argument of --inherit-fd has the
format fd[%d]:%s, where %d tells criu which of its own file descriptors
to use for restoring the file identified by %s.

As a debugging aid, if the argument has the format debug[%d]:%s, it tells
criu to write out the string after colon to the file descriptor %d.  This
can be used, for example, as an easy way to leave a "restore marker"
in the output stream of the process.

It's important to note that inherit fd support breaks applications
that depend on the state of the file descriptor being inherited.  So,
consider inherit fd only for specific use cases that you know for sure
won't break the application.

For examples please visit http://criu.org/Category:HOWTO.

v2: Added a check in send_fd_to_self() to avoid closing an inherit fd.
    Also, as an extra measure of caution, added checks in the inherit fd
    look up functions to make sure that the inherit fd hasn't been reused.
    The patch also includes minor cosmetic changes.

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-10 12:48:30 +03:00
Andrey Vagin
4bca68ba49 tcp: don't split packets for restoring a send queue
The kernel can do it better. The problem exists only for recv queues.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-08 15:46:44 +03:00
Andrey Vagin
71a0b5dc31 mem: check existence of parent images before dumping pages (v2)
When we are doing pre-dump, we splice pages in pipes and only then open
images and dump pages. But when we are splicing pages, we need to know
about existence of parent images. This patch adds a new call to determin
existence of parent images.

In addition this patch fixes a following issue:
CID 83244 (#1 of 1): Uninitialized pointer read (UNINIT)
14. uninit_use: Using uninitialized value xfer.parent.

v2: initialize unused field of struct page_server_iov, because it sends
in network.

CID 83451 (#1 of 1): Uninitialized scalar variable (UNINIT)
2. uninit_use_in_call: Using uninitialized value pi. Field pi.nr_pages
is uninitialized when calling write.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-29 19:32:40 +03:00
Pavel Emelyanov
69bffe26d3 kerndat: Make fs-virtualized check report yes/no
Right now it returns the whole struct stat which is excessive.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:15:09 +04:00
Pavel Emelyanov
19a76494a9 kerndat: Collect all global variables on one struct
Not to spoil the global namespace and unify the kerndat
data names.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:14:53 +04:00
Pavel Emelyanov
f33908a897 ns: Rename "created" futex and comment what it is
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:11:58 +04:00
Pavel Emelyanov
ee2e8e5bb9 parasite: Cleanup args size fetching
Right now we push all the auxiliary arguments to parasite_infect_seized
while 2 of them are only required to calculate the size of args area.

Let's better keep track of required args size and get rid of excessive
arguments to parasite_infect_seized().

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-11 20:11:34 +04:00
Pavel Emelyanov
1cad9b1049 util: Fix the ispathsub corner case
ispathsub("/foo", "/") reports false. This is a corner case,
as 2nd argument is not expected to end with /. Fix this and
add comment about ispathsub() arguments assumptions.

Reported-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-09 23:26:56 +04:00
Pavel Emelyanov
32f58742ca mnt: Introduce and use issubpath helper
When we validate the mount tree not to have overmounts we need to
check one path to be the sub-path of another. Here's a helper for
this.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-11-07 17:39:23 +04:00
Andrey Vagin
cb2f9223a0 dump: dump user namespaces (v2)
For that we need to save per-namespace mappings of user and group IDs.

And all id-s for tasks and files are saved from the target user
namespace.

v2: move code into collect_namespaces()
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:16:16 +04:00
Andrey Vagin
30711b109d userns: save uid-s from a target userns (v2)
We are going to support user namespaces and uid-s will be converted
accoding with userns mappings.

v2: conver id-s for sockets too
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:15:45 +04:00
Andrey Vagin
71a9cd0634 proc: delete parse_pid_stat_small() (v2)
It's unused now.

v2: remove the proc_pid_stat_small struct too.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:15:37 +04:00
Andrey Vagin
05943959a5 proc: parse state and ppid from /proc/pid/status (v2)
v2: don't leak FILE

CID 73423 (#1 of 1): Resource leak (RESOURCE_LEAK)
15. leaked_storage: Variable f going out of scope leaks the storage it points to.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:15:03 +04:00
Andrey Vagin
2cb6f2b68b dump: remove useless arguments from seize_task()
We get sig and pgid from a parasite, because we need to get
them from a target pid namespace.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:14:54 +04:00
Andrey Vagin
77905aae19 dump: get tasks ids from parasite
We have two reason for that:
* parsing of /proc/pid/status is slow
* parasite returns ids from a target userns

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:14:32 +04:00
Andrey Vagin
b0217d4e41 criu: add constants about user namespaces
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:00:32 +04:00
Pavel Emelyanov
2e91a9c814 bfd: Don't flush read-only images
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-05 15:38:17 +04:00
Pavel Emelyanov
1ef5ca8235 bfd: Check images got flushed at the end
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-05 15:37:39 +04:00
Andrey Vagin
102cbe8a09 namespaces: take into account USERNS id
and return an error, if a proccess live in another userns,
because criu doesn't support it.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-30 16:00:33 +04:00
Pavel Emelyanov
b1a8e41dd0 mnt: Don't validate mounts on pre-dump
This is for two reasons. First, validation can meet external mount
and will call plugins, which is not correct on pre-dump and actually
crashes on uninitilized plugins lists. Second, even if on pre-dump
mount tree is not "supported" this can be a temporary situation (yes,
yes, unlikely, but still).

On the other hand, it's better to fail earlier, but that's another
story.

Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-10-30 15:15:30 +04:00
Cyrill Gorcunov
48d81eb48a kerndat: Transform kerndat_get_devpts_stat into general form
We will need devtmpfs as well so make it general.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-30 15:10:31 +04:00
Cyrill Gorcunov
ec50bd8c91 tty: Add support of /dev/console
/dev/console is a system console which provided
by the system with major 5 and minor 1. It's usually
configured on system startup with console= option
and underlied driver is resposible to deliver messages
to the console user.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-27 21:35:39 +04:00
Andrey Vagin
bc9b4bcc3f parasite: stop a parasite daemon before dumping threads
The parasite daemon set up SIGCHLD handler, but for dumping threads we
use parasite-trap. While doing this the sigchild handler notices the
CHLD arriving on the thread trap, emits an error

(00.020292) Error (parasite-syscall.c:387): si_code=4 si_pid=3485 si_status=5

but wait() reports -1 (task is not dead, just trapped) and handler just exits.

Let's stop a parasite daemon before dumping threads.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-27 21:32:45 +04:00
Cyrill Gorcunov
5676383729 scripts: Add ACT_MAX limit and make @action_names being const
@action_names is rather a const array, so make
sure we never access some data outside of it
defining its size.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-27 21:30:37 +04:00
Cyrill Gorcunov
8c40f43018 prctl: Add new interface constants
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-27 21:25:25 +04:00
Pavel Emelyanov
fb54345e08 parasite: Don't keep code_orig on parasite_ctl
We need this only once -- while calling the mmap from remote
context -- so it's enough to have on-stack variable.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-10-23 20:30:34 +04:00
Andrey Vagin
2c65748f74 shmem: rework getting file descriptors for shared memory regions (v2)
/proc/PID/map_files are protected by the global CAP_SYS_ADMIN, so we
need to avoid using them to support user namespaces.

We are going to use memfd_create() to get the first file descriptor and
then all others processes will able to open it via /proc/PID/fd/X.

This patch reworks slave processes to not use map_files.

v2: add more comments
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-23 19:57:12 +04:00
Andrey Vagin
de71c48079 syscall: add memfd_create() (v3)
v2: Follow the kerndat style that "features" are described
just by global boolean variables.

v3: give NULL as a name to get EFAULT if memfd_create is supported
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-23 19:57:10 +04:00
Cyrill Gorcunov
8644d2ba83 files-reg: Add try_collect_special_file
The idea is to be able to lookup for special id
which might be not present and we should not
yield the error.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-23 17:51:49 +04:00
Cyrill Gorcunov
a944a78ce9 files-reg: Export do_open_reg_noseek_flags
We will need it for tty restore.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-23 17:51:33 +04:00
Cyrill Gorcunov
81c598cc05 files: Add file_desc_init helper
To use it in tty code even when file
descriptor is not added into files chain.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-23 17:51:32 +04:00
Cyrill Gorcunov
d6e231ae09 tty: parasite -- Don't call for TIOCGPKT/TIOCGPTLCK on non-ptys
We will have to support more tty types in future so
make calls depending on type of ttys.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-23 17:51:27 +04:00
Cyrill Gorcunov
bcc1f4eb72 tty: Introduce tty types
Instead of calling case() with majors all over the places lets
introduce own enum for tty types and use it instead.

Because we're using not @major numbers now but taking @minors
into account as well, this brings more strict check of which
kind of terminals we can dump now thus it's potentially should
fix the cases when we're trying to c/r terminals which we don't
understand yet (in particular /dev/console [5:1]).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-23 17:51:26 +04:00
Pavel Emelyanov
198c93656c pstree: Add helper for adding helpers to pstree
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-14 18:02:36 +04:00
Pavel Emelyanov
16971e47cd ns: Introduce ns walking helper
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-14 18:01:27 +04:00
Tycho Andersen
2b5d06817f dump: pre-load kernel modules
See the comment below for an explanation of what is going on. We will
ultimately need to handle dumping the netlink data, but I think it is good to
prevent injecting events into the stream during a dump. So we pre-load the
modules, even though it isn't very pretty.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-14 14:21:05 +04:00
Pavel Tikhomirov
ffe3d5cfda add int(CTL_32)
Signed-off-by: Pavel Tikhomirov <ptikhomirov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-08 19:23:24 +04:00
Tycho Andersen
de055b7992 cg: use one path style throughout cg restore code
This commit is in preparation for the (hopefully last :) restore special cpuset
patch.

Previously, we installed the cgroup service fd after calling
prepare_cgroup_dirs, which meant that we had to carry around the temporary
directory name in order to put things in the right place. The
restore_cgroup_prop function uses the cg service fd instead of carrying around
the full path. This means that we can't sue restore_cgroup_prop, without first
sanitizing the path. Instead, we install the service fd before calling
prepare_cgroup_dirs, and all the code just references that instead of carrying
around the temporary path.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-07 12:56:52 +04:00
Cyrill Gorcunov
73b9a2ebe3 cpuinfo: Add "cpuinfo [dump|check]" commands, v2
On Wed, Oct 01, 2014 at 05:51:09PM +0400, Pavel Emelyanov wrote:
> > Yes, what you've been expecting?
>
> if (!strcmp(argv[optind]))
> 	return cpu_cap_check()
>
> or smth like this.

updated. So if it become confusing -- feel free to merge [1;9] and
ping me to resend the rest, or pick up from attachements.

>From 6af96ff63ac82f9566c3cba9c116dc67698c9797 Mon Sep 17 00:00:00 2001
From: Cyrill Gorcunov <gorcunov@openvz.org>
Date: Tue, 30 Sep 2014 18:33:40 +0400
Subject: [PATCH] cpuinfo: Add "cpuinfo [dump|check]" commands

They allow to validate cpuinfo information
without running complete dump/restore actions.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:26:58 +04:00