In this mode we test if target cpu has all features present
in image file but do not require bit to bit match: target cpu
may be a new one with more features present.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Restoring AIO is quite simple. Once all VMAs are put in
their places we can call io_setup() to let kernel create
the context back and then move the ring into proper place.
Another thing we should "restore" is the context ID. But
the thing is, upon ring creation kernel repots the ring
start address as this ID. And there's a patch in the -next
tree that changes the ID when we remap the ring. That
said after AIO context creation and ring remap we need
to check that the new ID is seen by the kernel.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When AIO context is set up kernel does two things:
1. creates an in-kernel aioctx object
2. maps a ring into process memory
The 2nd thing gives us all the needed information
about how the AIO was set up. So, in order to dump
one we need to pick the ring in memory and get all
the information we need from it.
One thing to note -- we cannot dump tasks if there
are any AIO requests pending. So we also need to
go to parasite and check the ring to be empty.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
/dev/ttyN are the virtual terminals which are provided
by the system with major 4 and minor 1..63.
You can run some program on ttyN by pressing alt+ctrl+FN
and running it manualy or by using open(openvt nowadays).
This patch also allows us to run all our tests from a vt.
v2, style fix + using linux/vt.h for constants
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is a very common error when using criu.
The problem here is that we need to somehow transfer cr_errno
from one process to another. I suggest using pipe to give
one end to children and read cr_errno on other after restore
is finished.
v2, Pavel suggested putting errno into shared task_entries.
v3. and he also suggested using cmpxchg
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There are cases where a process's file descriptor cannot be restored
from the checkpoint images. For example, a pipe file descriptor with
one end in the checkpointed process and the other end in a separate
process (that was not part of the checkpointed process tree) cannot be
restored because after checkpoint the pipe will be broken.
There are also cases where the user wants to use a new file during
restore instead of the original file at checkpoint time. For example,
the user wants to change the log file of a process from /path/to/oldlog
to /path/to/newlog.
In these cases, criu's caller should set up a new file descriptor to be
inherited by the restored process and specify the file descriptor with the
--inherit-fd command line option. The argument of --inherit-fd has the
format fd[%d]:%s, where %d tells criu which of its own file descriptors
to use for restoring the file identified by %s.
As a debugging aid, if the argument has the format debug[%d]:%s, it tells
criu to write out the string after colon to the file descriptor %d. This
can be used, for example, as an easy way to leave a "restore marker"
in the output stream of the process.
It's important to note that inherit fd support breaks applications
that depend on the state of the file descriptor being inherited. So,
consider inherit fd only for specific use cases that you know for sure
won't break the application.
For examples please visit http://criu.org/Category:HOWTO.
v2: Added a check in send_fd_to_self() to avoid closing an inherit fd.
Also, as an extra measure of caution, added checks in the inherit fd
look up functions to make sure that the inherit fd hasn't been reused.
The patch also includes minor cosmetic changes.
Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The kernel can do it better. The problem exists only for recv queues.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we are doing pre-dump, we splice pages in pipes and only then open
images and dump pages. But when we are splicing pages, we need to know
about existence of parent images. This patch adds a new call to determin
existence of parent images.
In addition this patch fixes a following issue:
CID 83244 (#1 of 1): Uninitialized pointer read (UNINIT)
14. uninit_use: Using uninitialized value xfer.parent.
v2: initialize unused field of struct page_server_iov, because it sends
in network.
CID 83451 (#1 of 1): Uninitialized scalar variable (UNINIT)
2. uninit_use_in_call: Using uninitialized value pi. Field pi.nr_pages
is uninitialized when calling write.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now we push all the auxiliary arguments to parasite_infect_seized
while 2 of them are only required to calculate the size of args area.
Let's better keep track of required args size and get rid of excessive
arguments to parasite_infect_seized().
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
ispathsub("/foo", "/") reports false. This is a corner case,
as 2nd argument is not expected to end with /. Fix this and
add comment about ispathsub() arguments assumptions.
Reported-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we validate the mount tree not to have overmounts we need to
check one path to be the sub-path of another. Here's a helper for
this.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
For that we need to save per-namespace mappings of user and group IDs.
And all id-s for tasks and files are saved from the target user
namespace.
v2: move code into collect_namespaces()
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to support user namespaces and uid-s will be converted
accoding with userns mappings.
v2: conver id-s for sockets too
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2: don't leak FILE
CID 73423 (#1 of 1): Resource leak (RESOURCE_LEAK)
15. leaked_storage: Variable f going out of scope leaks the storage it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We get sig and pgid from a parasite, because we need to get
them from a target pid namespace.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have two reason for that:
* parsing of /proc/pid/status is slow
* parasite returns ids from a target userns
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
and return an error, if a proccess live in another userns,
because criu doesn't support it.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is for two reasons. First, validation can meet external mount
and will call plugins, which is not correct on pre-dump and actually
crashes on uninitilized plugins lists. Second, even if on pre-dump
mount tree is not "supported" this can be a temporary situation (yes,
yes, unlikely, but still).
On the other hand, it's better to fail earlier, but that's another
story.
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
We will need devtmpfs as well so make it general.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
/dev/console is a system console which provided
by the system with major 5 and minor 1. It's usually
configured on system startup with console= option
and underlied driver is resposible to deliver messages
to the console user.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The parasite daemon set up SIGCHLD handler, but for dumping threads we
use parasite-trap. While doing this the sigchild handler notices the
CHLD arriving on the thread trap, emits an error
(00.020292) Error (parasite-syscall.c:387): si_code=4 si_pid=3485 si_status=5
but wait() reports -1 (task is not dead, just trapped) and handler just exits.
Let's stop a parasite daemon before dumping threads.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
@action_names is rather a const array, so make
sure we never access some data outside of it
defining its size.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We need this only once -- while calling the mmap from remote
context -- so it's enough to have on-stack variable.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
/proc/PID/map_files are protected by the global CAP_SYS_ADMIN, so we
need to avoid using them to support user namespaces.
We are going to use memfd_create() to get the first file descriptor and
then all others processes will able to open it via /proc/PID/fd/X.
This patch reworks slave processes to not use map_files.
v2: add more comments
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2: Follow the kerndat style that "features" are described
just by global boolean variables.
v3: give NULL as a name to get EFAULT if memfd_create is supported
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The idea is to be able to lookup for special id
which might be not present and we should not
yield the error.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need it for tty restore.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To use it in tty code even when file
descriptor is not added into files chain.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will have to support more tty types in future so
make calls depending on type of ttys.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Instead of calling case() with majors all over the places lets
introduce own enum for tty types and use it instead.
Because we're using not @major numbers now but taking @minors
into account as well, this brings more strict check of which
kind of terminals we can dump now thus it's potentially should
fix the cases when we're trying to c/r terminals which we don't
understand yet (in particular /dev/console [5:1]).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
See the comment below for an explanation of what is going on. We will
ultimately need to handle dumping the netlink data, but I think it is good to
prevent injecting events into the stream during a dump. So we pre-load the
modules, even though it isn't very pretty.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This commit is in preparation for the (hopefully last :) restore special cpuset
patch.
Previously, we installed the cgroup service fd after calling
prepare_cgroup_dirs, which meant that we had to carry around the temporary
directory name in order to put things in the right place. The
restore_cgroup_prop function uses the cg service fd instead of carrying around
the full path. This means that we can't sue restore_cgroup_prop, without first
sanitizing the path. Instead, we install the service fd before calling
prepare_cgroup_dirs, and all the code just references that instead of carrying
around the temporary path.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On Wed, Oct 01, 2014 at 05:51:09PM +0400, Pavel Emelyanov wrote:
> > Yes, what you've been expecting?
>
> if (!strcmp(argv[optind]))
> return cpu_cap_check()
>
> or smth like this.
updated. So if it become confusing -- feel free to merge [1;9] and
ping me to resend the rest, or pick up from attachements.
>From 6af96ff63ac82f9566c3cba9c116dc67698c9797 Mon Sep 17 00:00:00 2001
From: Cyrill Gorcunov <gorcunov@openvz.org>
Date: Tue, 30 Sep 2014 18:33:40 +0400
Subject: [PATCH] cpuinfo: Add "cpuinfo [dump|check]" commands
They allow to validate cpuinfo information
without running complete dump/restore actions.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>