PTRACE_PEEKSIGINFO is used for received pending signals,
then all signal are sent back and saved in a image.
v2: rework according with the new kernel interface
v3: rework according with the newrest kernel interface
v4: fix error handling
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The page server is a process, that is about to get pages over
the network and put them into pagemap- + pages- images. Right
now what it does is simply get the data and puts it into the
image files. When we have dirty set tracking in the kernel the
page server will have to collect "page changes" and properly
integrate them into images.
Running crtools with page server is like this:
dst_node# crtools page-server --port <port> -D dump/ ...
src_node# crtools dump -t <pid> --page-server --address <dst_node> --port <port> -D dump/ ...
After this images from dst_node/dump/ and src_node/dump/ should
be put into one place and tasks can be restored out of it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since now we drain pages out of parasite, we can invent any format for
page dumps. Let is be ... prorobuf one! :)
Another thing to keep in mind, is that we're about to use splices and
implement iterative migration, so it's better to have actual pages be
page-aligned in the image.
And -- backward compatibility. That said the new format is:
1. pagemap-... file which contains a header (currently with a ID of
the image with pages, see below) and an array of <nr_pages:vaddr>
pairs. The first value means "how many pages to take from the
file with pages (see below)" and the second -- where in the task
address space to put them. Simple.
2. pages-... file which containes only pages one by one (thus aligned
as we want).
This patch breaks backward compatibility (old images with pages wil
be restored and then crash). Need to do it before v0.5 release.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I will have to push some sort of map of pages to dump into parasite.
For this, I need to have estimation of how much memory I'd need for
than in parasite args. These two values will help with it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Just make use of previous patch. The creds dumping args are tuned to
fit one page (minimal static args size).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some of ours headers (such as syscall.h) are
clashes with system headers names. So we need
to be sure that the headers we include as
| #include "something.h"
being searched in known place. In particular on
some machines it it already produced problems.
This btw revealved a problem in cr-dump.c -- we've
had #include <parasite.h> there. Fix it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Tested-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The FPU data is quite CPU-type oriented,
thus move it to asm/fpu.h.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch can be committed instead of:
[PATCH 1/6] cr-dump: move parasite_drain_fds_seized out of dump_task_files
[PATCH 2/6] cr-dump: fix dumping file locks in a mount namespace
readlink is not required here and a file can be unavailable,
if a process is in another mnt namespace
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's no longer required to use this option -- two currently
supported cases (tasks on host and tasks in containers) can
be detected automatically. Keep this option for future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's required to know whether the root task lives in
namespaces very very early (e.g. -- to lock the network
properly). Thus we have to collect task IDs right at
the time we collect the tasks themselves.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The recent kernels allow to get namespaces IDs by reading proc-ns links.
Use this to generate IDs for tasks' namespaces (I do generate them, since
IDs provided by kernel look ugly :( ).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These functions are designated to convert a native pointer
to uint64_t used to store a virtual address in protobuf messages
and vice versa in a machine-independent way.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dump file locks' necessary entries to the image, we only support flock and
posix file lock right now.
Changelog since the initial version:
We got file lock info from global list, so the dump_task_file_locks
can be much simpler.
Originally-signed-off-by: Zheng Gu <cengku.gu@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We collect all file locks to a golbal list, so we can use them easily
in dump_one_task. For optimizaton, we only collect file locks hold by
tasks in the pstree.
Thanks to the ptrace-seize machanism, we can aviod the blocked file lock
issue, makes the work simpler.
Right now, the check handles only one situation:
-- Dumping tasks with file locks hold without the -l option.
This covers for the most part. But we still need some more work to make
it perfect robust in the future.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For fanotify dumping we need to find mount points the path
lay on, moreover we save a mount point device number in the
image, thus collect mount point information on dump stage.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will be handling both inotify and fanotify
objects here thus to make less confusion rename
the files to fsnotify.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is the merge and a slight rework (no TI_SP macro) of Alexander's patches
about the subj.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currenly crtools supports a case when a child shared a fd table
with parent.
Here is only two interesting things.
* Service descriptors should be cloned for each process
who shared one fd table.
* One task should restore files and other tasks should sleep in this
* time.
v2: * allocate fdt_lock from shared memory
* don't wait a child, if it doesn't share fdtable
v3: * don't move ids on the pstree image
v4: * save ids in a separate image
* save fdinfo per id instead of pid
v5: fix alignment of service_fd_id
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used for determing which resources are shared
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It is read together with pstree items for checking what kind of
resources should be shared. Core is too big for reading it in
this place.
v2: fix check_core
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently fdinfo dumps for each task, so CR_FD_FDINFO is in cr_fdset.
A few tasks can share one fd table and the set of descriptors will be
dumped once and a image name will contain files_id instead of pid.
In this case CR_FD_FDINFO will go away from cr_fdset.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dump the with "new" prlimit syscall that works on arbitrary pid.
Restore is done in restorer _after_ mappings mixup and _before_
caps drop to make it set any max value.
The RLIM_INFINITY is handled explicitly to help future 64<->32
bits migration.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The number of arguments used to carry data via them is too
big already. Just fill the required core fields inside.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The size of an auxv is the machine pointer but a 64-bit integer is reserved
in a MmEntry protobuf message to store an auxv. Moreover the number of auxv's
varies from one architecture to another. So the following is proposed
to alleviate the issue.
* Introduced the type auxv_t representing a machine-pointer sized integer.
* The size of auxv array is extracted from a MmEntry message instead of using
the value of the macro AT_VECTOR_SIZE.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* The following files goes into the directory arch/x86/include/asm unmodified:
- include/atomic.h,
- include/linkage.h,
- include/memcpy_64.h,
- include/types.h,
- include/bitops.h,
- pie/parasite-head-x86-64.S,
- include/processor-flags.h,
- include/syscall-x86-64.def.
* Changed include directives in the source files that include the headers
listed above.
* Modified build scripts to reflect the source moves.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The dumping of FPU state is done with help of ptrace
facility. There are two cases which we need to handle
depending on which features are available on host machine
1) The dump via ptrace(PTRACE_GETFPREGS ...)
In this case the kernel will use fxsave approach
inside the kenrel and provides us back the data
encoded in i387_fxsave_struct format.
2) The dump via ptrace(PTRACE_GETREGSET ...)
In this case the kernel will use xsave approach
inside the kernel and provides us back the data
encoded in xsave_struct format.
In any case we decode data and save it in protobuf format.
This is why core.proto file has been extended to keep new
entries.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
And don't forget to undef them once they are not needed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Actually it was never used, just drop it.
Because of backward compatibility problem we
can't just zap it in protofile.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The cpu we're running on must at least support fxsave feature.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>