We have three arrays for thread related data: item->threads,
parasite_ctl->thread and tid_state in parasite.
With this patch a thread will have the same index in all arrays.
The zero index is used for a thread leader.
In this case we don't need to search thread_state in parasite.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When parasite daemon mode will be implemented we get deprived of ability
to fetch registers at the late moment of dumping as we were, thus just
bind CoreEntry to pstree item and allocate CoreEntry'ies for every
thread found, once process tree is in seized state.
Then immediately fill CoreEntry'ies with registers. We use prctl
opcode for that but fetch a complete set of registers including
FPU state, and convert them into protobuf format.
Zombie tasks remains untouched, we allocate CoreEntry for them
right at moment of dumping becuase we don't need registers there
to be written on disk.
This way get_task_regs no longer need parasite_ctl argument
and it's zapped.
Still parasite_ctl has own copy of general registers set but
this is because we need them to be in cpu native format unlike
ones kept in CoreEntry.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if we have created vdso proxy the rt-vdso should
not be dumped because it will be re-created on next restore
anyway. Thus with help of parasite service routine find
the rt-vdso and tear it off from VMAs list.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
During criu startup we need to fill symbol table of own
run-time vdso provided by the kernel. We will need this
data for vdso proxy.
Because this functions are not used in restorer code,
we move them out of PIE (since PIE code must remain
as small as possible).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When a kernel didn't show vma flags, we set MAP_GROWSDOWN for stack
vmas, but it's not reliable. E.g. thread stacks are mapped without
MAP_GROWSDOWN.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
struct fd_parms is reused for two calls, so don't
forget to initialize it before dump_one_reg_file
call.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
With this action criu will seize tasks, grab all its memory into
page-pipes, rest dirty tracker and will then release them. Writing
the memory from page-pipes would occur after tasks are unfreezed
and thus the frozen time should become reasonably small.
When pre-dump is in action, the dirty tracking is forcedly turned
off as well as tasks are resumed afterwards, not killed, by default.
This is a prerequisite for iterative migration.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
One of such things we use right now is the device for anon shmem
mappings backing. In the furure this can be extended to check for
various kernel features.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Information about mount points is used for dumping fanotify.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Use a PRI* format specifier to convert an integer of known size
to a string.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CID 996204 (#1 of 1): Resource leak (RESOURCE_LEAK)
11. leaked_storage: Variable "ch" going out of scope leaks the storage it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CID 996205 (#1-2 of 2): Resource leak (RESOURCE_LEAK)
14. leaked_storage: Variable "core" going out of scope leaks the storage it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If any task has a sysvipc mapping we should make sure, that the
ipc namespace is dumped as well. Otherwise after restore the task
will die.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
VMA-s may be protected against read, so rights for such VMA-s should be
changed for dumping and protected back after dumping.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently dump silently terminates and restore emits some
meaning-less messages in either case. Make these important
messages more informative.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we've failed to seize tasks we should report this error to the caller.
Reported-by: Kevin Wilson <wkevils@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
PTRACE_PEEKSIGINFO is used for received pending signals,
then all signal are sent back and saved in a image.
v2: rework according with the new kernel interface
v3: rework according with the newrest kernel interface
v4: fix error handling
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The page server is a process, that is about to get pages over
the network and put them into pagemap- + pages- images. Right
now what it does is simply get the data and puts it into the
image files. When we have dirty set tracking in the kernel the
page server will have to collect "page changes" and properly
integrate them into images.
Running crtools with page server is like this:
dst_node# crtools page-server --port <port> -D dump/ ...
src_node# crtools dump -t <pid> --page-server --address <dst_node> --port <port> -D dump/ ...
After this images from dst_node/dump/ and src_node/dump/ should
be put into one place and tasks can be restored out of it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since now we drain pages out of parasite, we can invent any format for
page dumps. Let is be ... prorobuf one! :)
Another thing to keep in mind, is that we're about to use splices and
implement iterative migration, so it's better to have actual pages be
page-aligned in the image.
And -- backward compatibility. That said the new format is:
1. pagemap-... file which contains a header (currently with a ID of
the image with pages, see below) and an array of <nr_pages:vaddr>
pairs. The first value means "how many pages to take from the
file with pages (see below)" and the second -- where in the task
address space to put them. Simple.
2. pages-... file which containes only pages one by one (thus aligned
as we want).
This patch breaks backward compatibility (old images with pages wil
be restored and then crash). Need to do it before v0.5 release.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I will have to push some sort of map of pages to dump into parasite.
For this, I need to have estimation of how much memory I'd need for
than in parasite args. These two values will help with it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Just make use of previous patch. The creds dumping args are tuned to
fit one page (minimal static args size).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some of ours headers (such as syscall.h) are
clashes with system headers names. So we need
to be sure that the headers we include as
| #include "something.h"
being searched in known place. In particular on
some machines it it already produced problems.
This btw revealved a problem in cr-dump.c -- we've
had #include <parasite.h> there. Fix it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Tested-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The FPU data is quite CPU-type oriented,
thus move it to asm/fpu.h.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch can be committed instead of:
[PATCH 1/6] cr-dump: move parasite_drain_fds_seized out of dump_task_files
[PATCH 2/6] cr-dump: fix dumping file locks in a mount namespace
readlink is not required here and a file can be unavailable,
if a process is in another mnt namespace
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's no longer required to use this option -- two currently
supported cases (tasks on host and tasks in containers) can
be detected automatically. Keep this option for future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's required to know whether the root task lives in
namespaces very very early (e.g. -- to lock the network
properly). Thus we have to collect task IDs right at
the time we collect the tasks themselves.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The recent kernels allow to get namespaces IDs by reading proc-ns links.
Use this to generate IDs for tasks' namespaces (I do generate them, since
IDs provided by kernel look ugly :( ).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>