Before this patch sigframes were constructed in restorer. We are going
to construct sigframes for parasites. Both parasite and restorer should
be as thing as posible.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
struct thread_restore_args contains many pointers on different objects,
only a few of them are really required.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Instead of opencoded mark injection provide
a helper for easier grepability.
[xemul: Go ahead and remove the INIT_VDSO_MARK at all]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if we have created vdso proxy the rt-vdso should
not be dumped because it will be re-created on next restore
anyway. Thus with help of parasite service routine find
the rt-vdso and tear it off from VMAs list.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if we need vdso proxy there is a need
to recognize it somehow on further checkpoint
action. But such vdso won't be recognized by
the kernel and [vdso] mark won't appear in
procfs output. Thus we put own mark on it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need it in parasite code to detect run time vdso area.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Runtime vdso need to be kept in some safe place when all
self-vmas are unmapped. So we reserve space for it in restorer
blob area and then remap it into. It's quite important to do
a remap here rather than data copy because otherwise pfn
of vdso disappear and in future we won't be able to detect
vdso are on dumping stage.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
During criu startup we need to fill symbol table of own
run-time vdso provided by the kernel. We will need this
data for vdso proxy.
Because this functions are not used in restorer code,
we move them out of PIE (since PIE code must remain
as small as possible).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's quite minimal at moment and provides only two helpers
- vdso_redirect_calls, to patch vdso area redirectling
calls to some new place.
- vdso_fill_symtable, to parse vma area as vdso library
and fill symbols table with offsets and names.
Because these routines will be needed in both regular criu
code and restorer code -- we compile it in pie format.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There will be a couple of more builtin helpers needed
in pie code soon. Thus to unify approach do
- rename asm/memcpy_64.h to asm/string.h
- introduce include/asm-generic/string.h file
where all helpers are implemented if optimized
variant is not yet provided
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This constants are system wide, so move them to mem.h
header for reuse sake.
[ xemul: It was kerndat.h in the patch ]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When a kernel didn't show vma flags, we set MAP_GROWSDOWN for stack
vmas, but it's not reliable. E.g. thread stacks are mapped without
MAP_GROWSDOWN.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
page-server creates a listen socket and only then goes into the
background, so we can be sure, that page-server is ready for work after
detaching.
v2: call daemon() in a proper place and reuse the option -d
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Based on work done by Cyrill Corcunov (many thanks for that).
In this commit we implement c/r for files which have opened
/proc/$pid/ns/$ids entries.
The idea is rather simple one
Checkpoint
==========
- Check if the file name is the one of known to be ns ref
- If match then write protobuf entry
Restore
=======
- Read all ns entries from the image
- When criu tries to open one we lookup over process
tree to figure out which PID should be used in path
and then just open it
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This will be needed for fast parsing of procfs ns references.
[ xemul: Add user_ns_desc here ]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To be able to dump procfs namespace entries we will need to
analyze the link name first and then figure out if the file
referred indeed a procfs entry or it's plain regular file.
This means we read link early, and to escape double reading
of same link, once we read it we remember it in fd_parms
structure which we pass to dump_one_reg_file.
Still the dump_one_reg_file is not solely used from inside
of dump_one_file, but from a number of other places (fifo,
special files) where fd_parms is filled only partially
and fill_fd_params is not even called. Thus dump_one_reg_file
must be ready for such scenario and read the link name by own
if the caller has not provided one.
To achieve all this we do
- extend fd_parms structure to have a reference on a new
fd_link structure, thus if caller already read the link
name it might assign the reference and call for dump_one_reg_file
- tune up dump_one_reg_file to fill own copy of fd_link
structure if the caller has not provied one
[ xemul: Added const to fill_fdlink arg ]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
struct fd_parms is reused for two calls, so don't
forget to initialize it before dump_one_reg_file
call.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The soft-dirty API has changed slightly -- now the bit in
question _is_ in pagemap file (not pagemap2) but to see it
we have to reset soft-dirty for anyone first.
Teach the kerndat soft-dirty checker this fact. The actual
pagemap reading code already knows select pagemap/pagemap2
file itself.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
pages_scanned -- the amount of pages criu looked at for decision
pages_skipped_parent -- the amount of pages that were skipped, due to
they are present in parent image
pages_writted -- the amount of pages criu transfered into image
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I'll need to modify this header so before anything
else lets beautify it
- drop struct pstree_item declaration it's already in pstree.h
- move struct cr_options to top
- align members of struct ns_desc
- move externs to top
- add argument name to try_show_namespaces
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
With this action criu will seize tasks, grab all its memory into
page-pipes, rest dirty tracker and will then release them. Writing
the memory from page-pipes would occur after tasks are unfreezed
and thus the frozen time should become reasonably small.
When pre-dump is in action, the dirty tracking is forcedly turned
off as well as tasks are resumed afterwards, not killed, by default.
This is a prerequisite for iterative migration.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dumping memory is draining it from parasite, for pre-dump
this time would be reasonably small. _Writing_ the memory
would occur _after_ tasks unseize and resume.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For pre-dump we'll need to cure the target task(s), but keep the
parasite_ctl (for its local args mapping) for memory image write.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
They are not documented, thus OK for now. Two options --
* one to specify where the parent images are
* one to reset dirty memory tracking
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is required for (soon to be) pre-dump command -- this command
will have to keep parasite args with pagemap (iovecs) some time after
parasite is executed. Since we call mprotect cmd _after_ pages dump
we can thus spoil these iovecs. To address this vmas to mprotect and
iovecs to dump are located in one parasite args area one after another
without intersections.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We'll have one more "image" file generated by dump and (surprisingly)
restore commands -- the stats one. It will contain in a single pb
object all the statistics collected by dump/restore.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
pie/restorer.c: In function ‘restore_thread_common’:
pie/restorer.c:262:39: error: incompatible types when assigning to type
‘k_rtsigset_t’ from type ‘u64’
RT_SIGFRAME_UC(sigframe).uc_sigmask = args->blk_sigset;
^
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2: handle errors from setXids and securebits manipulations
handle errors of restoring creds after finishing CR_STATE_RESTORE_CREDS,
because a sigchild handler is already restored in this moment.
Only the current process is killed in a error case.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For security reason processes can be resumed only when all
credentials are restored. Otherwise someone can attach to a
process, which are not restored credentials yet and execute
some code.
https://bugzilla.openvz.org/show_bug.cgi?id=2561
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Split it into two parts -- writing pagemap and writing
pages. This is required for page-server part, sinc it
cannot provide one big pipe with all the pages at once.
Thus it needs to feed pages in portions.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>