Based on work done by Cyrill Corcunov (many thanks for that).
In this commit we implement c/r for files which have opened
/proc/$pid/ns/$ids entries.
The idea is rather simple one
Checkpoint
==========
- Check if the file name is the one of known to be ns ref
- If match then write protobuf entry
Restore
=======
- Read all ns entries from the image
- When criu tries to open one we lookup over process
tree to figure out which PID should be used in path
and then just open it
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
RT_SIGFRAME_UC(sigframe).uc_sigmask.sig = args->blk_sigset;
blk_sigset is u64, but uc_sigmask.sig has type ulong [2], so
only a part of mask is restore here.
This patch reworks restoring of blocking masks symmetrically to dumping
these masks.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Tested-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2: handle errors from setXids and securebits manipulations
handle errors of restoring creds after finishing CR_STATE_RESTORE_CREDS,
because a sigchild handler is already restored in this moment.
Only the current process is killed in a error case.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For security reason processes can be resumed only when all
credentials are restored. Otherwise someone can attach to a
process, which are not restored credentials yet and execute
some code.
https://bugzilla.openvz.org/show_bug.cgi?id=2561
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is less useful than fixing typos in output messages, but anyway.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now we have 2 forms of storing pages -- legacy pages.img and
new pagemap + pages image. We'll have one more (ovz) and the
pagemap + pages will be stacked (snapshot restore). Thus it's
handy to have this as an page-reader object.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Information about mount points is used for dumping fanotify.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CID 996203 (#1 of 1): Resource leak (RESOURCE_LEAK)
15. leaked_storage: Variable "vma" going out of scope leaks the storage it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
cr-restore.c:1795:2: warning: Value stored to 'restore_task_vma_len' is
never read
restore_task_vma_len = 0;
^ ~
cr-restore.c:1796:2: warning: Value stored to 'restore_thread_vma_len'
is never read
restore_thread_vma_len = 0;
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
So when we fail print error thus a user would know
where exactly it failed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently dump silently terminates and restore emits some
meaning-less messages in either case. Make these important
messages more informative.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some exit()'s are called with exit(-1), some
are with exit(1). Use exit(1) everywhere for
consistency.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
After reworkring the way pagemap is stored the backward compatibility
was not preserved for patches simplicity. Time to return it back.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Each zombie sends SIGCHLD to parent. crtools restores all pending
signals, so all other signals should be collected.
Here is a problems, that signals SIGCHLD can be merged, but crtools
should be sure, that all signals are collected.
For that a zombie locks a global zombie_lock, which is released by
parent.
This operation should be done between CR_STATE_RESTORE and
CR_STATE_RESTORE_SIGCHLD.
Here is one more CR_STATE_RESTORE_ZOMBIES, whic is used for waiting all
zombies.
v2: clean up
v3: rework synchronization
v4: rework without additional CR_STATE-s
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Read siginfo-s from images and send them to itself by sigqueueinfo.
Thread signals cannot be restored in restore_thread_common, because
it blocks SIGCHLD, which used for error detecting.
v2: Don't remap task_args and thread_args
v3: fix error handling
v4: cosmetic clean up
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if there image corruption and page entry addres
is invalid -- exit out gracefully instead of BUG_ON hammer.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
"Can't fixup VMA's fd" is more understandable than plain
"Can't fixup fd".
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Use decode_pointer() to convert a virtual address into a native pointer.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Actually rt_sigset_t and k_rtsigset_t are the same
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise ppage_bitmap and page_bitmap will be updated for wrong VMA
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since now we drain pages out of parasite, we can invent any format for
page dumps. Let is be ... prorobuf one! :)
Another thing to keep in mind, is that we're about to use splices and
implement iterative migration, so it's better to have actual pages be
page-aligned in the image.
And -- backward compatibility. That said the new format is:
1. pagemap-... file which contains a header (currently with a ID of
the image with pages, see below) and an array of <nr_pages:vaddr>
pairs. The first value means "how many pages to take from the
file with pages (see below)" and the second -- where in the task
address space to put them. Simple.
2. pages-... file which containes only pages one by one (thus aligned
as we want).
This patch breaks backward compatibility (old images with pages wil
be restored and then crash). Need to do it before v0.5 release.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I will have to push some sort of map of pages to dump into parasite.
For this, I need to have estimation of how much memory I'd need for
than in parasite args. These two values will help with it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The FPU data is quite CPU-type oriented,
thus move it to asm/fpu.h.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The section 5.2.1.2 of the AAPCS says that the stack must be 8-byte aligned
and this rule is broken when the thread restore_task_with_children()
is forked by the function fork_with_pid() since the variable ca
and its field stack are likely to be 4-byte aligned.
This patch forces 8-byte alingment of the field cr_clone_arg::stack.
This made the following tests pass on ARM:
* static/shm,
* static/ipc_namespace.
Particulary the unaligned stack results in incorrect passing
of the 64-bit argument to the function snprintf() in the function
sysctl_write_u64().
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we've read all pstree-items and their ids we
can get the desired clone-flags early and avoid all
these dances with flag calculations in fork_with_pid
and company.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's no longer required to use this option -- two currently
supported cases (tasks on host and tasks in containers) can
be detected automatically. Keep this option for future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These functions are designated to convert a native pointer
to uint64_t used to store a virtual address in protobuf messages
and vice versa in a machine-independent way.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The patch reverts the commit 58064d9b723bd5a5e5910ed752fb3b19cc962fa8.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
According to the file lock information from the image, we recall
flock or fcntl with proper parameters, so we can rehold the file
locks as we were dumped.
We only support flock and posix file locks so far.
Changelog since the initial version:
a. Use prepare_file_locks instead of restore function directly.
b. Fix some bugs.
Originally-signed-off-by: Zheng Gu <cengku.gu@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch replaces the format specifier %ld with PRIx64
in the following places:
* the format string argument of the functions scanf() and printf(),
* in the macros GEN_SYSCTL_*_FUNC.
We need explicit specification of the integer size there.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This makes it possible to use TASK_SIZE instead of TASK_SIZE_MAX.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Restore of netns uses the file descriptor on the root netns. This
descriptor is opened before forking.
We can add one more service fd, but I think this approach is better.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will be handling both inotify and fanotify
objects here thus to make less confusion rename
the files to fsnotify.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>