We are going to replace pid on id in names of image files. The id is
uniq for each namespace, so it's more convient, if image files are
opened per namespace.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to replace pid on id in names of image files. The id is
uniq for each namespace, so it's more convient, if image files are
opened per namespace.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's preparation to replace pid on id in image names.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently all values of constants should be continuous,
because cr_fdset_open is used for opening images for all namespaces.
The next patches will rework this code and image files will be opened
per namespace, then all these ugly settings of one constant to another
will be removed.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we catch processes on the exit point from sigreturn.
If a task must be restored in the stopped state, we can send SIGSTOP
before detaching from it.
v2: add more descriptive comment about skipping SIGSTOP in ptrace.c
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we take pid and core from it and we are going to take state.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The kernel notifies about leaving syscall before starting to deliver
signals. If you don't believe me, pls look at arch/x86/kernel/entry_64.S:
int_ret_from_sys_call
syscall_trace_leave
do_notify_resume
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Everything in the sk-unix.c is ready for seq-packet sockets.
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case criu and dumpee live in the same mount namespace there's no
need in getting ns' root from init task. We can get it from criu and
(!) void the root == "/" check, required for namespace case.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We cannot fail at that late stage, as everything is restored
and running. In the worst case (unmap fails) restored task would
have one extra mapping.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We can restore task's pgid which is not equal to its pid,
only when the respective group leader is alive. To make
restore reliable we wait for all group leaders to restore
using separate restore stage.
It's better to optimize this -- each task has a pointer on
its group leader and waits for one to become such.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If a task opens a file, then goes chroot, CRIU should still
be able to dump and restore the now-invisible file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
First of all, this should be done strictly after we've stopped accessing
files by their paths, even absolute. This place is right before going
into restorer.
And the second thing is that we want to re-use the open_fd_by_id engine,
since it handles various tricky cases of open-file-by-path. And since
there's no such thing as fchroot(int fd), we emulate it using the
/proc/self/fd/ links.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We restore chroot before doing this, so if we might need to
open one, we may have no access to the /proc/... paths.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The register R12 has a special meaning when syscalls are hooked
with ptrace() in ARM that results in a dumpee context corruption
on an injected blob unmap. Note that this patch doesn't solve
the problem entirely since the compiler may corrupt the register
before issuing a call to the routine sys_munmap(); however
we assume that a sufficiently decent compiler doesn't.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Tested-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I don't know a reason, when accept() fails once and then goes back to
normal work.
Cc: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This vma looks like VSYSCALL on x86. We don't need to dump and restore it.
Currently this vma is dumped and restored as a private vma, but it is not
remmaped in a correct place:
Restore
--- dump/pipe00/6392/1/dump.maps 2013-09-23 12:49:19.436816192 +0000
+++ dump/pipe00/6392/1/restore.maps 2013-09-23 12:49:20.276766356 +0000
@@ -6,5 +6,6 @@ e05000-e26000
4009d000-4009f000
400a0000-400aa000
400b8000-401e7000
+b6d6f000-b6d70000
be838000-be859000
ffff0000-ffff1000
ERROR: Sets of mappings differ:
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch detects the race, when a signal hanler could be executed
during restore.
More details are in: 5d18eca restorer: Block signals early
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some VMA-s can be merged on restore. For example, If a process maps
VMA1, VMA2 and then VMA3 between the previous ones.
|VMA1|VMA3|VMA2|
The VMA3 will be merged only with VMA1, but all three VMA-s will be
merged on restore, because they are mmaped in a correct order VMA1,
VMA3, VMA2.
Due to this issue, we have a small script for merging continuous VMA-s.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
All process VMA-s are in "premmaped area". All restorer stuff are in
bootstap "area", so we have two areas.
So we don't need to unmap extra VMA-s one by one. We can call munmap
three times for the region before the first area, for the hole between
areas and for the region after the second area.
The old scheme didn't work, because the list of VMA-s can be changed
after collecting. It can be due to memory allocations by libc or due to
increased stack.
v2: improve readability at the expense of beautiness
v3: print return code of munmap in error messages
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch adds a new parasite command, which unmaps the parasite blob.
This command never returns and the criu process traps the target process
on the exit from the munmap syscall.
v2: rename the function for unmaping a parasite blob to not intersects
with criu's functions.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch adds a function for removing the restorer blob. This function
never returns and the process must be trapped on the exit from the
munmap syscall.
v2: * release parasite_ctl sturcture and use the new interface of
parasite_prep_ctl
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A task is stopped here for unmaping restorer blob and restoring a state.
The method is the same as for parasite. CRIU attaches to processes via
ptrace and start to trace all syscalls.
v2: don't use a software breakpoint
v3: stop all thread on the exit from sigreturn
v4: attach to each thread
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
parse_thread allocated a buffer for threads and then it initialized read
pid for each thread.
Now we want to use it on restore and in this moment we already have
a buffer with initialized virt pid-s, so we need to initialize read
pid-s only.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used in cr-restore.c for stopping threads on the exit from
sigreturn.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The munmap syscall must be executed from a process memory. The code can
be injected in memory and then removed. But we can avoid all these
actions, if the code will be in the blob and a process will be trapped
on the exit from the munmap syscall.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>