v2: handle errors from setXids and securebits manipulations
handle errors of restoring creds after finishing CR_STATE_RESTORE_CREDS,
because a sigchild handler is already restored in this moment.
Only the current process is killed in a error case.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For security reason processes can be resumed only when all
credentials are restored. Otherwise someone can attach to a
process, which are not restored credentials yet and execute
some code.
https://bugzilla.openvz.org/show_bug.cgi?id=2561
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is less useful than fixing typos in output messages, but anyway.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now we have 2 forms of storing pages -- legacy pages.img and
new pagemap + pages image. We'll have one more (ovz) and the
pagemap + pages will be stacked (snapshot restore). Thus it's
handy to have this as an page-reader object.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Information about mount points is used for dumping fanotify.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CID 996203 (#1 of 1): Resource leak (RESOURCE_LEAK)
15. leaked_storage: Variable "vma" going out of scope leaks the storage it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
cr-restore.c:1795:2: warning: Value stored to 'restore_task_vma_len' is
never read
restore_task_vma_len = 0;
^ ~
cr-restore.c:1796:2: warning: Value stored to 'restore_thread_vma_len'
is never read
restore_thread_vma_len = 0;
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
So when we fail print error thus a user would know
where exactly it failed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently dump silently terminates and restore emits some
meaning-less messages in either case. Make these important
messages more informative.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some exit()'s are called with exit(-1), some
are with exit(1). Use exit(1) everywhere for
consistency.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
After reworkring the way pagemap is stored the backward compatibility
was not preserved for patches simplicity. Time to return it back.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Each zombie sends SIGCHLD to parent. crtools restores all pending
signals, so all other signals should be collected.
Here is a problems, that signals SIGCHLD can be merged, but crtools
should be sure, that all signals are collected.
For that a zombie locks a global zombie_lock, which is released by
parent.
This operation should be done between CR_STATE_RESTORE and
CR_STATE_RESTORE_SIGCHLD.
Here is one more CR_STATE_RESTORE_ZOMBIES, whic is used for waiting all
zombies.
v2: clean up
v3: rework synchronization
v4: rework without additional CR_STATE-s
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Read siginfo-s from images and send them to itself by sigqueueinfo.
Thread signals cannot be restored in restore_thread_common, because
it blocks SIGCHLD, which used for error detecting.
v2: Don't remap task_args and thread_args
v3: fix error handling
v4: cosmetic clean up
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if there image corruption and page entry addres
is invalid -- exit out gracefully instead of BUG_ON hammer.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
"Can't fixup VMA's fd" is more understandable than plain
"Can't fixup fd".
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Use decode_pointer() to convert a virtual address into a native pointer.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Actually rt_sigset_t and k_rtsigset_t are the same
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise ppage_bitmap and page_bitmap will be updated for wrong VMA
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since now we drain pages out of parasite, we can invent any format for
page dumps. Let is be ... prorobuf one! :)
Another thing to keep in mind, is that we're about to use splices and
implement iterative migration, so it's better to have actual pages be
page-aligned in the image.
And -- backward compatibility. That said the new format is:
1. pagemap-... file which contains a header (currently with a ID of
the image with pages, see below) and an array of <nr_pages:vaddr>
pairs. The first value means "how many pages to take from the
file with pages (see below)" and the second -- where in the task
address space to put them. Simple.
2. pages-... file which containes only pages one by one (thus aligned
as we want).
This patch breaks backward compatibility (old images with pages wil
be restored and then crash). Need to do it before v0.5 release.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I will have to push some sort of map of pages to dump into parasite.
For this, I need to have estimation of how much memory I'd need for
than in parasite args. These two values will help with it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The FPU data is quite CPU-type oriented,
thus move it to asm/fpu.h.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The section 5.2.1.2 of the AAPCS says that the stack must be 8-byte aligned
and this rule is broken when the thread restore_task_with_children()
is forked by the function fork_with_pid() since the variable ca
and its field stack are likely to be 4-byte aligned.
This patch forces 8-byte alingment of the field cr_clone_arg::stack.
This made the following tests pass on ARM:
* static/shm,
* static/ipc_namespace.
Particulary the unaligned stack results in incorrect passing
of the 64-bit argument to the function snprintf() in the function
sysctl_write_u64().
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we've read all pstree-items and their ids we
can get the desired clone-flags early and avoid all
these dances with flag calculations in fork_with_pid
and company.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's no longer required to use this option -- two currently
supported cases (tasks on host and tasks in containers) can
be detected automatically. Keep this option for future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These functions are designated to convert a native pointer
to uint64_t used to store a virtual address in protobuf messages
and vice versa in a machine-independent way.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The patch reverts the commit 58064d9b723bd5a5e5910ed752fb3b19cc962fa8.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
According to the file lock information from the image, we recall
flock or fcntl with proper parameters, so we can rehold the file
locks as we were dumped.
We only support flock and posix file locks so far.
Changelog since the initial version:
a. Use prepare_file_locks instead of restore function directly.
b. Fix some bugs.
Originally-signed-off-by: Zheng Gu <cengku.gu@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch replaces the format specifier %ld with PRIx64
in the following places:
* the format string argument of the functions scanf() and printf(),
* in the macros GEN_SYSCTL_*_FUNC.
We need explicit specification of the integer size there.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This makes it possible to use TASK_SIZE instead of TASK_SIZE_MAX.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Restore of netns uses the file descriptor on the root netns. This
descriptor is opened before forking.
We can add one more service fd, but I think this approach is better.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will be handling both inotify and fanotify
objects here thus to make less confusion rename
the files to fsnotify.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is the merge and a slight rework (no TI_SP macro) of Alexander's patches
about the subj.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch is intended to reduce the usage of the macro P() since
integer size mismatches sometimes may be fixed by this type generalization.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currenly crtools supports a case when a child shared a fd table
with parent.
Here is only two interesting things.
* Service descriptors should be cloned for each process
who shared one fd table.
* One task should restore files and other tasks should sleep in this
* time.
v2: * allocate fdt_lock from shared memory
* don't wait a child, if it doesn't share fdtable
v3: * don't move ids on the pstree image
v4: * save ids in a separate image
* save fdinfo per id instead of pid
v5: fix alignment of service_fd_id
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A few processes can share one fd table. Each process has own set of
service file descriptors and a process knows nothing about servic fds
of another processes. So if two process share one fd table,
close_old_fds will close servic descriptors of another process.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It is read together with pstree items for checking what kind of
resources should be shared. Core is too big for reading it in
this place.
v2: fix check_core
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dump the with "new" prlimit syscall that works on arbitrary pid.
Restore is done in restorer _after_ mappings mixup and _before_
caps drop to make it set any max value.
The RLIM_INFINITY is handled explicitly to help future 64<->32
bits migration.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* The routine sigreturn_prep_xsave_frame() is renamed to sigreturn_prep_fpu_frame().
* Moved the routines sigreturn_prep_fpu_frame(), show_rt_xsave_frame(), and
valid_xsave_frame() to the file crtools.c.
* Introduced the structure fpu_state_t to pass the FPU state to the restorer
in a machine-independent way.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>