2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 13:28:27 +00:00

462 Commits

Author SHA1 Message Date
Andrey Vagin
caf875454f signal: fix logig about SIGMAX (v2)
A value of signo is in [1, SIGMAX].
Currenly signals are enumirated from 1 to SIGMAX, but SIGMAX
is not included. This patch fixes this mestake.

v2: * save backward compatibility
    * set a correct value of SIGMAX = 64. It can not be in a
    separate patch, because a format is changed again.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-04 19:26:54 +04:00
Pavel Emelyanov
360c50d429 rst: Relax nr_in_progress set in stage switching
When resetting nr_in_progress for next stage no need
in waking up anyone. Nobody waits for it yet :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-04 17:40:19 +03:00
Pavel Emelyanov
7eaad99ff4 rst: Helper for switching restore stages
Switching to a new stage is 4-step procedure which
deserves its own helper. Besides, now the information
about how many tasks participage in each stage is
collected in one place.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-04 17:37:13 +03:00
Pavel Emelyanov
e0d0dc821d rst: Rename task_entries->nr to ->nr_threads
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-04 17:22:45 +03:00
Pavel Emelyanov
2482c5a24b rst: Move initial nr_in_progress initialization
It's better to init it closer to the rest of rst orchestration.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-04 17:12:37 +03:00
Pavel Emelyanov
70af6cdd62 rst: Helper for restore stage barrier
When finishing a stage we have to report this (decrement the
number of tasks in stage) and wait while stage switch. Write
a helper that does both.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-04 16:59:41 +03:00
Pavel Emelyanov
03d0758df3 Revert "net: Introduce netdev index to name resolver"
This reverts commit ef3771d566dacb8ee9fe71b744d56f08674fe3db.
With new SO_BINDTODEVICE getting API it's not required.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-28 20:27:16 +03:00
Pavel Emelyanov
9df1786aea rst: Move and rewrite comment about how restorer blob is prepared
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-27 22:16:00 +03:00
Pavel Emelyanov
14915d01fd rst: Remove unneeded core file 2nd opening
Presumablty it was lost while reworking core entry restore.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-27 22:03:36 +03:00
Pavel Emelyanov
827c633b26 rst: Write pidfile in separate fn
Just for better readability.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-27 21:59:58 +03:00
Andrey Vagin
13a7498c2a crtools: add EOL to error messages
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-23 16:43:33 +04:00
Andrey Vagin
68d5cb63ae restore: add statistics about restored pages
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:34 +04:00
Andrey Vagin
5d712b7430 cr-restore: remove unshared pages from inherited private mappings (v2)
A parent process can change a few pages after forking a child and
all this pages should not be avaliable from the child.

Each vma has a bitmap of existent pages. Parent's and child's bitmaps
can be compared and all pages which are not present in a child bitmap
are dropped.

v2: don't check page_bitmap on NULL

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:33 +04:00
Andrey Vagin
52247b12cf restorer: don't need to restore pages content in restorer
All memory content are restored before entering in restorer.c.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:31 +04:00
Andrey Vagin
949af72bd5 restore: restore content of private mappings before forking children
It's required for restoring copy-on-write regions.

The similar code will be removed from restorer.c.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:30 +04:00
Andrey Vagin
d02844c8fb restore: use a new scheme for restoring of file private mappings
With this patch vma->shmid contains file id before mapping a region,
then it contains of a temporary address.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:28 +04:00
Andrey Vagin
71f3f7e67b restorer: remap private vmas to correct places (v3)
All private vmas are placed in a premmapped region and
they are sorted by start addresses, so they should be shifted apart.

Here is one more problem with overlapped temporary and target regions,
mremap could not remap such cases directly, so for such cases a vma is
remapped away and then remapped on a target place.

v2: fix accoding with Pavel's comments
v3: add a huge comment with pictures

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:27 +04:00
Andrey Vagin
f3e322a10f restore: don't unmap premmapped private vma-s (v2)
Private vma-s are mapped before forking children, then they are
remapped to corrected places in restorer.c.

In restorer all unneeded vma-s are unmaped. VMA-s from premmapped
regions should not be unmaped.

v2: replace guard pages on arithmetic in restorer

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:25 +04:00
Andrey Vagin
1a9d87de10 restore: map private vma-s before forking children (v3)
In this case private vma-s will be inherited by children,
it allows to restore copy-on-write reqions.

This code compares child and parent vma lists. If it found
two vma-s with the same start and end addresses, it decides
that the child inherites this vmas from the parent.

This code calculates a size of all private vma-s, then allocate
a memory region for all vma-s and maps them one by one. If a vma is
inherited it will be remaped to an allocated place.

As a result all vma-s will be placed in a continious memory region
and sorted by start addresses. This logic will be used for remap
vma-s to correct address.

v2: fix accoding with Pavel's comments ( clean up and simplify )
v3: simplify code and check that VMA-s are sorted

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:23 +04:00
Andrey Vagin
9f9d0e05b1 restore: collect vma-s before creating children (v3)
A private vma's should be inherited by children for
restoring copy-on-write regions.

v2: free parent's vma-s in this patch.
v3: split patch on two parts

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:39:09 +04:00
Andrey Vagin
3265078877 restore: Make target vmas list global
Makes further patching simpler

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:39:08 +04:00
Andrey Vagin
11ed3531a1 restore: release all previous entries from the vma list
Those will be inherited from parent. Before this patch this list was
always empty, but it will change soon.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:39:06 +04:00
Andrey Vagin
c430e2ee6b restore: don't worry if an vma image file is absent
read_vmas will be called for zombies

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:39:05 +04:00
Andrey Vagin
d5bc93e68b restore: don't add unneeded vma with zero start and end addresses
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:39:03 +04:00
Andrey Vagin
ec583c7408 restore: split read_and_open_vmas into parts read_vmas and open_vmas (v2)
read_vmas will be called bedore forking children to restore
copy-on-write memory.

v2: don't open an image one more time

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:39:02 +04:00
Pavel Emelyanov
f86bbe6a9c restore: Introduce a macro to get restorer symbol address
This makes code more readable, saves one ptr on stack and
lets us jump into restorer code using tags.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-13 20:15:13 +03:00
Cyrill Gorcunov
2ee5a42f3e restore: Add restoration of the blocked threads signals from the image
To unify the code for both thread leader and regular threads
we move blocked signals for thread leader into threads argument
area and use restore_thread_common() helper.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-12 17:43:01 +04:00
Cyrill Gorcunov
9a5b427470 restore: check_core -- Add missing test for thread_info in non-zombie task
Otherwise we might get nil dereference in sigreturn restore.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-12 12:35:03 +04:00
Cyrill Gorcunov
475aa87225 restore: check_core -- Move core->ids check under separate if() statement
We will need to extend non-zombie tests.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-12 12:34:56 +04:00
Pavel Emelyanov
d4735a22fa packet: Support mmap-ing of packet sockets
Three parts.

Proc: open of map_files' link doesn't work on sockets. We fstatat
it and check that it's a socket (it will be packet), then save
the socket inode on vma_area.

Dump: we resolve socket inode to socket id and save it on vma.
We use id, not inode, since on restore we'll have to mmap some
opened file, not just abstract socket with inode.

Restore: when reading vma-s we just need to find out on what fd
the respective packet socket is opened (i.e. -- no map-and-close
sockets supported by now) and dup() it to let restorer mmap it
back.

All this make it possible to c/r the tcpdump tool!

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-02 16:00:18 +03:00
Pavel Emelyanov
a385c6fa8d rst: Print more debug when pre-opening vma-s
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-02 15:59:20 +03:00
Huang Qiang
f33df79e1e cr-restore: fix to print correct length of bootstrap
The length of bootstrap in the print is old and wrong, we need to fix
it and unify the length variable.

Signed-off-by: Huang Qiang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-01 14:44:14 +04:00
Pavel Emelyanov
2b9d87fe6b rst: Fix creds vs threads restoration
Writing to last_pid sysctl is CAP_SYS_ADMIN potected. Thus restoring
creds before it won't work in all the cases.

Fix this by making all threads restore creds themselves, and the
thread group leader -- after all of them.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrey Vagin <avagin@parallels.com>
2012-10-30 10:04:37 +03:00
Cyrill Gorcunov
b1f1154c8a auxv: Use real size of the auxv vector
The size of vector depends on the kernel config
so use the real size of a vector dumped. Otherwise
we might fail on restore.

Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-29 19:54:12 +04:00
Pavel Emelyanov
f8142ba352 rst: Make thread_restore_args be part of task_restore_args
The former is actually the parameters of thread group leader, so
it's natural to have them on-task.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-29 17:00:18 +03:00
Huang Qiang
eb9b1ab240 cr-restore: remove the duplicate round_up for restore_thread_vma_len
With some historical changes, the second page-aligned for
restore_thread_vma_len is reduplicate. So remove it.

Signed-off-by: Huang Qiang <h.huangqiang@huawei.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-29 13:38:53 +04:00
Huang Qiang
223dce83c2 fix many unclosed file opened by open_image_ro
Many image files opened by open_image_ro weren't closed before return, fix
them all in this patch.

Signed-off-by: Huang Qiang <h.huangqiang@huawei.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-24 16:51:50 +04:00
Pavel Emelyanov
ef3771d566 net: Introduce netdev index to name resolver
It will be required to support socket bound to devices.

When restoring w/o net namespaces -- collect existing devices.
When restoring with them -- collect what is received from image.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-19 17:35:46 +04:00
Cyrill Gorcunov
9c579cfd02 sfd_type: Add SELF_STDIN_OFF service fd and call helpers where needed
We will need it for slave ttys migration. They serve for one purpose --
to clone self stdio descriptor and use it with tty layer, which will
be addressed in further patches.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-18 15:51:56 +04:00
Pavel Emelyanov
aa731ee1d7 core: Support task scheduler policies and priorities
No magic here, just fetch info using getpriority and sched_getxxx calls.
Good news is that the mentioned syscalls take pid as argument and do work
with it, i.e. -- no need in parasite help here.

Restore is splitted into prep -- copy sched bits from image on restorer
args -- and the restore itself. It's done to avoid restoring tasks info
with IDLE priority ;) To make restorer not-fail sched bits are validated
for sanity on prep stage.

Minimal sanity test is also there.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-17 00:23:25 +04:00
Cyrill Gorcunov
1686669410 tty: Make tty_setup_slavery to return error
In case if here no task found which would restore
controlling terminal -- exit with error instead of
continue with just error message.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-15 20:02:29 +04:00
Pavel Emelyanov
f429de662e creds: Support supplementary groups
Dumping them is performed via parasite, since calling the getgroups
is the only way of getting the complete list. Currently the nr of
groups to dump is limited explicitly with the size of shared memory
between crtools and parasite. This is MUCH more that we have seen
on real apps so far.

Restoring is done early, before restorer blob not to carry the undefined
array of grpous in there. This is OK, since groups do not affect us at
that point and are not affected by subsequent creds restore.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-11 17:07:02 +04:00
Cyrill Gorcunov
062f468817 pstree: Define symbolic name for init process
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-09 19:57:15 +04:00
Cyrill Gorcunov
17a1548a5b pstree: Rename @list member to @sibling
To be close to the kernel naming.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-08 18:59:26 +04:00
Andrey Vagin
c4148d7907 cr-restore: exit if someone can not be restored
Forgot to handle an error path in a one place.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-25 15:54:51 +04:00
Cyrill Gorcunov
997b295d67 files-reg: Use global mutex to serialize ghost file creation
Otherwise there is a race between files with same names:

link(name -> ghost)                link(name->ghost)
open(name)
unlink(name)
                                   open(name) -> ENOENT

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:12:58 +04:00
Andrey Vagin
5ec8a1c313 cr-restore: unlock connections at the last moment
Restore must not fail after unlocking connections.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:06:14 +04:00
Andrey Vagin
772d6853d2 crtools: collect inet sockets to crtools
Early we moved prepare_shared() to a root task,
because several preparation actions should be executed
in a target namespace set (e.g.: ghost files).

TCP sockets are a subset of init sockets,
they should be unlocked before resume. It's convient to do
from crtools.
An image can't be read more than one time, because we want to
send it via network.

For this two reasons prepare_shared is spitted in two parts,
one for crtools and one for a root task.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:06:06 +04:00
Andrey Vagin
c27ff2baac tcp: unset TCP_REPAIR at the last moment after unlocking network (v2)
TCP_REPAIR should be droppet when a network is unlocked.
A network should be unlocked at the last moment, because
after this moment restore must not failed, otherwise a state of
a tcp connection can be changed and a state of one side in our image
will be invalid.

v2: use xremalloc instead of mmap and remmap

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:02:57 +04:00
Pavel Emelyanov
990f80dd0f tty: Sanitize slavery and ctl tty setups
We need to do two non-trivial things with ttys -- interconnect
slaves to masters (or to each other) and setup ctl-tty restoring
task.

Now this is done in subsequently depending on each other steps:

1. collect ttys
2. interconnect slaves and mark ctl-tty tasks
3. collect fake fds for tty-ctl tasks
4. setup orphaned slaves

We can relax this logic in two ways:

1. don't split marking ctl-tty tasks and then creating fds for them
   do it in one step at the end
2. don't interconnect slaves with masters and orphaned slaves in
   two steps -- do it in one place after fds are collected

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-14 18:12:59 +04:00