2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 13:28:27 +00:00

462 Commits

Author SHA1 Message Date
Andrey Vagin
1264a7a9c6 restore: fail restore if pgid or sid are not restored
Don't fail if a root non-init task has another sid, because
it's inherited from parent and can't be restored and
it's expected behaviour, when a subtree is dumped.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-22 12:15:03 +04:00
Andrey Vagin
8c4017d933 restore: print message about sid only if it's restored
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-22 12:14:14 +04:00
Andrey Vagin
9b8a206729 restore: restore sid of task which isn't leaders and isn't a child of init (v4)
It's sign, that a parent has been changed sid after forking a child.
We should know a sid with which a process was born, because in a processes
chain, more then one process might change SID.

v2: fix names of variables
v3: prevent rewriting of born_sid
v4: Abort the restorer with error message if a born_sid can't be determing.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-22 12:13:58 +04:00
Andrey Vagin
2c412fa6ac restore: restore sids of tasks, which have been reparented to init (v3)
* Create helpers for processes which have been reparented to init.
* Insert helpers in a process tree.
* Helpers will exit after constructing a process tree.

v2: fix variables names and check errors
v3: add comments in code

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-22 12:13:26 +04:00
Andrey Vagin
eb9a6f2015 restore: add interface for creating helper tasks (v3)
They will be used for restoring sid. For example, if a session
group leader is absent, a helper process is created with this id
and it will die after restoring all other tasks.

Before this patch restore failed if anyone exited.
Now we should skip helpers, which exited successfully. It's a bit tricky.
All children are collected in sigchld_handler, but we have a point,
where we want to wait all helpers. For that waitpit is used and ECHLD
is ignored, because it signs that a helper exited and has been waited in
sigchld_handler.

v2: check that me isn't NULL in the sig handler
v3: move code about waiting helpers in a separate function

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-22 12:12:26 +04:00
Andrey Vagin
acacc6049e restore: calculate a maximum value of PID-s
It will be used for allocating PIDs for helper tasks

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-22 12:11:41 +04:00
Andrey Vagin
8075dc60a5 pstree: rename fields in struct pid
s/pid/virt/
s/real_pid/real/

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-22 12:09:44 +04:00
Andrey Vagin
9b3b059bdc crtools: add "pid" to the --namespaces cmdline option arguments (v3)
to require dumping pid namespace. Dump and restore will be failed if
a tress doesn't contain a process init.

pid namespace will be created implicitly if a process init in the tree.

v2: fix comments from Pavel
v3: Restore of pidns should be approved by user

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-22 12:08:57 +04:00
Cyrill Gorcunov
582954685b Escape using unsafe sprintf helper
Util it's very critical for speed we should
not use unsafe sprintf helper, we're root-granted
program and must be as safe as possible.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-20 15:04:51 +04:00
Andrey Vagin
b8e12c4829 restore: mount proc for a new pid namespace
Create a tmp directory and mount proc from a target pid ns.
This proc will show pid-s from the target pid ns.

crtools uses map_files for restoring sharing mappings.

the tmp directory is removed after restore.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-19 22:17:32 +04:00
Andrey Vagin
4ebc875995 util: add ability to change a proc mount point (v2)
v2:  rework this by using openat() and service fds for proc root.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-19 22:17:16 +04:00
Andrey Vagin
c8e6be95e4 restore: create pid namespace
A pid namespace is created if a pid of the first task is 1.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-19 22:16:19 +04:00
Andrey Vagin
4b48849c32 ctrools: prepare to dump pid namespace
Add struct pid and use it everywhere. This struct contains
two fields: pid and real_pid.

real_pid is a pid outside of the target pid namespace.
pid is in the target pid namespace

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-06-19 22:13:52 +04:00
Andrey Vagin
dfdf54d96b pstree: allocate restore data as a tail of pstree_item (v2)
v2: Synchronize the argument type of __alloc_pstree_item and the
values you put into it. I.e. int-int or bool-bool, not bool-int.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-31 15:41:05 +04:00
Andrey Vagin
f7d0263c69 restore: add fast path to find a parent pstree item
Taking into account the way the dump saves pstrees in the image.

If pstree.img isn't edited, a slow path should not be executed at all.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-31 15:41:03 +04:00
Andrey Vagin
cf63c1d9e8 crtools: link pstree_item-s in a tree (v3)
because they describes a process TREE.

It's usefull, when we dump tasks from another pid namespace,
because a real pid is got from parasite. In previous version
we need to update pid in two places one is in a pstree_item and
one is in a children array.

A process tree will be necessery to restore sid and pgid,
because we should add fake tasks in a tree. For example if
a sesion leader is absent.

v2: fix rollback actions
v3: fix comments from Pavel Emelyanov
    * add macros for_each_pstree_item
    * and a few bugs

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-31 15:41:02 +04:00
Andrey Vagin
066ec066a0 crtools: remove unused variables (v3)
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-18 19:01:21 +04:00
Andrey Vagin
a411c8f21c restore: don't use an uninitialized variable
The global variable me isn't initialized, when we tried to use it.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-15 18:25:36 +04:00
Cyrill Gorcunov
62ee701c89 Use /proc/pid/smaps for VMA parsing v2
This allows us to detect nonlinear mappings.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-11 17:41:05 +04:00
Andrey Vagin
f0decea728 log: remove pid from messages
Now log messages may be splitted by pid

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-04 16:24:20 +04:00
Andrey Vagin
fc7bedc50a crtools: make to be able to split messages by pid
If the option --log-pid is set, each process will have an own log file.
Otherwise PID is added to each log message.

A message can't be bigger than one page minus some bytes for pid.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-04 16:22:04 +04:00
Cyrill Gorcunov
bff52ba952 inotify: Add checkpoint/restore v2
v2:
 - open_mount is cleaned up
 - byte-stream hex conversion remains untouched since
   strtol is flipping numbers to LE manner

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-04 14:00:45 +04:00
Cyrill Gorcunov
b38777dff4 eventpoll: Add checkpoint/restore v2
v2:
 - Move everything into eventpoll.[ch]
 - Use rst_file_params

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-04 14:00:05 +04:00
Cyrill Gorcunov
889795da5d eventfd: Add checkpoint/restore support v2
v2:
 - Pass initial counter value to eventfd call
   (can't pass flags here since they are obtained
    with fcntl and must be restored same way or
    restore will fail)
 - Use rst_file_params for flags and owner restore
 - Use eventfd.[ch] instead of eventfs.[ch]
 - Move show funcs to eventfd.c

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-04 13:59:51 +04:00
Pavel Emelyanov
bc75b0f253 shmem: Move shmem restoring code to shmem.c
No changes, just code move (and rename of shmems variable).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-03 18:01:05 +04:00
Pavel Emelyanov
fa8a928e9f pipes: Move dumping code to pipes.c
Just a codemove, no real changed.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-03 17:36:00 +04:00
Pavel Emelyanov
675b4698ce tcp: Put connection locking management over the code
When dump finished with error we should unlock all locked
previously connections.

When restoring we should collect connctions and unlock them
all at the end.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-28 17:59:21 +04:00
Cyrill Gorcunov
4a2a290137 make: Generate offsets from linked files only
Instead of generating offsets from early compiled
object files (one day the offsets obtained from
there might be changed during linkage stage) better
to get them from a final stage where all object
files involved are linked into complete binary blob.

That happened that at early stage we indeed were using
only single file per parasite and restorer but at present
there a couple of file involved (and will be more in
future) so we need a safe approach.

Also note the symbols being exported are prefixed as
"__export_". This is easier approach for now. Putting
such symbols into separate section requires a way
more efforts to handle.

The main reason of having two files (Elf object
and binary blob) is to get 1:1 mapping between
symbols definition and their position in binary
target.

The exported symbols name addresses are obtained
from object file and used as offsets in binary
target.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-18 18:22:15 +04:00
Pavel Emelyanov
e2f745b920 files: Simplify fd-s restore
Don't re-read fdinfo image 4 times on restore, just use those collected
on me pstree_entry instance.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-18 16:27:46 +04:00
Pavel Emelyanov
a72d858652 files: Collect fdinfo-s on per-pstree_item list
Later we'll be able to restore them without re-reading the fdinfo file again.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-18 16:24:08 +04:00
Pavel Emelyanov
9829dbfdeb rst: Add abstract rst_info on pstree_item
The plan is to put collected resources on this to avoid seeking the image.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-18 15:42:26 +04:00
Cyrill Gorcunov
b39a929282 syscalls: Don't hide sigsetsize inside syscall itself
This brings hardness into syscall trasition to asm code,
pass this constants in callers.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-18 12:26:14 +04:00
Konstantin Khlebnikov
0ff0943eb4 Drop <sys/user.h> to fix PAGE_SIZE declaration
On some systems PAGE_SIZE is declared as sysconf(_SC_PAGESIZE) in <sys/user.h>
this is non-constant expression, so it cannot be used in type declarations.

This breaks compilation with a very non-obvious error message:

  CC       parasite-syscall.o
In file included from parasite-syscall.c:30:0:
./include/parasite.h:90:8: error: variably modified ‘fds’ at file scope

crtools doesn't uses anything from <sys/user.h>, so we can drop its usage.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-16 14:38:49 +04:00
Kinsbursky Stanislav
35eedb5f1f output: add "0x" to hex prints using sed
Command below was executed several times:

sed 's/\(pr_.*[^%,x,X]\)\(\%[0-9,l,L]*x\)/\10x\2/g' -i *.c

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-16 12:33:35 +04:00
Cyrill Gorcunov
7657c34ab7 restore: Temporary comment out exit's on sid/pgid mismatch
Util we have kernel support.
[ xemul: MySQL uses runaway pgid and sid and we cannot restore it
  gracefully with exiting API :( Byt MySQL seem not to care about
  pgid and sid change after restore, so ignore this for a while ]

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-13 19:06:35 +04:00
Pavel Emelyanov
a1ccfb9297 files: Support dumping/restoring of completely unlinked files
Completely unlinked file is the one with n_link count being zero.
Such files only allow to read their contents and carry with us.

In order to dump this thing I introduce the "path remap" technology.
For reg file a remapping entry is dumped which describes, that at
restore stage before opening a regfile->path this path should be
linked to some other name and then (after open) unlinked.

For completely unlinked files the remap path would be a path to
a "ghost" file, i.e. a file which is created only at the time of
restore and which is removed completely at the end of it.

Partially unlinked files (i.e. those having n_link != 0, but a
path by which we see them in someone's fd is not accessible) should
be handled in another way.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-13 17:54:36 +04:00
Pavel Emelyanov
81c72ea640 restore: Remove unused struct shmem_id and variables
Most likely they left from Andrey's shmem rework.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-12 15:46:57 +04:00
Pavel Emelyanov
814bd5d321 xids: Dump and restore tasks' pgid and sid
This is preriquisity for terminals handling and just a good
practice to save and restore everything we can :)

Not all combinations are supported. All the problems we still
have come from the inability to attach to group/session with
ID no tasks own as its PID.

This can be workarounded by fork()-ing this pid temporarily,
but we'd rather think in the direction of modifying the kernel
to give us direct syscall for this (oh my...)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-11 22:11:41 +04:00
Pavel Emelyanov
6f67bb8fc3 xids: Save pgid and sid on pstree_Item and pstree_entry
I store them on _entry since sids can only be inherited or
set to current's pid. Thus the best we can do it restore sids
at fork time, thus save them in the image we use to fork.

Maybe when we submit patches that will give us ability to set
arbitrary pgid and sid we'll change this, but this is in the
future.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-11 22:10:09 +04:00
Pavel Emelyanov
b17c49aa99 rst: Wait till everyone completes forking on restore
New stage CR_STATE_FORKING. This is required to restore pgids
properly -- we need to make sure a task with pid whose pgid we
are about to enter is alive. And this task is not necesserily
our parent, thus wait for everyone to appear.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-11 22:06:36 +04:00
Pavel Emelyanov
b984eeff9c mm: Move exe file id on mm_entry
This is mm_struct entity, so save one there. Also gets rid
of special FDINFO-s.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 15:52:00 +04:00
Pavel Emelyanov
fe70efad29 mm: Split mm parts from task core image
The mm_xxx bits are per-mm_struct, not per-task_struct in kernel.
Thus, when we support CLONE_VM we'd better have these bits in a
separate image file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 14:51:37 +04:00
Pavel Emelyanov
e5e57e832b fs: Move info about cwd into separate file
Why? Because one day we'll support various CLONE_ flags and
for fdtable and fs info we'd like to have separate images (since
these objects are separate in kernel).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 13:41:05 +04:00
Andrey Vagin
363812c9b9 restore: optimize restorer_get_vma_hint
It's an O(n) algorithm.

Now we iterate both lists simultaneously to find a hole.

[xemul: Discussion making the patch more understandable:

Cyrill:

	If s_vma is the last one on self_vma_list you could break immediately, no?

	And the snippet I somehow miss is -- how the situation handled when

		      hole
		    a      b
	source |----|      |-----|
	target   |----|      |-----|
		      c      d

	the hole fits the requested size but the hole is shifted
	in target, so that you've

	prev_vma_end = a

	and then you find that a - d > vma_len and return a
	as start address for new mapping while finally it
	might intersect with address c.

	Or I miss something obvious?

Andrey:

	Look at "continue" one more time.
	prev_vma_end is returned only if both condition are true

	if (prev_vma_end + vma_len > s_vma->vma.start) {
	....
	if (prev_vma_end + vma_len > t_vma->vma.start) {
	...

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Looks-good-to: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 10:31:00 +04:00
Andrey Vagin
fa11d76cab restore: check that a restorer vma doesn't intersect with target vma-s
[ xemul: The fix effectively is -- stop scanning the 2nd vma list
         once we see, that the hint's end hits the next vma ]

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 10:27:44 +04:00
Pavel Emelyanov
b386751697 sockets: Rework unix sockets onto fdinfo scheme
This is a big change, yes. Dump unix sockets in the same manner
as all the other files are done now. A few notes however.

1. We explicitly drop names for connected stream sockets. This is
   done to avoid conflicts with names -- accepted sockets share their
   names with the listening parent. This can be done later by binding
   a socket to a name, them renaming it to some temporary uniq one
   and at the very very end renaming some back to original.

2. Interconnected sockets are restored via socketpair() call. This is
   correct, but names are dropped. Need to bind() sockets after this
   (yes, this can be done), but for this we need to implement the trick
   with renames described before.

3. FD for socket queues is constantly re-opened not to resolve fd
   conflicts. Need to use service fds engine for this later.

4. Some code cleanup is still required, yes (will follow shortly).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-06 19:27:08 +04:00
Andrey Vagin
4f2bd37704 pipes: add functions for collecting pipes
pipe_entry is encapsulated in pipe_info.
All pipe_info-s connects in the list pipes.
All pipe_info-s with the same piep_id connects to pipe_list,
it a circular list without a defined head.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 21:22:16 +04:00
Andrey Vagin
382ebc3063 pipe: remove old code for restoring pipes
[ xemul: I don't know how to make this with incremental changes either,
         and just go with it :( ]

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 21:17:11 +04:00
Pavel Emelyanov
c40e3201f2 rst: Open thread core images without -nocheck
These are image files that weren't yet opened, thus should be
opened with proper checks.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 15:42:48 +04:00
Pavel Emelyanov
7570fbbf6a rst: Read pstree image once
Collect pstree_item-s on restore in big list. This lets
us not lseek this file on restore and simplifies the code
a little.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 15:34:31 +04:00