2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 05:18:00 +00:00

199 Commits

Author SHA1 Message Date
Pavel Emelyanov
a1ccfb9297 files: Support dumping/restoring of completely unlinked files
Completely unlinked file is the one with n_link count being zero.
Such files only allow to read their contents and carry with us.

In order to dump this thing I introduce the "path remap" technology.
For reg file a remapping entry is dumped which describes, that at
restore stage before opening a regfile->path this path should be
linked to some other name and then (after open) unlinked.

For completely unlinked files the remap path would be a path to
a "ghost" file, i.e. a file which is created only at the time of
restore and which is removed completely at the end of it.

Partially unlinked files (i.e. those having n_link != 0, but a
path by which we see them in someone's fd is not accessible) should
be handled in another way.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-13 17:54:36 +04:00
Cyrill Gorcunov
a8840ba721 fowners: Add checkpoint/restore for sockets
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-12 12:31:33 +04:00
Cyrill Gorcunov
ff3471f726 fowners: Prepare ground for dump and restore
Just show implemented and stubs added to image
(regular file and pipes).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-12 12:28:15 +04:00
Pavel Emelyanov
6f67bb8fc3 xids: Save pgid and sid on pstree_Item and pstree_entry
I store them on _entry since sids can only be inherited or
set to current's pid. Thus the best we can do it restore sids
at fork time, thus save them in the image we use to fork.

Maybe when we submit patches that will give us ability to set
arbitrary pgid and sid we'll change this, but this is in the
future.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-11 22:10:09 +04:00
Pavel Emelyanov
05e3c4d2c9 fd: Handle close-on-exec bits
This bit is not per-file, but per-fd, thus put it on the fdinfo_entry.
Draing these bits from parasite together with the fds themselves, save
into image and restore with fcntl F_SETFD cmd where applicable.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-10 18:36:59 +04:00
Pavel Emelyanov
1d6578bbd5 kcmp: Dump task's objects shared with CLONE_ flags
Just dump their IDs and check they are not shared. For future.
IO and SEMUNDO is not there since tasks may have NO such objects
and currently we cannot detect whether they have them equal or
both don't have.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 18:02:00 +04:00
Pavel Emelyanov
43367e2545 fdinfo: Rename fdinfo_entry addr to fd
Now we store only real fdtable entries in this file, so it's
time to name the field properly and change type to u32.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 16:18:33 +04:00
Pavel Emelyanov
b984eeff9c mm: Move exe file id on mm_entry
This is mm_struct entity, so save one there. Also gets rid
of special FDINFO-s.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 15:52:00 +04:00
Pavel Emelyanov
fe70efad29 mm: Split mm parts from task core image
The mm_xxx bits are per-mm_struct, not per-task_struct in kernel.
Thus, when we support CLONE_VM we'd better have these bits in a
separate image file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 14:51:37 +04:00
Pavel Emelyanov
de66a5d04b fs: Reserve place for task's root dumping
Do not restore it yet -- the logic we're about to apply to
resolve tasks' paths relative to dumper/restorer is not yet
clear to me and it should better be hidden into a couple of
calls (dump_one_reg_file/open_fe_fd). But since we can't
chroot to fd we're about to expose the logic outside of the
open_fe_fd, which is not desirable ATM.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 13:52:42 +04:00
Pavel Emelyanov
e5e57e832b fs: Move info about cwd into separate file
Why? Because one day we'll support various CLONE_ flags and
for fdtable and fs info we'd like to have separate images (since
these objects are separate in kernel).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 13:41:05 +04:00
Pavel Emelyanov
69b3ebd002 vma: Remove FDINFO_MAP fd type
The regfile's ID of a VMA is stored in its shmid field. And the
file itself if sumped into regfiles.img image with 'special'-ly
generated ID (i.e. -- just allocate a new unique one).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 12:57:38 +04:00
Andrey Vagin
2d4244d033 show: use read_img to read pid-s of children and threads
Because amount of entries are known.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-09 10:28:18 +04:00
Pavel Emelyanov
b386751697 sockets: Rework unix sockets onto fdinfo scheme
This is a big change, yes. Dump unix sockets in the same manner
as all the other files are done now. A few notes however.

1. We explicitly drop names for connected stream sockets. This is
   done to avoid conflicts with names -- accepted sockets share their
   names with the listening parent. This can be done later by binding
   a socket to a name, them renaming it to some temporary uniq one
   and at the very very end renaming some back to original.

2. Interconnected sockets are restored via socketpair() call. This is
   correct, but names are dropped. Need to bind() sockets after this
   (yes, this can be done), but for this we need to implement the trick
   with renames described before.

3. FD for socket queues is constantly re-opened not to resolve fd
   conflicts. Need to use service fds engine for this later.

4. Some code cleanup is still required, yes (will follow shortly).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-06 19:27:08 +04:00
Pavel Emelyanov
699caabdb9 show: Add pipes string to fdinfo output
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 21:31:13 +04:00
Andrey Vagin
96be8be2d1 pipe: save all pipe data in a separate file
A pipe buffer has 16 slots. A slot is page, offset and size.
When we use splice and data is not aligned, splice connects
a page from file cache and set offset. For this reason we loose
a part of buffer.

If a data size is more than 15 pages, data will be aligned in a image.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 21:23:57 +04:00
Andrey Vagin
bdb3932be5 pipe: all pipes are saved in one file (v2)
Information about pipe's file structs saved in one global file and
fdinfo_entry is saved for each descriptor

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 21:17:24 +04:00
Pavel Emelyanov
2a33c4d5dc mem: Remove zero page from the end of mem image files
This was required when pages were stored in elf files for
exec. Now we can stop reading it on eof.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 14:07:31 +04:00
Pavel Emelyanov
9b2617353b inet: Rework inet sk dumping on new fdinfo scheme
Now every inetsk fd dump results in a new entry in the fdinfo.img file. Sockets itself are
dumped into inetsk.img global image file. On restore the generic fdinfo redistribution algo
is used and inet sockets are opened only when required.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-27 12:42:59 +04:00
Pavel Emelyanov
c58abfd03d show: Introduce ->show callback for fdset
Each fdset item now has the callback which will show a contents of a magic-described
image file. Per-task and global show code is reworked to walk the respective fdsets
and calling ->show on each file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-27 12:01:14 +04:00
Pavel Emelyanov
82b7c07ca9 show: Fix 'all' mode showing
After we removed the pid from pstree image file the -t or -p option for show
command no longer makes sense. Make 'show' mode rely on -D option to find out
where to find the root (i.e. pstree.img) file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-27 11:04:23 +04:00
Pavel Emelyanov
3858ee4950 fdset: Introduce two fdsets -- task and ns
Write two helpers for opening an fdset for task and one for ns.

This probably can be done with some "generic" macro(s), but this
time it's simpler not to produce more code of that type.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:57:00 +04:00
Pavel Emelyanov
bcf9ee3d1c fdset: Helper for getting fd out of a set
This patch does

s/$fdset->fds[$nr]/fdset_fd($fdset, $nr)/

over the code.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:59 +04:00
Pavel Emelyanov
49cfa97954 show: Don't allocate fdset to show thread
Just use the open_image_ro for this.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:56 +04:00
Pavel Emelyanov
f1429be087 dump: Don't include sk-queues fd in task set
This fd is global, so make it such. It will stop being just a global
variable soon.

Plus, remove the pid arg from format.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:44 +04:00
Pavel Emelyanov
12147a0cbd show: Fix and clean cr_show_all
Don't allocate fdset to show sk queues and don't fail on pstree
fd opening :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:42 +04:00
Stanislav Kinsbursky
b82edc9fdf crtools: dump pstree to image file without pid number
Pid number is redundant - this file is one for the whole tree.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 15:46:07 +04:00
Pavel Emelyanov
6b79601ccb files: Split regfiles info into separate file
Since now on the fdinfo image only contains plain fdinfo_entry-es.
The tpye == FDINFO_REG files are described by regfiles.img entries
and are matched by te ID in both.

At dump stage each new ID generated results in a new entry in the
regfiles.img. At restore stage open_fe_fd should open a regfile by
the fdinfo's ID. Now this is done in suboptimal way, need to improve.

Show shows both images separately.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 21:15:16 +04:00
Pavel Emelyanov
500468d4e7 files: Split fdinfo in two parts
Make fdinfo_entry carry only the minimal info describing a file
descriptor -- the fd value itself, the fd type (regular file, exe
link, cwd, filemap and it will be pipes, sockets, inotifies, etc.)
and the describing file ID.

The mentioned ID will identify the type-d object, e.g. for regfiles
this ID is already generated with file-ids.c code.

The other part of this structure describes a regfile (i.e. a file
opened with open syscall). I put this new entry at the end of the
fdinfo_entry just to make the patching simpler. Soon this entry
will be dumped into its own file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 21:03:26 +04:00
Pavel Emelyanov
159d3bdfd5 fdinfo: Sanitize types in fdinfo_entry
The namelen is u16, to cover the PATH_MAX u8 is not enough.
The pos is u64, since file offset is that long indeed.
The id is u32 as per previous patch.

Fix printf-s respectively.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 21:00:35 +04:00
Pavel Emelyanov
c92c9e234e file-ids: Enlighten ID generation and storage
The unique id is 32 bit and consists only of the subid value. This
is _really_ enough. The genid part is just a hint for the tree-search
algirythm to avoid unneeded sys_kcmp calls.

Plus, generate IDs for special files. This will make it easier to
move the regfiles into into separate files (see the respective patch
for details).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 20:58:19 +04:00
Pavel Emelyanov
57c4e73625 show: Dont't print path len
It's actually useless for user, this field is for crtools only to
find out when one fdinfo entry ends and the new one starts.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-24 15:41:39 +04:00
Pavel Emelyanov
18cf145cc5 show: Print fdinfo types as names, not numbers
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-24 15:38:18 +04:00
Pavel Emelyanov
a3ed9db82a show: Fix timers names
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-22 21:17:57 +04:00
Pavel Emelyanov
97a1d8bb1c mm: Dump vmas into separate image file
The core image now contains only core per-task stuff.
The new file resurrects Tula magic number removed earlier.

Acked-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-21 18:17:12 +04:00
Andrey Vagin
e869c16df5 mm: rework of dumping shared memory
vma_entry contains shmid and all shared memory are dumped in own files.
The most interesting thing is restore.
A maping is restored by process with the smallest pid. The mamping
is created before executing restorer.
We map a full mapping and restore it's conten, then we open a file from
/proc/pid/map_files and store a descriptor in vma_info. The mapping is
unmaped. Now we can map any region of this mapping in the restorer.

We use this trick, because a target process may have this mapping in
some places and the restorer has not function to open proc files.

v2: fix error hangling
xemul: Fixed static-s and args for cr_dump_shmem

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-21 11:03:55 +04:00
Cyrill Gorcunov
8a8ce9b342 show: Don't forget to open sockets queue descriptor before accessing it
Otherwise I'm getting errors like

CR_FD_SK_QUEUES
----------------
Error (./include/util.h:102): Can't read img file: Bad file descriptor
----------------

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-02 11:58:55 +04:00
Cyrill Gorcunov
827cabb480 show: Use pr_msg for showing contents on console
Due to code sharing, especially in IPC area,
the unbinding is done via helper macros and
sysclt engine tuning (new CTL_SHOW action
added).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-02 11:12:59 +04:00
Kinsbursky Stanislav
e518c44c7c show: UNIX sockets queue support
Based on xemul@ patches.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-29 17:42:30 +04:00
Cyrill Gorcunov
2acc741a3a files: Use sys_kcmp to find file descriptor duplicates v4
We switch generic-object-id concept with sys_kcmp approach,
which implies changes of image format a bit (and since it's
early time for project overall, we're allowed to).

In short -- previously every file descriptor had an ID
generated by a kernel and exported via procfs. If the
appropriate file descriptors were the same objects in
kernel memory -- the IDs did match up to bit. It allows
us to figure out which files were actually the identical
ones and should be restored in a special way.

Once sys_kcmp system call was merged into the kernel,
we've got a new opprotunity -- to use this syscall instead.
The syscall basically compares kernel objects and returns
ordered results suitable for objects sorting in a userspace.

For us it means -- we treat every file descriptor as a combination
of 'genid' and 'subid'. While 'genid' serves for fast comparison
between fds, the 'subid' is kind of a second key, which guarantees
uniqueness of genid+subid tuple over all file descritors found
in a process (or group of processes).

To be able to find and dump file descriptors in a single pass we
collect every fd into a global rbtree, where (!) each node might
become a root for a subtree as well.

The main tree carries only non-equal genid. If we find genid which
is already in tree, we need to make sure that it's either indeed
a duplicate or not. For this we use sys_kcmp syscall and if we
find that file descriptors are different -- we simply put new
fd into a subtree.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:47 +04:00
Pavel Emelyanov
7f96ec68e2 show: Show blocked sigset
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-20 17:10:16 +04:00
Kinsbursky Stanislav
4101487f87 IPC: show semaphores set
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-15 13:33:46 +04:00
Kinsbursky Stanislav
24c4381644 IPC: show message queue dump content
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-14 20:21:30 +04:00
Pavel Emelyanov
227f177194 cr: Split dumped pages locations
This actually does two things:

1. The parasite code writes to pages _or_ to pages_shared file himself based
   on a hint given from the main program. This avoids shared pages copying
   in finalize_core.

2. The private pages are moved out of the core file into a separate one. This
   avoids private pages copying in finalize_core.

The goal of this patch is a) to avoid pages copying at all (we still have
one on restore, but fixing this requires Andrey's work on shared memory
dumping) and b) make big blobs with pages be stored in separate files (I
have plans on its format rework and unification).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-12 11:45:29 +04:00
Kinsbursky Stanislav
516099d885 IPC: show shared memory dump content
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-09 13:21:46 +04:00
Kinsbursky Stanislav
a0ec1002b2 crtools: cleanup fdset initalization
v2: wrappers names become less obfuscating

This patch:
1) Updates function cr_fdset_open() to be suitable for handling fdset creation
for dump and show stages.
2) Replaces cr_fdset_open() by new wrapper function cr_fdset_dump().
3) Replaces prep_cr_fdset_for_restore() by new wrapper function cr_fdset_show().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 21:43:12 +04:00
Kinsbursky Stanislav
530f9d9030 IPC: collect and dump tunables sequentially
This patch removes collect stage and dumps tunables object right after
collect.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 16:31:41 +04:00
Stanislav Kinsbursky
5d40ea2ff1 IPC: show dump content
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:32:22 +04:00
Kir Kolyshkin
e70f8d2376 cr-show.c: fix printf format warnings
cr-show.c: In function ‘show_pipes’:
cr-show.c:119:3: error: format ‘%8lx’ expects type ‘long unsigned int’, but argument 2 has type ‘u32’
cr-show.c:119:3: error: format ‘%8lx’ expects type ‘long unsigned int’, but argument 3 has type ‘u32’
cr-show.c:119:3: error: format ‘%8lx’ expects type ‘long unsigned int’, but argument 4 has type ‘u32’
cr-show.c:119:3: error: format ‘%8lx’ expects type ‘long unsigned int’, but argument 5 has type ‘u32’

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 15:57:16 +04:00
Kir Kolyshkin
0b237ae9f2 pr_perror(): print error at the end of line
This is a standard convention to print error message (i.e. strerror(errno))
at the end of line, like this:

        Cannot remove file: Permission denied

So pr_perror is fixed to follow this convention (using GNU extension
%m helps a lot here). Unfortunately, due to this we have to make
pr_perror() print a new line character, too, so we had to strip it
from the all pr_perror() invocations.

That (appending a newline) also makes pr_perror() a black sheep
in the herd of pr_* helpers, but what can we do? Worst case scenario
is an extra newline after an error message, not too harmful.

An alternative approach (stripping the newline from the passed format
string and re-adding it) was discussed thoroughly, and it was decided
that such a hack looks a bit too dirty.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 15:49:15 +04:00