2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 13:28:27 +00:00

83 Commits

Author SHA1 Message Date
Andrey Vagin
bdb3932be5 pipe: all pipes are saved in one file (v2)
Information about pipe's file structs saved in one global file and
fdinfo_entry is saved for each descriptor

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 21:17:24 +04:00
Pavel Emelyanov
2a33c4d5dc mem: Remove zero page from the end of mem image files
This was required when pages were stored in elf files for
exec. Now we can stop reading it on eof.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-05 14:07:31 +04:00
Pavel Emelyanov
9b2617353b inet: Rework inet sk dumping on new fdinfo scheme
Now every inetsk fd dump results in a new entry in the fdinfo.img file. Sockets itself are
dumped into inetsk.img global image file. On restore the generic fdinfo redistribution algo
is used and inet sockets are opened only when required.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-27 12:42:59 +04:00
Pavel Emelyanov
c58abfd03d show: Introduce ->show callback for fdset
Each fdset item now has the callback which will show a contents of a magic-described
image file. Per-task and global show code is reworked to walk the respective fdsets
and calling ->show on each file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-27 12:01:14 +04:00
Pavel Emelyanov
82b7c07ca9 show: Fix 'all' mode showing
After we removed the pid from pstree image file the -t or -p option for show
command no longer makes sense. Make 'show' mode rely on -D option to find out
where to find the root (i.e. pstree.img) file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-27 11:04:23 +04:00
Pavel Emelyanov
3858ee4950 fdset: Introduce two fdsets -- task and ns
Write two helpers for opening an fdset for task and one for ns.

This probably can be done with some "generic" macro(s), but this
time it's simpler not to produce more code of that type.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:57:00 +04:00
Pavel Emelyanov
bcf9ee3d1c fdset: Helper for getting fd out of a set
This patch does

s/$fdset->fds[$nr]/fdset_fd($fdset, $nr)/

over the code.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:59 +04:00
Pavel Emelyanov
49cfa97954 show: Don't allocate fdset to show thread
Just use the open_image_ro for this.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:56 +04:00
Pavel Emelyanov
f1429be087 dump: Don't include sk-queues fd in task set
This fd is global, so make it such. It will stop being just a global
variable soon.

Plus, remove the pid arg from format.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:44 +04:00
Pavel Emelyanov
12147a0cbd show: Fix and clean cr_show_all
Don't allocate fdset to show sk queues and don't fail on pstree
fd opening :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:42 +04:00
Stanislav Kinsbursky
b82edc9fdf crtools: dump pstree to image file without pid number
Pid number is redundant - this file is one for the whole tree.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 15:46:07 +04:00
Pavel Emelyanov
6b79601ccb files: Split regfiles info into separate file
Since now on the fdinfo image only contains plain fdinfo_entry-es.
The tpye == FDINFO_REG files are described by regfiles.img entries
and are matched by te ID in both.

At dump stage each new ID generated results in a new entry in the
regfiles.img. At restore stage open_fe_fd should open a regfile by
the fdinfo's ID. Now this is done in suboptimal way, need to improve.

Show shows both images separately.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 21:15:16 +04:00
Pavel Emelyanov
500468d4e7 files: Split fdinfo in two parts
Make fdinfo_entry carry only the minimal info describing a file
descriptor -- the fd value itself, the fd type (regular file, exe
link, cwd, filemap and it will be pipes, sockets, inotifies, etc.)
and the describing file ID.

The mentioned ID will identify the type-d object, e.g. for regfiles
this ID is already generated with file-ids.c code.

The other part of this structure describes a regfile (i.e. a file
opened with open syscall). I put this new entry at the end of the
fdinfo_entry just to make the patching simpler. Soon this entry
will be dumped into its own file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 21:03:26 +04:00
Pavel Emelyanov
159d3bdfd5 fdinfo: Sanitize types in fdinfo_entry
The namelen is u16, to cover the PATH_MAX u8 is not enough.
The pos is u64, since file offset is that long indeed.
The id is u32 as per previous patch.

Fix printf-s respectively.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 21:00:35 +04:00
Pavel Emelyanov
c92c9e234e file-ids: Enlighten ID generation and storage
The unique id is 32 bit and consists only of the subid value. This
is _really_ enough. The genid part is just a hint for the tree-search
algirythm to avoid unneeded sys_kcmp calls.

Plus, generate IDs for special files. This will make it easier to
move the regfiles into into separate files (see the respective patch
for details).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 20:58:19 +04:00
Pavel Emelyanov
57c4e73625 show: Dont't print path len
It's actually useless for user, this field is for crtools only to
find out when one fdinfo entry ends and the new one starts.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-24 15:41:39 +04:00
Pavel Emelyanov
18cf145cc5 show: Print fdinfo types as names, not numbers
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-24 15:38:18 +04:00
Pavel Emelyanov
a3ed9db82a show: Fix timers names
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-22 21:17:57 +04:00
Pavel Emelyanov
97a1d8bb1c mm: Dump vmas into separate image file
The core image now contains only core per-task stuff.
The new file resurrects Tula magic number removed earlier.

Acked-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-21 18:17:12 +04:00
Andrey Vagin
e869c16df5 mm: rework of dumping shared memory
vma_entry contains shmid and all shared memory are dumped in own files.
The most interesting thing is restore.
A maping is restored by process with the smallest pid. The mamping
is created before executing restorer.
We map a full mapping and restore it's conten, then we open a file from
/proc/pid/map_files and store a descriptor in vma_info. The mapping is
unmaped. Now we can map any region of this mapping in the restorer.

We use this trick, because a target process may have this mapping in
some places and the restorer has not function to open proc files.

v2: fix error hangling
xemul: Fixed static-s and args for cr_dump_shmem

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-21 11:03:55 +04:00
Cyrill Gorcunov
8a8ce9b342 show: Don't forget to open sockets queue descriptor before accessing it
Otherwise I'm getting errors like

CR_FD_SK_QUEUES
----------------
Error (./include/util.h:102): Can't read img file: Bad file descriptor
----------------

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-02 11:58:55 +04:00
Cyrill Gorcunov
827cabb480 show: Use pr_msg for showing contents on console
Due to code sharing, especially in IPC area,
the unbinding is done via helper macros and
sysclt engine tuning (new CTL_SHOW action
added).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-02 11:12:59 +04:00
Kinsbursky Stanislav
e518c44c7c show: UNIX sockets queue support
Based on xemul@ patches.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-29 17:42:30 +04:00
Cyrill Gorcunov
2acc741a3a files: Use sys_kcmp to find file descriptor duplicates v4
We switch generic-object-id concept with sys_kcmp approach,
which implies changes of image format a bit (and since it's
early time for project overall, we're allowed to).

In short -- previously every file descriptor had an ID
generated by a kernel and exported via procfs. If the
appropriate file descriptors were the same objects in
kernel memory -- the IDs did match up to bit. It allows
us to figure out which files were actually the identical
ones and should be restored in a special way.

Once sys_kcmp system call was merged into the kernel,
we've got a new opprotunity -- to use this syscall instead.
The syscall basically compares kernel objects and returns
ordered results suitable for objects sorting in a userspace.

For us it means -- we treat every file descriptor as a combination
of 'genid' and 'subid'. While 'genid' serves for fast comparison
between fds, the 'subid' is kind of a second key, which guarantees
uniqueness of genid+subid tuple over all file descritors found
in a process (or group of processes).

To be able to find and dump file descriptors in a single pass we
collect every fd into a global rbtree, where (!) each node might
become a root for a subtree as well.

The main tree carries only non-equal genid. If we find genid which
is already in tree, we need to make sure that it's either indeed
a duplicate or not. For this we use sys_kcmp syscall and if we
find that file descriptors are different -- we simply put new
fd into a subtree.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:47 +04:00
Pavel Emelyanov
7f96ec68e2 show: Show blocked sigset
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-20 17:10:16 +04:00
Kinsbursky Stanislav
4101487f87 IPC: show semaphores set
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-15 13:33:46 +04:00
Kinsbursky Stanislav
24c4381644 IPC: show message queue dump content
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-14 20:21:30 +04:00
Pavel Emelyanov
227f177194 cr: Split dumped pages locations
This actually does two things:

1. The parasite code writes to pages _or_ to pages_shared file himself based
   on a hint given from the main program. This avoids shared pages copying
   in finalize_core.

2. The private pages are moved out of the core file into a separate one. This
   avoids private pages copying in finalize_core.

The goal of this patch is a) to avoid pages copying at all (we still have
one on restore, but fixing this requires Andrey's work on shared memory
dumping) and b) make big blobs with pages be stored in separate files (I
have plans on its format rework and unification).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-12 11:45:29 +04:00
Kinsbursky Stanislav
516099d885 IPC: show shared memory dump content
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-09 13:21:46 +04:00
Kinsbursky Stanislav
a0ec1002b2 crtools: cleanup fdset initalization
v2: wrappers names become less obfuscating

This patch:
1) Updates function cr_fdset_open() to be suitable for handling fdset creation
for dump and show stages.
2) Replaces cr_fdset_open() by new wrapper function cr_fdset_dump().
3) Replaces prep_cr_fdset_for_restore() by new wrapper function cr_fdset_show().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 21:43:12 +04:00
Kinsbursky Stanislav
530f9d9030 IPC: collect and dump tunables sequentially
This patch removes collect stage and dumps tunables object right after
collect.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 16:31:41 +04:00
Stanislav Kinsbursky
5d40ea2ff1 IPC: show dump content
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:32:22 +04:00
Kir Kolyshkin
e70f8d2376 cr-show.c: fix printf format warnings
cr-show.c: In function ‘show_pipes’:
cr-show.c:119:3: error: format ‘%8lx’ expects type ‘long unsigned int’, but argument 2 has type ‘u32’
cr-show.c:119:3: error: format ‘%8lx’ expects type ‘long unsigned int’, but argument 3 has type ‘u32’
cr-show.c:119:3: error: format ‘%8lx’ expects type ‘long unsigned int’, but argument 4 has type ‘u32’
cr-show.c:119:3: error: format ‘%8lx’ expects type ‘long unsigned int’, but argument 5 has type ‘u32’

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 15:57:16 +04:00
Kir Kolyshkin
0b237ae9f2 pr_perror(): print error at the end of line
This is a standard convention to print error message (i.e. strerror(errno))
at the end of line, like this:

        Cannot remove file: Permission denied

So pr_perror is fixed to follow this convention (using GNU extension
%m helps a lot here). Unfortunately, due to this we have to make
pr_perror() print a new line character, too, so we had to strip it
from the all pr_perror() invocations.

That (appending a newline) also makes pr_perror() a black sheep
in the herd of pr_* helpers, but what can we do? Worst case scenario
is an extra newline after an error message, not too harmful.

An alternative approach (stripping the newline from the passed format
string and re-adding it) was discussed thoroughly, and it was decided
that such a hack looks a bit too dirty.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 15:49:15 +04:00
Stanislav Kinsbursky
225d119e5d namespaces: split UTS and generic code
Generic code will be used for other namespaces.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 13:43:28 +04:00
Cyrill Gorcunov
ef57024410 show: Always print process' exit_code
It's suitable not for info only but in debug purpose as well.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
2012-01-30 17:39:21 +04:00
Cyrill Gorcunov
5345889886 show: Drop tab'ified empty lines
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
2012-01-30 17:39:21 +04:00
Pavel Emelyanov
beb158a66e cr: Task creds support
Dumping is simple. All but secbits can be read from proc, secbits
are got from parasite.

Restoring is a bit tricky -- when you change anything on kernel
cred's struct it performs sophisticated checks and can change
some more stuff than requested, so the creds restoration procedure
is carefully commented step-by-step.

Another thing to mention is that creds are restored after everything
else, i.e. right before performing final threads sync and sigreturns.
This is done to avoid potential problems with insufficient caps for
restoring other stuff (e.g. CAP_DAC_OVERRIDE or zero euid is most
likely required for opening any image file and the notorious control
/proc/sys/kernel/ns_last_pid, which in turn is performed till the
very last moment).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 13:00:50 +04:00
Pavel Emelyanov
98f4c2e4de ns: Support UTS namespace
Only two fields are modifiable -- hostname and domainname. So
read them on dump and write on restore.

File format is simple --

u32 magic
u32 length of nodename
u8[] nodename string
u32 length of domainname
u8[] domainname string

For OpenVZ we can write the release at the end, but this is later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 16:54:22 +04:00
Pavel Emelyanov
3391416a1b crtools: Namespaces support skeleton
New option -n to dump/restore namespaces.

Fork the namespaces dumping task and write a helper for switching a namespace.

Prepare the restorer code for restoring namespaces before root task.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 16:54:22 +04:00
Pavel Emelyanov
8e90a9e6d8 crtools: Tossing CR_FD_ bits around
Split the CR_FD_ bits into per-task and global ones and replace
of CR_FD_DESC_NOPSTREE with CR_FD_DESC_TASK, which is explicit
set of per-task bits.

The CR_FD_DESC_NS will appear soon.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 16:54:22 +04:00
Pavel Emelyanov
b7de83aaf3 crtools: Interval timers support
Timers are dumped from inside parasite code, the format is plain -- just
3 pairs of interval/value one-by-one.

The restoration occurs in two stages -- first prepare the timer values in
restorer (and check for sanity), then setup the timers in the latest stage
before actually calling the sigreturn.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-24 18:41:49 +04:00
Cyrill Gorcunov
a9fd7583e6 show: Add cmdline args and envirion entries
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-01-24 18:01:08 +04:00
Pavel Emelyanov
18aaad6164 img: Extend task image with state and exit code
Introduce 3 states we will have to work with:

* alive for tasks sleeping or running
* dead for zombies
* stopped for stopped tasks. We cannot distinguish tasks in this state now,
  but with freezer cgroup this will become possible

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-23 01:43:36 +04:00
Pavel Emelyanov
164ccc095f crtools: R/W API rewrite
Kill all the macros for reading/writing image parts. New API looks like

* write_img_buf/write_img
  Write an object into an image. Reports 0 for OK, -1 for error. The _buf
  version accepts object size as an argument, the other one uses sizeof()

* read_img_buf/read_img
  Reads an object from image. Reports 0 for OK, -1 for error or EOF.

* read_img_buf_eof/read_img
  Reads an object from image. Reports 1 for OK, 0 for EOF and -1 for error.
  This is not symmetrical with the previous one, but it was done deliberately
  to make it possible to write code like

  ret = read_img_bug_eof();
  if (ret <= 0)
	return ret; /* 0 means OK, all is done, -1 means error was met */.

  ... /* 1 means object was read, can proceed */

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-23 01:43:10 +04:00
Pavel Emelyanov
dbf3c1a8cd crtools: Reformat core_entry
Keep task arch-independent fields in one struct (will be extended) in the
beginning of the image and make pads be located separately.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-23 01:43:00 +04:00
Stanislav Kinsbursky
7fc6561c0c show: inet sockets dump parsing support added
Nothing to comment there.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-18 12:39:04 +04:00
Pavel Emelyanov
6babc5c17b files: Show file ID in -s output
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-15 22:37:49 +04:00
Cyrill Gorcunov
893f5099cb show: Drop redundat rendering of process tree
Leftover from merging stage.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-15 22:36:18 +04:00
Cyrill Gorcunov
53d75ad13c show: Fixup nit left after merging
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-13 21:42:18 +04:00