2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 12:57:57 +00:00

136 Commits

Author SHA1 Message Date
Pavel Emelyanov
4a3861acb8 fdset: Introduce glbal fdset
This contains reg-files and sk-queues images, as they contain data
which is potentially generated by every task, so keep it open all
the time dump goes.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:57:07 +04:00
Pavel Emelyanov
1fb1d94186 fdset: Introduce new fdsets
Current fdsets are ugly, limited (bitmask will exhaust in several months) and
suffer from unknown problems with fdsets reuse :(

With new approach (this set) the images management is simple. The basic function
is open_image, which gives you an fd for an image. If you want to pre-open several
images at once instead of calling open_image every single time, you can use the
new fdsets.

Images CR_FD_ descriptors should be grouped like

_CR_FD_FOO_FROM,
CR_FD_FOO_ITEM1,
CR_FD_FOO_ITEM2,
..
CR_FD_FOO_ITEMN,
_CR_FD_FOO_TO,

After this you can call cr_fd_open() specifying ranges -- _FROM and _TO macros,
it will give you an cr_fdset object. Then the fdset_fd(set, type) will give you
the descriptor of the open "set" group corresponding to the "type" type.

3 groups are introduced in this set -- tasks, ns and global.

That's it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:57:04 +04:00
Pavel Emelyanov
3858ee4950 fdset: Introduce two fdsets -- task and ns
Write two helpers for opening an fdset for task and one for ns.

This probably can be done with some "generic" macro(s), but this
time it's simpler not to produce more code of that type.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:57:00 +04:00
Pavel Emelyanov
bcf9ee3d1c fdset: Helper for getting fd out of a set
This patch does

s/$fdset->fds[$nr]/fdset_fd($fdset, $nr)/

over the code.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:59 +04:00
Pavel Emelyanov
7241b9291b fdset: Kill ability to re-use fdset
It's not required any longer. Now fdsets are allocated one-by-one only
when required and there's no need in adding new fds to existing sets.

Thus just remove the last arg from cr_fdset_open.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-26 22:56:51 +04:00
Pavel Emelyanov
95f957b837 image: New image file for regfiles
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-25 21:11:58 +04:00
Pavel Emelyanov
97a1d8bb1c mm: Dump vmas into separate image file
The core image now contains only core per-task stuff.
The new file resurrects Tula magic number removed earlier.

Acked-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-21 18:17:12 +04:00
Andrey Vagin
e869c16df5 mm: rework of dumping shared memory
vma_entry contains shmid and all shared memory are dumped in own files.
The most interesting thing is restore.
A maping is restored by process with the smallest pid. The mamping
is created before executing restorer.
We map a full mapping and restore it's conten, then we open a file from
/proc/pid/map_files and store a descriptor in vma_info. The mapping is
unmaped. Now we can map any region of this mapping in the restorer.

We use this trick, because a target process may have this mapping in
some places and the restorer has not function to open proc files.

v2: fix error hangling
xemul: Fixed static-s and args for cr_dump_shmem

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-21 11:03:55 +04:00
Andrey Vagin
31feef8ab4 mm: s/PAGES_SHMEM/SHMEM_PAGES
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-21 10:57:23 +04:00
Andrey Vagin
5dda50468b mm: change offset of zero_page_entry to ~0LL
Because 0 is actually a valid value.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-21 10:57:14 +04:00
Andrey Vagin
5ca347889e crtools: support any format of image path (v3)
Now a name of an image file is hard coded ("smth-%d.img", pid),
but the images of namespaces, shared memery, etc belong to
not one task, so they may have other formats of names, which
will describe objects.

For example a image of shared memory content may have name like
this ("pages-shmem-%ld.img", shmid)

v2: fix comment
v3: rebase

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-19 15:45:12 +04:00
Pavel Emelyanov
ffacd0f17c image: Open images via openat
Using absolute paths for this is dangerous - while doing c/r we should
be extremely carefully and not change tasks' roots and mount namespaces
too early. Sometimes it will not work -- when restoring containers we'll
be unable to switch to new CT and still have the ability to open images.

Rework the images opening via openat and keep the image dir fd open all
the time as the service fd (introduced earlier).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-16 20:45:50 +04:00
Pavel Emelyanov
c39e759048 check: Initial skeleton
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-02 14:15:09 +04:00
Cyrill Gorcunov
827cabb480 show: Use pr_msg for showing contents on console
Due to code sharing, especially in IPC area,
the unbinding is done via helper macros and
sysclt engine tuning (new CTL_SHOW action
added).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-02 11:12:59 +04:00
Cyrill Gorcunov
7aa8e4b6e2 log: log-engine slight redesign
The messages are filtered by their type

    LOG_MSG     - plain messages, they escape any (!) log level
                  filtration and go to stdout
    LOG_ERROR   - error messages
    LOG_WARN    - warning messages
    LOG_INFO    - informative messages
    LOG_DEBUG   - debug messages

By default the LOG_WARN log level is used, thus LOG_INFO
and LOG_DEBUG messages will not appear in output stream.

pr_panic helper was replaced with pr_err, pr_warning
shorthanded to pr_warn and old printk if rather pr_msg
now.

Because we share messages between "show" and "dump" actions,
before the "show" action proceed we need to tune up
log level and set it to LOG_INFO.

Also note that printing of VMA and siginfo now
became LOG_INFO messages, it was not that correct
to print them regardless the log level.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-02 01:05:43 +04:00
Pavel Emelyanov
9e0b308af0 dump/restore: Rework final-state switch
Remove CR_TASK_XXX states, use the TASK_XXX ones (for image). This is
required to unseize tasks properly in the next patches.

Plus, make sure that pstree_list and the seized set coincide (i.e.
handle error in collect_task).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-01 19:31:20 +04:00
Kinsbursky Stanislav
c19012326d dump: socket queues support
This patch was designed to be generic and thus usable for all kinds of
sockets. Not sure, thah this goal has been reached, but at least I tried.

Key ideas:
1) On-stack structure for collecting sockets queues and then passing them to
   parasite code.
2) Singly linked list is used for collecting structures, representing sockets
   of any kind (!) with queues.

Based on xemul@ patches.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-29 17:42:30 +04:00
Cyrill Gorcunov
68654479c6 crtools: Drop pr_debug from fdset ops
They are redundant, and simply overlog the output

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-20 14:23:28 +04:00
Cyrill Gorcunov
ef97467da9 log: Add log-levels
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-20 14:23:28 +04:00
Kinsbursky Stanislav
4141296ed7 IPC: dump semaphores set
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-15 13:33:46 +04:00
Kinsbursky Stanislav
fa2ff60680 IPC: dump message queue
v2: New "MSG_STEAL" functionality is used

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-14 20:21:30 +04:00
Kinsbursky Stanislav
3d886be2c6 IPC: dump shared memory
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-09 13:21:46 +04:00
Kinsbursky Stanislav
a0ec1002b2 crtools: cleanup fdset initalization
v2: wrappers names become less obfuscating

This patch:
1) Updates function cr_fdset_open() to be suitable for handling fdset creation
for dump and show stages.
2) Replaces cr_fdset_open() by new wrapper function cr_fdset_dump().
3) Replaces prep_cr_fdset_for_restore() by new wrapper function cr_fdset_show().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 21:43:12 +04:00
Kinsbursky Stanislav
530f9d9030 IPC: collect and dump tunables sequentially
This patch removes collect stage and dumps tunables object right after
collect.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 16:31:41 +04:00
Cyrill Gorcunov
e61605169f ctrools: Rewrite task/threads stopping engine is back
This commit brings the former "Rewrite task/threads stopping engine"
commit back. Handling it separately is too complex so better try
to handle it in-place.

Note some tests might fault, it's expected.
---

Stopping tasks with STOP and proceeding with SEIZE is actually excessive --
the SEIZE if enough. Moreover, just killing a task with STOP is also racy,
since task should be given some time to come to sleep before its proc
can be parsed.

Rewrite all this code to SEIZE task and all its threads from the very beginning.

With this we can distinguish stopped task state and migrate it properly (not
supported now, need to implement).

This thing however has one BIG problem -- after we SEIZE-d a task we should
seize
it's threads, but we should do it in a loop -- reading /proc/pid/task and
seizing
them again and again, until the contents of this dir stops changing (not done
now).

Besides, after we seized a task and all its threads we cannot scan it's children
list once -- task can get reparented to init and any task's child can call clone
with CLONE_PARENT flag thus repopulating the children list of the already seized
task (not done also)

This patch is ugly, yes, but splitting it doesn't help to review it much, sorry
:(

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 19:49:28 +04:00
Cyrill Gorcunov
63b88720a3 Revert "ctrools: Rewrite task/threads stopping engine"
This reverts commit 6da51eee3f6cd7aca9dd88275844e73fb78b767b.

It breaks transition/file_read test case
2012-02-01 19:27:28 +04:00
Pavel Emelyanov
6da51eee3f ctrools: Rewrite task/threads stopping engine
Stopping tasks with STOP and proceeding with SEIZE is actually excessive --
the SEIZE if enough. Moreover, just killing a task with STOP is also racy,
since task should be given some time to come to sleep before its proc
can be parsed.

Rewrite all this code to SEIZE task and all its threads from the very beginning.

With this we can distinguish stopped task state and migrate it properly (not
supported now, need to implement).

This thing however has one BIG problem -- after we SEIZE-d a task we should seize
it's threads, but we should do it in a loop -- reading /proc/pid/task and seizing
them again and again, until the contents of this dir stops changing (not done now).

Besides, after we seized a task and all its threads we cannot scan it's children
list once -- task can get reparented to init and any task's child can call clone
with CLONE_PARENT flag thus repopulating the children list of the already seized
task (not done also)

This patch is ugly, yes, but splitting it doesn't help to review it much, sorry :(

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 17:29:13 +04:00
Kir Kolyshkin
1408ead858 Assorted trivial message fixes
* kid -> child
* First letter should be uppercase
* Misc typos in messages and comments

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 02:55:16 +04:00
Stanislav Kinsbursky
c75c33cb86 IPC: restore namespace itself
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:32:22 +04:00
Stanislav Kinsbursky
c826057a9c IPC: dump namespace itself
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:32:22 +04:00
Stanislav Kinsbursky
9cdfe71921 namespaces: docs updated
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:32:22 +04:00
Stanislav Kinsbursky
0213d3ec64 namespaces: parametrized namespace option introduced
v2: strlen() check removed from parse_ns_string()

Now '-n' option must be followed by namespaces tags, separated by commas.
Currently, only "uts" namespace is supported.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:32:22 +04:00
Cyrill Gorcunov
7705df5397 crtools: Make fdset operations robust against open() errors
There are two cases for cr_fdset_open

 - It might be called with already allocated
   memory so we should reuse it.

 - It might be called with NULL pointing out
   that caller expects us to allocate memory.

If an open() error happens somewhere inside cr_fdset_open
it requires two error paths

 - Just close all files opened but don't free memory
   if it was not allocated by us

 - Close all files opened *and* free memory allocated
   by us.

In any case we should close all files opened so close_cr_fdset()
helper is splitted into two parts.

Also the caller should be ready for such semantics as well and
do not re-assign pointers obtained but simply test for NULL
on results.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-01-31 19:15:32 +04:00
Kir Kolyshkin
0b237ae9f2 pr_perror(): print error at the end of line
This is a standard convention to print error message (i.e. strerror(errno))
at the end of line, like this:

        Cannot remove file: Permission denied

So pr_perror is fixed to follow this convention (using GNU extension
%m helps a lot here). Unfortunately, due to this we have to make
pr_perror() print a new line character, too, so we had to strip it
from the all pr_perror() invocations.

That (appending a newline) also makes pr_perror() a black sheep
in the herd of pr_* helpers, but what can we do? Worst case scenario
is an extra newline after an error message, not too harmful.

An alternative approach (stripping the newline from the passed format
string and re-adding it) was discussed thoroughly, and it was decided
that such a hack looks a bit too dirty.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 15:49:15 +04:00
Kir Kolyshkin
b8b0dd42a7 Remove duplicate strerror(errno) printing
Function pr_perror() already spits out strerror(errno), no need to do it
in the calling code.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 21:29:51 +04:00
Kir Kolyshkin
789a2c7f7a Trivial whitespace cleanup
Cleaning a few space-at-EOL occurences, plus one spaces-instead-of-tab.

Found using:

	git grep -n '[[:space:]]$'
	git grep -n '        '

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 21:29:42 +04:00
Pavel Emelyanov
beb158a66e cr: Task creds support
Dumping is simple. All but secbits can be read from proc, secbits
are got from parasite.

Restoring is a bit tricky -- when you change anything on kernel
cred's struct it performs sophisticated checks and can change
some more stuff than requested, so the creds restoration procedure
is carefully commented step-by-step.

Another thing to mention is that creds are restored after everything
else, i.e. right before performing final threads sync and sigreturns.
This is done to avoid potential problems with insufficient caps for
restoring other stuff (e.g. CAP_DAC_OVERRIDE or zero euid is most
likely required for opening any image file and the notorious control
/proc/sys/kernel/ns_last_pid, which in turn is performed till the
very last moment).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 13:00:50 +04:00
Pavel Emelyanov
98f4c2e4de ns: Support UTS namespace
Only two fields are modifiable -- hostname and domainname. So
read them on dump and write on restore.

File format is simple --

u32 magic
u32 length of nodename
u8[] nodename string
u32 length of domainname
u8[] domainname string

For OpenVZ we can write the release at the end, but this is later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 16:54:22 +04:00
Pavel Emelyanov
3391416a1b crtools: Namespaces support skeleton
New option -n to dump/restore namespaces.

Fork the namespaces dumping task and write a helper for switching a namespace.

Prepare the restorer code for restoring namespaces before root task.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 16:54:22 +04:00
Pavel Emelyanov
b7de83aaf3 crtools: Interval timers support
Timers are dumped from inside parasite code, the format is plain -- just
3 pairs of interval/value one-by-one.

The restoration occurs in two stages -- first prepare the timer values in
restorer (and check for sanity), then setup the timers in the latest stage
before actually calling the sigreturn.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-24 18:41:49 +04:00
Pavel Emelyanov
164ccc095f crtools: R/W API rewrite
Kill all the macros for reading/writing image parts. New API looks like

* write_img_buf/write_img
  Write an object into an image. Reports 0 for OK, -1 for error. The _buf
  version accepts object size as an argument, the other one uses sizeof()

* read_img_buf/read_img
  Reads an object from image. Reports 0 for OK, -1 for error or EOF.

* read_img_buf_eof/read_img
  Reads an object from image. Reports 1 for OK, 0 for EOF and -1 for error.
  This is not symmetrical with the previous one, but it was done deliberately
  to make it possible to write code like

  ret = read_img_bug_eof();
  if (ret <= 0)
	return ret; /* 0 means OK, all is done, -1 means error was met */.

  ... /* 1 means object was read, can proceed */

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-23 01:43:10 +04:00
Pavel Emelyanov
cf0550ce61 dump: Images opening rework
Rename prep_cr_fdset_for_dump into cr_fdset_open and make it reentable, i.e.
every next enter will open more files in the same fdset. Required for zombies
and makes the code cleaner.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-23 01:42:51 +04:00
Andrey Vagin
f00efadd2c This patches improves execution of zdtm tests.
It's really like in VZ.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-18 23:31:22 +04:00
Andrey Vagin
b601765782 devide commands and options
./crtools CMD [options]
CMD may be show, restore, dump

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-18 23:31:12 +04:00
Stanislav Kinsbursky
f3253a40d2 checkpoint: IPv4 listening sockets dumping support 2012-01-18 12:38:58 +04:00
Pavel Emelyanov
8c9c575ab4 util: Move get_image_path to util.c
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-16 13:43:23 +04:00
Cyrill Gorcunov
b0a9533080 crtools: Make sure there is enough space to hold image path
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-01-13 17:09:17 +04:00
Cyrill Gorcunov
9360a73cad crtools: Make close_cr_fdset being safe to be called with closed fd-set
It was being done intentionally to be able to call close_cr_fdset
several times in a row, bring this ability back. Otherwise I'm
getting glibc complains about attemt to free already freed memory.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-12 21:25:19 +04:00
Pavel Emelyanov
6b83aef6a1 crtools: Merge fdset free into close
The same as previous patch -- no need in two separate calls.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-12 18:51:21 +04:00
Pavel Emelyanov
871b73674d crtools: Merge fdset allocation into prep
They always go in pairs so there's no need in two calls.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-12 18:51:13 +04:00