2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 05:18:00 +00:00

1696 Commits

Author SHA1 Message Date
Pavel Emelyanov
c56574b411 dump: Obtain task brk via misc dump command
Right now we do syscall_seized for this, but we have the misc dumping command
and the core is (after patch #3) dump after parasite, so we can get brk from
the misc dump, thus avoiding one more switch to parasite.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-12 11:45:29 +04:00
Pavel Emelyanov
028d5df3b0 parasite: Remove kill_seized
It's not used in code.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-12 11:45:29 +04:00
Pavel Emelyanov
227f177194 cr: Split dumped pages locations
This actually does two things:

1. The parasite code writes to pages _or_ to pages_shared file himself based
   on a hint given from the main program. This avoids shared pages copying
   in finalize_core.

2. The private pages are moved out of the core file into a separate one. This
   avoids private pages copying in finalize_core.

The goal of this patch is a) to avoid pages copying at all (we still have
one on restore, but fixing this requires Andrey's work on shared memory
dumping) and b) make big blobs with pages be stored in separate files (I
have plans on its format rework and unification).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-12 11:45:29 +04:00
Kir Kolyshkin
82e548ecca socket.c: use const whenever possible
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-10 18:07:19 +04:00
Cyrill Gorcunov
4749ef2626 syscall: Drop unneeded addr variable
Otherwise it become a global variable propagated
even to files where it must has no reference to
(such as parasite code)

Symbol table '.symtab' contains 22 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
...
     4: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS parasite.c
...
    21: 000000000000b208     8 OBJECT  GLOBAL DEFAULT    3 addr

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-10 17:38:43 +04:00
Kinsbursky Stanislav
516099d885 IPC: show shared memory dump content
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-09 13:21:46 +04:00
Kinsbursky Stanislav
3d886be2c6 IPC: dump shared memory
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-09 13:21:46 +04:00
Kinsbursky Stanislav
a0ec1002b2 crtools: cleanup fdset initalization
v2: wrappers names become less obfuscating

This patch:
1) Updates function cr_fdset_open() to be suitable for handling fdset creation
for dump and show stages.
2) Replaces cr_fdset_open() by new wrapper function cr_fdset_dump().
3) Replaces prep_cr_fdset_for_restore() by new wrapper function cr_fdset_show().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 21:43:12 +04:00
Kinsbursky Stanislav
530f9d9030 IPC: collect and dump tunables sequentially
This patch removes collect stage and dumps tunables object right after
collect.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 16:31:41 +04:00
Pavel Emelyanov
82a8a8ff95 sockets: Decode kernel dev_t into stat's one
Unix diag and stat report dev_t-s in different formats. Cast one to
another when comparing in unix dumping.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-08 16:11:27 +04:00
Cyrill Gorcunov
76a249282e restore: Add checkpoint/restore for /proc/pid/exe symlink
This patch adds ability to checkpoint/restore
/proc/pid/exe symlink, so if a process we've just
checkpointed has been say /path/to/exe, then at restore
time we bring this path back.

There some restiction from kernel side: if
existing /proc/pid/exe already mapped more than
once, the kernel will refuse to change the symlink,
so we need to restore it lately when mmaps of crtools
itself already unmapped (ie via late call in
restorer.c).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-07 20:08:01 +04:00
Andrey Vagin
4d962b27c0 crtools: dump and restore clear_tid_address
pthread_join works with this patch

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-03 17:28:04 +04:00
Andrey Vagin
e8c6c877ed syscall: add set_tid_address
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-03 17:27:31 +04:00
Cyrill Gorcunov
405985e964 Add sysctl handling engine
Since we need to operate with sysctls pretty heavy,
better to add some common engine for all handlers.

Based-on-patch-from: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-02 21:22:20 +04:00
Andrey Vagin
27b20dfcc6 restore: clean up code, which synchronizes resume of tasks (v2)
I added two mechanism of synchronization. The second one is better.
This patch deletes the first one.

Before we had an entry (pid and lock) for each tasks and all this
entries were shared between all processes. Now we don't need "lock"
and we use pids from crtools to kill all processes if someone failed.

v2: s/malloc/xmalloc

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-02 20:45:43 +04:00
Cyrill Gorcunov
e61605169f ctrools: Rewrite task/threads stopping engine is back
This commit brings the former "Rewrite task/threads stopping engine"
commit back. Handling it separately is too complex so better try
to handle it in-place.

Note some tests might fault, it's expected.
---

Stopping tasks with STOP and proceeding with SEIZE is actually excessive --
the SEIZE if enough. Moreover, just killing a task with STOP is also racy,
since task should be given some time to come to sleep before its proc
can be parsed.

Rewrite all this code to SEIZE task and all its threads from the very beginning.

With this we can distinguish stopped task state and migrate it properly (not
supported now, need to implement).

This thing however has one BIG problem -- after we SEIZE-d a task we should
seize
it's threads, but we should do it in a loop -- reading /proc/pid/task and
seizing
them again and again, until the contents of this dir stops changing (not done
now).

Besides, after we seized a task and all its threads we cannot scan it's children
list once -- task can get reparented to init and any task's child can call clone
with CLONE_PARENT flag thus repopulating the children list of the already seized
task (not done also)

This patch is ugly, yes, but splitting it doesn't help to review it much, sorry
:(

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 19:49:28 +04:00
Cyrill Gorcunov
ab82c2de98 Revert "ipc: Drop u32[2] from image, simply use u64 all the time"
This reverts commit 4f83d028ff3062d23357f62583f22381805c6bda.

It breaks IPC test-case, need to investigate.
2012-02-01 19:27:39 +04:00
Cyrill Gorcunov
63b88720a3 Revert "ctrools: Rewrite task/threads stopping engine"
This reverts commit 6da51eee3f6cd7aca9dd88275844e73fb78b767b.

It breaks transition/file_read test case
2012-02-01 19:27:28 +04:00
Andrey Vagin
d4c67416e9 parasite: transfer image fd in parasite
Don't open image files from parasite.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 19:26:58 +04:00
Andrey Vagin
13faf4ee7f parasite: create unix socket in parasite for transfering fd (v3)
Before we used absolute pathes to images and wrote logging messages in
stderr of a target process.

Now we are able to create fd and send it to parasite.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 19:26:50 +04:00
Andrey Vagin
1465fade3b util: move recv_fd and send_fd in a separate file
It will be used in the parasite

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 19:26:41 +04:00
Andrey Vagin
fd7c44dd86 util: add socket syscalls
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 19:26:33 +04:00
Pavel Emelyanov
6da51eee3f ctrools: Rewrite task/threads stopping engine
Stopping tasks with STOP and proceeding with SEIZE is actually excessive --
the SEIZE if enough. Moreover, just killing a task with STOP is also racy,
since task should be given some time to come to sleep before its proc
can be parsed.

Rewrite all this code to SEIZE task and all its threads from the very beginning.

With this we can distinguish stopped task state and migrate it properly (not
supported now, need to implement).

This thing however has one BIG problem -- after we SEIZE-d a task we should seize
it's threads, but we should do it in a loop -- reading /proc/pid/task and seizing
them again and again, until the contents of this dir stops changing (not done now).

Besides, after we seized a task and all its threads we cannot scan it's children
list once -- task can get reparented to init and any task's child can call clone
with CLONE_PARENT flag thus repopulating the children list of the already seized
task (not done also)

This patch is ugly, yes, but splitting it doesn't help to review it much, sorry :(

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 17:29:13 +04:00
Cyrill Gorcunov
4f83d028ff ipc: Drop u32[2] from image, simply use u64 all the time
This eliminate

 | ipc_ns.c:287:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]

and makes code simplier.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-01 17:23:44 +04:00
Cyrill Gorcunov
44a05e70ec ips_ns.h: Add CR_ prefix to header defines
Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 14:39:01 +04:00
Cyrill Gorcunov
1824611b8e list: Cleanup spaces/tabs mess
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-01 14:38:49 +04:00
Cyrill Gorcunov
bbf71707b9 ipc: Add missing ipc_ns.h
Which was missed to add to git index
in one of previous IPC patch series.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:42:33 +04:00
Stanislav Kinsbursky
c826057a9c IPC: dump namespace itself
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:32:22 +04:00
Stanislav Kinsbursky
0213d3ec64 namespaces: parametrized namespace option introduced
v2: strlen() check removed from parse_ns_string()

Now '-n' option must be followed by namespaces tags, separated by commas.
Currently, only "uts" namespace is supported.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 22:32:22 +04:00
Kir Kolyshkin
fa5ce3112f Annotate printk with printf attribute
This way gcc will be able to catch invalid format bugs.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 15:56:33 +04:00
Kir Kolyshkin
6ce8d8ab93 Make BUG_ON() clang-compatible
When trying to compile the beast with clang, it complains:

====
./include/lock.h:33:2: error: indirection of non-volatile null pointer will be deleted, not trap
        BUG_ON(ret < 0);
        ^~~~~~~~~~~~~~~
In file included from restorer.c:18:
./include/util.h:118:27: note: instantiated from:
#define BUG_ON(condition)       BUG_ON_HANDLER((condition))
                                ^
./include/util.h💯4: note: instantiated from:
                        *(unsigned long *)NULL = 0xdead0000 + __LINE__; \
                        ^
====

Make clang happy again by adding 'volatile'.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 15:56:23 +04:00
Kir Kolyshkin
0b237ae9f2 pr_perror(): print error at the end of line
This is a standard convention to print error message (i.e. strerror(errno))
at the end of line, like this:

        Cannot remove file: Permission denied

So pr_perror is fixed to follow this convention (using GNU extension
%m helps a lot here). Unfortunately, due to this we have to make
pr_perror() print a new line character, too, so we had to strip it
from the all pr_perror() invocations.

That (appending a newline) also makes pr_perror() a black sheep
in the herd of pr_* helpers, but what can we do? Worst case scenario
is an extra newline after an error message, not too harmful.

An alternative approach (stripping the newline from the passed format
string and re-adding it) was discussed thoroughly, and it was decided
that such a hack looks a bit too dirty.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 15:49:15 +04:00
Stanislav Kinsbursky
225d119e5d namespaces: split UTS and generic code
Generic code will be used for other namespaces.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-31 13:43:28 +04:00
Kir Kolyshkin
789a2c7f7a Trivial whitespace cleanup
Cleaning a few space-at-EOL occurences, plus one spaces-instead-of-tab.

Found using:

	git grep -n '[[:space:]]$'
	git grep -n '        '

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 21:29:42 +04:00
Andrey Vagin
e6a52bdad1 parasite: remove unused commands
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 16:02:38 +04:00
Pavel Emelyanov
beb158a66e cr: Task creds support
Dumping is simple. All but secbits can be read from proc, secbits
are got from parasite.

Restoring is a bit tricky -- when you change anything on kernel
cred's struct it performs sophisticated checks and can change
some more stuff than requested, so the creds restoration procedure
is carefully commented step-by-step.

Another thing to mention is that creds are restored after everything
else, i.e. right before performing final threads sync and sigreturns.
This is done to avoid potential problems with insufficient caps for
restoring other stuff (e.g. CAP_DAC_OVERRIDE or zero euid is most
likely required for opening any image file and the notorious control
/proc/sys/kernel/ns_last_pid, which in turn is performed till the
very last moment).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 13:00:50 +04:00
Pavel Emelyanov
d846d108f6 syscalls: Prepare syscalls and bits for (mostly) setting creds
These are setXXXid, capset and various bits for prctl and caps machinery.
The thing is that the caps API is not yet fully in glibc so we have to
declare some bits even for core code, not just for restorer/parasite.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 13:00:33 +04:00
Pavel Emelyanov
f382d2a376 proc_parse: Routine for reading creds from /proc/pid/status
All the IDs and caps are in there. Just read them for future use.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 13:00:18 +04:00
Pavel Emelyanov
47c161f2dc parasite: Dump misc command
Add command and basis for dumping minor bits for task
from parasite code. It's supposed to retrieve minor bits
form tasks which cannot be read from /proc.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-30 12:59:54 +04:00
Cyrill Gorcunov
29bda9aae5 sockets: Restore in-flight unix stream sockets
It's done in two steps

 - On checkpoint we find which icons are present
   over all sockets and setup peer number to
   appropriate listening socket

 - On restore we collect listening sockets and once
   we find in-flight connection we search for appropriate
   listening socket name and use it to call connect() then

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-01-27 23:21:06 +04:00
Pavel Emelyanov
16c58dbd11 magic: Fix PIPEFS_MAGIC constant
This one is actually an internal kernel magic number for pipefs filesystem
and shouldn't be changed.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-01-26 20:42:45 +04:00
Pavel Emelyanov
60dee71484 magic: Change magic numbers
Existing ones are boring. Let's switch them into geographical coordinates
of various Russian towns in NNNNEEEE form.

4 digits for a coordinate give us up to 2km of inaccuracy, which is more
than enough to find a town. We cannot use longitude further than 99.99,
i.e. we won't cover the Far East region, but that's OK -- there's more than
enough good candidates even in the European part of the country only.

Feel free to extend.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 19:49:27 +04:00
Pavel Emelyanov
98f4c2e4de ns: Support UTS namespace
Only two fields are modifiable -- hostname and domainname. So
read them on dump and write on restore.

File format is simple --

u32 magic
u32 length of nodename
u8[] nodename string
u32 length of domainname
u8[] domainname string

For OpenVZ we can write the release at the end, but this is later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 16:54:22 +04:00
Pavel Emelyanov
3391416a1b crtools: Namespaces support skeleton
New option -n to dump/restore namespaces.

Fork the namespaces dumping task and write a helper for switching a namespace.

Prepare the restorer code for restoring namespaces before root task.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 16:54:22 +04:00
Pavel Emelyanov
8e90a9e6d8 crtools: Tossing CR_FD_ bits around
Split the CR_FD_ bits into per-task and global ones and replace
of CR_FD_DESC_NOPSTREE with CR_FD_DESC_TASK, which is explicit
set of per-task bits.

The CR_FD_DESC_NS will appear soon.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-26 16:54:22 +04:00
Pavel Emelyanov
b7de83aaf3 crtools: Interval timers support
Timers are dumped from inside parasite code, the format is plain -- just
3 pairs of interval/value one-by-one.

The restoration occurs in two stages -- first prepare the timer values in
restorer (and check for sanity), then setup the timers in the latest stage
before actually calling the sigreturn.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-24 18:41:49 +04:00
Pavel Emelyanov
6ab01f7a16 syscalls: Add set/get itimers syscalls
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-24 18:41:40 +04:00
Pavel Emelyanov
db7dd32203 parasite: Braces around set-status macro arguments
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-24 18:41:23 +04:00
Pavel Emelyanov
043bf95fb8 parasite: Arguments sanitation
Remove typedefs and make the common part of sigacts and pages dumping arguments
for dumping something into a specific file. Will be used almost as-is soon.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-01-24 18:40:32 +04:00
Cyrill Gorcunov
0008148686 restore: Restore cmdline arguments, envirion and auxv restore
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-01-24 18:01:08 +04:00