2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 12:57:57 +00:00

542 Commits

Author SHA1 Message Date
Andrey Vagin
6bbdec26f3 files: add ability to set callbacks for files (v7)
Here is nothing interecting. If a file can't be dumped by criu,
plugins are called. If one of plugins knows how to dump the file,
the file entry is marked as need_callback. On restore if we see
this mark, we execute plugins for restoring the file.

v2: Callbacks are called for all files, which are not supported by CRIU.
v3: Call plugins for a file instead of file descriptor. A few file
descriptors can be associated with one file.
v4: A file descriptor is opened in a callback. It's required for
    restoring anon vmas.
v5: Add a separate type for unsupported files
v6: define FD_TYPES__UNSUPP
v7: s/unsupp/ext (external)

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 16:07:38 +04:00
Andrey Vagin
d7cf271ed4 crtools: preload libraries (v2)
Libraries (plugins) is going to be used for dumping and restoring
external dependencies (e.g. dbus, systemd journal sockets, charecter
devices, etc)

A plugin can have the cr_plugin_init() and cr_plugin_fini functions for
initialization and deinialization.

criu-plugin.h contains all things, which can be used in plugins.

v2: rename lib to plugin
v3: add a default value for a plugin path.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-19 21:48:33 +04:00
Tikhomirov Pavel
5ec15bf25c page-read: add proper processing of return value
when get an error reading image file need stop restore

Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-17 19:45:22 +04:00
Pavel Emelyanov
ae98ef6ae0 mount: Factor out mount tree build for NEWNS and non-NS cases
We anyway build the tree, in the NS case -- few calls later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-12 16:19:48 +04:00
Cyrill Gorcunov
cf1ce5f817 mount: Build mount tree on dump restore early, if needed
For paths resolution we will need mount tree to be parsed
and built, but it's not that simple -- the current code
implies that once parsed the tree must not be re-parsed
again, so we pass @parse argument from a caller: if a task
we're restoring do not use mount namespace, we should parse
mount tree early, otherwise defer this action until mount
tree is read from the image.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-11 16:05:19 +04:00
Andrey Vagin
57d25e7cea mm: fix expression to determine which vma-s can be shared
Currently only addresses are compared. It's obviously not enough.

* First of all the parent vma must be private.
* Both vma-s must have the identical set of MAP_GROWSDOWN and MAP_FILES
  flags.
* Both vma-s must be linked to the same file.

https://bugzilla.openvz.org/show_bug.cgi?id=2824
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-22 18:19:23 +04:00
Andrey Vagin
7659c995f5 vm: don't overwrite vma->shmid for private mappings
shmid contains a file id for file mappings. It's required to determine,
which VMA-s are cowed. The parent maps a VMA and saves premmaped
address. Then  child trys to determing, which VMA-s must be inhereted
from parent, for that it compares addresses, flags and file id.

We don't want to transfer vma_area-s in restorer, so when a VMA entry is
copied in restorer memory, the premmaped address is save in shmid.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-22 18:19:08 +04:00
Pavel Emelyanov
c3b9448cf7 pidfile: Don't push opts.pidfile as write_pidfile arg
opts are criu-wide available.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-20 14:26:41 +04:00
Ruslan Kuprieiev
dc80d6f125 log: get rid of LOG_DIR_FD_OFF and opening cwd in log_init()
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-15 21:38:41 +04:00
Pavel Emelyanov
a2917ffc87 rst: Initialize task restore args after rst-mem remap
The plan is to move the args on rst-mem itself. Thus we need
to make sure it's initialized _after_ remap into restorer space.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-08 17:57:04 +04:00
Pavel Emelyanov
3e895cc2da rst: Rename task_restore_core_args
Remove the _core_ from it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-08 17:32:07 +04:00
Pavel Emelyanov
ec7e483e8b restorer: Make task- and thread- args go one-by-one
Currently we have task args page-aligned, then there go
thread args. This is waste of memory. Let's put them in
one row.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-08 17:29:55 +04:00
Andrey Vagin
4850fd94a8 crtools: move cr_options in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 18:17:52 +04:00
Andrey Vagin
0d1dfc2e08 crtools: move all stuff about vma together
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:43:49 +04:00
Andrey Vagin
824403a009 crtools: create new header for servicefd stuff (v2)
v2: generate patch relative to the official git.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:43:02 +04:00
Pavel Emelyanov
ba0527d42b restore: Remove actually unused variable from sigreturn_restore
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-04 00:16:34 +04:00
Pavel Emelyanov
32b4a26c6b restore: Comment why we need copy data on task restore args
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-04 00:16:14 +04:00
Pavel Emelyanov
ebd76f4bec restore: Move sigpending out of sigreturn_restore
The sigreturn_restore is the place when we prepare the restorer
layout and jump to it. Reading and decoding images should be done
earlier. The new rst-malloc engine allows for that.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-04 00:15:39 +04:00
Pavel Emelyanov
91f797f66f restore: Move posix-timers out of sigreturn_restore
The sigreturn_restore is the place when we prepare the restorer
layout and jump to it. Reading and decoding images should be done
earlier. The new rst-malloc engine allows for that.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-04 00:15:15 +04:00
Pavel Emelyanov
ed88f2df66 restore: Move rlimits out of sigreturn_restore
The sigreturn_restore is the place when we prepare the restorer
layout and jump to it. Reading and decoding images should be done
earlier. The new rst-malloc engine allows for that.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-04 00:14:42 +04:00
Pavel Emelyanov
4f675313cc rst-malloc: Switch to private allocations once forked
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-03 17:40:15 +04:00
Pavel Emelyanov
ca0b51bc00 rst: Close logdir earlier
Just a code sanitation.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-03 17:37:10 +04:00
Pavel Emelyanov
3e235f715e restore: Get self maps after allocating necessary memory
We're filling some rst-mem data _after_ we get the self maps
list. This is a bug, since the restorer vma get forcedly mapped
into a place we get out of self-vmas-list.

Move the self-vmas-list getting after we allocate the memory
we need.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-03 17:23:31 +04:00
Ruslan Kuprieiev
95a961b739 log: don't kill task, if unable to write pidfile
write_pidfile() was taken out from cr-restore.c, where it was supposed to kill
child if unable to create pidfile. Now we're also using it at service/page-server where kill is redundant. So lets take out kill() from write_pidfile() back to cr-restore.

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-03 12:51:13 +04:00
Pavel Emelyanov
45f39e0415 rst: Make shmem restore to use rst-malloc
This actually fixes a bug -- memory for shmem info was
not allocated dynamically, thus we were limited in the
amount of shmems to be restored.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-02 01:06:31 +04:00
Pavel Emelyanov
7ed35d8a87 rst: Switch private rst-mem allocator on generic rst-malloc
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-02 01:05:13 +04:00
Cyrill Gorcunov
0707df7745 restore: Don't unmap vdso proxy on final cleanup
In case if we need to use vdso proxy the memory area
which holds restorer also has a place for vdso proxy
code itself, so on final pass we should not unmap it,
otherwise any call to vdso function will cause sigsegv.

IOW, the memory before final "cleanup" pass of restorer
might look as

    +-----------+---------+     +-------------+------+
    | bootstrap | rt-vdso | ... | application | vdso |
    +-----------+---------+     +-------------+------+
                       ^                         |
                       `-------------------------+

and we have redirected "vdso" code to jump to "rt-vdso".
After final pass the memory must look as

                +---------+     +-------------+------+
                | rt-vdso | ... | application | vdso |
                +---------+     +-------------+------+
                       ^                         |
                       `-------------------------+

I noticed this problem during container migration
testing, the container itself was suspended on 2.6.32
OpenVZ kernel with apache running inside, and any attempt
to connect to apache caused apache to crash.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-30 16:30:57 +04:00
Pavel Emelyanov
389b814632 restore: Make optional images check right after open
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-29 23:00:24 +04:00
Pavel Emelyanov
bb5476cb63 restore: Put tgt vmas in rst-mem, not special purpose memory
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-29 23:00:06 +04:00
Pavel Emelyanov
d7db85e9dc restore: Iterate tgt vmas by number, not by terminating point
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-29 22:59:55 +04:00
Pavel Emelyanov
55a04580d5 restorer: Compact rst stack evaluation code
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-29 22:59:48 +04:00
Pavel Emelyanov
00dc26602a restorer: Remove unused heap
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-29 22:59:41 +04:00
Pavel Emelyanov
14a7aff288 rst: Read sys.last_cap only once in kerndat
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-12 00:03:25 +04:00
Pavel Emelyanov
f0a8643736 kerndat: Initialize necessary kerndats on restore
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-11 17:38:57 +04:00
Pavel Emelyanov
6bf63b3f01 security: Push full creds info into may_xxx checks
It's not enough to check only uids on dump and restore -- we need to
check e-ids and s-ids now (and caps in the future).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 15:48:44 +04:00
Ruslan Kuprieiev
547d9bf959 v2 security: set suid flag on crtools and check real uid on dump/restore
v2: remove redundant functions and variables.

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-02 17:11:17 +04:00
Andrey Vagin
07930a8df4 ns: replace pid on id in per-namespace files
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:17:04 +04:00
Andrey Vagin
79d47a939d crtoools: add support of stopped tasks (v2)
Currently we catch processes on the exit point from sigreturn.
If a task must be restored in the stopped state, we can send SIGSTOP
before detaching from it.

v2: add more descriptive comment about skipping SIGSTOP in ptrace.c
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 11:21:34 +04:00
Pavel Emelyanov
f1edcb32f5 rst: Introduce fine-grained pgid-restore synchronization
We can restore task's pgid which is not equal to its pid,
only when the respective group leader is alive. To make
restore reliable we wait for all group leaders to restore
using separate restore stage.

It's better to optimize this -- each task has a pointer on
its group leader and waits for one to become such.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-27 04:38:00 +04:00
Pavel Emelyanov
c378f790b8 fs: Restore root
First of all, this should be done strictly after we've stopped accessing
files by their paths, even absolute. This place is right before going
into restorer.

And the second thing is that we want to re-use the open_fd_by_id engine,
since it handles various tricky cases of open-file-by-path. And since
there's no such thing as fchroot(int fd), we emulate it using the
/proc/self/fd/ links.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-25 13:59:20 +04:00
Pavel Emelyanov
75b1d4a1e3 rst: Open sys.ns_last_pid before diving into restorer
We restore chroot before doing this, so if we might need to
open one, we may have no access to the /proc/... paths.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-25 13:59:17 +04:00
Andrey Vagin
e1e1034786 restorer: rework unmaping old VMA-s (v3)
All process VMA-s are in "premmaped area". All restorer stuff are in
bootstap "area", so we have two areas.

So we don't need to unmap extra VMA-s one by one. We can call munmap
three times for the region before the first area, for the hole between
areas and for the region after the second area.

The old scheme didn't work, because the list of VMA-s can be changed
after collecting. It can be due to memory allocations by libc or due to
increased stack.

v2: improve readability at the expense of beautiness
v3: print return code of munmap in error messages
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:11 +04:00
Andrey Vagin
89d8b20186 restorer: unmap itself (v2)
This patch adds a function for removing the restorer blob. This function
never returns and the process must be trapped on the exit from the
munmap syscall.

v2: * release parasite_ctl sturcture and use the new interface of
      parasite_prep_ctl

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:09 +04:00
Andrey Vagin
66f21e6b71 restore: catch task on the exit from sigreturn (v4)
A task is stopped here for unmaping restorer blob and restoring a state.

The method is the same as for parasite. CRIU attaches to processes via
ptrace and start to trace all syscalls.

v2: don't use a software breakpoint
v3: stop all thread on the exit from sigreturn
v4: attach to each thread
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:08 +04:00
Andrey Vagin
f43ac0643e restore: save a task state on pstree_item
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:07 +04:00
Andrey Vagin
5a37481914 restore: get real pid for each task (v2)
For the root task the clone syscall returns the pid in criu's pidns,
but for other processes the clone syscall returns PID in the restored
namespace.

The /proc/self link contains the PID value of the current process, so if
we want to determing the PID in a criu's pidns, we should use criu's
/proc.

v2: readlink() does not append a null byte to buf, so we must do that
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:22:58 +04:00
Ruslan Kuprieiev
4eb2872b27 v2 crtools: write pidfile, when service/page server is run as daemon and "--pidfile" is set
When service/page server becomes daemon, we may need to know it's pid.

Signed-off-by: Ruslan Kuprieiev<kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-16 15:45:01 +04:00
Andrey Vagin
7d8ed36c33 cr-restore.c: fixed compilation errors on ARM
Use decode_pointer() to convert a virtual address into a native pointer.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 15:10:58 +04:00
Andrey Vagin
fd58e62b1c mm: map grow-down VMA-s with guard pages
In /proc/pid/maps grow-down VMA-s are shown without guard pages, but
sometime these "guard" pages can contain usefull data. For example if
a real guard page has been remmaped by another VMA. Let's call such
pages as fake guard pages.

So when a grow-down VMA is mmaped on restore, it should be mapped with
one more guard page to restore content of the fake guard page.

https://bugzilla.openvz.org/show_bug.cgi?id=2715
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 14:25:14 +04:00
Andrey Vagin
39e6d7f553 restore: decode exit status in sigchld_handler
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-28 17:16:55 +04:00