2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-30 05:48:05 +00:00

514 Commits

Author SHA1 Message Date
Pavel Emelyanov
8a827ba403 files: Make fd_id_generate_special return ID into pointer
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:17:49 +04:00
Pavel Emelyanov
8b611770aa files: Pass stat information into fd_id_generate_special
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:17:18 +04:00
Pavel Emelyanov
d29238db83 dump: Missing \n in auxv dumping message
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 18:58:00 +04:00
Pavel Emelyanov
54f4f889a5 mm: Move VmaEntries from separate image into Mm one
When writing VMAs we perform too many small writes into vma-.img files.
This can be easily fixed by moving the vma-s into mm-s, all the more
so they cannot be splitted from each other.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 11:44:05 +04:00
Pavel Emelyanov
eb1ae0a025 vma: Turn embeded VmaEntry on vma_area into pointer
On restore we will read all VmaEntries in one big MmEntry object,
so to avoif copying them all into vma_areas, make them be pointable.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 11:44:01 +04:00
Pavel Emelyanov
6da13c687f dump: Merge mm and vmas dumping functions
The plan is to merge vma images into mm ones (see further
patching), so prepare the dumping code for that.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 11:43:59 +04:00
Pavel Emelyanov
608db864a3 vmas: Don't call stat on vm file twice
When parsing mappings in proc, we fstat vm file, later,
when dumping it, we stat it again to fill fd_parms.
The 2nd stat is not required, we can keep the stat in
vma_area.

This removed 35% of all stat calls on dump of basic container.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-03 00:18:32 +04:00
Pavel Emelyanov
740eb9c101 proc-parse: Don't open and stat every single map_files link
Quite a lot of VMAs in tasks map the same file with different
perms. In that case we may skip opening all these files, but
"borrow" one from the previous VMA parsed.

There's little sense in seeking more that just previous VMA,
as same files are rarely (can be though) mapped in different
locations.

After this on a basic Centos6 container the number of opens and
stats in this function drops from ~1500 to ~500.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-31 20:31:06 +04:00
Pavel Emelyanov
c4b5a9d5b0 vma: Don't close socket's inode as fd
The vm_socket_id is union with vm_file_fd and calling
close on it is wrong.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-31 03:14:46 +04:00
Pavel Emelyanov
cc918897b0 irmap: Introduce irmap on-disk cache
When dumping fsnotifies we may go to irmap to get inode->path
mapping. The irmap engine scans FS (in hinted locations) to
get one and it is slow even though we scan only part of the FS.

Since the above scanning is done while tasks are frozen the
freeze time goes up :(

Improve the situation by generating irmap cache in working dir
at pre-dump when tasks get unfrozen.

The on-disk irmap cache is PB file, it sits in -W directory
and can be loaded on dump/pre-dump start in memory. When
resolving the inode->path mapping irmap may meet these entries,
revalidate them and potentially save time.

After pre-dump the (re-)collected irmap data is written back
to irmap cache image. Typically entries written back are the
same read in on cache load.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:16 +04:00
Pavel Emelyanov
751856c8b8 files: Pre-dump file descriptors
We will generate some info about file-descriptors at that
stage. For now these pre-dumped ones would be fsnotifies,
so the pre-dump of a single fd is written as simple as
possible, but enough for that type of FDs pre-dump.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:15 +04:00
Pavel Emelyanov
84ebc64b1f pre-dump: Collect mount info, root and nsmask
Well, we want to pre-dump files (fsnotifies), for that we
will need mountinfo-s and root, and for the latter -- the
current ns mask.

The problem with current ns mask is that its generation is
incorporated into ns IDs generation and dumping. And since
the ids dumping is not performed on pre-dump, let's just
provide a helper for ns-mask generation.

Strictly speaking, the whole ns-mask idea is not great, but
it's to be fixed later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:15 +04:00
Pavel Emelyanov
040fe7712f pre-dump: Enforce track-mem and leave-running in cr_pre_dump_tasks
Service will call the pre-dump routine, so this is factoring out
enforcin options for CLI and RPC.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 15:58:39 +04:00
Cyrill Gorcunov
41a1b2da32 dump: collect_mappings -- Fix message text
@vma_area_list::longest is in pages not bytes.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-29 15:42:34 +04:00
Andrey Vagin
3b7cc1e45e pre-dump: handle errors from page_xfer_dump_pages()
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-24 15:47:45 +04:00
Andrey Vagin
e970fccb69 dump: call cpu_init() in pre-dump
It's required for dumping fpu state

[root@avagin-fc19-cr crtools]# bash  test/zdtm.sh  -d -P -i 2  static/fpu00
Execute zdtm/live/static/fpu00
./fpu00 --pidfile=fpu00.pid --outfile=fpu00.out
Dump 8411
Dump 8411
Check results 8411
01:29:30.550:  8411: FAIL: fpu00.c:78: 0.806165 != -nan
 (errno = 11 (Resource temporarily unavailable))

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-20 14:13:17 +04:00
Pavel Emelyanov
2ec003eb85 predump: Spot the VDSO page/vma in victim task
While doing pre-dump we don't do proper VDSO fixups, thus at
this stage we may fail the should_dump_page() checks -- it
will tread VDSO pages are 'regular' and may skip dumping some
of them.

This is not bad as is, but the subsequent dump will properly
spot VDSO are and will try to dump _all_ pages from it. And
if checks for soft-dirty will report that some pages are clean,
dump will try to locate those in parent images and would fail.

Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-14 22:54:17 +04:00
Pavel Emelyanov
91011328fa criu: Several formatting fixes
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-14 09:33:19 +04:00
Andrew Vagin
a2eaa5cf44 ptrace: the task state is restored automatically
It's a feature of PTRACE_SEIZE.  So we need to do something, only
if we want to change the state.

[xemul: If task _was_ in stopped state before dump and we want them
 to stay alive after dump, the existing code queues one more STOP
 to it. This affects subsequent dump, as we seize a stopped task
 with STOP in queue.

 One more item in TODO list -- support stopped tasks with STOP in
 queue :)
]

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 20:36:12 +04:00
Andrey Vagin
800c4d09bb seize: add more detailed comment
Why we do several attempts to freeze tasks.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 16:41:14 +04:00
Andrey Vagin
bf67879062 crtools: rework freeze of processes (v2)
Before this patch crtools freezes processes and if something is changed,
it unfreezes all processes and starts again from the beginning.

If if are going to dump fork-bomb, this method doesn't work. Because a
big tree is always changed.

We don't need unfreeze processes, which have been frozen and this patch
does that.

This patch uses depth-first search (DFS) for traversing a process tree.

A root task is frozen at first turn, than a child is frozen, than a
child of child and so on.

When all children of one process are frozen, criu reads the list of
children again and check that nothing changed. This processes continues
until all of them will not be frozen. Afte that a new child can not be
appeared, because all children for children are frozen too.

v2: add comments in code
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 15:16:11 +04:00
Andrey Vagin
978badc196 dump: rework freeze of threads
When we try to freeze threads, some of them can exit
and a few new ones can be born. Currently we unfreeze process free
int this case, so we have the same chance to failed in the next case.

I suggest to not unfreeze frozen threads, just try to update thread list
and freeze unfrozen threads.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 15:16:10 +04:00
Pavel Emelyanov
a37cf7a407 seize: Add some comments about unseizing task
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-19 22:48:55 +04:00
Andrey Vagin
9cec289f74 dump: restore state correctly
Currently all task are restored as alive, but stopped tasks
must be restored as stopped.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-19 22:47:45 +04:00
Andrey Vagin
625ad48dfd dump: don't restore a state of threads
It is not needed, because stat is a property of task,
so we can restore a state of task and it should be enough.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-19 22:47:43 +04:00
Andrey Vagin
d7cf271ed4 crtools: preload libraries (v2)
Libraries (plugins) is going to be used for dumping and restoring
external dependencies (e.g. dbus, systemd journal sockets, charecter
devices, etc)

A plugin can have the cr_plugin_init() and cr_plugin_fini functions for
initialization and deinialization.

criu-plugin.h contains all things, which can be used in plugins.

v2: rename lib to plugin
v3: add a default value for a plugin path.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-19 21:48:33 +04:00
Pavel Emelyanov
5535e44dac crtools: Keep cr_dump_task's post-dump ret code
Broken by 09b7b57c.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-13 14:32:18 +04:00
Pavel Emelyanov
ae98ef6ae0 mount: Factor out mount tree build for NEWNS and non-NS cases
We anyway build the tree, in the NS case -- few calls later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-12 16:19:48 +04:00
Cyrill Gorcunov
cf1ce5f817 mount: Build mount tree on dump restore early, if needed
For paths resolution we will need mount tree to be parsed
and built, but it's not that simple -- the current code
implies that once parsed the tree must not be re-parsed
again, so we pass @parse argument from a caller: if a task
we're restoring do not use mount namespace, we should parse
mount tree early, otherwise defer this action until mount
tree is read from the image.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-11 16:05:19 +04:00
Andrey Vagin
4850fd94a8 crtools: move cr_options in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 18:17:52 +04:00
Andrey Vagin
1300cf4915 crtools: move all stuff about fdset in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 15:24:48 +04:00
Andrey Vagin
0d1dfc2e08 crtools: move all stuff about vma together
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:43:49 +04:00
Andrey Vagin
824403a009 crtools: create new header for servicefd stuff (v2)
v2: generate patch relative to the official git.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:43:02 +04:00
Andrey Vagin
a6edbcf669 crtools: don't include restorer.h in proc_parse.h
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:37:55 +04:00
Cyrill Gorcunov
fcfa58026c dump: Don't forget to cleanup link remap if needed
In case if checkpoint is failed or -R option passed
we need to remove link remap files created during
dump procedure.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-01 13:36:07 +04:00
Cyrill Gorcunov
7fe6220883 dump: Unlock network if -R option passed
It's been found that if -R (leave task running after checkpoint)
option passed we don't unlock network, nether we clean service
files (such as link remaps).

After a long discussion we choose the following path: if -R option
is passed, it means a user is quite confident in what he is doing
and consistency of the resources (file system) is achieved by
a user himself with help of post-dump script. Also a user knows
that the network will be unlocked and accept such case.

So here we check of -R being passed in command line and once
checkpoint complete we unlock the network.

Cleaning up of link remaps is addressed in another patch.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-01 13:35:48 +04:00
Jamie Liu
71e1a99523 pre-dump: do not disconnect from page server before writing to it
Signed-off-by: Jamie Liu <jamieliu@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-10 22:56:33 +04:00
Pavel Emelyanov
00ae0d330a dump: Add comment how we dump zombies in pidns case
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-10 15:28:29 +04:00
Pavel Emelyanov
20d64b4326 dump: Install target ns' proc fd as service fd
Don't carry it around in a static global variable. Would
be useful for pidns leaks (processes entered one) scan.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-10 15:07:01 +04:00
Pavel Emelyanov
6bf63b3f01 security: Push full creds info into may_xxx checks
It's not enough to check only uids on dump and restore -- we need to
check e-ids and s-ids now (and caps in the future).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 15:48:44 +04:00
Ruslan Kuprieiev
9907302823 dump: initilize vmas in the very beginning
When dump/pre-dump failed before initializing vmas, free_mappings(&vmas)
is called and this cause segfault. Lets initialize vmas in the very
beginning of dump.

Signed-off-by: Ruslan Kuprieiev <kurpuser@gmail.com>

seqfault.patch
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-02 13:27:32 +04:00
Pavel Emelyanov
e8f4840049 dump: Add some comments to tasks collecting code
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 06:28:33 +04:00
Pavel Emelyanov
91389f8782 security: Introduce (rather basic) security restrictions for C/R
Right now we have an ability to launch the C/R service from root
and execure dump requests from unpriviledged users. Not to be bad
guys, we deny dumping tasks belonging to user, that cannot be
"watched" (traced, read /proc, etc.) by the dumper.

In the future we will use this "engine" when launched with suid
bit, and (probably) will have more sophisticated policy.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 06:16:17 +04:00
Andrey Vagin
07930a8df4 ns: replace pid on id in per-namespace files
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:17:04 +04:00
Andrey Vagin
79d47a939d crtoools: add support of stopped tasks (v2)
Currently we catch processes on the exit point from sigreturn.
If a task must be restored in the stopped state, we can send SIGSTOP
before detaching from it.

v2: add more descriptive comment about skipping SIGSTOP in ptrace.c
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 11:21:34 +04:00
Pavel Emelyanov
2169020bea dump: Add comment saying why we dump zombies separately from alive tasks
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 11:17:28 +04:00
Andrey Vagin
ca3a23ec9c dump: transfer pstree_item in dump_task_core_all
Currently we take pid and core from it and we are going to take state.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 11:14:44 +04:00
Andrey Vagin
dbd41b522f proc: allow parse_thread to use an existent buffer
parse_thread allocated a buffer for threads and then it initialized read
pid for each thread.

Now we want to use it on restore and in this moment we already have
a buffer with initialized virt pid-s, so we need to initialize read
pid-s only.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:04 +04:00
Andrey Vagin
3e5ad587f4 parse_proc: move parse_threads from cr-dump.c
It will be used in cr-restore.c for stopping threads on the exit from
sigreturn.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:03 +04:00
Ruslan Kuprieiev
4c504f5ddd dump: Don't dump if children's uids are not equal to client's uid
This is for security -- service is run from root, but serves requests from non-root
processes. Thus, we shouldn't allow for anyone to dump suid programs that a run from
under unpriviledged processes.

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-13 15:53:42 +04:00