2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-27 20:37:57 +00:00

1696 Commits

Author SHA1 Message Date
Pavel Emelyanov
eb1ae0a025 vma: Turn embeded VmaEntry on vma_area into pointer
On restore we will read all VmaEntries in one big MmEntry object,
so to avoif copying them all into vma_areas, make them be pointable.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 11:44:01 +04:00
Andrey Vagin
0ad373ba6c make: config add test for ptrace_peeksiginfo_args
Currently we check PTRACE_PEEKSIGINFO and if it's defined in a system
header, we suppose that ptrace_peeksiginfo_args is defined there too.

But due to a bug in glibc, this check doesn't work. Now we have F20,
where ptrace_peeksiginfo_args is defined in sys/ptrace and F21 where
it isn't defined.

commit 9341dde4d56ca71b61b47c8b87a06e6d5813ed0e
Author: Mike Frysinger <vapier@gentoo.org>
Date:   Sun Jan 5 16:07:13 2014 -0500

    ptrace.h: add __ prefix to ptrace_peeksiginfo_args

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-03 23:36:12 +04:00
Pavel Emelyanov
446fdd7200 rst: Collect VmaEntries only once on restore
Right now we do it two times -- on shmem prepare and
on the restore itself. Make collection only once as
we do for fdinfo-s -- root task reads all stuff in and
populates tasks' rst_info with it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-03 23:35:03 +04:00
Pavel Emelyanov
0786f831d7 mem: Move shmem preparation routine and rename
We'll collect VmaEntries early before fork.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-03 23:34:12 +04:00
Pavel Emelyanov
98fbeb8d0a vma: Vma allocation helper is now function
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-03 17:18:42 +04:00
Pavel Emelyanov
608db864a3 vmas: Don't call stat on vm file twice
When parsing mappings in proc, we fstat vm file, later,
when dumping it, we stat it again to fill fd_parms.
The 2nd stat is not required, we can keep the stat in
vma_area.

This removed 35% of all stat calls on dump of basic container.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-03 00:18:32 +04:00
Pavel Emelyanov
bd7bf7bd39 anon-inode: Don't readlink fd/fd multiple times
The is_foo_link readlinks the lfd to check. This makes
anon-inodes dumping readlink several times to find proper
dump ops. Optimize this thing.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-02 22:14:29 +04:00
Pavel Emelyanov
e4a2618724 pb: Number PB_ constants
For easier logs-to-constant eyes mapping.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-02 01:20:49 +04:00
Pavel Emelyanov
740eb9c101 proc-parse: Don't open and stat every single map_files link
Quite a lot of VMAs in tasks map the same file with different
perms. In that case we may skip opening all these files, but
"borrow" one from the previous VMA parsed.

There's little sense in seeking more that just previous VMA,
as same files are rarely (can be though) mapped in different
locations.

After this on a basic Centos6 container the number of opens and
stats in this function drops from ~1500 to ~500.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-31 20:31:06 +04:00
Pavel Emelyanov
a90613cf9d mem: Fix zero_page_pfn type
It is compared to u64, so should be such as well

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-31 00:08:47 +04:00
Pavel Emelyanov
f9c8e3a2cd pagemap: Factor out pfn retrieving for vdso and zero page
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 23:34:53 +04:00
Pavel Emelyanov
ab57e56202 stats: Add irmap resolve time
It's useful to know this value.

W/o cache (first pre-dump) on minimal container the irmap
resolve time is ~0.2 sec. With cache (next pre-dumps or
final dump) on the same container the irmap resolve time
is 10 times less.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:16 +04:00
Pavel Emelyanov
cc918897b0 irmap: Introduce irmap on-disk cache
When dumping fsnotifies we may go to irmap to get inode->path
mapping. The irmap engine scans FS (in hinted locations) to
get one and it is slow even though we scan only part of the FS.

Since the above scanning is done while tasks are frozen the
freeze time goes up :(

Improve the situation by generating irmap cache in working dir
at pre-dump when tasks get unfrozen.

The on-disk irmap cache is PB file, it sits in -W directory
and can be loaded on dump/pre-dump start in memory. When
resolving the inode->path mapping irmap may meet these entries,
revalidate them and potentially save time.

After pre-dump the (re-)collected irmap data is written back
to irmap cache image. Typically entries written back are the
same read in on cache load.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:16 +04:00
Pavel Emelyanov
529f1a099e img: Needed declarations for irmap-cache image file
The irmap-cache is PB-file (like the stat-* ones).
See commtns in next patches for more details.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:15 +04:00
Pavel Emelyanov
751856c8b8 files: Pre-dump file descriptors
We will generate some info about file-descriptors at that
stage. For now these pre-dumped ones would be fsnotifies,
so the pre-dump of a single fd is written as simple as
possible, but enough for that type of FDs pre-dump.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:15 +04:00
Pavel Emelyanov
84ebc64b1f pre-dump: Collect mount info, root and nsmask
Well, we want to pre-dump files (fsnotifies), for that we
will need mountinfo-s and root, and for the latter -- the
current ns mask.

The problem with current ns mask is that its generation is
incorporated into ns IDs generation and dumping. And since
the ids dumping is not performed on pre-dump, let's just
provide a helper for ns-mask generation.

Strictly speaking, the whole ns-mask idea is not great, but
it's to be fixed later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:15 +04:00
Pavel Emelyanov
c18c733b7c proc-parse: Parse pid's fdinfo entries
The existing code opens "self" and parses what's in there,
just twist the code a little to accept generic pid.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:15 +04:00
Pavel Emelyanov
9753501297 rpc: Introduce CLI's --action-script analogue
Service shouldn't call client provided scripts, as it
creates a security issue (client may be unpriviledged,
while the service is).

In order to let caller do what it would normally do with
criu-scripts, make criu notify it about scripts. Caller
then do whatever it needs and responds back.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 15:58:45 +04:00
Pavel Emelyanov
29952618e3 daemon: Write own daemon routine
RPC will start page-server daemon and needs to get the
controll back to report back to caller, but the glibc's
daemon() does exit() in parent context preventing it.

Thus -- introduce own daemonizing routine.

Strictly speaking, this is not pure daemon() clone, as the
parent process has to exit himself. But this is OK for now.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 15:58:41 +04:00
Andrey Vagin
af510ae01a mm: don't dump the zero page
If someone reads untouched page, the kernel maps the zero page
to this address. This page will not have the SOFT_DIRTY bit and it must
not be dumped.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 14:31:39 +04:00
Pavel Emelyanov
839a3c6122 files: Don't call fstatfs twice
When filling fd_parms we do call statfs, no need to call it
again later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-29 20:09:27 +04:00
Pavel Emelyanov
7c22422659 irmap: Reverse dev:inode to path mapping
For inotify/fanotifies we cannot always open inodes by a
handle and have to scan directories searching for the inode
path :(

Fortunately, in most of the containers' cases fsnotifies are
put in "typical" places. These are used as hints for scanner.

The best way to go is use openvz's ploop over such filesystems.
Long term solution is to fix NFS to provide opening by handle.

Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-29 17:30:05 +04:00
Pavel Emelyanov
2c500de8ed criu: Remove parent-img service fd
It's no longer needed. All parent manipulations are done in-place
using CR_PARENT_LINK name.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-24 16:12:46 +04:00
Pavel Emelyanov
91011328fa criu: Several formatting fixes
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-14 09:33:19 +04:00
Pavel Emelyanov
52dec20a17 files: Save fstype on fd_parms
We will need to special-care NFS silly-rename files, thus
we need to know which FS a file belongs to.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-27 21:14:09 +04:00
Pavel Emelyanov
7ab8a3261b show: Implement simple images filtering
The -F|--fields option specifies which fields (by name, comma
separated) should be printed.

For nested fields all names in path should be specified.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-27 15:58:27 +04:00
Pavel Emelyanov
a4525522b5 net: Add ability to dump external links with plugins
If we meet a link we cannot dump we call plugin to check
whether it's the link, that should be treated as external.

Note, that on restore we don't call any plugins, but
consider the setup-namespace script to move the respective
link into the namespace. Links are not hierarchical and
can be moved between namespaces easily, so it's OK to
delegate the link creation to the script.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 22:39:04 +04:00
Pavel Emelyanov
c18b30d0a9 mount: Restore external bind-mounts with plugins
All the entries with with_plugin set will be mounted by plugin.
The interesting case is when we do the pivot-root restore. In this
case we call restore callback very early (before we unmount the old
tree) and ask it to create the mountpoint at temporary location.
Later we move the mount to proper place.

The old_root argument of the callback is where it can find files
in the original mount namespace.

The is_file is return-argument. Sine files and directories cannot be
bind-mounted to each-other, the callback should create the mountpoint
itself and report whether it created file or directory.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 11:07:41 +04:00
Pavel Emelyanov
d21ff39aab mount: Dump external bind-mounts with plugins
External bind mounts are those with source sitting outside of the
current FS view. Such are detected in validate_mounts(), so we
just go ahead and call plugins.

The plugin is provided with the mountpoint to decide whether it's
his or not (what else does the guy need?) and an ID with this it
can identify the mountpoint in /proc. The same ID will be used at
restore time to find the needed restore info.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 11:07:39 +04:00
Andrew Vagin
c8e9701f21 ptrace: check, that PTRACE_LISTEN isn't defined yet
In file included from arch/x86/crtools.c:11:0:
include/ptrace.h:16:0: error: "PTRACE_LISTEN" redefined [-Werror]
 #define PTRACE_LISTEN  0x4208
 ^
In file included from include/ptrace.h:5:0,
                 from arch/x86/crtools.c:11:
/usr/include/sys/ptrace.h:150:0: note: this is the location of the previous definition
 #define PTRACE_LISTEN PTRACE_LISTEN
 ^
cc1: all warnings being treated as errors
make[1]: *** [arch/x86/crtools.o] Error 1

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-23 13:54:19 +04:00
Andrew Vagin
a2eaa5cf44 ptrace: the task state is restored automatically
It's a feature of PTRACE_SEIZE.  So we need to do something, only
if we want to change the state.

[xemul: If task _was_ in stopped state before dump and we want them
 to stay alive after dump, the existing code queues one more STOP
 to it. This affects subsequent dump, as we seize a stopped task
 with STOP in queue.

 One more item in TODO list -- support stopped tasks with STOP in
 queue :)
]

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 20:36:12 +04:00
Kir Kolyshkin
fc9fea2fdc criu-log.h: fit macros in 80 columns
Before this patch, backslash was at 81th column which makes the text
twice longer on a standard 80 col terminal, which is quite annoying.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 16:09:13 +04:00
Andrey Vagin
6bbdec26f3 files: add ability to set callbacks for files (v7)
Here is nothing interecting. If a file can't be dumped by criu,
plugins are called. If one of plugins knows how to dump the file,
the file entry is marked as need_callback. On restore if we see
this mark, we execute plugins for restoring the file.

v2: Callbacks are called for all files, which are not supported by CRIU.
v3: Call plugins for a file instead of file descriptor. A few file
descriptors can be associated with one file.
v4: A file descriptor is opened in a callback. It's required for
    restoring anon vmas.
v5: Add a separate type for unsupported files
v6: define FD_TYPES__UNSUPP
v7: s/unsupp/ext (external)

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 16:07:38 +04:00
Andrey Vagin
9e3f4451e1 unix: add ability to set callbacks for external sockets (v5)
We don't know a state behind an external socket. It depends on logic
of the program, which handles this socket.

This patch adds ability to load a library with callbacks for dumping
and restoring external sockets.

This patch introduces two callbacks cr_plugin_dump_unix_sk and
cr_plugin_restore_unix_sk. If a callback can not handle a socket, it
must return -ENOTSUP.

The main questions, what kind of information should be tranfered in
these callbacks. Pls, think a few minutes about that and send me
your opinion.

v2: Use uflags instread of adding a new field
v3: clean up
v4: Unsuitable callbacks return -ENOTSUP.
v5: set USK_CALLBACK, if a socket was dumped by callback.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 15:28:41 +04:00
Andrey Vagin
e027f116e4 plugin: add a function to get a descriptor to the image dir
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-19 21:49:39 +04:00
Andrey Vagin
d7cf271ed4 crtools: preload libraries (v2)
Libraries (plugins) is going to be used for dumping and restoring
external dependencies (e.g. dbus, systemd journal sockets, charecter
devices, etc)

A plugin can have the cr_plugin_init() and cr_plugin_fini functions for
initialization and deinialization.

criu-plugin.h contains all things, which can be used in plugins.

v2: rename lib to plugin
v3: add a default value for a plugin path.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-19 21:48:33 +04:00
Andrey Vagin
2add5b87fa plugin: allow to use logging function in plugins
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 21:52:12 +04:00
Pavel Emelyanov
ba96e646a4 page-read: Fix naming
The open_page_at one is quite obfuscating.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:55:22 +04:00
Tikhomirov Pavel
41433f4043 v3 deduplication: add auto-dedup local
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:52:04 +04:00
Tikhomirov Pavel
4904878258 v3 deduplication: add auto-dedup option
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:51:47 +04:00
Tikhomirov Pavel
d866bd196b v3 deduplication: look up for old pages in previous snapshot
old snapshot from "parent" symlink, and pids from pagemap-PID.img files

Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:51:42 +04:00
Tikhomirov Pavel
ea403c7a08 v3 deduplication: add dedup comand to criu
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:51:22 +04:00
Tikhomirov Pavel
5db1adc567 v3 page-read: add open_page_rw to open pages in O_RDWR mode
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:51:20 +04:00
Tikhomirov Pavel
6336d537ac v3 page-read: make function seek_pagemap_page can be used in dedup
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:51:09 +04:00
Tikhomirov Pavel
0f90727ddb v3 page-read: add reuse posibility of pagemap2iovec
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:50:49 +04:00
Ruslan Kuprieiev
a1e7407397 service: move constants to cr-service-const.h
Such constants as CR_MAX_MSG_SIZE and CR_DEFAULT_SERVICE_ADDRESS are need to be used in both service and lib.

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:26:10 +04:00
Kir Kolyshkin
b11f24fd5d criu check: don't run as non-root
In case criu check is run as non-root, a lot of information is printed
to a user, with the only missing bit is it should run it as root.

Fix it.

I still don't like the fact that some other stuff is printed here,
like the timestamp and the __FILE__:__LINE__, but this should be
fixed separately.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-13 13:58:45 +04:00
Pavel Emelyanov
ae98ef6ae0 mount: Factor out mount tree build for NEWNS and non-NS cases
We anyway build the tree, in the NS case -- few calls later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-12 16:19:48 +04:00
Cyrill Gorcunov
e9f9fdb9b3 headers: Drop uintX_t usage
We have a mess of uintX_t and uX usage. Drop off uintX_t ones.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-12 10:03:07 +04:00
Kir Kolyshkin
26fda7a319 space-before-tab whitespace cleanup
Remove space before tab characters.

Found by git grep ' 	' (Space, Ctrl-V, Tab in shell).

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-12 10:00:53 +04:00