2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 13:28:27 +00:00

28 Commits

Author SHA1 Message Date
Pavel Emelyanov
350a7a982a Revert "cgroups: Add ability to reuse existing cgroup yard directory"
Reasoning: some systems have /sys/fs/cgroup stuff mounted as read-only
and we have to either remount it rw or create our own set. The former
doesn't look sane as this rw remounting is also done by ststemd, so
let's return back to manual cgyard construction.

This reverts commit 860df95f859cf7ba23b57fc832793c623a5897e4.

Conflicts:
	cgroup.c
	include/cr_options.h

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-16 19:15:20 +03:00
Cyrill Gorcunov
c7d646afb3 cgroups: Introduce cgroup management modes
When been playing wich checkpoint/restore of container I found
that we can't reuse existing controller if they were pre-created.
For example currently in PCS7 we're bindmount cgroups which belong
to a container in a form of

 /sys/fs/cgroup/<controller>/<container> ==> /sys/fs/cgroup/<controller>

so that CRIU dumps such configuration fine but on restore
it recreates controllers from the scratch which we would
like to bindmount them and ask CRIU to restore subcgroups
and their parameters.

So I extended --manage-cgroups option to take <mode> arguments.
Detailed description in docs.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-15 21:21:56 +03:00
Cyrill Gorcunov
860df95f85 cgroups: Add ability to reuse existing cgroup yard directory
Currently we always create temporary directory where we restore
cgroups, but this won't work in case if mounting cgroups is forbidden
from inside of a container for some reason (as in OpenVZ kernel).

So one can pass --cgroup-yard option to specify an existing
directory where cgroups are living. By default we assume it
lays in /sys/fs/cgroup.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-15 21:21:54 +03:00
Tycho Andersen
fcae4f3954 mnt: add --enable-external-masters option
This option enables external (slave) bind mounts to be resolved.

v2: don't always assume that when the master id matches, the mounts match

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-10 17:54:51 +03:00
Tycho Andersen
0afffc9dc1 mnt: add --enable-external-sharing flag
With this flag, external shared bind mounts are attempted to be resolved
automatically.

v2: don't always assume when the sharing matches that the mount matches

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-10 17:54:12 +03:00
Tycho Andersen
aebfabb5ad mnt: add --ext-mount-map auto option
When this option is specified, if an external (private) bind mount is not
specified by --ext-mount-map KEY:VAL then it is attempted to be resolved
automatically.

v2: introduce find_best_external_match, which looks for the best match based on
    sharing/slave ids; don't try to resolve fsroot_mounted() mountpoints
v3: get rid of really_collect_self_mounts
v4: get rid of fsroot_mounted() check when autodetecting external mounts

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-10 17:52:14 +03:00
Cyrill Gorcunov
391d589482 options: Use union for @daemon and @restore_detach
They both are using 'd' option in different context
though, lets give them two names.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-04-06 18:06:16 +03:00
Cyrill Gorcunov
fd07bc7791 cpu: Add 'ins' mode to --cpu-cap option
In this mode we test if target cpu has all features present
in image file but do not require bit to bit match: target cpu
may be a new one with more features present.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-26 18:15:46 +03:00
Saied Kazemi
0412152fc5 Add inherit fd support
There are cases where a process's file descriptor cannot be restored
from the checkpoint images.  For example, a pipe file descriptor with
one end in the checkpointed process and the other end in a separate
process (that was not part of the checkpointed process tree) cannot be
restored because after checkpoint the pipe will be broken.

There are also cases where the user wants to use a new file during
restore instead of the original file at checkpoint time.  For example,
the user wants to change the log file of a process from /path/to/oldlog
to /path/to/newlog.

In these cases, criu's caller should set up a new file descriptor to be
inherited by the restored process and specify the file descriptor with the
--inherit-fd command line option.  The argument of --inherit-fd has the
format fd[%d]:%s, where %d tells criu which of its own file descriptors
to use for restoring the file identified by %s.

As a debugging aid, if the argument has the format debug[%d]:%s, it tells
criu to write out the string after colon to the file descriptor %d.  This
can be used, for example, as an easy way to leave a "restore marker"
in the output stream of the process.

It's important to note that inherit fd support breaks applications
that depend on the state of the file descriptor being inherited.  So,
consider inherit fd only for specific use cases that you know for sure
won't break the application.

For examples please visit http://criu.org/Category:HOWTO.

v2: Added a check in send_fd_to_self() to avoid closing an inherit fd.
    Also, as an extra measure of caution, added checks in the inherit fd
    look up functions to make sure that the inherit fd hasn't been reused.
    The patch also includes minor cosmetic changes.

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-12-10 12:48:30 +03:00
Cyrill Gorcunov
ff1a751a89 opt: cpu-cap -- Introduce "none" and "cpuinfo" arguments
They will serve to choose capability level when migrating
images between various hardware nodes.

Note it's bare functionality introduced in this commit,
the real implementation is in next patches.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:25:56 +04:00
Pavel Emelyanov
53957fadc3 restore: Introduce the --restore-sibling option
We have a slight mess with how criu restores root task.
Right now we have the following options.

1) CLI
	a) Usually
	task calling criu
	 `- criu
	     `- root restored task

	b) when --restore-detached AND root has pdeath_sig

	task calling criu
	 `- criu
	 `- root restored task

2) Library/SWRK
	task using lib/swrk
	 `- criu
	 `- root restored task

3) Standalone service
	a) Usually
	service
	 `- service sub task
	     `- root restored task

	b) when root has pdeath_sig
	criu service
	 `- criu sub task
	 `- root restored task

It would be better is CRIU always restored the root task as sibling,
but we have 3 constraints:

First, the case 1.a is kept for zdtm to run tests in pid namespaces
on 3.11, which in turn doesn't allow CLONE_PARENT | CLONE_NEWPID.

Second, CLI w/o --restore-detach waits for the restored task to die and
this behavior can be "expected" already.

Third, in case of standalone service tasks shouldn't become service's
children.

And I have one "plan". The p.haul project while live migrating tasks
on destination node starts a service, which uses library/swrk mode. In
this case the restored processes become p.haul service's kids which is
also not great.

That said, here's the option called --restore-child that pairs the
--restore-detach like this:

* detached AND child:

task
 `- criu restore (exits at the end)
 `- root task

The root task will become task's child.
This will be default to library/swrk.
This is what LXC needs.

* detach AND !child

task
 `- criu restore (exits at the end)
     `- root task

The root task will get re-parented to init.
This will be compatible with 1.3.
This will be default to standalone service and
to my wish with the p.haul case.

* !detach AND child

task
 `- criu restore (waits for root task to die)
 `- root task

This should be deprecated, so that criu restore doesn't mess
 with task <-> root task signalling.

* !detach AND !child

task
 `- criu restore (waits for root task to die)
     `- root task

This is how plain criu restore works now.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@openvz.org>
2014-09-10 18:30:30 +04:00
Pavel Emelyanov
069bdd9674 scripts: Move scripts code into separate sources
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-05 13:48:21 +04:00
Pavel Emelyanov
7058714fda service: Add ability to inherit page server socket
The swrk action is turning out to be a cool thing. We can
spawn criu with swrk action with some FD being open, then
ask for dump/pre-dump/page-server telling it that some
descriptor it needs is "out there".

This patch lets us specify that the page server communication
channel is already in criu's fdtable.

TODO: teach regular service to accept fd via service socket.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-03 20:50:12 +04:00
Saied Kazemi
9eec8b03af Use --root instead of --aufs-root
When dumping Docker containers using the AUFS graph driver, we can
use the --root option instead of --aufs-root for specifying the
container's root.  This patch obviates the need for --aufs-root
and makes dump CLI more consistent with restore CLI.

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-27 14:31:40 +04:00
Saied Kazemi
d8b41b6525 Added AUFS support.
The AUFS support code handles the "bad" information that we get from
the kernel in /proc/<pid>/map_files and /proc/<pid>/mountinfo files.
For details see comments in sysfs_parse.c.

The main motivation for this work was dumping and restoring Docker
containers which by default use the AUFS graph driver.  For dump,
--aufs-root <container_root> should be added to the command line options.
For restore, there is no need for AUFS-specific command line options
but the container's AUFS filesystem should already be set up before
calling criu restore.

[ xemul: With AUFS files sometimes, in particular -- in case of a
  mapping of an executable file (likekely the one created at elf load),
  in the /proc/pid/map_files/xxx link target we see not the path
  by which the file is seen in AUFS, but the path by which AUFS
  accesses this file from one of its "branches". In order to fix
  the path we get the info about branches from sysfs and when we
  meet such a file, we cut the branch part of the path. ]

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-21 18:35:22 +04:00
Tycho Andersen
94f6c87c9f cg: add --cgroup-root option
The motivation for this is to be able to restore containers into cgroups other
than what they were dumped in (if, e.g. they might conflict with an existing
container). Suppose you have a container in:

memory:/mycontainer
cpuacct,cpu:/mycontainer
blkio:/mycontainer
name=systemd:/mycontainer

You could then restore them to /mycontainer2 via --cgroup-root /mycontainer2.
If you want to restore different controllers to different paths, you can
provide multiple arguments, for example, passing:

--cgroup-root /mycontainer2 --cgroup-root cpuacct,cpu:/specialcpu \
--cgroup-root name=systemd:/specialsystemd

Would result in things being restored to:

memory:/mycontainer2
cpuacct,cpu:/specialcpu
blkio:/mycontainer2
name=systemd:/specialsystemd

i.e. a --cgroup-root without a controller prefix specifies the new default root
for all cgroups.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-19 12:58:36 +04:00
Tycho Andersen
f95b05eb75 opts: add --manage-cgroups option
criu managed cgroups is now an opt-in thing, so by default criu does not manage
(i.e. dump or restore) cgroups. This allows users to use the previous behavior.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-12 14:32:50 +04:00
Pavel Emelyanov
84eb0a1927 criu: Restore tasks as siblings in swrk
Andrey validly pointed out, that restoring pdeath_sig is not
compatible with criu_restore_child() call -- after criu restore
children, it will exit and fire the pdeath_sig into restored
tree root, potentially killing it.

The fix for that could be -- when started in swrk more, criu can
restore tree not as children tasks, but as siblings, using the
CLONE_PARENT flag when fork()-ing the root task.

With this we should also take care about errors handing -- right
now criu catches the SIGCHILD from dying children tasks, and
since we plan to create them be children of the criu parent (the
library caller) we will not be able to catch them. To do so we
SEIZE the root task in advance thus causing all SIGCHLD-s go to
criu, not to its parent.

Having this done we no longer need the SUBREAPER trick in the
library call -- tasks get restored right as callers kids :)

Some thoughts for future -- using this trick we can finally make
"natural" restoration of shell jobs. I.e. -- make criu restore
some subtree right under bash, w/o leaving itself as intermediate
task and w/o re-parenting the subtree to init after restore.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrey Vagin <avagin@parallels.com>
2014-07-01 16:16:07 +04:00
Pavel Emelyanov
c7e0042946 crtools: Introduce the --ext-mount-map option (v3)
On dump one uses one or more --ext-mount-map option with A:B arguments.
A denotes a mountpoint (as seen from the target mount namespace) criu
dumps and B is the string that will be written into the image file
instead of the mountpoint's root.

On restore one uses the same --ext-mount-map option(s) with similar
A:B arguments, but this time criu treats A as string from the image's
root field (foobar in the example above) and B as the path in criu's
mount namespace the should be bind mounted into the mountpoint.

v3:
* Added documentation
* Added RPC bits
* Changed option name into --ext-mount-map
* Use colon as key and value separator

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:36:30 +04:00
Deyan Doychev
69a6bf4439 criu: Add exec-cmd option (v3)
The --exec-cmd option specifies a command that will be execvp()-ed on successful
restore. This way the command specified here will become the parent process of
the restored process tree.

Waiting for the restored processes to finish is responsibility of this command.

All service FDs are closed before we call execvp(). Standad output and error of
the command are redirected to the log file when we are restoring through the RPC
service.

This option will be used when restoring LinuX Containers and it seems helpful
for perf or other use cases when restored processes must be supervised by a
parent.

Two directions were researched in order to integrate CRIU and LXC:

1. We tell to CRIU, that after restoring container is should execve()
   lxc properly explaining to it that there's a new container hanging
   around.

2. We make LXC set himself as child subreaper, then fork() criu and ask
   it to detach (-d) from restore container afterwards. Being a subreaper,
   it should get the container's init into his child list after it.

The main reason for choosing the first option is that the second one can't work
with the RPC service. If we call restore via the service then criu service will
be the top-most task in the hierarchy and will not be able to reparent the
restore trees to any other task in the system. Calling execve from service
worker sub-task (and daemonizing it) should solve this.

Signed-off-by: Deyan Doychev <deyandoichev@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-25 01:20:02 +04:00
Pavel Emelyanov
edde5fb461 irmap: Add option that forces fsnotify watches paths resolve
When migrating container with copying its FS, the inode numbers
and thus their handles wil change. This will make the restore of
inotify/fanotify fail, since they do it via fhandles.

We've already faced the problems with fsnotifies on NFS -- they
don't work there. To address this an irmap cache is created on
pre-dump, so to resolve the issue with changed inodes during
migration, we can force the irmap cache build.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-06 15:12:05 +04:00
Cyrill Gorcunov
056047bdf9 criu: Add --cpu-cap option
This option will serve to manage CPU capabilities
to be matched/ignored on restore procedure. At the
moment we introduce 'fpu','all' capability arguments.
By default 'all' is set.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-28 13:36:38 +04:00
Pavel Emelyanov
9753501297 rpc: Introduce CLI's --action-script analogue
Service shouldn't call client provided scripts, as it
creates a security issue (client may be unpriviledged,
while the service is).

In order to let caller do what it would normally do with
criu-scripts, make criu notify it about scripts. Caller
then do whatever it needs and responds back.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 15:58:45 +04:00
Pavel Emelyanov
7ab8a3261b show: Implement simple images filtering
The -F|--fields option specifies which fields (by name, comma
separated) should be printed.

For nested fields all names in path should be specified.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-27 15:58:27 +04:00
Andrey Vagin
d7cf271ed4 crtools: preload libraries (v2)
Libraries (plugins) is going to be used for dumping and restoring
external dependencies (e.g. dbus, systemd journal sockets, charecter
devices, etc)

A plugin can have the cr_plugin_init() and cr_plugin_fini functions for
initialization and deinialization.

criu-plugin.h contains all things, which can be used in plugins.

v2: rename lib to plugin
v3: add a default value for a plugin path.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-19 21:48:33 +04:00
Tikhomirov Pavel
4904878258 v3 deduplication: add auto-dedup option
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-18 14:51:47 +04:00
Cyrill Gorcunov
3f03d139d3 headers: Add missing __CR_ at last endif
For big #ifdef/#endif chunks we do a comment /* */
at #endif. Add missing ones.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-15 16:59:57 +04:00
Andrey Vagin
4850fd94a8 crtools: move cr_options in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 18:17:52 +04:00