2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 04:48:16 +00:00

80 Commits

Author SHA1 Message Date
Pavel Emelyanov
2d3fa5e7d0 net: Use ns walking helper
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-14 18:01:38 +04:00
Pavel Emelyanov
aeb3f547f7 net: Move NETLINK_INET_DIAG from socket.c to net.c
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:48:28 +04:00
Pavel Emelyanov
c57c2cfa64 predump: Collect mnt and net namespaces properly
On pre-dump we collect only two namespaces -- the mnt one
for criu and mnt one again for root task.

This is not correct. We need all mount namespaces to make
the irmap generation work properly and we need all net
namespaces to have parasite sockets created.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:30:31 +04:00
Pavel Emelyanov
45fd143409 parasite: Precreate daemon control sockets
Now we have netns on pstree-item and have the place
where to pre-create daemon socket in needed namespace.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:35:26 +04:00
Pavel Emelyanov
3c7d01f6a7 net: Pre-create nl diag sk
The setns() syscall (called by switch_ns()) can be extremely
slow. If we call it two or more times from the same task the
kernel will synchonously go on a very slow routine called
synchronize_rcu() trying to put a reference on old namespaces.

To avoid doing this more than once I propose to create all
per-ns sockets in one place with one setns call. In this
patch there's on nl diag socket used to collect other sockets
is created this way.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:34:29 +04:00
Pavel Emelyanov
4f9acb6a7c net: Do walk net namespaces to collect
Right now we don't support multiple net namespaces,
but some day we will. Other than this we have a logic
to distinguish cases with no namespaces vs one namespace,
so this walking already makes sence.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:34:05 +04:00
Pavel Emelyanov
7327ffe6a7 ns: Introduce collect_net_namespaces
And move sockets collection there.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:33:56 +04:00
Pavel
8ac80915e0 ns: Factor out namespace switching call
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-30 21:54:11 +04:00
Pavel Emelyanov
295090c1ea img: Introduce the struct cr_img
We want to have buffered images to speed up dump and,
slightly, restore. Right now we use plan file descriptors
to write and read images to/from. Making them buffered
cannot be gracefully done on plain fds, so introduce
a new class.

This will also help if (when?) we will want to do more
complex changes with images, e.g. store them all in one
file or send them directly to the network.

For now the cr_img just contains one int _fd variable.

This patch chages the prototype of open_image() to
return struct cr_img *, pb_(read|write)* to accept one
and fixes the compilation of the rest of the code :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:13 +04:00
Pavel Emelyanov
5f2a7ac27b img: Rename fdset -> imgset
Since we're going to switch from int-fd-s to class-image
soon the fdset name will not fit into the new terminology.

This patch is

 sed -e 's/fdset/imgset/g' -i *
 sed -e 's/imgset_fd/img_from_set/g' -i *
 git mv include/fdset.h include/imgset.h

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:10 +04:00
Pavel Emelyanov
17d44de9af scripts: Use numeric script names
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-05 13:48:26 +04:00
Pavel Emelyanov
069bdd9674 scripts: Move scripts code into separate sources
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-05 13:48:21 +04:00
Cyrill Gorcunov
3146f58317 plugin: Rework plugins API, v2
Here we define new api to be used in plugins.

 - Plugin should provide a descriptor with help of
   CR_PLUGIN_REGISTER macro, or in case if plugin require
   no init/exit functions -- with CR_PLUGIN_REGISTER_DUMMY.

 - Plugin should define a plugin hook with help of
   CR_PLUGIN_REGISTER_HOOK macro.

 - Now init/exit functions of plugins takes @stage
   argument which tells plugin which stage of criu
   it's been called on dump/restore. For exit it
   also takes @ret which allows plugin to know if
   something went wrong and it needs to cleanup
   own resources.

The idea behind is to not limit plugins authors with names
of functions they might need to use for particular hook.

Such new API deprecates olds plugins structure but to keep
backward compatibility we will provide a tiny layer of
additional code to support old plugins for at least a couple
of release cycles.

For example a trivial plugin might look like

 | #include <sys/types.h>
 | #include <sys/stat.h>
 | #include <fcntl.h>
 | #include <libgen.h>
 | #include <errno.h>
 |
 | #include <sys/socket.h>
 | #include <linux/un.h>
 |
 | #include <stdio.h>
 | #include <stdlib.h>
 | #include <string.h>
 | #include <unistd.h>
 |
 | #include "criu-plugin.h"
 | #include "criu-log.h"
 |
 | static int dump_ext_file(int fd, int id)
 | {
 |	pr_info("dump_ext_file: fd %d id %d\n", fd, id);
 |	return 0;
 | }
 |
 | CR_PLUGIN_REGISTER_DUMMY("trivial")
 | CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__DUMP_EXT_FILE, dump_ext_file)

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-03 20:48:36 +04:00
Andrey Vagin
d2012883ab criu: rename current_ns_mask to root_ns_mask (v2)
Now we supports sub-mntns, so root_ns_mask sounds more correct than
current_ns_mask.

v2: typo fix
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-21 22:38:33 +04:00
Deyan Doychev
69a6bf4439 criu: Add exec-cmd option (v3)
The --exec-cmd option specifies a command that will be execvp()-ed on successful
restore. This way the command specified here will become the parent process of
the restored process tree.

Waiting for the restored processes to finish is responsibility of this command.

All service FDs are closed before we call execvp(). Standad output and error of
the command are redirected to the log file when we are restoring through the RPC
service.

This option will be used when restoring LinuX Containers and it seems helpful
for perf or other use cases when restored processes must be supervised by a
parent.

Two directions were researched in order to integrate CRIU and LXC:

1. We tell to CRIU, that after restoring container is should execve()
   lxc properly explaining to it that there's a new container hanging
   around.

2. We make LXC set himself as child subreaper, then fork() criu and ask
   it to detach (-d) from restore container afterwards. Being a subreaper,
   it should get the container's init into his child list after it.

The main reason for choosing the first option is that the second one can't work
with the RPC service. If we call restore via the service then criu service will
be the top-most task in the hierarchy and will not be able to reparent the
restore trees to any other task in the system. Calling execve from service
worker sub-task (and daemonizing it) should solve this.

Signed-off-by: Deyan Doychev <deyandoichev@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-25 01:20:02 +04:00
Pavel Emelyanov
01e88d1c87 rpc: Add ability to specify veth pairs (--veth-pair option)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-12 00:33:02 +04:00
Pavel Emelyanov
eeec1b40a7 net: Add comment about not using ifindex for setlink
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-27 16:00:57 +04:00
Pavel Emelyanov
b88d8e1bf6 net: Don't specify link index when restoring its params
Well, when we create external link with ip utility we cannot
specify its index. Thus we will not be able to move external
device to namespace and ask criu to restore link params.

However, RTM_SETLINK can happily work with the link name only.
And since we do have one in the images, we can omit setting
the index in the requrest.

TODO: Send patch with index specification to iproute2.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 22:40:24 +04:00
Pavel Emelyanov
a4525522b5 net: Add ability to dump external links with plugins
If we meet a link we cannot dump we call plugin to check
whether it's the link, that should be treated as external.

Note, that on restore we don't call any plugins, but
consider the setup-namespace script to move the respective
link into the namespace. Links are not hierarchical and
can be moved between namespaces easily, so it's OK to
delegate the link creation to the script.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 22:39:04 +04:00
Pavel Emelyanov
6d683556b2 net: Factor out unknown devices dumping
For now this "dumping" is just a warning. In the next patches
there will be call to plugins.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 22:39:01 +04:00
Pavel Emelyanov
8df4d6769c net: Factor out empty-kind devices checking
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 22:38:54 +04:00
Pavel Emelyanov
d5790de623 net: Move void devices dumping in a separate helper
To make it look symmetrical to over types dumping.
And for simpler further patching.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 22:38:42 +04:00
Pavel Emelyanov
d52e000152 net: Don't create lo on netns restore
For devices, that are available in netns we have a special
routine, that just restored link params.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-08 16:52:20 +04:00
Andrey Vagin
4850fd94a8 crtools: move cr_options in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 18:17:52 +04:00
Andrey Vagin
1300cf4915 crtools: move all stuff about fdset in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 15:24:48 +04:00
Andrey Vagin
1a0ee90d2b tcp: disable repair mode for sockets on rollback (v2)
Currently if a network namespace is dumped and something fails, sockets
remain in repair mode. It's because cpt_unlock_tcp_connections is
executed only if network namespace is not dumped.

cpt_unlock_tcp_connections disables repair mode for sockets and drops
netfilters. netfilters are not used in case of network namespaces.

v2: don't execute network-unlock scripts, if network namespace are not
    dumped.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-31 20:12:55 +04:00
Cyrill Gorcunov
9dd6887d7a net: Dump EXTLINK devices
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-10 15:12:54 +04:00
Cyrill Gorcunov
d0a323cb1f net: Restore EXTLINK devices
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-10 15:12:36 +04:00
Pavel Emelyanov
28014d7eb4 net: Save and restore iptables in net namespace
By default just use the iptables-save and iptables-restore commands.
User may define CR_IPTABLES variable, in this case the "sh -c $CR_IPTABLES"
would be called.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-04 02:51:33 +04:00
Pavel Emelyanov
0327d5511b fdset: Beautify fdset opening
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 05:39:52 +04:00
Andrey Vagin
faf7b94868 netns: don't use global fdset for dumping namespace
We are going to replace pid on id in names of image files. The id is
uniq for each namespace, so it's more convient, if image files are
opened per namespace.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:10:45 +04:00
Cyrill Gorcunov
c721a2751f net: Print link name when restore it
For debug purpose.

 | (00.013002)      1: Restoring link lo type 1
 | (00.013002)      1: Restoring netdev lo idx 1
 | (00.015002)      1: Restoring link venet0 type 4
 | (00.015002)      1: Restoring link eth0 type 2
 | (00.015002)      1: Restoring netdev eth0 idx 3

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-18 20:43:59 +04:00
Pavel Emelyanov
02650b0711 net: Sanitize dump_links() function code
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-10 12:45:42 +04:00
Andrey Vagin
cec93fa155 net: mount sysfs in a new mount name-space
The current scheme is racy. It use open_detache_mount in a current
name-space. If a mount namespace is created by someone else between
mount and umount(detach) in open_detache_mount, the mount will be
propagated in the new mntns, then it is detached in a current ns and
rmdir fails, because it's still mounted in athother mntns.

This patch creates a new mount namespace for mounting sysfs.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-29 17:46:28 +04:00
Pavel Emelyanov
b18fb09eb9 show: Replace one-line show_foo calls with args array
We have generic do_pb_show() call and tons of show_foo
routines, that just call one with proper args. Compact
the code by putting the args into array and calling
the do_pb_show() in one place.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-24 04:00:32 +04:00
Pavel Emelyanov
022cfc30ae net: Dump and restore netdev address
Tap-s and Veth-s can change one, need to keep it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-24 01:07:33 +04:00
Pavel Emelyanov
1ac6d76cbd tun: Restore tun files and tun links
This thing is pretty straightforward -- on netns creation
populate it with tun-s, after this collect tun files, open
and attach them with regular fd-s engine.

One tricky thing -- when populating namespace with tun links
make them all persistent and drop this flag (if required)
later, when the first alive opened appears.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 19:10:15 +04:00
Pavel Emelyanov
a3e53658f7 tun: Dump tun files and tun links
The major issue with dump is -- some info id get via netlink,
some via sysfs and some (!) via opened and attached tun file.
But the latter cannot be created, if there's another one attached
(or the mq device is full with threads).

Thus we have to dump this info via existing tun file and keep one
in memory till the link dump code takes place.

Opposite situation is also possible -- we can have a persistent
unattached device. In this case we have to attach to it, dump
things and detach back.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 19:02:55 +04:00
Pavel Emelyanov
92cc20c07c net: Ability to restore existing link's params with rtm
TUN devices are created with ioctl, but their parameters (e.g.
flags with state, mtu, etc.) are to be restored with generic
RTM_SETLINK message.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 18:46:48 +04:00
Pavel Emelyanov
4869da1781 net: Read ns' sysfs file helper
Just a small helper, that reads string from ns' sysfs mount.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 16:14:37 +04:00
Pavel Emelyanov
8d2e0d5d14 net: Mount ns' sysfs before dump
Some information about network devices may hide in sysfs, thus
it's required to have one at hands while dumping the netns.

Create the detached mount for that.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 16:13:31 +04:00
Pavel Emelyanov
c7afbae598 net: Prepare to dump netdev entry with extentions
Some (most) network devices would like to have NetDeviceEntry with
more fields, than currently present (and enough for lo and veth).
Prepare for that by allowing them to define their own callback that
would fill the resor of the pb entry and call write_netdev_img().

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 16:08:46 +04:00
Pavel Emelyanov
40ed18839e net: Print link kind when reporting inability to dump such
Kernel has more and more links with rtnl-ops, which report
a string kind of the device, which is handy for debugging.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-06-11 16:49:11 +04:00
Cyrill Gorcunov
30936058a0 ns: Extend ns_desc to carry the length of the ns name
This will be needed for fast parsing of procfs ns references.

[ xemul: Add user_ns_desc here ]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-18 03:36:56 +04:00
Pavel Emelyanov
add21b75c9 show: Remove options args from ->show callback
This thing is global, we can address one explicitly.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-08 00:23:42 +04:00
Kir Kolyshkin
d90d4b1b88 Fix typos in log messages
Someone has to do it, right?..

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-15 12:46:25 +04:00
Pavel Emelyanov
5cae819d8c img: Get rid of open_image_ro helper
O_RSTR flag should be used instead for regular open_image

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-09 19:22:21 +04:00
Cyrill Gorcunov
7d8b5da7d6 net: Do not BUG() if unsupported link type met
But rather exit gracefully.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-02-11 17:14:20 +04:00
Pavel Emelyanov
ac845bd1d8 cr: Obsolete the --namespaces option
It's no longer required to use this option -- two currently
supported cases (tasks on host and tasks in containers) can
be detected automatically. Keep this option for future.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-18 13:25:16 +04:00
Pavel Emelyanov
3a1c7d1d76 ns: Introduce ns descriptors
These are structs that (now) tie together ns string
and the CLONE_ flag. It's nice to have one (some code
becomes simpler) and will help us with auto-namespaces
detection.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-15 23:24:01 +04:00