2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-22 18:07:57 +00:00

122 Commits

Author SHA1 Message Date
Kir Kolyshkin
15f914f20a pr_perror(): don't supply \n
pr_perror() is special, it adds \n at the end so there is
no need to supply one.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-10-09 18:29:55 +03:00
Kir Kolyshkin
17b92fa542 Append newline when using pr_err()
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-10-09 18:28:00 +03:00
Cyrill Gorcunov
90eae6a168 sk-unix: Don't affect cwd for relative named sockets
When we restore sockets with relative names we change
current working directory into the one provided by
socket image data. This actually affects current
criu state because the rest of code doesn't know
about such tricks and may rely on working dir
consistency.

So remember the current working dir and restore it
back once socket cwd operations are complete.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-10-08 13:18:10 +03:00
Cyrill Gorcunov
7bda5bd649 sk-unix: Fix memory leak on error path
Dynamically allocated @name doesn't release
if error happened.

 | ** CID 129898:    (RESOURCE_LEAK)
 | /sk-unix.c: 505 in unix_process_name()
 | /sk-unix.c: 509 in unix_process_name()
 | /sk-unix.c: 519 in unix_process_name()

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-08-05 13:45:37 +03:00
Andrey Vagin
c674906f69 sk-unix: queuer should be set for peer
because we are going to restore data of peer.

Anyway this is wrong, because we need to restore a message with a sender
address.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-08-05 13:44:34 +03:00
Cyrill Gorcunov
2143d7e9ca sk-unix: Fix name resolving on nested mount points
In case if socket's cwd lays on nested mount point
we might resolve its path a bit incorrectly
(mount_resolve_path helper should not obtain
 paths with leading dot).

Thus send a path without leading dot for correct
name resolving.

Also add some error messages.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-31 15:45:22 +03:00
Cyrill Gorcunov
7a6e97b096 sk-unix: Add more verbosity to error paths
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-31 15:45:20 +03:00
Cyrill Gorcunov
3e0b09b1b5 sk-unix: protobuf -- Use string type instead of bytestream
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-30 16:35:16 +03:00
Cyrill Gorcunov
48dbef3ecc sk-unix: unix_process_name -- Defer lookup until required
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-30 16:35:10 +03:00
Artem Kuzmitskiy
b790b586eb Add restoring of unnamed unix sockets.
Added functionality for restoring unnamed unix sockets
using already implemented feature - inherit fd and using same command line
option.
Usage example:
criu restore -d -D images -o restore.log --pidfile restore.pid -v4 \
     -x --inherit-fd fd[3]:socket:[9677263]

Signed-off-by: Artem Kuzmitskiy <artem.kuzmitskiy@lge.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-29 17:53:36 +03:00
Artem Kuzmitskiy
79fd764ae6 Add dumping of unnamed unix sockets.
* Added functionality for dumping unnamed unix sockets.
  When we call CRIU with dump option, for unnamed socket we
  should pass it inode into --ext-unix-sk. Details about this problem
  described in http://criu.org/External_UNIX_socket#What_to_do_with_socketpair.28.29-s.3F.
  Usage example:
    criu dump -D images -o dump.log -v4 --ext-unix-sk=4529709 -t 13506

* fix typo error in log output

Signed-off-by: Artem Kuzmitskiy <artem.kuzmitskiy@lge.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-29 17:51:51 +03:00
Cyrill Gorcunov
8788054e33 sk-unix: Add trivial name resolver for sockets with relative names
Unix sockets may be created with non-absolute (relative) path
(when kernel creates one it always use AT_FDCWD for name resolving),
So when we collect sockets we see them as having names without leading
slash.

In common cases for such sockets application doesn't change own
working directory after that but this is not always the true.
So we need to invent some name resolver. The good candidate is
IRMAP cache but after a number of testings I found that it might
slow down performance very dramatically. Thus we need some more
intelligent way here.

For a while, for common applications such as postfix, fetching
dumpee working directory and root is enough. So here what we do

 - when socket get collected from diag interface we remember
   its relative name parameters (device and inode) but postprone
   name resolving to not bring perf penalty until really needed

 - when we meet a socket to dump with relative name assigned we
   try to use $cwd/name and $root/name for this socket to check
   if it has been created in those directories. On success we
   simply remember the directory in image and when restore such
   socket call for chdir helper to change working dir and generate
   relative name

v2:

 - Use new unlink_stale to remove sockets we're to restore
 - Use *at() helpers once we're changed working dir in bind_unix_sk
 - Add more debug ouput

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-29 17:44:08 +03:00
Cyrill Gorcunov
a6543f8d33 sk-unix: Move name handling into separate routine
It gonna be extende to support relative names.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-28 15:02:53 +03:00
Cyrill Gorcunov
350b8f2107 sk-unix: Defile log prefix
For grepability sake.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-28 15:01:53 +03:00
Cyrill Gorcunov
e559f26909 sockets: unix -- Drop redundant empty lines
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-28 15:01:52 +03:00
Tycho Andersen
83f1f7e588 rst: only restore dgram socket queue once
In the case where there were multiple clients for a dgram socket, we were
restoring the queue for each client. Instead, we should pick one client and
she should restore the queue while the rest skip it.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-28 13:53:11 +03:00
Tycho Andersen
3122529fc8 unix: wait for listen() as well as bind()
We need to wait for listen() as well as bind() for internal unix sockets, or we
can race like this:

(00.135950)      1: Opening standalone socket (id 0xb ino 0x9422f peer 0)
(00.135974)    353: Error (sk-unix.c:701): Can't connect 0x947c4 socket: Connection refused
(00.136390)      1: Error (cr-restore.c:1228): 353 exited, status=1
(00.136407)      1:  Putting 0x9422f into listen state

(where 0x9422f is the peer for 0x947c4)

This race was pretty rare for me, but I've run 1000 tests and it didn't
happen so hopefully this patch fixes it :)

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-07-16 14:25:39 +03:00
Pavel Emelyanov
25d074ad91 unix: Don't dump external peer w/o name
On restore we will use the peer's name to connect() the
socket back, so if there's no name dump should be aborted.

This situation happens when we create a socketpair(), fork
and dump only one task with one pair end.

Reported-by: Artem Kuzmitskiy <artem.kuzmitskiy@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2015-06-08 23:36:21 +03:00
Andrey Vagin
30711b109d userns: save uid-s from a target userns (v2)
We are going to support user namespaces and uid-s will be converted
accoding with userns mappings.

v2: conver id-s for sockets too
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-11-07 17:15:45 +04:00
Pavel Emelyanov
5f2a7ac27b img: Rename fdset -> imgset
Since we're going to switch from int-fd-s to class-image
soon the fdset name will not fit into the new terminology.

This patch is

 sed -e 's/fdset/imgset/g' -i *
 sed -e 's/imgset_fd/img_from_set/g' -i *
 git mv include/fdset.h include/imgset.h

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:10 +04:00
Cyrill Gorcunov
3146f58317 plugin: Rework plugins API, v2
Here we define new api to be used in plugins.

 - Plugin should provide a descriptor with help of
   CR_PLUGIN_REGISTER macro, or in case if plugin require
   no init/exit functions -- with CR_PLUGIN_REGISTER_DUMMY.

 - Plugin should define a plugin hook with help of
   CR_PLUGIN_REGISTER_HOOK macro.

 - Now init/exit functions of plugins takes @stage
   argument which tells plugin which stage of criu
   it's been called on dump/restore. For exit it
   also takes @ret which allows plugin to know if
   something went wrong and it needs to cleanup
   own resources.

The idea behind is to not limit plugins authors with names
of functions they might need to use for particular hook.

Such new API deprecates olds plugins structure but to keep
backward compatibility we will provide a tiny layer of
additional code to support old plugins for at least a couple
of release cycles.

For example a trivial plugin might look like

 | #include <sys/types.h>
 | #include <sys/stat.h>
 | #include <fcntl.h>
 | #include <libgen.h>
 | #include <errno.h>
 |
 | #include <sys/socket.h>
 | #include <linux/un.h>
 |
 | #include <stdio.h>
 | #include <stdlib.h>
 | #include <string.h>
 | #include <unistd.h>
 |
 | #include "criu-plugin.h"
 | #include "criu-log.h"
 |
 | static int dump_ext_file(int fd, int id)
 | {
 |	pr_info("dump_ext_file: fd %d id %d\n", fd, id);
 |	return 0;
 | }
 |
 | CR_PLUGIN_REGISTER_DUMMY("trivial")
 | CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__DUMP_EXT_FILE, dump_ext_file)

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-03 20:48:36 +04:00
Pavel Emelyanov
090587e1a1 stat: Pass namespace into phys_stat_dev_match, not mnt tree
This makes the API simpler.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-06 10:57:25 +04:00
Pavel Emelyanov
74357818f8 unix: Get ns root fd only once.
This makes mntns_get_root_fd usage more natural.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-23 02:51:11 +04:00
Pavel Emelyanov
68e2841a9b mnt: Turn mntns_get_root_fd into accepting mnt ns_id
The only exception (for now) is the irmap -- it should
operate on ns as well.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-23 02:31:16 +04:00
Pavel Emelyanov
1435617c40 mnt: Rename _collect_root into _get_root_fd
Nowadays this routine is mainly used for getting an
fd, rather than keeping one for future reference.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-23 01:38:58 +04:00
Andrey Vagin
2f4be997b6 mount: use per-namespace mntinfo_tree (v2)
This patch removes the global mntinfo_tree and collect_mount_info where
it was constructed. The mntinfo list is filled from dump_mnt_ns,
rst_collect_local_mntns, collect_mnt_namespaces and read_mnt_ns_img.

A mountinfo entry contains a reference on a proper ns_id entry, so
we cau use mnt_id to look up a proper mount namespace.

v2: remove trash after rebasing.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-21 22:40:19 +04:00
Andrey Vagin
87a49bdfaf servicefd: add a service fd for current root
It's already used for dumping files and it will be used for restoring,
so it should be service fd to avoid intersection with restored
descriptors.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-17 12:03:11 +04:00
Pavel Emelyanov
8a827ba403 files: Make fd_id_generate_special return ID into pointer
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:17:49 +04:00
Pavel Emelyanov
8b611770aa files: Pass stat information into fd_id_generate_special
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:17:18 +04:00
Pavel Emelyanov
261fdce39d unix: Don't drop whole socket with unlinked path
When we can't stat socket name with ENOENT errno this
means, that the socket is bound and unlinked. Don't
drop the whole socket info, just treat it as nameless.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-29 15:43:11 +04:00
Andrey Vagin
f0e0ee7a5c unix: fix calculation of pointers
CID 1141016 (#1 of 1): Extra sizeof expression (SIZEOF_MISMATCH)
suspicious_pointer_arithmetic: Adding "40UL /* sizeof (FilePermsEntry)
*/" to pointer "(FownEntry *)perms" of type "FownEntry *" is suspicious
because adding an integral value to this pointer automatically scales
that value by the size, 48 bytes, of the pointed-to type, "FownEntry".
Most likely, "sizeof (FilePermsEntry)" is extraneous and should be
replaced with 1.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-23 13:59:59 +04:00
Andrey Vagin
07d9e11374 unix: fix double free on error paths
CID 1141011 (#1 of 1): Double free (USE_AFTER_FREE)
24. double_free: Calling "free(void *)" frees pointer "ue" which has
already been freed.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-23 13:59:23 +04:00
Andrey Vagin
9e3f4451e1 unix: add ability to set callbacks for external sockets (v5)
We don't know a state behind an external socket. It depends on logic
of the program, which handles this socket.

This patch adds ability to load a library with callbacks for dumping
and restoring external sockets.

This patch introduces two callbacks cr_plugin_dump_unix_sk and
cr_plugin_restore_unix_sk. If a callback can not handle a socket, it
must return -ENOTSUP.

The main questions, what kind of information should be tranfered in
these callbacks. Pls, think a few minutes about that and send me
your opinion.

v2: Use uflags instread of adding a new field
v3: clean up
v4: Unsuitable callbacks return -ENOTSUP.
v5: set USK_CALLBACK, if a socket was dumped by callback.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 15:28:41 +04:00
Andrey Vagin
afc61ba562 unux: postpone dumping sockets v2
Unix sockets are dumped, when a peer socket is found.
We are going to dump external sockets with help plugins. For the we need
to set the USK_CALLBACK flags in unix entry. Currently a socket is
dumped immediately when it's transfered, but we can be sure that a
socket is not external, only when we have its peer.

v2: add comments in code
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 15:27:36 +04:00
Andrey Vagin
59162cccf7 sk-unix/dump: allocate UnixSkEntry dynamically
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 15:27:35 +04:00
Andrey Vagin
8f66bf9794 unix: link a socket to its peer
We are going to add callback-s for dumping external sockets.
All external sockets are added into unix_list, but for dumping we need
to know all peers.

And one more thing is that a socket is not closed before its peer is
not be dumped. We are going to transfer the socket decriptor in the
callback.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-20 15:27:34 +04:00
Pavel Emelyanov
873d1dac9d unix: Move odev to kdev conversion into phys_stat_dev_match
This is more correct, as if st_dev == phys_dev check fails
we have to treat phys_dev as kdev for path resolve device
comparison.

Howver, this is not the case for non-btrfs FSs, and for the
latter one doesn't change anything as it uses anon devices
which are equal for kdev and odev cases.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-12 08:44:15 +04:00
Cyrill Gorcunov
1ba08ca664 mount: Extend phys_stat_dev_match to use path resolving instead of btrfs engine
Instead of scanning btrfs subvolumes (which can be even unaccessbile
if mount point lays on directory instead of subvolume itself) we use
path resolving feature here -- once we need to figure out if some
device number need to be altered up to mount point (as we know stat()
called on subvolume returns st_dev for subvolume itself, but not
one that associated with a superblock and shown in /proc/self/mountinfo
output).

This as well implies that we need to check if device number for ghost
files are to be updated to match mountinfo, thus we use phys_stat_resolve_dev
helper here.

After this patch the previously merged btrfs engine is no longer needed
(at least it seems so) and can be dropped.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-11 16:05:22 +04:00
Cyrill Gorcunov
9e6bd8c512 sk-unix: Don't fail if socket path lays on btrfs volume
Because socket migh lay on btrfs volume (thus the device
number reported by diag module won't be the same as obtained
from stat(2)) we need to do an additional test and try
to resolve device number with help of btrfs engine.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-04 19:23:33 +04:00
Cyrill Gorcunov
d7141750d2 sk-unix: Use pr_warn instead of pr_perror if socket path can't be stat'ed
The error message is rather confusing people. In worst case (if
it happened that we need this uncollected socket), criu will
print out real error message later.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-28 15:38:52 +04:00
Cyrill Gorcunov
75ebb6d02c sk-unix: Print the name of sockets being dropped
Useful for bug hunting.

 | (00.005209) unix: Dropping path /mnt/disk1/new_subvol/criu/test/zdtm/live/static/sockets00.test
 | for unlinked bound sk 0x26.0x1d4e real 0x23.0x1d4e

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-28 15:36:33 +04:00
Andrey Vagin
4850fd94a8 crtools: move cr_options in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 18:17:52 +04:00
Andrey Vagin
1300cf4915 crtools: move all stuff about fdset in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 15:24:48 +04:00
Pavel Emelyanov
0a8a162146 unix: Increase verbosity of "can't dump this socket" check
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-10 11:17:42 +04:00
Pavel Emelyanov
cfe72ab77a service: Put service sk inode into separate variable
I'm about to get rid of service state struct.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 06:06:53 +04:00
Ruslan Kuprieiev
3f1aeb2c86 unix: SOCK_SEQPACKET
Everything in the sk-unix.c is ready for seq-packet sockets.

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-30 18:44:57 +04:00
Pavel Emelyanov
4b6e1d6dc0 unix: Print sockets IDs in hex when collect skips them
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-25 11:40:40 +04:00
Pavel Emelyanov
3ebb368299 unix: Print "socket not found" error message on dump
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-25 11:39:54 +04:00
Pavel Emelyanov
5f47e0a67f service: Simplify dump-responce sending
We need 2 parameters only to form it properly.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-16 15:40:04 +04:00
Pavel Emelyanov
e866b7c043 rpc: Split rpc req-s from rpc-resps
Now we don't have generic criu_msg thing -- instead, we have
explicit request (with per-type args) and explicit responce
(yet again -- with per-type args).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-16 15:36:12 +04:00