pr_perror() is special, it adds \n at the end so there is
no need to supply one.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we restore sockets with relative names we change
current working directory into the one provided by
socket image data. This actually affects current
criu state because the rest of code doesn't know
about such tricks and may rely on working dir
consistency.
So remember the current working dir and restore it
back once socket cwd operations are complete.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
because we are going to restore data of peer.
Anyway this is wrong, because we need to restore a message with a sender
address.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if socket's cwd lays on nested mount point
we might resolve its path a bit incorrectly
(mount_resolve_path helper should not obtain
paths with leading dot).
Thus send a path without leading dot for correct
name resolving.
Also add some error messages.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* Added functionality for dumping unnamed unix sockets.
When we call CRIU with dump option, for unnamed socket we
should pass it inode into --ext-unix-sk. Details about this problem
described in http://criu.org/External_UNIX_socket#What_to_do_with_socketpair.28.29-s.3F.
Usage example:
criu dump -D images -o dump.log -v4 --ext-unix-sk=4529709 -t 13506
* fix typo error in log output
Signed-off-by: Artem Kuzmitskiy <artem.kuzmitskiy@lge.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Unix sockets may be created with non-absolute (relative) path
(when kernel creates one it always use AT_FDCWD for name resolving),
So when we collect sockets we see them as having names without leading
slash.
In common cases for such sockets application doesn't change own
working directory after that but this is not always the true.
So we need to invent some name resolver. The good candidate is
IRMAP cache but after a number of testings I found that it might
slow down performance very dramatically. Thus we need some more
intelligent way here.
For a while, for common applications such as postfix, fetching
dumpee working directory and root is enough. So here what we do
- when socket get collected from diag interface we remember
its relative name parameters (device and inode) but postprone
name resolving to not bring perf penalty until really needed
- when we meet a socket to dump with relative name assigned we
try to use $cwd/name and $root/name for this socket to check
if it has been created in those directories. On success we
simply remember the directory in image and when restore such
socket call for chdir helper to change working dir and generate
relative name
v2:
- Use new unlink_stale to remove sockets we're to restore
- Use *at() helpers once we're changed working dir in bind_unix_sk
- Add more debug ouput
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It gonna be extende to support relative names.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In the case where there were multiple clients for a dgram socket, we were
restoring the queue for each client. Instead, we should pick one client and
she should restore the queue while the rest skip it.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We need to wait for listen() as well as bind() for internal unix sockets, or we
can race like this:
(00.135950) 1: Opening standalone socket (id 0xb ino 0x9422f peer 0)
(00.135974) 353: Error (sk-unix.c:701): Can't connect 0x947c4 socket: Connection refused
(00.136390) 1: Error (cr-restore.c:1228): 353 exited, status=1
(00.136407) 1: Putting 0x9422f into listen state
(where 0x9422f is the peer for 0x947c4)
This race was pretty rare for me, but I've run 1000 tests and it didn't
happen so hopefully this patch fixes it :)
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On restore we will use the peer's name to connect() the
socket back, so if there's no name dump should be aborted.
This situation happens when we create a socketpair(), fork
and dump only one task with one pair end.
Reported-by: Artem Kuzmitskiy <artem.kuzmitskiy@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to support user namespaces and uid-s will be converted
accoding with userns mappings.
v2: conver id-s for sockets too
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since we're going to switch from int-fd-s to class-image
soon the fdset name will not fit into the new terminology.
This patch is
sed -e 's/fdset/imgset/g' -i *
sed -e 's/imgset_fd/img_from_set/g' -i *
git mv include/fdset.h include/imgset.h
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Here we define new api to be used in plugins.
- Plugin should provide a descriptor with help of
CR_PLUGIN_REGISTER macro, or in case if plugin require
no init/exit functions -- with CR_PLUGIN_REGISTER_DUMMY.
- Plugin should define a plugin hook with help of
CR_PLUGIN_REGISTER_HOOK macro.
- Now init/exit functions of plugins takes @stage
argument which tells plugin which stage of criu
it's been called on dump/restore. For exit it
also takes @ret which allows plugin to know if
something went wrong and it needs to cleanup
own resources.
The idea behind is to not limit plugins authors with names
of functions they might need to use for particular hook.
Such new API deprecates olds plugins structure but to keep
backward compatibility we will provide a tiny layer of
additional code to support old plugins for at least a couple
of release cycles.
For example a trivial plugin might look like
| #include <sys/types.h>
| #include <sys/stat.h>
| #include <fcntl.h>
| #include <libgen.h>
| #include <errno.h>
|
| #include <sys/socket.h>
| #include <linux/un.h>
|
| #include <stdio.h>
| #include <stdlib.h>
| #include <string.h>
| #include <unistd.h>
|
| #include "criu-plugin.h"
| #include "criu-log.h"
|
| static int dump_ext_file(int fd, int id)
| {
| pr_info("dump_ext_file: fd %d id %d\n", fd, id);
| return 0;
| }
|
| CR_PLUGIN_REGISTER_DUMMY("trivial")
| CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__DUMP_EXT_FILE, dump_ext_file)
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Nowadays this routine is mainly used for getting an
fd, rather than keeping one for future reference.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch removes the global mntinfo_tree and collect_mount_info where
it was constructed. The mntinfo list is filled from dump_mnt_ns,
rst_collect_local_mntns, collect_mnt_namespaces and read_mnt_ns_img.
A mountinfo entry contains a reference on a proper ns_id entry, so
we cau use mnt_id to look up a proper mount namespace.
v2: remove trash after rebasing.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's already used for dumping files and it will be used for restoring,
so it should be service fd to avoid intersection with restored
descriptors.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we can't stat socket name with ENOENT errno this
means, that the socket is bound and unlinked. Don't
drop the whole socket info, just treat it as nameless.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CID 1141016 (#1 of 1): Extra sizeof expression (SIZEOF_MISMATCH)
suspicious_pointer_arithmetic: Adding "40UL /* sizeof (FilePermsEntry)
*/" to pointer "(FownEntry *)perms" of type "FownEntry *" is suspicious
because adding an integral value to this pointer automatically scales
that value by the size, 48 bytes, of the pointed-to type, "FownEntry".
Most likely, "sizeof (FilePermsEntry)" is extraneous and should be
replaced with 1.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We don't know a state behind an external socket. It depends on logic
of the program, which handles this socket.
This patch adds ability to load a library with callbacks for dumping
and restoring external sockets.
This patch introduces two callbacks cr_plugin_dump_unix_sk and
cr_plugin_restore_unix_sk. If a callback can not handle a socket, it
must return -ENOTSUP.
The main questions, what kind of information should be tranfered in
these callbacks. Pls, think a few minutes about that and send me
your opinion.
v2: Use uflags instread of adding a new field
v3: clean up
v4: Unsuitable callbacks return -ENOTSUP.
v5: set USK_CALLBACK, if a socket was dumped by callback.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Unix sockets are dumped, when a peer socket is found.
We are going to dump external sockets with help plugins. For the we need
to set the USK_CALLBACK flags in unix entry. Currently a socket is
dumped immediately when it's transfered, but we can be sure that a
socket is not external, only when we have its peer.
v2: add comments in code
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to add callback-s for dumping external sockets.
All external sockets are added into unix_list, but for dumping we need
to know all peers.
And one more thing is that a socket is not closed before its peer is
not be dumped. We are going to transfer the socket decriptor in the
callback.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is more correct, as if st_dev == phys_dev check fails
we have to treat phys_dev as kdev for path resolve device
comparison.
Howver, this is not the case for non-btrfs FSs, and for the
latter one doesn't change anything as it uses anon devices
which are equal for kdev and odev cases.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Instead of scanning btrfs subvolumes (which can be even unaccessbile
if mount point lays on directory instead of subvolume itself) we use
path resolving feature here -- once we need to figure out if some
device number need to be altered up to mount point (as we know stat()
called on subvolume returns st_dev for subvolume itself, but not
one that associated with a superblock and shown in /proc/self/mountinfo
output).
This as well implies that we need to check if device number for ghost
files are to be updated to match mountinfo, thus we use phys_stat_resolve_dev
helper here.
After this patch the previously merged btrfs engine is no longer needed
(at least it seems so) and can be dropped.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Because socket migh lay on btrfs volume (thus the device
number reported by diag module won't be the same as obtained
from stat(2)) we need to do an additional test and try
to resolve device number with help of btrfs engine.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The error message is rather confusing people. In worst case (if
it happened that we need this uncollected socket), criu will
print out real error message later.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Everything in the sk-unix.c is ready for seq-packet sockets.
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now we don't have generic criu_msg thing -- instead, we have
explicit request (with per-type args) and explicit responce
(yet again -- with per-type args).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>