2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 12:57:57 +00:00

1696 Commits

Author SHA1 Message Date
Andrey Vagin
5a37481914 restore: get real pid for each task (v2)
For the root task the clone syscall returns the pid in criu's pidns,
but for other processes the clone syscall returns PID in the restored
namespace.

The /proc/self link contains the PID value of the current process, so if
we want to determing the PID in a criu's pidns, we should use criu's
/proc.

v2: readlink() does not append a null byte to buf, so we must do that
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:22:58 +04:00
Ruslan Kuprieiev
4eb2872b27 v2 crtools: write pidfile, when service/page server is run as daemon and "--pidfile" is set
When service/page server becomes daemon, we may need to know it's pid.

Signed-off-by: Ruslan Kuprieiev<kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-16 15:45:01 +04:00
Pavel Emelyanov
5f47e0a67f service: Simplify dump-responce sending
We need 2 parameters only to form it properly.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-16 15:40:04 +04:00
Pavel Emelyanov
e866b7c043 rpc: Split rpc req-s from rpc-resps
Now we don't have generic criu_msg thing -- instead, we have
explicit request (with per-type args) and explicit responce
(yet again -- with per-type args).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-16 15:36:12 +04:00
Ruslan Kuprieiev
d74073a593 unix: Handle service socket on dump and restore
Service connection is actually an 'external' one from unix sockets engine POV,
but we don't want to dump it as such. Thus, we explicitly find one and dump it
as half-closed connection. On restore we push an artificial message into it
to report to the program that the dump-request was served, but the program is
restored.

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-13 15:53:20 +04:00
Ruslan Kuprieiev
8fddfd2ff4 crtools: Add cr_service meat
The need in service is described at http://criu.org/Self_dump

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-13 15:53:06 +04:00
Ruslan Kuprieiev
eb79300dfd crtools: initial skeleton for cr_service
The criu service is a daemon, that opens a unix socket and listens for
incoming requests. The requests will be declared in protobuf/rpc.proto
and for now will only contain the 'dump' request.

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-13 15:50:39 +04:00
Andrey Vagin
b67afac623 util: remove unused the service fd constant
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-12 18:37:08 +04:00
Pavel Emelyanov
9a6bc0fdab unix: Add comments about icons and external sockets
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-10 12:24:11 +04:00
Pavel Emelyanov
b4c09afd43 files: Add more comments about shared files dump/restore
This is not trivial codeflow, let's document it till we
remember what and why it does.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-10 12:13:36 +04:00
Pavel Emelyanov
7a16734ed5 rst: Formalize the shared resource restore order
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-29 08:07:15 +04:00
Andrey Vagin
02d8a72bb5 mounts: find mounts, which are propagated from a current one (v2)
A few sentences, which are required for understanging this patch

2a) A shared mount can be replicated to as many mountpoints and all the
replicas continue to be exactly same.
2b) A slave mount is like a shared mount except that mount and umount
events only propagate towards it.
2c) A private mount does not forward or receive propagation.

All rules is there Documentation/filesystems/sharedsubtree.txt

If it's a first mount in a group, all group members should be
bind-mounted from this one.

Each mount propagates to all members of parent's group. The group can
contains a few slaves.

Mounts, which have propagated to slaves, are unmounted, because we can't
be sure, that they propagated in real life. For example:

mount --bind --make-slave /share /slave1
mount --bind --make-slave /share /slave2
mount /share/test
umount /slave2/test
mount --make-share /slave1/test
mount --bind --make-share /slave1/test /slave2/test

41 40 0:33 / /share rw,relatime shared:28 - tmpfs xxx rw
42 40 0:33 / /slave1 rw,relatime master:28 - tmpfs xxx rw
43 40 0:33 / /slave2 rw,relatime master:28 - tmpfs xxx rw
44 41 0:34 / /share/test rw,relatime shared:29 - tmpfs xxx rw
46 42 0:34 / /slave1/test rw,relatime shared:30 master:29 - tmpfs xxx rw
45 43 0:34 / /slave2/test rw,relatime shared:30 master:29 - tmpfs xxx rw

/slave1/test and /slave2/test depend on each other and minimum one of them
doesn't propagate from /share/test

v2: use false and true for bool

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 15:01:30 +04:00
Andrey Vagin
e01072c05a mounts: if a mount can't be mounted, it is queued in postpone list (v4)
Try to restore mounts while a postpone list isn't empty and check
that each iteration has some progress, otherwice it will fails for
preventing infinite loops

v2: rework logic about postpone list
    add more comments

v3: one more attempt to make it more readable
v4: Here is a master class from Pavel how to write self-documented code.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 15:01:28 +04:00
Andrey Vagin
d00e7c6f88 mount: link dependent mounts (v3)
All shared mounts from one group are connected to circular list.
All slave are added into the proper master list.

v2: change variable name and fix a bug about adding shared mounts in a
circular list.
v3: handle errors of collect_shared

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 15:01:26 +04:00
Andrew Vagin
5bf25d36c0 files: declare fd_params->pos as off_t
Currently pos has type unsigned long, so its size depends on
architecture. pos is saved as 64-bit value in the image file and it
isn't restored, if it is equal to -1. Due to convertation on 32-bit
platforms -1 is converted into UINT_MAX and we get error on restore.

$ zdtm.sh ns/static/tun
...
(00.398513)      5: Error (files-reg.c:534): Can't restore file pos: Illegal seek
(00.398888)      5: Error (files-reg.c:489): Can't open file /dev/net/tun: Illegal seek
...
id: 0x15 flags: 0x2 pos: 0x000000ffffffff fown: { uid: 0 euid: 0 signum: 0 pid_type: 0 pid: 0 }  name: "/dev/net/tun"

crtools is compiled with _FILE_OFFSET_BITS=64, so off_t is always 64-bit.

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-29 14:55:37 +04:00
Cyrill Gorcunov
bc002e8537 Add strlcpy helper
Same as kernel provides, adopted from Linux sources.

strlcpy is similar to strncpy but _always_ adds \0
at the end of string even if destination is shorter.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-28 19:06:43 +04:00
Pavel Emelyanov
633e559274 show: Remove dead code
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-24 04:03:31 +04:00
Pavel Emelyanov
b18fb09eb9 show: Replace one-line show_foo calls with args array
We have generic do_pb_show() call and tons of show_foo
routines, that just call one with proper args. Compact
the code by putting the args into array and calling
the do_pb_show() in one place.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-24 04:00:32 +04:00
Pavel Emelyanov
84737e2796 build: Generate most of the pb-desc automatically
These contain linkage between number, data type and routines
for pb messages we write/read to/from image files. Most of them
have simple number-type-routines mapping, so introduce a generating
script for that.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 21:47:31 +04:00
Pavel Emelyanov
c2b7800740 check: Add tun support
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 19:18:47 +04:00
Pavel Emelyanov
1ac6d76cbd tun: Restore tun files and tun links
This thing is pretty straightforward -- on netns creation
populate it with tun-s, after this collect tun files, open
and attach them with regular fd-s engine.

One tricky thing -- when populating namespace with tun links
make them all persistent and drop this flag (if required)
later, when the first alive opened appears.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 19:10:15 +04:00
Pavel Emelyanov
a3e53658f7 tun: Dump tun files and tun links
The major issue with dump is -- some info id get via netlink,
some via sysfs and some (!) via opened and attached tun file.
But the latter cannot be created, if there's another one attached
(or the mq device is full with threads).

Thus we have to dump this info via existing tun file and keep one
in memory till the link dump code takes place.

Opposite situation is also possible -- we can have a persistent
unattached device. In this case we have to attach to it, dump
things and detach back.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 19:02:55 +04:00
Pavel Emelyanov
9615c3b96b tun: Initial skeleton for tun support
There will be two entities handled:

1. tun file -- an opened char device with misc major and tun minor
   that can be attached to item #2

2. tun netdevice -- another type of links

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 18:57:40 +04:00
Pavel Emelyanov
92cc20c07c net: Ability to restore existing link's params with rtm
TUN devices are created with ioctl, but their parameters (e.g.
flags with state, mtu, etc.) are to be restored with generic
RTM_SETLINK message.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 18:46:48 +04:00
Pavel Emelyanov
4869da1781 net: Read ns' sysfs file helper
Just a small helper, that reads string from ns' sysfs mount.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 16:14:37 +04:00
Pavel Emelyanov
c7afbae598 net: Prepare to dump netdev entry with extentions
Some (most) network devices would like to have NetDeviceEntry with
more fields, than currently present (and enough for lo and veth).
Prepare for that by allowing them to define their own callback that
would fill the resor of the pb entry and call write_netdev_img().

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-23 16:08:46 +04:00
Pavel Emelyanov
60e6d38868 collect: Shorten common images collecting code
Now we have a set of cinfo-s, it's possible to collect all
this stuff in a plan for-loop.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-21 03:52:18 +04:00
Pavel Emelyanov
64e7d2435a collect: Reduce amount of args to collect_image call
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-21 03:27:06 +04:00
Pavel Emelyanov
9917c4fe34 rst: Compact file-descs collects a bit
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-21 01:06:58 +04:00
Andrey Vagin
ef3ca3a104 restore: do not kill processes if not-all of them have been created
If processes are restored without pidns, criu knows pidtheir -s from images,
but part of those task may have not yet forked, and thus the pids can not
exist or (!) be used by other processes.

To address that we abort stages RESTORE_NS and FORKING without killing tasks,
but with task_entries->start futex by writing STATE_FAIL into it and making
the tasks to check that. Since during RESTORE_NS and FORKING stages tasks can
only block on the mentioned futes, we can safely do it.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-20 15:17:30 +04:00
Cyrill Gorcunov
71f7f7546c atomic: Use atomic_read instead of atomic_get
To switch to kernel's style.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-16 19:37:06 +04:00
Cyrill Gorcunov
aea8a605f3 atomic -- Switch to linux kernel templates
Use same code as provided in kernel. In first place
we used own prototypes in case of simplicity (they
all were based on "lock xadd" instruction. There is
no more need for that and we can switch to well known
kernel's api.

Because kernel uses plain int type to carry atomic
counters I had to add explicit u32 type for futexes,
as well as a couple of fixes for new api usage.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-16 19:37:03 +04:00
Pavel Emelyanov
01f113ecd3 rst: Remove threads restore serialization
This thing was introduced by 01f8f8f4 to help not mixing
per-thread error messages in log files. Now messages are
not mixed by other means, so this thing is useless.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-12 09:17:02 +04:00
Pavel Emelyanov
9b45833b81 stats: Account total time to restore
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 21:25:42 +04:00
Pavel Emelyanov
2df39a4b47 stats: Account for time to fork tasks on restore
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 21:22:40 +04:00
Pavel Emelyanov
c65f068489 stats: Add timing stats for restore
This will only work if timiings are reported by a single
task. Collecting them from several tasks is to be done.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 21:15:58 +04:00
Pavel Emelyanov
8ff15e5c41 util: Make set_proc_mountpoint static
And rename it to better reflect what it does.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 20:02:33 +04:00
Pavel Emelyanov
e99576f655 rst: Collect stats about checked-vs-cowed pages
On restore we compare pages' contents with memcmp to check which
of them can remain shared. Report this info in restore stats.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 13:36:24 +04:00
Pavel Emelyanov
9bb545011c stats: Introduce counters for restore
These are atomic_add-s on shmalloc-ed stats.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 13:24:13 +04:00
Pavel Emelyanov
d77a05b6dc stats: Rename existing timing and cnt counters into dump_... ones
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 13:07:21 +04:00
Pavel Emelyanov
6ac4870181 stats: Prepare for collecting restore stats
Restore stats are difficult -- we have to collect them from several
tasks and thus existing plain variables would not work. We'll need
shared memory with stats, so prepre for allocating one.

Other than this -- put call to write_stats() where appropriate for
restore.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 13:00:45 +04:00
Pavel Emelyanov
7f9302505c page-server: Convert opts.addr into char *
We'll have --address argument reused for library, so make this
abstract address, not ipv4 one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-11 12:01:14 +04:00
Pavel Emelyanov
f61eb82f8b files: Convert fdescs hash from list-s into hlist-s
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-10 16:41:18 +04:00
Andrey Vagin
d5540a2de5 page-server: check that all data have been accepted
Currently criu sends data to the page server, but it doesn't get any
feedback, so it can't be sure that all data have been accepted.

This patch adds a flush command, which requires an answer from the page
server. This command is sent before disconnecting.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-09 15:39:37 +04:00
Pavel Emelyanov
9cef1a00ce mount: Factor out detached mountpoint opening
The difficulty is that this code is required in both -- pie
and non-pie contexts.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-29 13:12:00 +04:00
Pavel Emelyanov
72ec39f10c util: Rename pie's util-net.c into util.c
Will put more things there.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-29 12:43:30 +04:00
Pavel Emelyanov
987de2de05 parasite: Rename ack-waiting function to look better
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-17 08:56:17 +04:00
Pavel Emelyanov
f2978d2ac7 parasite: Reshuffle sync and async daemon-node executing routines
The version, that might not wait for ack is always called with
"async" flag set. Cleanup things according to this.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-17 08:54:24 +04:00
Pavel Emelyanov
d20089fb32 dump: Merge thread and task parasite-core info dumping
There are parts dumping which is common to thread and task,
and this stuff is represented by parasite_dump_thread structure.

Merge this into parasite_dump_misc and facror out dumping code.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-16 19:47:52 +04:00
Andrey Vagin
f3f72cd7f8 parasite: use only one command for executing parasite in a daemon mode
If a kernel supports PTRACE_SETSIGMASK, criu don't need to execute
PARASITE_CMD_INIT and PARASITE_CMD_DAEMONIZE, because the frist command
is used only for blocking signals. If criu crashes between these
commands, a process state will be corrupted, because all signals remain
blocked.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-16 17:24:18 +04:00