2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-31 06:15:24 +00:00
Commit Graph

129 Commits

Author SHA1 Message Date
Alexander Kartashov
5c01178279 protobuf: replace the constant 4294967295 with 0xFFFFFFFF in generated sources
This workarounds a compilation warning on ARM:

	packet-sock.pb-c.c: In function 'packet_sock_entry__init':
	packet-sock.pb-c.c:98:3: error: this decimal constant is unsigned only in ISO C90 [-Werror]
	packet-sock.pb-c.c: At top level:
	packet-sock.pb-c.c:318:1: error: this decimal constant is unsigned only in ISO C90 [-Werror]

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-16 19:41:01 +04:00
Cyrill Gorcunov
eb8f8c12cd fsnotify: fanotify -- Group objects in image
As Pavel proposed we can refine fanotify image objects
squeezing common part in separate entry. Finally the objects
are grouped as

enum mark_type {
	INODE	= 1;
	MOUNT	= 2;
}

message fanotify_inode_mark_entry {
	required uint64		i_ino		= 1;
	required fh_entry	f_handle	= 2;
}

message fanotify_mount_mark_entry {
	required uint32		mnt_id		= 1;
}

message fanotify_mark_entry {
	required uint32		id		= 1;
	required mark_type	type		= 2;

	required uint32		mflags		= 3;
	required uint32		mask		= 4;
	required uint32		ignored_mask	= 5;
	required uint32		s_dev		= 6;

	optional fanotify_inode_mark_entry ie	= 7;
	optional fanotify_mount_mark_entry me	= 8;
}

This required some tuning in fdinfo parsing and
fsnotify code itself, but result looks good to me.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-15 23:17:57 +04:00
Cyrill Gorcunov
f385d32da5 image: fanotify -- Add scaffold code for fanotify objects
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-15 18:34:32 +04:00
Cyrill Gorcunov
4629ecd391 protobuf: fsnotify -- Add fanotify entries
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-15 18:34:31 +04:00
Cyrill Gorcunov
b724096f0f fsnotify: Rename inotify files to fsnotify
We will be handling both inotify and fanotify
objects here thus to make less confusion rename
the files to fsnotify.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-15 18:34:26 +04:00
Andrey Vagin
d50c786c7e files: dump fdinfo per files_id instead of pid
A few processes can share one fdtable.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-12 00:54:44 +04:00
Pavel Emelyanov
a82cb23f98 pipe: Dump and restore pipe size
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-11 17:45:54 +04:00
Pavel Emelyanov
748be83181 cr: Support rlimits
Dump the with "new" prlimit syscall that works on arbitrary pid.

Restore is done in restorer _after_ mappings mixup and _before_
caps drop to make it set any max value.

The RLIM_INFINITY is handled explicitly to help future 64<->32
bits migration.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-10 20:08:38 +04:00
Pavel Emelyanov
d703f8260e fs: Support umask dump/restore
This one is bound to task's fs info (with cwd and root)
thus put it in the fs.img file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-10 12:48:31 +03:00
Cyrill Gorcunov
c1f7ab2150 checkpoint: Add dumping of FPU state
The dumping of FPU state is done with help of ptrace
facility. There are two cases which we need to handle
depending on which features are available on host machine

1) The dump via ptrace(PTRACE_GETFPREGS ...)

   In this case the kernel will use fxsave approach
   inside the kenrel and provides us back the data
   encoded in i387_fxsave_struct format.

2) The dump via ptrace(PTRACE_GETREGSET ...)

   In this case the kernel will use xsave approach
   inside the kernel and provides us back the data
   encoded in xsave_struct format.

In any case we decode data and save it in protobuf format.
This is why core.proto file has been extended to keep new
entries.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-21 17:35:39 +04:00
Cyrill Gorcunov
1256c390b6 dump: Drop FPU padding allocation
Actually it was never used, just drop it.
Because of backward compatibility problem we
can't just zap it in protofile.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-21 17:35:37 +04:00
Cyrill Gorcunov
7eb33a76a8 ghost-files: Save device and inode in image
Because we need to lookup for ghost files from
inotify system where we only have device/inode
as a key, we save dev/ino in ghost image entry.

Note we use in-kernel format for device to be
consistent with inotify and mount related
code base.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-06 11:11:10 +04:00
Pavel Emelyanov
f43c1c2ade sk: Rework bound-dev dump/restore according to new API
The SO_BINDTODEVICE getter is changed in the kernel (before
official release) to report not index, but name to be in
harmony with setter.

Fix crtools accordingly.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-28 20:24:42 +03:00
Stanislav Kinsbursky
760e595f09 make: cleanup the whole infrastructure a bit
Main things:
1) Variables are defined properly (":=" or ":=" instead of "+"). Otherwise,
because we call nested makefiles, and such variables like CFLAGS are
inheriting it's previous state.
2) SYS-OBJ renamed to SYSCALL-LIB.
3) Inlcude of Makefile.inc removed from protobuf/Makefile

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Looks-good-to: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-22 18:51:41 +04:00
Cyrill Gorcunov
4be701794e protobuf: Add @blk_sigset to thread_core_entry
It will hold the blocked signals for threads.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-12 17:42:47 +04:00
Pavel Emelyanov
d3397e8f8c sk: Support socket filters
One thing to note. The socket filter proggie is a set of struct-s
wuth 8 and 16 bits values in it. Protobuf doesn't support such thing
and it's too annoying to mess with yet another message for that.
Instead, I encode all this stuff into array of fixed64 fields to
handle endianity (yes, protobuf handles it, but each field is not
just 64-bit value, but a structure).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-01 19:20:33 +03:00
Cyrill Gorcunov
cc6af3898e memory: Add pasing of VmFlags
The kernel now supports providing VMA flags via smaps
interface so add pasting of them.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-26 00:16:05 +04:00
Pavel Emelyanov
8471081244 unix: Add support for shutdown sockets
Get the info from kernel diag message (it should always be there)
and restore the shutdown at the very end.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-24 18:39:54 +04:00
Pavel Emelyanov
b42f8fa1ac sk: Support SO_BINDTODEVICE option
The kernel SO_BINDTODEVICE option is not symmetrical --
set required device name, but get reports index. Thus
need the index to name resolver.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-19 17:36:44 +04:00
Pavel Emelyanov
aa731ee1d7 core: Support task scheduler policies and priorities
No magic here, just fetch info using getpriority and sched_getxxx calls.
Good news is that the mentioned syscalls take pid as argument and do work
with it, i.e. -- no need in parasite help here.

Restore is splitted into prep -- copy sched bits from image on restorer
args -- and the restore itself. It's done to avoid restoring tasks info
with IDLE priority ;) To make restorer not-fail sched bits are validated
for sanity on prep stage.

Minimal sanity test is also there.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-17 00:23:25 +04:00
Pavel Emelyanov
293eca3127 sk: Support SO_NO_CHECK option
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-15 15:45:57 +04:00
Pavel Emelyanov
ed745fa9cf sk: Support SO_DONTROUTE option
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-12 20:32:06 +04:00
Pavel Emelyanov
5e1a9f840c sk: Support SO_PASSSEC and SO_PASSCRED options
There's some bug in show fn for options :( Will be fixed later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-12 14:01:18 +04:00
Pavel Emelyanov
b53d6d90b7 sk: Support SO_MARK socket option
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-12 13:48:18 +04:00
Pavel Emelyanov
9929d2efdf sock: Handle rcvlowat and priority options
And write a test for them.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-11 22:02:25 +04:00
Pavel Emelyanov
f429de662e creds: Support supplementary groups
Dumping them is performed via parasite, since calling the getgroups
is the only way of getting the complete list. Currently the nr of
groups to dump is limited explicitly with the size of shared memory
between crtools and parasite. This is MUCH more that we have seen
on real apps so far.

Restoring is done early, before restorer blob not to carry the undefined
array of grpous in there. This is OK, since groups do not affect us at
that point and are not affected by subsequent creds restore.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-11 17:07:02 +04:00
Andrey Vagin
d7d600c127 tcp: save and restore rcv_wscale (v2)
rcv_wscale is a symetric parameter with snd_wscale.

Both this parameters are set on a connection handshake.

Without this value a remote window size can't be interpreted correctly,
because a value from a packet should be shifted on rcv_wscale.

This patch doesn't break a back compatibility, a rcv window
will be restored with the same bug (rcv_wscale = 0).

v2: Update to a new kernel interface:
	[PATCH] tcp: restore rcv_wscale in a repair mode (v2)

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-19 16:17:11 +04:00
Cyrill Gorcunov
978cecd629 tty: Make termios and winsize being optional params
The dangling slave peers might have no data associate
with them if master peer is closed and link is hanging
up. Thus make this parameters optional to not blow the
image with data which never will be used.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-14 17:50:41 +04:00
Cyrill Gorcunov
89a7a45d37 tty: Add checkpoint/restore for unix terminals v6
Usually the PTYs represent a pair of links -- master peer and slave
peer. Master peer must be opened before slave. Internally, when kernel
creates master peer it also generates a slave interface in a form of
/dev/pts/N, where N is that named pty "index". Master/slave connection
unambiguously identified by this index.

Still, one master can carry multiple slaves -- for example a user opens
one master via /dev/ptmx and appropriate /dev/pts/N in sequence.
The result will be the following

master
`- slave 1
`- slave 2

both slave will have same master index but different file descriptors.
Still inside the kernel pty parameters are same for both slaves. Thus
only one slave parameters should be restored, there is no need to carry
all parameters for every slave peer we've found.

Not yet addressed problems:

- At moment of restore the master peer might be already closed for
  any reason so to resolve such problem we need to open a fake master
  peer with proper index and hook a slave on it, then we close
  master peer.

- Need to figure out how to deal with ttys which have some
  data in buffers not yet flushed, at moment this data will
  be simply lost during c/r

- Need to restore control terminals

- Need to fetch tty flags such as exclusive/packet-mode,
  this can't be done without kernel patching

[ avagin@:
   - ideas on contol terminals restore
   - overall code redesign and simplification
]

v4:
 - drop redundant pid from dump_chrdev
 - make sure optional fown is passed on regular ptys
 - add a comments about zeroifying termios
 - get rid of redundant empty line in files.c

v5 (by avagin@):
 - complete rework of tty image format, now we have
   two files -- tty.img and tty-info.img. The idea
   behind to reduce data being stored.

v6 (by xemul@):
 - packet mode should be set to true in image,
   until properly fetched from the kernel
 - verify image data on retrieval

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-12 20:00:54 +04:00
Cyrill Gorcunov
3234308d62 make: Fix build bug introduced in 726a1180
In 726a1180 we made protobuf library to depend
on *.ch which is good thing but a bit incomplete.
We need a rule to generate headers if they are
missed.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-03 13:26:14 +04:00
Cyrill Gorcunov
726a1180aa make: Make protobuf target to depend on *.ch
Our general source code depends on headers
generated during protobuf library building
but if library is already built and *.ch
files are removed we might hit a problem
where dep files can't be generated.

Thus add explicit rule pointing out that
library depends on generated *.ch files.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-02 00:59:25 +04:00
Cyrill Gorcunov
40688d1945 make: Use PROTO_ prefix for protobuf targets
It's easier to handle things if we know that names
in makefiles are never intersected.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-02 00:59:05 +04:00
Cyrill Gorcunov
d88eebc9e0 make: Move common definitions to Makefile.inc
To eliminate code duplication.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-02 00:58:37 +04:00
Andrey Vagin
a74605a78d sk-inet: restore option REUSEADDR (v2)
All sockets are created with SO_REUSEADDR, it's needed for restoring.
E.g.: A listen socket is created after a connected socket. Both of them
are binded to one port.

So SO_REUSEADDR should be restored, when all sockets on a port were created.

This code creates a structure for each port of one type of sockets
and accounts a number of sockets, which are not restored yet.

Sockets have a hook post_open(), in which it waits when all sockets for
a defined port would be created and then it will restore SO_REUSEADDR.

struct port contains a type (udp, tcp, etc) and a port number.
It doesn't contain family or addr, because it's extra loads of logic,
which doesn't bring a significant profits.

v2: fix according with comments from Pavel

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-20 17:50:08 +04:00
Pavel Emelyanov
2660b810d9 packet: Rings support
There's no way (currently) to check that the ring got restored.
Will do it once we implement mapping of a packet socket and
tcpdump app test.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-20 15:32:55 +04:00
Pavel Emelyanov
4ee3345beb packet: Support fanout
This one may be present and may be not, thus it's optional in the image.
The C-binding we use report the field absense in the parsed stream via
the has_xxx field, but in the google docs it's stated, that

	"When a message is parsed, if it does not contain an optional
	 element, the corresponding field in the parsed object is set
	 to the default value for that field."

Thus, I also declare the default value for it to be not zero as 0 is
a valid fanout configuration.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-20 15:32:27 +04:00
Pavel Emelyanov
69acf64f57 packet: Add support for mclists
The implementation is rather straightforward. One thing to note
is that non-single membership of each type is not supported. It
can be done, but I'm unaware of any software doing so.

Note: the pb show routine should be tuned to support showing bytes.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-15 21:04:49 +04:00
Pavel Emelyanov
7e3463a855 packet: Add PACKET_COPY_THRESH into dump/restore
No test for it, sorry :( There's no easy way to check it works.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-15 19:41:58 +04:00
Pavel Emelyanov
fb57cd126e proto: Add comments describing why we need two IDs for unix and inet sockets
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-15 17:45:40 +04:00
Andrey Vagin
acf73093df sk-inet: save the socket option IPV6_V6ONLY (v2)
Most part of services (ssh, httpd, ...) create two separate sockets
one for ipv4 and one for ipv6. If IPV6_V6ONLY isn't dumped, bind() returns
EADDRINUSE

v2: use do_dump_opt and initialize yes = 1

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-14 12:46:43 +04:00
Cyrill Gorcunov
097d73a101 dump: Add futex robust list dumping v3
This patch introduces ThreadCoreEntry protobuf structure which is to carry
thread-specific arch-independent information.

Now put there the c/r futex robust lists.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-10 20:28:59 +04:00
Pavel Emelyanov
4ae4c4acc9 net: Dump veth device
These devices can be distinguished by type ETHER and kind "veth".
Some problems with peer detection exists (described in comment), but
we cannot handle them at the moment.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-10 17:24:11 +04:00
Pavel Emelyanov
7f1c9af0f8 vma: State that vma->fd is -1 constant in the image
This field was lost while switching to protobuf -- the vma images
were used by parasite as plain array and it was easier to reseve
this space in the image. Now it's too late to change this, so make
it be -1 always.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-10 10:17:50 +04:00
Pavel Emelyanov
7422176427 pscketsk: Add loss and timestamp sockoptions
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-09 19:23:31 +04:00
Pavel Emelyanov
d29feb9103 packetsk: Add support for auxdata, origdev and vnethdr bits
These are boolean in reality.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-09 18:13:02 +04:00
Pavel Emelyanov
3ef8d138ab packetsk: Support PACKET_RESERVE option
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-09 17:38:15 +04:00
Pavel Emelyanov
1259a9ad80 packetsk: Support PACKET_VERSION sockoption
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-09 17:34:40 +04:00
Pavel Emelyanov
fc7071d05e net: Packet sockets basic support
Support only basic packet socket functionality -- create and bind.
This should be enough to start testing dhclient inside container.
Other stuff (filter, mmaps, fanouts, etc.) will come later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-09 16:17:41 +04:00
Pavel Emelyanov
0401664144 signalfd: Add protobuf descriptions
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-02 12:24:27 +04:00
Pavel Emelyanov
4943eb43fd netns: Basic link dump, restore and show
Only support the lo device. This is not final yet (much more
stuff is to be handled for a link) but is rather a skeleton
showing how to do it and letting us check the LXC container
early.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-02 08:17:27 +04:00