This workarounds a compilation warning on ARM:
packet-sock.pb-c.c: In function 'packet_sock_entry__init':
packet-sock.pb-c.c:98:3: error: this decimal constant is unsigned only in ISO C90 [-Werror]
packet-sock.pb-c.c: At top level:
packet-sock.pb-c.c:318:1: error: this decimal constant is unsigned only in ISO C90 [-Werror]
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will be handling both inotify and fanotify
objects here thus to make less confusion rename
the files to fsnotify.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dump the with "new" prlimit syscall that works on arbitrary pid.
Restore is done in restorer _after_ mappings mixup and _before_
caps drop to make it set any max value.
The RLIM_INFINITY is handled explicitly to help future 64<->32
bits migration.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The dumping of FPU state is done with help of ptrace
facility. There are two cases which we need to handle
depending on which features are available on host machine
1) The dump via ptrace(PTRACE_GETFPREGS ...)
In this case the kernel will use fxsave approach
inside the kenrel and provides us back the data
encoded in i387_fxsave_struct format.
2) The dump via ptrace(PTRACE_GETREGSET ...)
In this case the kernel will use xsave approach
inside the kernel and provides us back the data
encoded in xsave_struct format.
In any case we decode data and save it in protobuf format.
This is why core.proto file has been extended to keep new
entries.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Actually it was never used, just drop it.
Because of backward compatibility problem we
can't just zap it in protofile.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Because we need to lookup for ghost files from
inotify system where we only have device/inode
as a key, we save dev/ino in ghost image entry.
Note we use in-kernel format for device to be
consistent with inotify and mount related
code base.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The SO_BINDTODEVICE getter is changed in the kernel (before
official release) to report not index, but name to be in
harmony with setter.
Fix crtools accordingly.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Main things:
1) Variables are defined properly (":=" or ":=" instead of "+"). Otherwise,
because we call nested makefiles, and such variables like CFLAGS are
inheriting it's previous state.
2) SYS-OBJ renamed to SYSCALL-LIB.
3) Inlcude of Makefile.inc removed from protobuf/Makefile
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Looks-good-to: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
One thing to note. The socket filter proggie is a set of struct-s
wuth 8 and 16 bits values in it. Protobuf doesn't support such thing
and it's too annoying to mess with yet another message for that.
Instead, I encode all this stuff into array of fixed64 fields to
handle endianity (yes, protobuf handles it, but each field is not
just 64-bit value, but a structure).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The kernel now supports providing VMA flags via smaps
interface so add pasting of them.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Get the info from kernel diag message (it should always be there)
and restore the shutdown at the very end.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The kernel SO_BINDTODEVICE option is not symmetrical --
set required device name, but get reports index. Thus
need the index to name resolver.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
No magic here, just fetch info using getpriority and sched_getxxx calls.
Good news is that the mentioned syscalls take pid as argument and do work
with it, i.e. -- no need in parasite help here.
Restore is splitted into prep -- copy sched bits from image on restorer
args -- and the restore itself. It's done to avoid restoring tasks info
with IDLE priority ;) To make restorer not-fail sched bits are validated
for sanity on prep stage.
Minimal sanity test is also there.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dumping them is performed via parasite, since calling the getgroups
is the only way of getting the complete list. Currently the nr of
groups to dump is limited explicitly with the size of shared memory
between crtools and parasite. This is MUCH more that we have seen
on real apps so far.
Restoring is done early, before restorer blob not to carry the undefined
array of grpous in there. This is OK, since groups do not affect us at
that point and are not affected by subsequent creds restore.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
rcv_wscale is a symetric parameter with snd_wscale.
Both this parameters are set on a connection handshake.
Without this value a remote window size can't be interpreted correctly,
because a value from a packet should be shifted on rcv_wscale.
This patch doesn't break a back compatibility, a rcv window
will be restored with the same bug (rcv_wscale = 0).
v2: Update to a new kernel interface:
[PATCH] tcp: restore rcv_wscale in a repair mode (v2)
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The dangling slave peers might have no data associate
with them if master peer is closed and link is hanging
up. Thus make this parameters optional to not blow the
image with data which never will be used.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Usually the PTYs represent a pair of links -- master peer and slave
peer. Master peer must be opened before slave. Internally, when kernel
creates master peer it also generates a slave interface in a form of
/dev/pts/N, where N is that named pty "index". Master/slave connection
unambiguously identified by this index.
Still, one master can carry multiple slaves -- for example a user opens
one master via /dev/ptmx and appropriate /dev/pts/N in sequence.
The result will be the following
master
`- slave 1
`- slave 2
both slave will have same master index but different file descriptors.
Still inside the kernel pty parameters are same for both slaves. Thus
only one slave parameters should be restored, there is no need to carry
all parameters for every slave peer we've found.
Not yet addressed problems:
- At moment of restore the master peer might be already closed for
any reason so to resolve such problem we need to open a fake master
peer with proper index and hook a slave on it, then we close
master peer.
- Need to figure out how to deal with ttys which have some
data in buffers not yet flushed, at moment this data will
be simply lost during c/r
- Need to restore control terminals
- Need to fetch tty flags such as exclusive/packet-mode,
this can't be done without kernel patching
[ avagin@:
- ideas on contol terminals restore
- overall code redesign and simplification
]
v4:
- drop redundant pid from dump_chrdev
- make sure optional fown is passed on regular ptys
- add a comments about zeroifying termios
- get rid of redundant empty line in files.c
v5 (by avagin@):
- complete rework of tty image format, now we have
two files -- tty.img and tty-info.img. The idea
behind to reduce data being stored.
v6 (by xemul@):
- packet mode should be set to true in image,
until properly fetched from the kernel
- verify image data on retrieval
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In 726a1180 we made protobuf library to depend
on *.ch which is good thing but a bit incomplete.
We need a rule to generate headers if they are
missed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Our general source code depends on headers
generated during protobuf library building
but if library is already built and *.ch
files are removed we might hit a problem
where dep files can't be generated.
Thus add explicit rule pointing out that
library depends on generated *.ch files.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's easier to handle things if we know that names
in makefiles are never intersected.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
All sockets are created with SO_REUSEADDR, it's needed for restoring.
E.g.: A listen socket is created after a connected socket. Both of them
are binded to one port.
So SO_REUSEADDR should be restored, when all sockets on a port were created.
This code creates a structure for each port of one type of sockets
and accounts a number of sockets, which are not restored yet.
Sockets have a hook post_open(), in which it waits when all sockets for
a defined port would be created and then it will restore SO_REUSEADDR.
struct port contains a type (udp, tcp, etc) and a port number.
It doesn't contain family or addr, because it's extra loads of logic,
which doesn't bring a significant profits.
v2: fix according with comments from Pavel
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There's no way (currently) to check that the ring got restored.
Will do it once we implement mapping of a packet socket and
tcpdump app test.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This one may be present and may be not, thus it's optional in the image.
The C-binding we use report the field absense in the parsed stream via
the has_xxx field, but in the google docs it's stated, that
"When a message is parsed, if it does not contain an optional
element, the corresponding field in the parsed object is set
to the default value for that field."
Thus, I also declare the default value for it to be not zero as 0 is
a valid fanout configuration.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The implementation is rather straightforward. One thing to note
is that non-single membership of each type is not supported. It
can be done, but I'm unaware of any software doing so.
Note: the pb show routine should be tuned to support showing bytes.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Most part of services (ssh, httpd, ...) create two separate sockets
one for ipv4 and one for ipv6. If IPV6_V6ONLY isn't dumped, bind() returns
EADDRINUSE
v2: use do_dump_opt and initialize yes = 1
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch introduces ThreadCoreEntry protobuf structure which is to carry
thread-specific arch-independent information.
Now put there the c/r futex robust lists.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These devices can be distinguished by type ETHER and kind "veth".
Some problems with peer detection exists (described in comment), but
we cannot handle them at the moment.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This field was lost while switching to protobuf -- the vma images
were used by parasite as plain array and it was easier to reseve
this space in the image. Now it's too late to change this, so make
it be -1 always.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Support only basic packet socket functionality -- create and bind.
This should be enough to start testing dhclient inside container.
Other stuff (filter, mmaps, fanouts, etc.) will come later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Only support the lo device. This is not final yet (much more
stuff is to be handled for a link) but is rather a skeleton
showing how to do it and letting us check the LXC container
early.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>