All sockets are created with SO_REUSEADDR, it's needed for restoring.
E.g.: A listen socket is created after a connected socket. Both of them
are binded to one port.
So SO_REUSEADDR should be restored, when all sockets on a port were created.
This code creates a structure for each port of one type of sockets
and accounts a number of sockets, which are not restored yet.
Sockets have a hook post_open(), in which it waits when all sockets for
a defined port would be created and then it will restore SO_REUSEADDR.
struct port contains a type (udp, tcp, etc) and a port number.
It doesn't contain family or addr, because it's extra loads of logic,
which doesn't bring a significant profits.
v2: fix according with comments from Pavel
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There's no way (currently) to check that the ring got restored.
Will do it once we implement mapping of a packet socket and
tcpdump app test.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This one may be present and may be not, thus it's optional in the image.
The C-binding we use report the field absense in the parsed stream via
the has_xxx field, but in the google docs it's stated, that
"When a message is parsed, if it does not contain an optional
element, the corresponding field in the parsed object is set
to the default value for that field."
Thus, I also declare the default value for it to be not zero as 0 is
a valid fanout configuration.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The implementation is rather straightforward. One thing to note
is that non-single membership of each type is not supported. It
can be done, but I'm unaware of any software doing so.
Note: the pb show routine should be tuned to support showing bytes.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We hash socket with inode and family, but search only by inode. Well,
yes, family should match, since all sockets are in one sockfs in kernel,
but let's make sure _we_ did things right by checking the family as well.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's in net-next already and does provide all we currently
need (and more). Implementation is like for inet and unix
sockets.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There is no need in 6 receive callbacks -- we can find out family and protocol
(and thus -- type) out of the inet_diag_req_v2 passed through transparent arg
of nlk request engine.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
All sockets should be collected in a target net name-space when the -n net
is specified.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Support only basic packet socket functionality -- create and bind.
This should be enough to start testing dhclient inside container.
Other stuff (filter, mmaps, fanouts, etc.) will come later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The function that does socket collecting is actually a rtnl
request sending one. It will be usefull for netns dump/restore,
so move it to generic place.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We've switched to SkOptsEntry, no need to carry this
obsolete one.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch prepares the ground for further patches.
sk_opts_entry get converted to PB format but not
yet used anywhere in code. Also a few additional
pb_ helpers are added.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2:
- Use do_dump_opt in dump_socket to not call
lookup_socket redundantly
v3:
- use getsockopt to check that unconnected
socket has no peername and it's not listening
- do test only socket protocol since family and
type will be called in dump_one_inet_fd.
v4:
- Move proto tests to can_dump_inet_sk
- Use SOL_TCP/TCP_INFO to gather info about tcp sockets
v5:
- hash unconnected sockets to run BUG_ON(already-dumped) logic
- no default value for sk->wqlen, use zero from xzalloc
- drop bogus xfree on error
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The socket_desc is being looked up in dump_unix/dump_inet. In
the dump_socket this desc is only required to obtain a family,
which in turn can be done with a getsockopt.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Use fdtype_ops facility to c/r inet sockets.
v2:
- Use BUG_ON if socket is attempted to be dumped
several times
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Use fdtype_ops facility to c/r unix sockets.
v2
- BUG_ON added in dump_one_unix_fd if socket
is already dumped since we never should dump
same socket several times
- The order of restore remains as it was before,
the lookup is done via socket inode numbers
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The task is not complete - this is just a part of what have to be done. I.e.
looks like a lot of excessive deps can be fixed.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Those sitting on the SOL_SOCKET level are common to different
socket families and will be handled in a generic code.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
First of all -- to make crtools dump/restore established tcp sockets
you have to specify the --tcp-established options. By doing so you
inform crtools that
a) you know, that after dump there will be a netfilter rules blocking
the dumped connections
b) you guarantee, that before restore this netfilter block is still
there
What else this patch does is simple -- collects establised sockets and
calls the dump/restore tcp function (now empty) where appropriate.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Funny, but after this git thinks, that I've renamed the sockets.c
file into sk-unix.c one and fixed it a little bit %)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Collect all unix sockets in a list while dumping to check
whether we've missed something early (at the restore time
it can be already late for roll-back).
Plus, fix the checks for external stream sockets.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is common, that opened fd fix its fowner and flags. Make
a cuntion for this. Those that obtain fd with open() don't need
to restore flags though.
A thought -- do we need yet another abstraction between fdinfo and
type-d files?
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The conn job misses fowner restoratio. Other places are ok, but toss
the code for easier next patching.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
printf() has standard flag for adding "0x" prefix before hexadecimal integer.
MOD='[-+]\?[0-9]*.\?[0-9]*[lL]*'
git grep -l "0x%${MOD}x" | xargs -n1 sed -e "s/0x%\(${MOD}\)x/%#\1x/g" -i
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Easier to read.
[ xemul: There's a silent change in how sk buffer is read in -- before
the patch there was a static buffer for data, now this thing is
xrealloc-ed ]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if dgram socket peer is not connected back
we can try to resolve peer by name.
For security reason this happens only if '-x' option
is passed at checkpoint and restore time.
In particular this is needed for programs which do
use dgram socket to send messages to /dev/log.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Command below was executed several times:
sed 's/\(pr_.*[^%,x,X]\)\(\%[0-9,l,L]*x\)/\10x\2/g' -i *.c
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some ino/peers are printed with %d, some with %x,
get rid of it and print all in %x.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is trivial. Just copy-n-paste pieces from the udp dumping
and restoring code. The zdtm test is also a compy of the _udp
test with respecitve changes.
Reuired for httpd, this one uses udplite sockets.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if unix socket was not found don't call for container_of.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>