2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-22 18:07:57 +00:00

53 Commits

Author SHA1 Message Date
Andrey Vagin
98efb3c904 tcp: restore the boundary between sent and unsent data
All data in a write buffer can be divided on two parts sent but not yet
acknowledged data and unsent data.

Currently the boundary between sent and unsent data is not dumped and
all the data are restored as if they have already been sent.
This methode can provoke long delays in tcp connection, because a kernel
can wait before retransmitting data.
https://bugzilla.openvz.org/show_bug.cgi?id=2808

The TCP stack must know which data have been sent, because
acknowledgment can be received for them. These data must be restored in
repair mode.

The second part of data have never been sent out, so they can be
restored without any tricks. These data can be sent into socket as
usual.

For restoring unsent data the repair mode is disabled for socket,
but it is enabled back after restoring data. It will be disabled
after unlocking network. In this case window probe is sent, which is
required for waknge the connection.

This patch fixes long delays in tcp connections after dumping and
restoring.

Thanks Pavel for the idea of disabling repair mode for restoring
unsent data.

https://bugzilla.openvz.org/show_bug.cgi?id=2808

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-14 17:15:44 +04:00
Andrey Vagin
d917ff15bb tcp: save the amount of unsent data in the socket send queue
TCP send queue contains two types of data:
* unacknowledged data, which have been sent out
* data, which have not been sent

Currently only the size of all data is save, but it's not enough for
proper restoring of TCP connections.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-14 17:14:31 +04:00
Andrey Vagin
dd407dd04e hdrs: minor cleaup
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-07 15:13:50 +04:00
Andrey Vagin
4850fd94a8 crtools: move cr_options in a separate header
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 18:17:52 +04:00
Andrey Vagin
1a0ee90d2b tcp: disable repair mode for sockets on rollback (v2)
Currently if a network namespace is dumped and something fails, sockets
remain in repair mode. It's because cpt_unlock_tcp_connections is
executed only if network namespace is not dumped.

cpt_unlock_tcp_connections disables repair mode for sockets and drops
netfilters. netfilters are not used in case of network namespaces.

v2: don't execute network-unlock scripts, if network namespace are not
    dumped.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-31 20:12:55 +04:00
Cyrill Gorcunov
664659a0ad inet: tcp -- Find size of max memory allowed to restore TCP data
The maximal size which may be used in the kernel for sending TCP data
on restore is varies depending on how many memory installed on the
system, moreover the memory allocated for "read queue" is bigger than
used for "write queue". Thus when we checkpointed a big slab of data
we need to figure out which size is allowed for sending data on restore.

For this we read /proc/sys/net/ipv4/tcp_[wmem|rmem] on restore and calculate
the size needed, then we simply chop data to segements and send it
in a loop.

Typical output on restore is something like

 | (00.013001)  30110: TCP queue memory limits are 2097152:3145728

https://bugzilla.openvz.org/show_bug.cgi?id=2751

[xemul: moved stuff to kerndat.c]

Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-04 16:18:24 +04:00
Pavel Emelyanov
fe3fb8851e tcp: Support CORK and NODELAY options
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-25 11:43:02 +04:00
Pavel Emelyanov
360c1c13b2 tcp: Show tcp queues contents when requested
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-19 12:16:07 +04:00
Pavel Emelyanov
b18fb09eb9 show: Replace one-line show_foo calls with args array
We have generic do_pb_show() call and tons of show_foo
routines, that just call one with proper args. Compact
the code by putting the args into array and calling
the do_pb_show() in one place.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-24 04:00:32 +04:00
Andrew Vagin
d8cf988ab9 tcp: show PB_TCP_STREAM as a single entry
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-31 17:14:18 +04:00
Cyrill Gorcunov
27582e3272 inet: Restore SO_REUSEADDR in case of rollback
tcp_repair_off implicitly modifies SO_REUSEADDR option
inside the kernel (thanks avagin@ for pointing this
feature out) thus if we are to rollback and restore
the former settings of socket -- don't forget to
repair this particular one.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-16 14:48:24 +04:00
Pavel Emelyanov
79dfbe6cc2 tcp: Switch to use rst memory allocator on repair off
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-05 15:04:57 +04:00
Cyrill Gorcunov
66cc9b6657 make: Introduce compile time include/config.h generation
It's being reported that some systems (as Ubuntu 13.04) already
have struct tcp_repair_opt definition in their system headers.

| sk-tcp.c:25:8: error: redefinition of struct tcp_repair_opt
| sk-tcp.c:31:2: error: redeclaration of enumerator TCP_NO_QUEUE

So add a facility for compile time testing for reported entities
to be present on a system. For this we generate include/config.h
where all tested entries will lay and source code need to include
it only in places where really needed.

Reported-by: Vasily Averin <vvs@parallels.com>
Acked-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-20 16:02:14 +04:00
Kir Kolyshkin
3e8b82d367 Change crtools to criu in comments
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-10 11:02:27 +04:00
Pavel Emelyanov
add21b75c9 show: Remove options args from ->show callback
This thing is global, we can address one explicitly.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-08 00:23:42 +04:00
Cyrill Gorcunov
921dbf23de Don't use \Newline in pr_perror
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-05-02 22:44:24 +04:00
Pavel Emelyanov
5f4759202d tcp: Don't stand absense of TCP_TIMESTAMP sockoption
It has been merged, so do require one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-30 18:39:07 +04:00
Kir Kolyshkin
d90d4b1b88 Fix typos in log messages
Someone has to do it, right?..

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-15 12:46:25 +04:00
Pavel Emelyanov
2ab06c9c5a tcp: Schedule tcp socket for repair-off with proper fd
The fd in -> open callback is temporary (the files restoring
engine will re-open one under some other fd). But since we
add this fd to future repair off, this off will fail working
on wrong fd.

Move scheduling for repair-off into port-open where the corrent
fd is known.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-11 21:10:37 +04:00
Pavel Emelyanov
babc9c617c tcp: Print message when scheduling socket for repair off in pie
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-11 19:36:06 +04:00
Andrey Vagin
17e2daddf2 sk-tcp: fix memory leak
CID 996187 (#1 of 1): Resource leak (RESOURCE_LEAK)
10. leaked_storage: Variable "buf" going out of scope leaks the storage it points to.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-10 00:56:20 +04:00
Pavel Emelyanov
5cae819d8c img: Get rid of open_image_ro helper
O_RSTR flag should be used instead for regular open_image

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-09 19:22:21 +04:00
Pavel Emelyanov
c5aa77f88b tcp: Use protobuf showing fn for tcp-stream images
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-02 14:18:11 +04:00
Andrey Vagin
ef4783a646 check: Check for ability to get tcp timestamp
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-02-14 20:27:55 +04:00
Andrey Vagin
04c48fefbc tcp: dump and restore tcp_timestamp for each socket (v2)
If a TCP socket will get live-migrated from one box to another the
timestamps (which are typically ON) will get screwed up -- the new
kernel will generate TS values that has nothing to do with what they
were on dump. The solution is to yet again fix the kernel and put a
"timestamp offset" on a socket.

v2: don't fail if TCP_TIMESTAMP is unsupported

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-02-14 20:27:39 +04:00
Pavel Emelyanov
ac845bd1d8 cr: Obsolete the --namespaces option
It's no longer required to use this option -- two currently
supported cases (tasks on host and tasks in containers) can
be detected automatically. Keep this option for future.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-18 13:25:16 +04:00
Andrey Vagin
b1abc3b21c restore: don't desable tcp repair mode twice
TCP repair mode should be disabled after unlocking connections.
Disabling of repair mode drops SO_REUSEADDR, so it should be restored.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-14 18:47:19 +04:00
Alexander Kartashov
6f61488f21 x86: moved x86-specific files into the directory arch/x86.
* The following files goes into the directory arch/x86/include/asm unmodified:
  - include/atomic.h,
  - include/linkage.h,
  - include/memcpy_64.h,
  - include/types.h,
  - include/bitops.h,
  - pie/parasite-head-x86-64.S,
  - include/processor-flags.h,
  - include/syscall-x86-64.def.

* Changed include directives in the source files that include the headers
  listed above.

* Modified build scripts to reflect the source moves.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-09 17:02:47 +04:00
Andrey Vagin
456413c98f tcp: check a state in refresh_inet_sk (v2)
A socket can get fin and a state will be changed on CLOSE_WAIT,
which is not supported yet.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 17:17:51 +04:00
Andrey Vagin
e10829370d tcp: refresh a data about tcp connection after blocking it (v5)
We have a window between getting info about tcp connections
and blocking them.

#2419

v2: clean upV
v3: don't update lengthes of queues for listen sockets,
    they don't used.
v4: check that a state of a tcp connection is ESTABLISHED or CLOSE
v5: * don't check state, because it can be changed only on TCP_CLOSE.
    In this case it will be changed again after restoring.
    * refresh a socket after enabling the repair mode

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 17:17:30 +04:00
Andrey Vagin
0fa83b0394 socket: increase socket buffers for restoring queues (v2)
Sizes of send and recv buffers are set to maximum to restore queues,
after that sizes of buffers are restored.

O_NONBLOCK is set to a socket to prevent blocking during restore.

#2411

v2: do the same for unix sockets.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-02 13:40:54 +04:00
Pavel Emelyanov
bf8b7c4f89 inet: Reshuffle proto-level socket dumping
We'll support other tcp states and udp-specific info eventually.
This introduced switch() looks more friendly to this future.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-30 22:25:57 +03:00
Andrey Vagin
d7d600c127 tcp: save and restore rcv_wscale (v2)
rcv_wscale is a symetric parameter with snd_wscale.

Both this parameters are set on a connection handshake.

Without this value a remote window size can't be interpreted correctly,
because a value from a packet should be shifted on rcv_wscale.

This patch doesn't break a back compatibility, a rcv window
will be restored with the same bug (rcv_wscale = 0).

v2: Update to a new kernel interface:
	[PATCH] tcp: restore rcv_wscale in a repair mode (v2)

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-19 16:17:11 +04:00
Andrey Vagin
08e559b3f2 tcp: rename functions for unlocking tcp connections
One function is used on restoring and one is used on dumping,
so each function has own prefix rst or cpt.
The both functions have the same effect, so the main part of the names
is same and it describes "unlock_tcp_connections".

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:07:03 +04:00
Andrey Vagin
fbea445df4 cd-dump: lock connection with iptables rules only in a current netns
For another netns we don't need to lock separate connections,
an external chanel can be locked.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:05:49 +04:00
Andrey Vagin
45fa18487d tcp: split the list tcp_repair_sockets on rst and cpt parts
because here are two types of entries

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:05:32 +04:00
Andrey Vagin
c27ff2baac tcp: unset TCP_REPAIR at the last moment after unlocking network (v2)
TCP_REPAIR should be droppet when a network is unlocked.
A network should be unlocked at the last moment, because
after this moment restore must not failed, otherwise a state of
a tcp connection can be changed and a state of one side in our image
will be invalid.

v2: use xremalloc instead of mmap and remmap

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:02:57 +04:00
Andrey Vagin
7e22e60f83 net: add ability to use tcp_repair_off from a restorer code
Use sys_setsockopt and declare a function in a header as "static inline"

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:02:39 +04:00
Andrey Vagin
458cb41f57 restore: disable repair mode in post_open()
A disabling repair mode drops SO_REUSEADDR.

We can set SO_REUSEADDR after disabling repair mode, but
a small race window exists in this case.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-27 23:16:50 +04:00
Pavel Emelyanov
b1b0a39a58 pb: Rewrite object reading to use pb-descs
The pb_read thing is no longer a macros. This will allow to
factor out objects collecting on restore.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-07 19:22:00 +04:00
Pavel Emelyanov
2398c55e41 pb: Rewrite object writing to use pb-descs
The pb_write thing is no longer a macros.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-07 19:21:59 +04:00
Pavel Emelyanov
4ee52f3403 tcp: Use sk ino number for opening tcp stream image on restore
It was accidentally broken by 424a4adb6. It's better to use sk ino
instead of sk id, since tcp connection may be unbound from any fds
(not supported now) and thus there may be no ID for those.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-07-31 13:57:44 +04:00
Cyrill Gorcunov
7818863ad0 protobuf: Convert tcp_stream_entry to PB engine
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-07-17 07:52:44 +04:00
Cyrill Gorcunov
65570d9559 sockets, inet: Use inet_sk_entry as a reference in inet_sk_info
For PB transition.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-07-16 07:03:59 +04:00
Cyrill Gorcunov
424a4adb6f sockets, inet: Use general machnism for checkpoint/restore v2
Use fdtype_ops facility to c/r inet sockets.

v2:
 - Use BUG_ON if socket is attempted to be dumped
   several times

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-30 12:56:04 +04:00
Stanislav Kinsbursky
41195598cf parasite: remove excessive header deps from parasite.h and friends
The task is not complete - this is just a part of what have to be done. I.e.
looks like a lot of excessive deps can be fixed.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-30 12:50:18 +04:00
Andrey Vagin
066ec066a0 crtools: remove unused variables (v3)
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-18 19:01:21 +04:00
Pavel Emelyanov
4ec76b285f tcp: Add code for "check"-ing TCP repair support
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-05-03 15:21:37 +04:00
Pavel Emelyanov
6b2a0205c7 tcp: Fix queue contents restore
The wrong descriptor was passed to the read_img_buf.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-29 09:08:11 +04:00
Pavel Emelyanov
08788c835d tcp: Show queue contents when -c is given
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-04-29 09:06:03 +04:00