2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-09-05 08:45:49 +00:00
Commit Graph

2223 Commits

Author SHA1 Message Date
Radostin Stoyanov
4fe75ecfc4 zdtm: Specify --address for remote_lazy_pages
When --remote-lazy-pages is used the address option was not specified.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:01 +03:00
Cyrill Gorcunov
19d3da0d9b test: static/tun -- More detailed errors and code shrink
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
6b8852cfcb test: static/socket-tcp -- Check for iptables success
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
33dd782c1d test: static/tun -- Check if unshare successed
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
1efdb547e2 test: static/socket-tcp -- Check if unshare successed
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
dbdd19fa1b test: static/sk-unix-mntns -- Check if unshare successed
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
15e6662472 test: static/netns_sub -- Check if unshare successed
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
9121d8b78c test: static/mntns_shared_bind -- Check if unshare successed
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
caca7c1ad2 test: static/mntns_root_bind -- Check if unshare successed
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
76669e307b test: static/mnt_ext_master -- Check if unshare successed
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Cyrill Gorcunov
4e212cebb6 test: static/mnt_ext_auto -- Check if unshare successed
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Andrei Vagin
4ce3fc0b98 jenkins/criu-lazy-migration: clean old files
Otherwise files from a previous build can affect the current one.

Reported-by: Mr Jenkins
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:28:00 +03:00
Radostin Stoyanov
d40c702dd8 zdtm: Call criu.available() only for run action
When zdtm.py is executed with `list` sub-command the 'criu_bin'
option is not defined and criu.available() fails.

$ python test/zdtm.py list
Traceback (most recent call last):
  File "test/zdtm.py", line 2243, in <module>
    criu.available()
  File "test/zdtm.py", line 1185, in available
    if not os.access(opts['criu_bin'], os.X_OK):
KeyError: u'criu_bin'

However, we don't need to check the existence of criu_bin
unless we use the `run` action.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Acked-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:58 +03:00
Radostin Stoyanov
221f115189 Fix typos
Most of the typos were found by codespell.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:58 +03:00
Alice Frosi
5f8ee1053f s390: Fix to skip the test if GS not supported
Signed-off-by: Alice Frosi <alice@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:58 +03:00
Andrei Vagin
3795598cd5 zdtm: handle inherit_fd-s in the rpc mode
./test//zdtm.py --set inhfd run --all --rpc

Cc: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:58 +03:00
Pavel Tikhomirov
f194124e0a zdtm: cleanup zdtmtst and zdtmtst.defaultroot cgroups after finishing test
After running criu test, docker on the node becomes unusable, as it is
confused by our leftover cgroups. Surely docker should be fixed to
ignore custom cgroups (https://github.com/moby/moby/issues/37601), but
we should not leave them after test also.

v2: rmdir the holder only if it exists, remove racy wait and remove
wrongly added cleanup method in class criu
v3: bring back missed semicolon

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:58 +03:00
Pavel Tikhomirov
1c37c0d46b zdtm/fork: print children wait status
05:57:35.640:  1238: FAIL: fork.c:80: Task 16275 didn't exit (errno = 10 (No child processes))

There is no info about the killing signal in logs, add it.

https://jira.sw.ru/browse/PSBM-87579
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:58 +03:00
Adrian Reber
dabe778c38 tests: add configuration file test via RPC
This test checks the following things:

 * Does configuration file parsing work at all.
 * Does the parser detect wrong options.
 * Does the configuration file work via RPC.
 * Do the configuration file options not overwrite the RPC settings in
   the default setup.
 * Is it possible to tell CRIU to prefer the configuration file via RPC.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:57 +03:00
Andrei Vagin
75d049c5a7 Revert "test: check criu restore with --auto-dedup"
This reverts commit 8f1ba5892d.

This test doesn't work right now:
=[log]=> dump/zdtm/transition/maps007/174/3/restore.log
------------------------ grep Error ------------------------
(00.237564)      1:    `- FD 2 pid 6
(00.237566)      1:  `- type 1 ID 0xa
(00.237567)      1:    `- FD 3 pid 5
(00.237568)      1:    `- FD 3 pid 6
(00.240025)      1: Error (criu/image.c:432): Unable to open pages-3.img: Permission denied
(00.270600) uns: calling exit_usernsd (-1, 1)
(00.270640) uns: daemon calls 0x469f80 (199, -1, 1)
(00.270648) uns: `- daemon exits w/ 0
(00.271193) uns: daemon stopped
(00.271199) Error (criu/cr-restore.c:2308): Restoring FAILED.
------------------------ ERROR OVER ------------------------

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:56 +03:00
Andrei Vagin
6ef193b0a0 test: check criu restore with --auto-dedup
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:55 +03:00
Andrei Vagin
15b4d9205b test/others: check external network namespaces
Acked-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:54 +03:00
Pavel Tikhomirov
3a66660e3e zdtm: shared options should not be lost for bind mounts
1150 1371 0:169 / /zdtm/static/private_bind_propagation.test rw,relatime shared:920 - tmpfs zdtm_fs rw
1151 1150 0:170 / /zdtm/static/private_bind_propagation.test/share1 rw,relatime shared:921 - tmpfs share rw
1152 1150 0:170 / /zdtm/static/private_bind_propagation.test/share2 rw,relatime shared:921 - tmpfs share rw
1153 1151 0:169 /source /zdtm/static/private_bind_propagation.test/share1/child rw,relatime - tmpfs zdtm_fs rw
1154 1152 0:169 /source /zdtm/static/private_bind_propagation.test/share2/child rw,relatime - tmpfs zdtm_fs rw

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2018-10-30 19:27:54 +03:00
Pavel Tikhomirov
6c81e8773e zdtm.py: also check that sharing options are restored for mounts
We already check (root, mountpoint) pairs preserve, do the same for
(root, mountpoint, shared, slave) fours.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:54 +03:00
Pavel Tikhomirov
382756a4f5 zdtm: add a test for non-uniform shares
Create a tree of shared mounts where shared mounts have different sets
of children (while having the same root):

First share1 is mounted shared tmpfs and second share1/child1 is mounted
inside, third share1 is bind-mounted to share2 (now share1 and share2
have the same shared id, but share2 has no child), fourth share1/child2
is bind-mounted from share1, and also propagated to share2/child2 (now
all except share1/child1 have the same shared id), fifth share1/child3
is mounted and propagates inside the share.

Finally we have four mounts shared between each other with different
sets of children mounts, and even more two of them are children of
another two:

495 494 0:62 / /zdtm/static/non_uniform_share_propagation.test/share1 rw,relatime shared:235 - tmpfs share rw
496 495 0:63 / /zdtm/static/non_uniform_share_propagation.test/share1/child1 rw,relatime shared:236 - tmpfs child1 rw
497 494 0:62 / /zdtm/static/non_uniform_share_propagation.test/share2 rw,relatime shared:235 - tmpfs share rw
498 495 0:62 / /zdtm/static/non_uniform_share_propagation.test/share1/child2 rw,relatime shared:235 - tmpfs share rw
499 497 0:62 / /zdtm/static/non_uniform_share_propagation.test/share2/child2 rw,relatime shared:235 - tmpfs share rw
500 495 0:64 / /zdtm/static/non_uniform_share_propagation.test/share1/child3 rw,relatime shared:237 - tmpfs child3 rw
503 497 0:64 / /zdtm/static/non_uniform_share_propagation.test/share2/child3 rw,relatime shared:237 - tmpfs child3 rw
502 499 0:64 / /zdtm/static/non_uniform_share_propagation.test/share2/child2/child3 rw,relatime shared:237 - tmpfs child3 rw
501 498 0:64 / /zdtm/static/non_uniform_share_propagation.test/share1/child2/child3 rw,relatime shared:237 - tmpfs child3 rw

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2018-10-30 19:27:54 +03:00
Pavel Tikhomirov
24bd5fcfff zdtm: check children of shared slaves restore
495 494 0:62 / /zdtm/static/shared_slave_mount_children.test/share rw,relatime shared:235 - tmpfs share rw
496 494 0:62 / /zdtm/static/shared_slave_mount_children.test/slave1 rw,relatime shared:236 master:235 - tmpfs share rw
497 494 0:62 / /zdtm/static/shared_slave_mount_children.test/slave2 rw,relatime shared:236 master:235 - tmpfs share rw
498 496 0:63 / /zdtm/static/shared_slave_mount_children.test/slave1/child rw,relatime shared:237 - tmpfs child rw
499 497 0:63 / /zdtm/static/shared_slave_mount_children.test/slave2/child rw,relatime shared:237 - tmpfs child rw

Before the fix we had:

(00.167574)      1: Error (criu/mount.c:1769): mnt: A few mount points can't be mounted
(00.167577)      1: Error (criu/mount.c:1773): mnt: 498:496 / /tmp/.criu.mntns.o2Op5j/9-0000000000/zdtm/static/shared_slave_mount_children.test/slave1/child child
(00.167580)      1: Error (criu/mount.c:1773): mnt: 497:494 / /tmp/.criu.mntns.o2Op5j/9-0000000000/zdtm/static/shared_slave_mount_children.test/slave2 share

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2018-10-30 19:27:54 +03:00
Pavel Tikhomirov
fc01e18b47 zdtm: add a test for unsupported children collision
These test is not automatic as after kernel v4.11 behaviour changes, on
older kernel we get children collision:

817 188 0:48 / /zdtm/static/unsupported_children_collision.test/share1 rw,relatime shared:942 - tmpfs share rw
> 818 817 0:124 / /zdtm/static/unsupported_children_collision.test/share1/child rw,relatime shared:943 - tmpfs child1 rw
819 188 0:48 / /zdtm/static/unsupported_children_collision.test/share2 rw,relatime shared:942 - tmpfs share rw
820 819 0:125 / /zdtm/static/unsupported_children_collision.test/share2/child rw,relatime shared:944 - tmpfs child2 rw
> 821 817 0:125 / /zdtm/static/unsupported_children_collision.test/share1/child rw,relatime shared:944 - tmpfs child2 rw

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
2018-10-30 19:27:54 +03:00
Adrian Reber
35fbc373a9 test rpc: remove unnecessary import, close fd
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:54 +03:00
Andrei Vagin
bdbd7c8f14 zdtm/static: add a test to check epoll file descriptors
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:53 +03:00
Andrei Vagin
d6ec834757 zdtm: handle errors of make
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:53 +03:00
Adrian Reber
ae55a6ccd5 tests: fix builds on alpine and centos
Install sudo, create test user with ID 1000, install bash,
fix pidfile creation and pidfile chmod.

v2:
 * use sleep to give the criu daemon some time to start up

v3:
 * Andrei is of course right and sleep is not good solution.
   After adding --status-fd support to criu service, this
   is how we now detect that criu is ready.

v4:
 * This was much more complicated than expected which is related
   to the different versions of the tools on the different travis
   test targets. There seems to be a bug in bash on Ubuntu
    https://lists.gnu.org/archive/html/bug-bash/2017-07/msg00039.html
   which prevents using 'read -n1' on Ubuntu. As a workaround
   the result from CRIU's status FD is now read via python.

   Another problem was discovered on alpine with the loop restore test.
   CRIU says to use setsid even if the process is already using setsid.
   As a workaround, still with setsid, this process is now using
   shell-job true for checkpoint and restore.

Parts of v2 have been committed before. So the changes from this commit
are partially already in another commit.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:52 +03:00
Adrian Reber
2c5b2785ed tests: fix builds on alpine and centos
Install sudo, create test user with ID 1000, install bash,
fix pidfile creation and pidfile chmod.

v2:
 * use sleep to give the criu daemon some time to start up

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:52 +03:00
Adrian Reber
46b35b2c2f test/others/rpc: also run RPC version command via service
This extends the test.py to also run the RPC command VERSION via 'criu
service'. It was already running using 'criu swrk'.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:52 +03:00
Adrian Reber
03d636afed test/other/rpc: resurrect the RPC test cases
In this directory there are various test cases using CRIU in RPC mode
(or SWRK mode).

This fixes the broken tests by moving the start of 'criu service' from
run.sh to the Makefile as the test cases is running using "sudo -g
'#1000' -u '#1000'" and the PID file created by CRIU can only be read by
the root user. If starting the 'criu service' before run.sh the PID file
still can be changed to 0666 and fixing the test script.

This also adds version.py to the test cases that are executed.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:52 +03:00
Veronika Kabatova
7c9e7cfc3e Modify and add test for configuration file functionality
Creating a test for verifying configuration parsing feature. The
test is created by reusing already present inotify_irmap test.

Because of addition of default configuration files, --no-default-config
option is added to zdtm.py to not break the test suite on systems with
these files present.

Signed-off-by: Veronika Kabatova <vkabatov@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-30 19:27:52 +03:00
Pavel Emelyanov
e51960ea3e test, pipes: Exhaustive test of shared pipes
So, here's the next test that just enumerates all possible states and checks
that CRIU C/R-s it well. This time -- pipes. The goal of the test is to load
the fd-sharing engine, so pipes are chosen, as they not only generate shared
struct files, but also produce 2 descriptors in CRIU's fdesc->open callback
which is handled separately.

It's implemented slightly differently from the unix test, since we don't want
to check sequences of syscalls on objects, we need to check the task to pipe
relations in all possible ways.

The 'state' is several tasks, several pipes and each generated test includes
pipe ends sitting in all possible combinations in the tasks' FDTs.

Also note, that states, that seem to be equal to each other, e.g. pipe between
tasks A->B and pipe B->A, are really different as CRIU picks the pipe-restorer
based in task PIDs. So whether the picked task has read end or write end at
his FDT makes a difference on restore.

Number of tasks is limited with --tasks option, number of pipes with the
--pipes one. Test just runs all -- generates states, makes them and C/R-s
them. To check the restored result the /proc/pid/fd/ and /proc/pid/fdinfo/
for all restored tasks is analyzed.

Right now CRIU works OK for --tasks 2 --pipes 2 (for more -- didn't check).
Kirill, please, check that your patches pass this test.

TODO:

 - Randomize FDs under which tasks see the pipes. Now all tasks if they have
   some pipe, all see it under the same set of FDs.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-27 10:49:46 +03:00
Pavel Emelyanov
e098b119f3 test, unix: Exhaustive testing of states (v2)
By exhaustive testing I understand a test suite that generates as much
states to try to C/R as possible by trying all the possible sequences
of system calls. Since such a generation, if done on all the Linux API
we support in CRIU, would produce bazillions of process, I propose to
start with something simple.

As a starting point -- unix stream sockets with abstract names that
can be created and used by a single process :)

The script generates situations in which unix sockets can get into by
using a pre-defined set of system calls. In this patch the syscalls
are socket, listen, bind, accept, connect and send. Also the nummber
of system calls to use (i.e. -- the depth of the tree) is limited by
the --depth option.

There are three things that can be done with a generated 'state':

I) Generate :) and show

Generation is done by recursively doing everything that is possible
(and makes sence) in a given state. To reduce the size of the tree
some meaningless branches are cut, e.g. creating a socket and closing
it right after that, creating two similar sockets one-by-one and some
more.

Shown on the screen is a cryptic string, e.g. 'SA-CX-MX_SBL one,
describing the sockets in the state. This is how it can be decoded:

 - sockets are delimited with _
 - first goes type (S -- stream, D --datagram)
 - next goes name state (A -- no name, B with name, X socket is not in
   FD table, i.e. closed or not yet accepted)
 - next may go letter L meaning that the socket is listening
 - -Cx -- socket is connected and x is the peer's name state
 - -Ixyz -- socket has incoming connections queue and xyz are the
   connect()-ors name states
 - -Mxyz -- socket has messages and xyz is senders' name states

The example above means, that we have two sockets:

 - SA-CX-MX: stream, with no name, connected to a dead one and with a
   message from a dead one
 - SBL: stream, with name, listening

Next printed is the sequence of system calls to get into it, e.g. this
is how to get into the state above:

	socket(S) = 1
	bind(1, $name-1)
	listen(1)
	socket(S) = 2
	connect(2, $name-1)
	accept(1) = 3
	send(2, $message-0)
	send(3, $message-0)
	close(3)

Program has created a stream socket, bound it, listened it, then
created another stream socket, connected to the 1st one, then accepted
the connection sent two messages vice-versa and closed the accepted
end, so the 1st socket left connected to the dead socket with a
message from it.

II) Run the state

This is when test actually creates a process that does the syscalls
required to get into the generated state (and hopefully gets into it).

III) Check C/R of the state

This is the trickiest part when it comes to the R step -- it's not
clear how to validate that the state restored is correct. But if only
trying to dump the state -- it's just calling criu dump. As images dir
the state string description is used.

One may choose only to generate the states with --gen option. One may
choose only to run the states with --run option. The latter is useful
to verify that the states generator is actually producing valid
states. If no options given, the state is also dump-ed (restore is to
come later).

For now the usage experience is like this:

- Going --depth 10 --gen (i.e. just generating all possibles states
  that are acheivable with 10 syscalls) produces 44 unique states for
  0.01 seconds. The generated result covers some static tests we have
  in zdtm :)  More generation stats is like this:
   --depth 15 : 1.1 sec   / 72 states
   --depth 18 : 13.2 sec  / 89 states
   --depth 20 : 1 m 8 sec / 101 state

- Running and trying with criu is checked with --depth 9. Criu fails
  to dump the state SA-CX-MX_SBL (shown above) with the error

  Error (criu/sk-queue.c:151): recvmsg fail: error: Connection reset by peer

Nearest plans:

1. Add generators for on-disk sockets names (now oly abstract).
   Here an interesting case is when names overlap and one socket gets
   a name of another, but isn't accessible by it

2. Add datagram sockets.
   Here it'd be fun to look at how many-to-one connections are
   generated and checked.

3. Add socketpair()-s.

Farther plans:

1. Cut the tree better to allow for deeper tree scan.

2. Add restore.

3. Add SCM-s

4. Have the exhaustive testing for other resources.

Changes since v1:

* Added DGRAM sockets :)

  Dgram sockets are trickier that STREAM, as they can reconnect from
  one peer to another. Thus just limiting the tree depth results in
  wierd states when socket just changes peer. In the v1 of this patch
  new sockets were added to the state only when old ones reported that
  there's nothing that can be done with them. This limited the amount
  of stupid branches, but this strategy doesn't work with dgram due to
  reconnect. Due to this, change #2:

* Added the --sockets NR option to limit the amount of sockets.

  This allowed to throw new sockets into the state on each step, which
  made a lot of interesting states for DGRAM ones.

* Added the 'restore' stage and checks after it.

  After the process is restore the script performs as much checks as
  possible having the expected state description in memory. The checks
  verify that the values below get from real sockets match the
  expectations in generated state:

   - socket itself
   - name
   - listen state
   - pending connections
   - messages in queue (sender is not checked)
   - connectivity

  The latter is checked last, after all queues should be empty, by
  sending control messages with socket.recv() method.

* Added --keep option to run all tests even if one of them fails.

  And print nice summary at the end.

So far the test found several issues:

- Dump doesn't work for half-closed connection with unread messages
- Pending half-closed connection is not restored
- Socket name is not restored
- Message is not restored

New TODO:

- Check listen state is still possible to accept connections (?)
- Add socketpair()s
- Add on-disk names
- Add SCM-s
- Exhaustive script for other resources

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-10-27 10:49:46 +03:00
Andrei Vagin
5c76d06186 zdtm: open notify file descriptors in a binary mode
Send pre-dump notify to 36
  Traceback (most recent call last):
    File "zdtm.py", line 2161, in <module>
      do_run_test(tinfo[0], tinfo[1], tinfo[2], tinfo[3])
    File "zdtm.py", line 1549, in do_run_test
      cr(cr_api, t, opts)
    File "zdtm.py", line 1264, in cr
      test.pre_dump_notify()
    File "zdtm.py", line 490, in pre_dump_notify
      fdin.write(struct.pack("i", 0))
  TypeError: write() argument 1 must be unicode, not str

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-10 07:01:35 +03:00
Adrian Reber
a9e29bc596 Fix building unlink_fstat00 unlink test case
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:52 +03:00
Pavel Tikhomirov
4d9bf608b5 zdtm: check that pid-reuse does not break iterative memory dump
The idea of the test is:

1) mmap separate page and put variable there, so that other usage does
not dirty these region. Initialize the variable with VALUE_A.

2) fork a child with special pid == CHILD_NS_PID. Only if it is a first
child overwrite the variable with VALUE_B.

3) wait for the end of the next predump or end of restore with
test_wait_pre_dump_ack/test_wait_pre_dump pair and kill our child.

Note: The memory region is "clean" in parent.

4) goto (2) unles end of cr is reported by test_waitpre

So on first iteration child with pid CHILD_NS_PID was dumped with
VALUE_B, on all other iterations and on final dump other child with the
same pid exists but with VALUE_A. But on all iterations after the first
one we have these memory region "clean". So criu before the fix would
have restored the VALUE_B taking it from first child's image, but should
restore VALUE_A.

Note: Child in its turn waits termination and performs a check that variable
value doesn't change after c/r.

We should run the test with at least one predump to trigger the problem:

[root@snorch criu]# ./test/zdtm.py run --pre 1 -k always -t zdtm/transition/pid_reuse
Checking feature ns_pid
Checking feature ns_get_userns
Checking feature ns_get_parent

=== Run 1/1 ================ zdtm/transition/pid_reuse

===================== Run zdtm/transition/pid_reuse in ns ======================
DEP       pid_reuse.d
CC        pid_reuse.o
LINK      pid_reuse
Start test
Test is SUID
./pid_reuse --pidfile=pid_reuse.pid --outfile=pid_reuse.out
Run criu pre-dump
Send the 10 signal to  52
Run criu dump
Run criu restore
Send the 15 signal to  73
Wait for zdtm/transition/pid_reuse(73) to die for 0.100000
Test output: ================================
14:47:57.717: 11235: ERR: pid_reuse.c:76: Wrong value in a variable after restore
14:47:57.717:     4: FAIL: pid_reuse.c:110: Task 11235 exited with wrong code 1 (errno = 11 (Resource temporarily unavailable))

<<< ================================

https://jira.sw.ru/browse/PSBM-67502

v3: simplify waitpid's status check
v9: switch to test_wait_pre_dump(_ack)

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:52 +03:00
Mike Rapoport
d6d9d7f7ac zdtm/lib: don't close bad criu_status_in file descriptor in signal handler
The criu_status_in is not always used and it may be -1 when the signal
handler closes it. With lazy-pages we hit a corner case which clobbers the
errno value. This happens when we resume the process inside glibc syscall
wrapper and get the signal before the page containing errno is copied. In
this case, signal handler is invoked before the syscall return value is
written to errno and the actual value of errno seen by the process becomes
-EBADF because of close(-1) in the signal handler.

Let's ensure that close() in signal handler does not fail to make Jenkins
happier while the proper solution for the lazy-pages issue is found.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:51 +03:00
Andrei Vagin
b4afa08060 test: make zdtm.py python2/python3 compatible
Cc: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:51 +03:00
Andrei Vagin
c971d7a97b zdtm/socket-ext: clean up test files
Reported-by: Dmitry Safonov <dima@arista.com>
Cc: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:51 +03:00
Andrei Vagin
40ab3c18c8 zdtm: show a process tree if a test doesn't show signs of life
Call "ps axf" if waitpid() is running more than 10 seconds

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:51 +03:00
Cyrill Gorcunov
b4451351a5 seccomp: test, seccomp_filter_threads -- Use multiple threads
Andrew proposed the test which actually triggered the issue
in current seccomp series, put it into a regular basis.

Suggested-by: Andrey Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:51 +03:00
Cyrill Gorcunov
96a30f068c seccomp: test -- Add seccomp_filter_threads
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:51 +03:00
Pavel Tikhomirov
e071c2a610 zdtm/lib: add pre-dump-notify test flag
If pre-dump-notify flag is set, zdtm sends a notify to the test after
pre-dump was finished and waits for the test to send back a reply that
test did all it's work and now is ready for a next pre-dump/dump.

How it can be used:

while (!test_wait_pre_dump()) {
	/* Do something after predump */
	test_wait_pre_dump_ack();
}
/* Do something after restore */

Internally we open two pipes for the test one for receiving notify (with
two open ends) and one for replying to it (only write end open). Fds of
pipes are dupped to predefined numbers and zdtm opens these fds through
/proc/<test-pid>/fd/{100,101} and communicates with the test.

v9: switch to two way interface to remove race then operation we try to
run after predump may be yet unfinished at the time of next dump.

Suggested-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:50 +03:00
Cyrill Gorcunov
339446a7cc unix: test, sk-unix01 -- Fix data sending for be machines
Reported-by: Andrey Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:50 +03:00
Cyrill Gorcunov
7ac24b5348 tests: others,rpc -- Tune up header
- string.h is missed for memX helpers
 - fcntl.h should be included instead of sys

With this patch the test-c is compiled on alpine tests,
but there is a problem related to the container itself

 | protoc-c --proto_path=. --c_out=. rpc.proto
 | gcc -g -Werror -Wall -I.   -c -o test-c.o test-c.c
 | gcc -g -Werror -Wall -I.   -c -o rpc.pb-c.o rpc.pb-c.c
 | gcc   test-c.o rpc.pb-c.o  -lprotobuf-c -o test-c
 | protoc --proto_path=. --python_out=. rpc.proto
 | cp ../../../criu/criu criu
 | chmod u+s criu
 | mkdir -p build
 | chmod a+rwx build
 | sudo -g '#1000' -u '#1000' ./criu service -v4 -W build -o service.log --address criu_service.socket -d --pidfile pidfile
 | make: sudo: Command not found

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:50 +03:00
Pavel Tikhomirov
70e18b846b zdtm.py: ignore unicode encode errors
We have a problem after commit 212e4c771a ("test: make zdtm.py
python2/python3 compatible") when running tests on python2:

https://ci.openvz.org/job/CRIU/job/CRIU-virtuozzo/job/criu-dev/3804/console

Traceback (most recent call last):
  File "./test/zdtm.py", line 2249, in <module>
    opts['action'](opts)
  File "./test/zdtm.py", line 2001, in run_tests
    launcher.run_test(t, tdesc, run_flavs)
  File "./test/zdtm.py", line 1680, in run_test
    self.wait()
  File "./test/zdtm.py", line 1737, in wait
    self.__wait_one(0)
  File "./test/zdtm.py", line 1725, in __wait_one
    print(open(sub['log']).read())
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in
position 258: ordinal not in range(128)

print does not like "‘" symbol in zdtm_static_cgroup04.log:
...
rmdir: failed to remove ‘cgclean.sKFHLm/zdtmtst/special_prop_check’: No
such file or directory

Small reproducer:

[snorch@snorch ~]$ cat test_ascii.py
from __future__ import absolute_import, division, print_function, unicode_literals
from builtins import (str, open, range, zip, int, input)

f = open('./zdtm_static_cgroup04.log')
s = f.read()
print(s)

[snorch@snorch ~]$ python test_ascii.py | grep ""
Traceback (most recent call last):
  File "test_ascii.py", line 6, in <module>
    print(s)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in
position 258: ordinal not in range(128)

So just ignore these quote symbol when printing logs.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2018-07-09 18:26:50 +03:00