2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-31 14:25:49 +00:00
Commit Graph

8907 Commits

Author SHA1 Message Date
Rodrigo Bruno
c45eb121e5 img: Introduce O_FORCE_LOCAL flag for images
criu/image-desc.c    | 4 ++--
 criu/image.c         | 4 ++--
 criu/include/image.h | 1 +
 3 files changed, 5 insertions(+), 4 deletions(-)

In order to prepare for remote snapshots (possible with Image Proxy and Image
Cache) the O_FORCE_LOCAL flag is added to force some images not to be remote
and stay as local files in the file system.

Signed-off-by: Rodrigo Bruno <rbruno@gsd.inesc-id.pt>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:18:53 +03:00
Pavel Emelyanov
d9220e958b lib: Add simple Go wrappers for swrk mode
We'll need some docs :) bu the API is

criu := MakeCriu()

criu.Dump(opts, notify)
criu.Restore(opts, notify)
criu.PreDump(opts, notify)
criu.StartPageServer(opts)

where opts is the object from rpc.proto, Go has almost native support
for those, so caller should

- compile .proto file
- export it and golang/protobuf/proto
- create and initialize the CriuOpts struct

and notify is an interface with callbacks that correspond to criu
notification messages.

A stupid dump/restore tool in src/test/main.go demonstrates the above.

Changes since v1:

* Added keep_open mode for pre-dumps. Do use it one needs
  to call criu.Prepare() right after creation and criu.Cleanup()
  right after .Dump()

* Report resp.cr_errmsg string on request error.

Further TODO:

- docs
- code comments

travis-ci: success for libphaul (rev2)
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:18:53 +03:00
Pavel Emelyanov
090ce500d3 test, pipes: Exhaustive test of shared pipes
So, here's the next test that just enumerates all possible states and checks
that CRIU C/R-s it well. This time -- pipes. The goal of the test is to load
the fd-sharing engine, so pipes are chosen, as they not only generate shared
struct files, but also produce 2 descriptors in CRIU's fdesc->open callback
which is handled separately.

It's implemented slightly differently from the unix test, since we don't want
to check sequences of syscalls on objects, we need to check the task to pipe
relations in all possible ways.

The 'state' is several tasks, several pipes and each generated test includes
pipe ends sitting in all possible combinations in the tasks' FDTs.

Also note, that states, that seem to be equal to each other, e.g. pipe between
tasks A->B and pipe B->A, are really different as CRIU picks the pipe-restorer
based in task PIDs. So whether the picked task has read end or write end at
his FDT makes a difference on restore.

Number of tasks is limited with --tasks option, number of pipes with the
--pipes one. Test just runs all -- generates states, makes them and C/R-s
them. To check the restored result the /proc/pid/fd/ and /proc/pid/fdinfo/
for all restored tasks is analyzed.

Right now CRIU works OK for --tasks 2 --pipes 2 (for more -- didn't check).
Kirill, please, check that your patches pass this test.

TODO:

 - Randomize FDs under which tasks see the pipes. Now all tasks if they have
   some pipe, all see it under the same set of FDs.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:18:53 +03:00
Pavel Emelyanov
fbc26a980a test, unix: Exhaustive testing of states (v2)
By exhaustive testing I understand a test suite that generates as much
states to try to C/R as possible by trying all the possible sequences
of system calls. Since such a generation, if done on all the Linux API
we support in CRIU, would produce bazillions of process, I propose to
start with something simple.

As a starting point -- unix stream sockets with abstract names that
can be created and used by a single process :)

The script generates situations in which unix sockets can get into by
using a pre-defined set of system calls. In this patch the syscalls
are socket, listen, bind, accept, connect and send. Also the nummber
of system calls to use (i.e. -- the depth of the tree) is limited by
the --depth option.

There are three things that can be done with a generated 'state':

I) Generate :) and show

Generation is done by recursively doing everything that is possible
(and makes sence) in a given state. To reduce the size of the tree
some meaningless branches are cut, e.g. creating a socket and closing
it right after that, creating two similar sockets one-by-one and some
more.

Shown on the screen is a cryptic string, e.g. 'SA-CX-MX_SBL one,
describing the sockets in the state. This is how it can be decoded:

 - sockets are delimited with _
 - first goes type (S -- stream, D --datagram)
 - next goes name state (A -- no name, B with name, X socket is not in
   FD table, i.e. closed or not yet accepted)
 - next may go letter L meaning that the socket is listening
 - -Cx -- socket is connected and x is the peer's name state
 - -Ixyz -- socket has incoming connections queue and xyz are the
   connect()-ors name states
 - -Mxyz -- socket has messages and xyz is senders' name states

The example above means, that we have two sockets:

 - SA-CX-MX: stream, with no name, connected to a dead one and with a
   message from a dead one
 - SBL: stream, with name, listening

Next printed is the sequence of system calls to get into it, e.g. this
is how to get into the state above:

	socket(S) = 1
	bind(1, $name-1)
	listen(1)
	socket(S) = 2
	connect(2, $name-1)
	accept(1) = 3
	send(2, $message-0)
	send(3, $message-0)
	close(3)

Program has created a stream socket, bound it, listened it, then
created another stream socket, connected to the 1st one, then accepted
the connection sent two messages vice-versa and closed the accepted
end, so the 1st socket left connected to the dead socket with a
message from it.

II) Run the state

This is when test actually creates a process that does the syscalls
required to get into the generated state (and hopefully gets into it).

III) Check C/R of the state

This is the trickiest part when it comes to the R step -- it's not
clear how to validate that the state restored is correct. But if only
trying to dump the state -- it's just calling criu dump. As images dir
the state string description is used.

One may choose only to generate the states with --gen option. One may
choose only to run the states with --run option. The latter is useful
to verify that the states generator is actually producing valid
states. If no options given, the state is also dump-ed (restore is to
come later).

For now the usage experience is like this:

- Going --depth 10 --gen (i.e. just generating all possibles states
  that are acheivable with 10 syscalls) produces 44 unique states for
  0.01 seconds. The generated result covers some static tests we have
  in zdtm :)  More generation stats is like this:
   --depth 15 : 1.1 sec   / 72 states
   --depth 18 : 13.2 sec  / 89 states
   --depth 20 : 1 m 8 sec / 101 state

- Running and trying with criu is checked with --depth 9. Criu fails
  to dump the state SA-CX-MX_SBL (shown above) with the error

  Error (criu/sk-queue.c:151): recvmsg fail: error: Connection reset by peer

Nearest plans:

1. Add generators for on-disk sockets names (now oly abstract).
   Here an interesting case is when names overlap and one socket gets
   a name of another, but isn't accessible by it

2. Add datagram sockets.
   Here it'd be fun to look at how many-to-one connections are
   generated and checked.

3. Add socketpair()-s.

Farther plans:

1. Cut the tree better to allow for deeper tree scan.

2. Add restore.

3. Add SCM-s

4. Have the exhaustive testing for other resources.

Changes since v1:

* Added DGRAM sockets :)

  Dgram sockets are trickier that STREAM, as they can reconnect from
  one peer to another. Thus just limiting the tree depth results in
  wierd states when socket just changes peer. In the v1 of this patch
  new sockets were added to the state only when old ones reported that
  there's nothing that can be done with them. This limited the amount
  of stupid branches, but this strategy doesn't work with dgram due to
  reconnect. Due to this, change #2:

* Added the --sockets NR option to limit the amount of sockets.

  This allowed to throw new sockets into the state on each step, which
  made a lot of interesting states for DGRAM ones.

* Added the 'restore' stage and checks after it.

  After the process is restore the script performs as much checks as
  possible having the expected state description in memory. The checks
  verify that the values below get from real sockets match the
  expectations in generated state:

   - socket itself
   - name
   - listen state
   - pending connections
   - messages in queue (sender is not checked)
   - connectivity

  The latter is checked last, after all queues should be empty, by
  sending control messages with socket.recv() method.

* Added --keep option to run all tests even if one of them fails.

  And print nice summary at the end.

So far the test found several issues:

- Dump doesn't work for half-closed connection with unread messages
- Pending half-closed connection is not restored
- Socket name is not restored
- Message is not restored

New TODO:

- Check listen state is still possible to accept connections (?)
- Add socketpair()s
- Add on-disk names
- Add SCM-s
- Exhaustive script for other resources

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:18:53 +03:00
Andrei Vagin
3121d90de0 criu: print a criu version with the info level
We always ask users what version of criu they use to investigate a problem,
so it better to have it in a log.

Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:26:02 +03:00
Andrei Vagin
ffee07723e criu: remap soccr log levels to criu levels
criu and soccr has different values for log levels, so
someone has to remap them.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:25:49 +03:00
root
638c14f2ed zdtm: grep errors from page-server.log and lazy-pages.log
This can help to investigate logs from Mr Jenkins.

Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:23 +03:00
Cyrill Gorcunov
e6537f3d8d fsnotify: open_handle -- Handle multiple mounts with same s_dev
When inotify is laying on uovermounted fs we should walk over
all mountpoints with same s_dev to find openable path.

Note on restore the path is usually already allocated during
dump stage so get_mark_path won't call for open_handle(), in
turn on dump stage the positive return from open_handle()
will cause fsnotify engine to find openable path, thus there
is kind of double work to be optimized in future.

For example we got a container where systemd-udevd inside
opens inotify for /dev/X entry then overmount ./dev path
with slave option and in result irmap engine on predump
can't figure out where the inotify is sitting causing
migrtion to abort.

Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-11-23 20:23:23 +03:00
Dmitry Safonov
d541bc797c build: Move generated config.h into include/common/
config.h is a generated file with "build-features" defines.
We use it for several purposes:
o to check that compiler can do it's job
o to complement user-visible API between distributions
o to add compile-time options from .config global file

It's used in criu and soccr, but compel also needs such thing.

Previously, soccr has a link to config.h in criu includes,
but it would be much cleaner to move it to other headers,
that are shared between sub-projects into include/common.

Reported-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Tested-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:23 +03:00
Dmitry Safonov
585dda236c ia32: Get rid of R_X86_64_32S relocation
Distributions starts to supply GCC that is configured to compile
-pie and -fPIC code by default due to security reasons.

CONFIG_COMPAT was unfriendy to -pie by the reason of R_X86_64_32S
relocation in call32.S helper:
  LINK     criu/criu
/usr/bin/ld: criu/arch/x86/crtools.built-in.o: relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
make[1]: *** [criu/Makefile:92: criu/criu] Error 1
make: *** [Makefile:225: criu] Error 2

Use %rip-relative addressing to avoid ld errors for shared binary linking.
Puff, all needs to be done with bare hands!

Now CONFIG_COMPAT can be used with -pie binaries and all should
also work for debian toolchain (#315).

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:23 +03:00
Andrei Vagin
008db0cb7a zdtm: run page-server via rpc
v2: typo fix
v3: run criu pre-dump via rpc
v4: don't use status-fd for rpc

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:23 +03:00
Andrei Vagin
397df9c035 lib/py: allow to execute page-server as a child process
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:23 +03:00
Andrei Vagin
ebd64bddfe service: allow to execute page-server as a child process
In this case we can wait it and get an exit code.

For example, it will be useful for p.haul where one connection
is used several times, so we need a way how to understand  that
page-server exited unexpectedly.

v2: don't write ps_info if a start descriptor isn't set

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:23 +03:00
Cyrill Gorcunov
0b6f9c7975 build: Reused .FORCE from nmk
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
8db0f03758 build: nmk -- Move phony targets to include.mk
So they can be reused.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
d62e01929e build: nmk -- Add .FORCE target
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Avindra Goolcharan
8e45ce4905 images.py: remove shebang
This file is not executable directly, so it should not have the shebang.

Signed-off-by: Avindra Goolcharan <aavindraa@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Pavel Begunkov
3e0200e571 build: get rid of warnings with sysmacro
warning: In the GNU C Library, "major" is defined
 by <sys/sysmacros.h>. For historical compatibility, it is
 currently defined by <sys/types.h> as well, but we plan to
 remove this soon. To use "major", include <sys/sysmacros.h>
 directly. If you did not intend to use a system-defined macro
 "major", you should undefine it after including <sys/types.h>.
  if (major(st.st_rdev) != major(st_rtc.st_rdev) ||

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
30b2630e80 test: static,aio01 -- Use proper type for context
aio_context_t is 8 byte long so on 32 bit mode it might be
strippped off when unsigned long used instead. Fix this typo.

Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Andrei Vagin
cae6262ce9 zdtm: add an option to show criu statistics
v2: defining crit_bin and using it for Popen() // Mike

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Andrei Vagin
7d6d795d28 stats: add counters for pipes and page_pipe_bufs
The number of pipes are limited in a system, so it is better to know how
many we use.

Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Andrei Vagin
00413cb5a7 test: check ipv6 sockets which handle ipv4 connections
A server socket is created with AF_INET6, but a client
socket is create with AF_INET.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Andrei Vagin
6e770e50b8 soccr: c/r ipv6 sockets which handles ipv4 connections
IPv6 listening sockets can accept both ipv4 and ipv6 connections,
in both cases a family of an accepted socket will be AF_INET6.

But we have to send tcp packets accoding with a connection type.

------------------------ grep Error ------------------------
(00.002320)     53: Debug: 		Will set rcv_wscale to 7
(00.002325)     53: Debug: 		Will turn timestamps on
(00.002331)     53: Debug: Will set mss clamp to 65495
(00.002338)     53: Debug: 	Restoring TCP 1 queue data 2 bytes
(00.002403)     53: Error (soccr/soccr.c:673): Unable to send a fin packet: libnet_write_raw_ipv6(): -1 bytes written (Network is unreachable)

(00.002434)     53: Error (criu/files.c:1191): Unable to open fd=3 id=0x6
(00.002506) Error (criu/cr-restore.c:2171): Restoring FAILED.
------------------------ ERROR OVER ------------------------

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Andrei Vagin
09b93e0ac5 sk-inet: restore a value of SO_REUSEPORT
The SO_REUSEPORT option allows multiple sockets on the same
host to bind to the same port. This option has to ve restored when all
sockets are bound to a port. The same logic is already used to restore
SO_REUSEADDR.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Andrei Vagin
3503332086 zdtm: check a case when one port is shared between two sockets
SO_REUSEPORT allows multiple sockets on the same host to bind to the
same port.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
2e16cc1e4f compel: Do not loose sign of result in compat syscall
Regs are present in unsigned format so convert them
into signed first to provide results.

In particular if memfd_create syscall failed we won't
notice -ENOMEM error but rather treat it as unsigned
hex value

 | (05.303002) Putting parasite blob into 0x7f1c6ffe0000->0xfffffff4
 | (05.303234) Putting tsock into pid 42773

Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
fc21d6fb53 crit: Add socket states decoding
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
4d0fc1a496 crit: Add socket types decoding
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
d4c29ab7cb crit: Add protocols decoding
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
56cd56706d crit: Add more families into socket decoding
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
ec273275fe crit: Add INET6 familiy
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Pavel Tikhomirov
926e42ac63 zdtm: test overmounting with shared parent works
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Pavel Tikhomirov
2d2381e01d zdtm: test shared mount propagation is preserved
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Pavel Tikhomirov
bf1f9c61c1 mount: allow overmount on parent with shared group
In CT, we do:

mkdir -p /a/b/c1
mkdir -p /c2
mount --bind /c2 /a/b/c1
mount --rbind /a/b /a

And after that container is not dumpable with error:

mnt: Unable to handle mounts under 146:./a

Just because overmounts with shared parent group are prohibited,
but I can't see any problem with enabling them.

https://jira.sw.ru/browse/PSBM-69501
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Pavel Tikhomirov
2214568de9 mount: fix parent shared group dependency in can_mount_now
What we do before patch:

1) If we are NOT in the same shared group - if we have some parent's
shared group member unmounted, we just wait for it.
2) If we are in the same group - we wait only for members with root
path len shorter than ours.

That is done to make child mount propagate in all shared group,
but I think it is wrong, e.g.:

mkdir -p /dir/a/b/c /d /e /f
mount --bind /dir/a /d
mount --bind /dir/a/b /e
mount --bind /f /e/c

Before c/r we have:

507 114 182:1017985 / / rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12
144 507 182:1017985 /dir/a /d rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12
146 507 182:1017985 /dir/a/b /e rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12
148 146 182:1017985 /f /e/c rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12
150 507 182:1017985 /f /dir/a/b/c rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12
149 144 182:1017985 /f /d/b/c rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12

After c/r we have:

600 132 182:1017985 / / rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12
602 600 182:1017985 /f /dir/a/b/c rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12
603 600 182:1017985 /dir/a /d rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12
604 600 182:1017985 /dir/a/b /e rw,relatime shared:63 master:60 - ext4 /dev/ploop63624p1 rw,data=ordered,balloon_ino=12

There is no propagation as all mounts are in same shared group and
602(150) has shorter root than 603(144) and 604(146).

What we should do:

Wait member of our parent's shared group only if it has our 'sibling'
mount in it. Sibling mount is the one which had propagated to shared
mount of our parent for us when we were mounted. We need to enforce
propagation only for these case.

https://jira.sw.ru/browse/PSBM-69501
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:14 +03:00
Cyrill Gorcunov
b2d2574955 image-desc: Encode pagemap in unsigned long format
The anonymous shared memory are using shmid for image
name encoding which is unsigned long and we already
met scenario where high bits get strippped off thus
the restore failed.

Lets use unsigned long here, and because pagemap code
is shared between plain memory and anon shared memory
use unsigned long every where.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
bf341b463e image-desc: Use unsigned format for tmpfs image
The index comes from mnt_id which is signed integer
both in kernel and in userspace, but negative value
is never valid, thus don't use it.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
d9952dcd4f image-desc: Use unsigned format for pid driven entries
For images which are using pid as id for image names
use unsigned format since here is no negative pid
in real system.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
582f104d04 image-desc: Use unsigned format for userns
It uses ns->id

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
af5a6b52cb image-desc: Use unsigned format for netns
Since it uses ns->id

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
a8d7857303 image-desc: Use unsigned format for netdev
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
ba10ae1ac0 image-desc: Use unsigned for mountpoints
Just as we declare it in ns_id structure.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
6a08b0cdf1 image-desc: Use unsigned int for old binfmt-misc
Same as for autofs.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
9020d18d44 image-desc: Use unsigned int for tmpfs dev image
Same as for autofs.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
2ca78e9a00 image-desc: Use unsigned format for autofs
Both the kernel and criu uses unsigned int
for it, make the format appropriate.

	| struct mount_info {
	|	...
	|	unsigned int		s_dev;
	|	...
	| }

We didn't see negative number here in real life so
I don't think if such %d to %u convention cause
backward compatibility problem ever.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
1fbc51f8e2 image-desc: Fix fdinfo format
We pass unsigned 4 byte integer here, so
use appropriate format.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
57e4ad5ba8 namespaces: __get_ns_id -- Use safe snprintf
Namespace descriptors are not promised to have
constant short names, so just to be on a safe
side.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
78d4f7c352 test: jenkins -- Add huge shmid fault into the tests
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Cyrill Gorcunov
c94fb7d069 fault-injection: Add FI_HUGE_ANON_SHMEM_ID type
To test if we can survive with shmid more than 4 bytes
long in image formats.

Without the fix for shmid

 | [root@uranus criu] test/zdtm.py run -t zdtm/static/maps01 --fault 132 -f h -k always
 | === Run 1/1 ================ zdtm/static/maps01
 |
 | ========================= Run zdtm/static/maps01 in h ==========================
 | Start test
 | Test is SUID
 | ./maps01 --pidfile=maps01.pid --outfile=maps01.out
 | Run criu dump
 | Forcing 132 fault
 | Run criu restore
 | Forcing 132 fault
 | =[log]=> dump/zdtm/static/maps01/36/1/restore.log
 | ------------------------ grep Error ------------------------
 | (00.016464)     37: Opening 0x007f39c04b5000-0x007f3a004b5000 0000000000000000 (101) vma
 | (00.016465)     37: Search for 0x007f39c04b5000 shmem 0x10118e915 0x7f97f7ae4ae8/36
 | (00.016470)     37: Waiting for the 10118e915 shmem to appear
 | (00.016479)     36: No pagemap-shmem-18409749.img image
 | (00.016481)     36: Error (criu/shmem.c:559): Can't restore shmem content
 | (00.016501)     36: Error (criu/mem.c:1208): `- Can't open vma
 | (00.016552) Error (criu/cr-restore.c:2449): Restoring FAILED.
 | ------------------------ ERROR OVER ------------------------

And with the fix

 | [root@uranus criu] test/zdtm.py run -t zdtm/static/maps01 --fault 132 -f h -k always
 | === Run 1/1 ================ zdtm/static/maps01
 |
 | ========================= Run zdtm/static/maps01 in h ==========================
 | Start test
 | Test is SUID
 | ./maps01 --pidfile=maps01.pid --outfile=maps01.out
 | Run criu dump
 | Forcing 132 fault
 | Run criu restore
 | Forcing 132 fault
 | Send the 15 signal to  36
 | Wait for zdtm/static/maps01(36) to die for 0.100000
 | ========================= Test zdtm/static/maps01 PASS =========================

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00
Andrei Vagin
5785dbd93d zdtm.py: fix decode_flav()
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-23 20:23:13 +03:00