2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 21:07:43 +00:00

4796 Commits

Author SHA1 Message Date
Andrey Vagin
9391034f12 zdtm.sh: don't change owner of a test directory
After changing an owner the current user will not be able to remove or
change the directory. It isn't convenient.

Reported-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-07 15:45:35 +04:00
Andrey Vagin
82d509cb48 zdtm: don't execute pipe00 a few times simultaneously
zdtm-pre-dump, zdtm-snapshot, zdtm-iter, zdtm execute pipe00, so
these targets should be executed one by one.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-07 15:45:11 +04:00
Pavel Emelyanov
ebada5b274 cpuinfo: Don't report negative values to shell
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-07 15:44:12 +04:00
Pavel Emelyanov
4cfd9197e1 cpuinfo: Fail if cpuinfo check is requested, but file is missing
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-10-07 15:43:49 +04:00
Tycho Andersen
003aed3b97 cg: restore special cpuset props only in cgs criu creates
Previously we were trying to copy the root cg properties recursively to attempt
to correct invalid restores. However, based on some ML discussion, we should
only restore exactly what was dumped. Users need to set up their cg heirarchies
correctly (that is, they should not set up any options in the parent that would
conflict with what was dumped) in order to restore successfully.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-07 12:57:28 +04:00
Tycho Andersen
de055b7992 cg: use one path style throughout cg restore code
This commit is in preparation for the (hopefully last :) restore special cpuset
patch.

Previously, we installed the cgroup service fd after calling
prepare_cgroup_dirs, which meant that we had to carry around the temporary
directory name in order to put things in the right place. The
restore_cgroup_prop function uses the cg service fd instead of carrying around
the full path. This means that we can't sue restore_cgroup_prop, without first
sanitizing the path. Instead, we install the service fd before calling
prepare_cgroup_dirs, and all the code just references that instead of carrying
around the temporary path.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-07 12:56:52 +04:00
Nicolas Dichtel
25e1997fde test/zdtm: fix compilation of maps02.c when MADV_DONTDUMP is unknown
Error was:
maps02.c: In function ‘main’:
maps02.c:57:74: error: ‘MADV_DONTDUMP’ undeclared (first use in this function)
maps02.c:57:74: note: each undeclared identifier is reported only once for each function it appears in
make: *** [maps02] Error 1
ERROR: fail to start /home/root/criu/test/zdtm/live/static/maps02

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 18:57:32 +04:00
Cyrill Gorcunov
b52935d7b9 cpuinfo: arm -- Fix build
Add missing implementations for ARM platforms.

Reported-by: Mr. Travis
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 16:20:10 +04:00
Cyrill Gorcunov
cd1a6dc97f cpuinfo: rpc -- Add CPUINFO_DUMP/CPUINFO_CHECK commands, v2
On Tue, Sep 30, 2014 at 09:18:55PM +0400, Cyrill Gorcunov wrote:
> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>

Updated

>From fea15362291a525f4b00f7e070968c6890cc831e Mon Sep 17 00:00:00 2001
From: Cyrill Gorcunov <gorcunov@openvz.org>
Date: Fri, 19 Sep 2014 17:56:11 +0400
Subject: [PATCH 12/12] cpuinfo: rpc -- Add CPUINFO_DUMP/CPUINFO_CHECK commands

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:27:00 +04:00
Cyrill Gorcunov
73b9a2ebe3 cpuinfo: Add "cpuinfo [dump|check]" commands, v2
On Wed, Oct 01, 2014 at 05:51:09PM +0400, Pavel Emelyanov wrote:
> > Yes, what you've been expecting?
>
> if (!strcmp(argv[optind]))
> 	return cpu_cap_check()
>
> or smth like this.

updated. So if it become confusing -- feel free to merge [1;9] and
ping me to resend the rest, or pick up from attachements.

>From 6af96ff63ac82f9566c3cba9c116dc67698c9797 Mon Sep 17 00:00:00 2001
From: Cyrill Gorcunov <gorcunov@openvz.org>
Date: Tue, 30 Sep 2014 18:33:40 +0400
Subject: [PATCH] cpuinfo: Add "cpuinfo [dump|check]" commands

They allow to validate cpuinfo information
without running complete dump/restore actions.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:26:58 +04:00
Cyrill Gorcunov
87273ccdb8 cpuinfo: x86 -- Add dump and validation of cpuinfo image, v2
On Wed, Oct 01, 2014 at 04:57:40PM +0400, Pavel Emelyanov wrote:
> On 10/01/2014 01:07 AM, Cyrill Gorcunov wrote:
> > On Tue, Sep 30, 2014 at 09:18:53PM +0400, Cyrill Gorcunov wrote:
> >> If a user requested criu to dump cpuinfo image then we
> >> write one on dump and verify on restore. At the moment
> >> we require all cpu feature bits to match the destination
> >> cpu in a sake of simplicity, but in future we need deps
> >> engine which would filer out bits and test if cpu we're
> >> restoring on is more capable than one we were dumping at
> >> allowing to proceed restore procedure.
> >>
> >> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
> >
> > Updated to new img format

Something like attached?

>From 59272a9514311e6736cddee08d5f88aa95d49189 Mon Sep 17 00:00:00 2001
From: Cyrill Gorcunov <gorcunov@openvz.org>
Date: Thu, 25 Sep 2014 16:04:10 +0400
Subject: [PATCH] cpuinfo: x86 -- Add dump and validation of cpuinfo image

If a user requested criu to dump cpuinfo image then we
write one on dump and verify on restore. At the moment
we require all cpu feature bits to match the destination
cpu in a sake of simplicity, but in future we need deps
engine which would filer out bits and test if cpu we're
restoring on is more capable than one we were dumping at
allowing to proceed restore procedure.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:26:57 +04:00
Cyrill Gorcunov
e07b4a0e7a cpuinfo: x86 -- Add protobuf entry
At the moment only x86 is covered, ARM needs own handler.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:26:56 +04:00
Cyrill Gorcunov
de770e023a cpuinfo: x86 -- Rework cpuinfo features fetching
Instead of parsing procfs lets use native cpuid(), it's a way faster.
The dark side is that the kernel may disable some of features via
bootline options even if they are present on hardware but for us
it's fine -- we will be testing hardware cpu for features anyway.

The X86_FEATURE_ bits are gathered from two sources: linux kernel
and cpu specifications.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:26:54 +04:00
Cyrill Gorcunov
2581dcfb22 cpuinfo: x86 -- Add cpuid helpers
We will use them to fetch cpu caps.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:26:50 +04:00
Cyrill Gorcunov
7c732deb44 cpuinfo: x86 -- Add feature bits definitions
We will need them to carry CPU's caps.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:26:05 +04:00
Cyrill Gorcunov
b191244ea0 cpuinfo: Update documentation for --cpu-cap
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:25:57 +04:00
Cyrill Gorcunov
ff1a751a89 opt: cpu-cap -- Introduce "none" and "cpuinfo" arguments
They will serve to choose capability level when migrating
images between various hardware nodes.

Note it's bare functionality introduced in this commit,
the real implementation is in next patches.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:25:56 +04:00
Cyrill Gorcunov
88f488e31f opt: cpu-cap -- Make it as optional_argument
This slightly changes the visible API, but i think it's safe now.
The idea behind it to make single --cpu-cap to indicate "--cpu-cap all".

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:25:55 +04:00
Cyrill Gorcunov
3914b180d3 cpuinfo: Drop cpu_set_feature from exporting
It's redundant, should be cpu local.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:23:34 +04:00
Cyrill Gorcunov
4ab2a3ec73 cpuinfo: x86 -- Drop redundant cpu_has helper
It's simply a wrapper over cpu_has_feature,
so use this it directly instead.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:22:41 +04:00
Cyrill Gorcunov
4ad462b459 mount: proc-parse -- Show @mnt_id on debug print as well
This is convenient when need to lookup into debug prints
and check which mount point were used somewhere else
(in particular I will need @mnt_id in tty code so
 on error I can easily figure out which mountpoint has
 been used).

No func changes.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-03 13:21:16 +04:00
Andrey Vagin
f582f16db0 test: expand the default test set
* check page server
* check snapshots
* check a few iterations of dump/restore

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:58:37 +04:00
Tycho Andersen
169f25b667 restore: copy special cpuset props recursively
The symptom of this bug was that users restoring tasks to a nested cgroup where
the top level group was created by criu (and not previously configured) e.g.
cpuset:/lxc/u1 would get an ENOSPC. criu would try to copy the special
properties into /lxc/u1 directly and (silently) fail, and then tried to copy
the task into the cg and fail with ENOSPC:

ENOSPC Attempted  to  write(2)  an empty cpuset.cpus or cpuset.mems setting to
       a cpuset that has tasks attached.

Fixing the silent failure to a loud failure, it gave EACCES:

EACCES Attempted to add, using write(2), a CPU or memory node to a cpuset, when
       that CPU or memory node was not already in its parent.

So, we need to copy the the special props down the entire tree. Additionally,
we shouldn't copy props directly from the top, since some intermediate point in
the tree could add restrictions. We first walk back up the tree to find the
first point where the props are empty, and then copy that parent's props all
the way down.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:57:36 +04:00
Cyrill Gorcunov
ae96d21a07 bfd: Use ERR_PTR and such instead of BREADERR
No need to invent new error codes here, simply
use ERR_PTR/IS_ERR_OR_NULL and such.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:56:39 +04:00
Pavel Emelyanov
aeb3f547f7 net: Move NETLINK_INET_DIAG from socket.c to net.c
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:48:28 +04:00
Tycho Andersen
8bd3b7a99a cg: don't leak properties when cg dir exists
We're using NULL as a sentinel here to indicate that we shouldn't restore any
cgroup properties. We should make sure that we don't leak this information and
instead check the n_properties field, which we should also set correctly.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:42:06 +04:00
Pavel Emelyanov
c57c2cfa64 predump: Collect mnt and net namespaces properly
On pre-dump we collect only two namespaces -- the mnt one
for criu and mnt one again for root task.

This is not correct. We need all mount namespaces to make
the irmap generation work properly and we need all net
namespaces to have parasite sockets created.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-02 14:30:31 +04:00
Pavel Emelyanov
45fd143409 parasite: Precreate daemon control sockets
Now we have netns on pstree-item and have the place
where to pre-create daemon socket in needed namespace.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:35:26 +04:00
Pavel Emelyanov
80efa564f4 parasite: Reshuffle tsock creation
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:35:20 +04:00
Pavel Emelyanov
8ad653c732 pstree: Store task's netns on pstree-item
Will be needed for parasite sockets.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:35:11 +04:00
Pavel Emelyanov
678d19be26 ns: Reshuffle nsid generation code
To make it possible to get ns_id object together
with its ID.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:35:01 +04:00
Pavel Emelyanov
3f38145163 pstree: Introduce item's dump info
Empty for now, will be filled soon.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:34:53 +04:00
Pavel Emelyanov
c443b03e10 rst: Rework the rst_info referencing
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:34:38 +04:00
Pavel Emelyanov
3c7d01f6a7 net: Pre-create nl diag sk
The setns() syscall (called by switch_ns()) can be extremely
slow. If we call it two or more times from the same task the
kernel will synchonously go on a very slow routine called
synchronize_rcu() trying to put a reference on old namespaces.

To avoid doing this more than once I propose to create all
per-ns sockets in one place with one setns call. In this
patch there's on nl diag socket used to collect other sockets
is created this way.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:34:29 +04:00
Pavel Emelyanov
4f9acb6a7c net: Do walk net namespaces to collect
Right now we don't support multiple net namespaces,
but some day we will. Other than this we have a logic
to distinguish cases with no namespaces vs one namespace,
so this walking already makes sence.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:34:05 +04:00
Pavel Emelyanov
7327ffe6a7 ns: Introduce collect_net_namespaces
And move sockets collection there.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:33:56 +04:00
Pavel Emelyanov
01f6f890c2 ns: Introduce collect_namespaces routine
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-10-01 13:33:42 +04:00
Pavel Emelyanov
b476879239 irmap: Get root mntfd before releasing tasks on predump
We have a use-after-free in predump code:

1st the free_pstree() is called in pre_dump_tasks(), then we
go to irmap_predump_run() which may call the lookup_irmap()
which, in turn, dereferences the root_item to get the root
mount ns fd.

But the problem is bigger than that. After we've released the
tasks (done before freeing pstree on predump) we can no longer
access them by PIDs, so keeping the root-item after irmap
scan is not a fix.

Fix is to get the root fd before releasing the tasks and using
one in irmap scanner.

Caught recently on iterative inotify_irmap test.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-10-01 09:37:04 +04:00
Andrey Vagin
6694403214 zdtm/tempfs: set mode for O_CREAT
man 2 open:
"""
mode specifies the permissions to use in case a new file is cre‐
ated.  This argument must be supplied when O_CREAT or O_TMPFILE
is specified in flags;
"""

Cc: Konstantin Neumoin <kneumoin@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-30 21:55:01 +04:00
Andrey Vagin
c7390d2d3f zdtm/cwd01: don't forget to set '\0' after readlink()
Reported-by: Konstantin Neumoin <kneumoin@parallels.com>
Cc: Konstantin Neumoin <kneumoin@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-30 21:54:34 +04:00
Pavel
8ac80915e0 ns: Factor out namespace switching call
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-30 21:54:11 +04:00
Andrey Vagin
3bc0936ae7 criu: add .travis.yml (v3)
Travis CI is configured by adding a file named .travis.yml, which is a
YAML format text file, to the root directory of the GitHub
repository.[5]

Travis CI automatically detects when a commit has been made and pushed
to a GitHub repository that is using Travis CI, and each time this
happens, it will try to build the project and run tests.
""" https://en.wikipedia.org/wiki/Travis_CI

Currently Travis CI builds criu for x86_64 and ARM

v2: move travis-ci.sh in scripts
v3: fix path to the script in the script
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-30 21:51:16 +04:00
Pavel Emelyanov
8651f43bc0 bfd: Implement buffered reads
The restore times look like

Before patch:
	  futex:      370  3.554482 (84.2%)
	 umount:       41  0.234796 (5.6%)
	   read:     4737  0.113987 (2.7%)
	recvmsg:       43  0.100083 (2.4%)
	  wait4:       10  0.033344 (0.8%)

After patch:
	  futex:      187  1.547642 (72.9%)
	 umount:       41  0.234595 (11.0%)
	recvmsg:       43  0.075738 (3.6%)
	  flock:       42  0.038696 (1.8%)
	  clone:       35  0.037699 (1.8%)

Most of the time we wait for other processes to restore,
but that's OK (would only affect parallel restore). And
we see that read-s really go away (onto 7th position).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:57 +04:00
Pavel Emelyanov
b46409349b bfd: Implement buffered writes
Dump times (top-5) look like

Before patch:
	writev:     1595  0.048337 (15.1%)
	openat:     1326  0.041976 (13.1%)
	 close:     1434  0.034661 (10.8%)
	  read:      988  0.028760 (9.0%)
	 wait4:      170  0.028271 (8.8%)

After patch:
	openat:     1326  0.040010 (16.4%)
	 close:     1434  0.030039 (12.3%)
	  read:      988  0.025827 (10.6%)
	 wait4:      170  0.025549 (10.5%)
	ptrace:      834  0.021624 (8.9%)

So write-s go away from top list (turn into 8th position).

Funny thing is that all object writes get merged with the
magic writes, so the total amount of write()-s (not writev-s)
in the strace remain intact :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:55 +04:00
Pavel Emelyanov
b90ae65c4c img: Prepare to use bfd engine
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:53 +04:00
Pavel Emelyanov
67bbc7ea0b bfd: Rename fields
For reads and writes the names pos and bleft will
have strange meaning, so rename them into smth more
appropriate.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:51 +04:00
Pavel Emelyanov
166c58d5bb img: Mark unbufferred images
We have some images that store raw data together with
the pb objects (and one that just stores raw data) and
use custom access to this. E.g. pipe-data images splice
data into them and sk-queue one lseeks the image for
queue packets.

For those using buffered mode mixed with raw may lead
to troubles. Explicitly mark such images, so that the
buffering (next patches) handle such images carefully.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:15 +04:00
Pavel Emelyanov
cf64851b2a pb: Pass cr_img into image_name()
The pb_(read|write)-s will stop using plan fd soon.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:14 +04:00
Pavel Emelyanov
295090c1ea img: Introduce the struct cr_img
We want to have buffered images to speed up dump and,
slightly, restore. Right now we use plan file descriptors
to write and read images to/from. Making them buffered
cannot be gracefully done on plain fds, so introduce
a new class.

This will also help if (when?) we will want to do more
complex changes with images, e.g. store them all in one
file or send them directly to the network.

For now the cr_img just contains one int _fd variable.

This patch chages the prototype of open_image() to
return struct cr_img *, pb_(read|write)* to accept one
and fixes the compilation of the rest of the code :)

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:13 +04:00
Pavel Emelyanov
0c5dc93bd0 Subject: [PATCH 07/14] pstree: Subblock for ids read on task restore
Ugly, but it's for easier further patching.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-09-30 21:48:12 +04:00