2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 12:57:57 +00:00

77 Commits

Author SHA1 Message Date
Andrey Vagin
33c75d0df9 eventpoll: parse_fdinfo_pid_s() returns allocated object for eventpol tfd
We are going to collect all objects in a list and write them into
the eventpoll image. The eventpoll tfd image will be depricated.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-02 16:08:17 +04:00
Andrey Vagin
78a54bd87c fsnotify: parse_fdinfo_pid_s() returns allocated object for fanotify marks
We are going to collect all objects in a list and write them into
the fanotify image. The fanotify mark image will be depricated.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-02 16:07:44 +04:00
Andrey Vagin
7079bb1086 fsnotify: parse_fdinfo_pid_s() returns allocated object for inotify wd (v2)
We are going to collect all objects in a list and write them into
the inotify image. The inotify wd image will be depricated.

v2: cb() must always free an entry
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-09-02 16:07:43 +04:00
Saied Kazemi
d8b41b6525 Added AUFS support.
The AUFS support code handles the "bad" information that we get from
the kernel in /proc/<pid>/map_files and /proc/<pid>/mountinfo files.
For details see comments in sysfs_parse.c.

The main motivation for this work was dumping and restoring Docker
containers which by default use the AUFS graph driver.  For dump,
--aufs-root <container_root> should be added to the command line options.
For restore, there is no need for AUFS-specific command line options
but the container's AUFS filesystem should already be set up before
calling criu restore.

[ xemul: With AUFS files sometimes, in particular -- in case of a
  mapping of an executable file (likekely the one created at elf load),
  in the /proc/pid/map_files/xxx link target we see not the path
  by which the file is seen in AUFS, but the path by which AUFS
  accesses this file from one of its "branches". In order to fix
  the path we get the info about branches from sysfs and when we
  meet such a file, we cut the branch part of the path. ]

Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-21 18:35:22 +04:00
Cyrill Gorcunov
ecd432fe27 timerfd: Implement c/r procedure
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-06 19:20:09 +04:00
Andrey Vagin
ce5aa74d10 mount: save local mount point paths on restore
On restore we add a temporary root to a mount point path. It's convinient
for restoring mount namespaces, but real paths are used for restoring
link-remap files.

v2: replace the offset field on a char * field

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-06 19:14:15 +04:00
Pavel Emelyanov
5289ea973a mnt: Extend comment about how mntinfo->mountpoint path looks like
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-08-06 12:04:22 +04:00
Tycho Andersen
51876eea5d Attempt to restore cgroups
During the dump phase, /proc/cgroups is parsed to find co-mounted cgroups.
Then, for each task /proc/self/cgroup is parsed for the cgroups that it is a
member of, and that cgroup is traversed to find any child cgroups which may
also need restoring. Any cgroups not currently mounted will be temporarily
mounted and traversed. All of this information is persisted along with the
original cg_sets, which indicate which cgroups a task is a member of.

On restore, an initial phase creates all the cgroups which were saved. Tasks
are then restored into these cgroups via cg_sets as usual.

Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-07-10 17:00:28 +04:00
Andrey Vagin
494c044384 mount: dump one file system only once (v2)
A file system can be bind-mounted a few times and some of these mounts
can be non-root. We need to find one of root mounts and dump it.

v2: don't forget to check pm->dumped and pm->parent
    don't dump a root file system, it's always external for now.

Reported-by: Saied Kazemi <saied@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:40:00 +04:00
Pavel Emelyanov
c7e0042946 crtools: Introduce the --ext-mount-map option (v3)
On dump one uses one or more --ext-mount-map option with A:B arguments.
A denotes a mountpoint (as seen from the target mount namespace) criu
dumps and B is the string that will be written into the image file
instead of the mountpoint's root.

On restore one uses the same --ext-mount-map option(s) with similar
A:B arguments, but this time criu treats A as string from the image's
root field (foobar in the example above) and B as the path in criu's
mount namespace the should be bind mounted into the mountpoint.

v3:
* Added documentation
* Added RPC bits
* Changed option name into --ext-mount-map
* Use colon as key and value separator

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-06-17 10:36:30 +04:00
Pavel Emelyanov
b48e4cbfb8 proc: Introduce helper for parsing /proc/$pid/cgroup file
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-05-27 23:48:06 +04:00
Andrey Vagin
85569e8dd4 mount: prevent dumping nested mount namespace without mnt_id in fdinfo
When we don't know mnt_id, we don't know to which namespace a file
belongs.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-21 22:40:27 +04:00
Andrey Vagin
1b3fa9bc25 mount: set nsid for each mount point
We want to look up mntns by mnt_id.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-21 22:39:50 +04:00
Andrey Vagin
84c48e6244 mounts: Mark ns' roots in the list of mount points (v2)
When we'll restore nested mount namespaces, all but root ones (sub-namespaces)
will be restored as sub-mounts in the root mount namespace. So mi->mountpoint
will be not '/' even if a mount is root for its mntns.

v2: s/is_root/is_ns_root/
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-21 22:37:57 +04:00
Andrey Vagin
8df879941d mount: save relative path in mi->mountpoint
"relative path" is absolute path with dot at the beginning.

We already use relative paths on restore. In this patch we add "."
on dump too. It's convinient, because we needed to add dot each time
when we want to access this mount point.
Before this patch we had to created a temporary copy.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-17 12:05:58 +04:00
Andrey Vagin
bed13a58ec proc_parse: parse mnt_id from /proc/PID/fdinfo/FD
It will be used for restoring files from proper mounts.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-09 16:43:50 +04:00
Andrey Vagin
8bcffef6b9 proc_parse: parse fdinfo to get pos and flags
We are going to parse fdinfo for getting mnt_id,
so we can take there pos and flags and don't call
fcntl and lseek for that.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-09 16:42:53 +04:00
Pavel Emelyanov
fd41201975 restore: Parse /proc/self/maps for self mappings
On restore we only need to know currnet task mappings' start and end
to find where to put the restorer blob. And since the smaps file in
/proc/pid is up to 3 times slower, than the maps one, it makes
perfect sense just to parse the latter one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-07 13:32:21 +04:00
Pavel Emelyanov
c18c733b7c proc-parse: Parse pid's fdinfo entries
The existing code opens "self" and parses what's in there,
just twist the code a little to accept generic pid.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-01-30 16:20:15 +04:00
Pavel Emelyanov
c18b30d0a9 mount: Restore external bind-mounts with plugins
All the entries with with_plugin set will be mounted by plugin.
The interesting case is when we do the pivot-root restore. In this
case we call restore callback very early (before we unmount the old
tree) and ask it to create the mountpoint at temporary location.
Later we move the mount to proper place.

The old_root argument of the callback is where it can find files
in the original mount namespace.

The is_file is return-argument. Sine files and directories cannot be
bind-mounted to each-other, the callback should create the mountpoint
itself and report whether it created file or directory.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 11:07:41 +04:00
Pavel Emelyanov
d21ff39aab mount: Dump external bind-mounts with plugins
External bind mounts are those with source sitting outside of the
current FS view. Such are detected in validate_mounts(), so we
just go ahead and call plugins.

The plugin is provided with the mountpoint to decide whether it's
his or not (what else does the guy need?) and an ID with this it
can identify the mountpoint in /proc. The same ID will be used at
restore time to find the needed restore info.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-26 11:07:39 +04:00
Cyrill Gorcunov
e9f9fdb9b3 headers: Drop uintX_t usage
We have a mess of uintX_t and uX usage. Drop off uintX_t ones.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-12 10:03:07 +04:00
Pavel Emelyanov
3708ecb499 mount: Introduce generic FSs parsing callback
And make use of it in for btrfs.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2013-12-04 20:03:17 +04:00
Pavel Emelyanov
b6e2dfd2de mount: Prepare fstypes to contain more unsupported FSs
We will need to parse btrfs stuff, but this one is not
in the supported list yet (as it's bound to hardware).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2013-12-04 20:03:15 +04:00
Cyrill Gorcunov
232e3a28a7 proc_parse: Introduce @private member into mount_info structure
It will hold auxiliary data associated with mount point. We
will use it for btrfs handling but in future it can be extended.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-04 18:18:22 +04:00
Cyrill Gorcunov
eeb21b8a34 proc_parse: Remember a type of FS provided by a kernel
We will need it for btrfs handling. Also print out the
FS type for easier debug

 | (00.003545)     type unsupported (cgroup) source cgroup 1c / @ /sys/fs/cgroup/blkio flags 30000e options blkio,
 | (00.003558)     type unsupported (cgroup) source cgroup 1d / @ /sys/fs/cgroup/perf_event flags 30000e options perf_event,
 | (00.003571)     type unsupported (cgroup) source cgroup 1e / @ /sys/fs/cgroup/hugetlb flags 30000e options hugetlb,
 | (00.003584)     type unsupported (ext4) source /dev/sda2 800002 / @ / flags 300000 options data=ordered,
 | (00.003670)     type tmpfs (tmpfs) source tmpfs 20 / @ /tmp flags 100000 options
 | (00.003696)     type unsupported (mqueue) source mqueue d / @ /dev/mqueue flags 300000 options

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-12-04 18:18:21 +04:00
Andrey Vagin
a6edbcf669 crtools: don't include restorer.h in proc_parse.h
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:37:55 +04:00
Andrey Vagin
3e5ad587f4 parse_proc: move parse_threads from cr-dump.c
It will be used in cr-restore.c for stopping threads on the exit from
sigreturn.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:03 +04:00
Andrey Vagin
02d8a72bb5 mounts: find mounts, which are propagated from a current one (v2)
A few sentences, which are required for understanging this patch

2a) A shared mount can be replicated to as many mountpoints and all the
replicas continue to be exactly same.
2b) A slave mount is like a shared mount except that mount and umount
events only propagate towards it.
2c) A private mount does not forward or receive propagation.

All rules is there Documentation/filesystems/sharedsubtree.txt

If it's a first mount in a group, all group members should be
bind-mounted from this one.

Each mount propagates to all members of parent's group. The group can
contains a few slaves.

Mounts, which have propagated to slaves, are unmounted, because we can't
be sure, that they propagated in real life. For example:

mount --bind --make-slave /share /slave1
mount --bind --make-slave /share /slave2
mount /share/test
umount /slave2/test
mount --make-share /slave1/test
mount --bind --make-share /slave1/test /slave2/test

41 40 0:33 / /share rw,relatime shared:28 - tmpfs xxx rw
42 40 0:33 / /slave1 rw,relatime master:28 - tmpfs xxx rw
43 40 0:33 / /slave2 rw,relatime master:28 - tmpfs xxx rw
44 41 0:34 / /share/test rw,relatime shared:29 - tmpfs xxx rw
46 42 0:34 / /slave1/test rw,relatime shared:30 master:29 - tmpfs xxx rw
45 43 0:34 / /slave2/test rw,relatime shared:30 master:29 - tmpfs xxx rw

/slave1/test and /slave2/test depend on each other and minimum one of them
doesn't propagate from /share/test

v2: use false and true for bool

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 15:01:30 +04:00
Andrey Vagin
e01072c05a mounts: if a mount can't be mounted, it is queued in postpone list (v4)
Try to restore mounts while a postpone list isn't empty and check
that each iteration has some progress, otherwice it will fails for
preventing infinite loops

v2: rework logic about postpone list
    add more comments

v3: one more attempt to make it more readable
v4: Here is a master class from Pavel how to write self-documented code.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 15:01:28 +04:00
Andrey Vagin
d00e7c6f88 mount: link dependent mounts (v3)
All shared mounts from one group are connected to circular list.
All slave are added into the proper master list.

v2: change variable name and fix a bug about adding shared mounts in a
circular list.
v3: handle errors of collect_shared

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-08-30 15:01:26 +04:00
Pavel Tikhomirov
d992960fa7 posix-timer: Parse proc /proc/<pid>/timers and save info in list
Signed-off-by: Pavel Tikhomirov <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-07-01 16:19:26 +04:00
Andrey Vagin
6a49f82fb6 mount: free all parts of mnt entries
mnt_entry contains a few strings and they should be release too

CID 996198 (#4 of 4): Resource leak (RESOURCE_LEAK)
20. leaked_storage: Variable "pm" going out of scope leaks the storage
it points to.

CID 996190 (#1 of 1): Resource leak (RESOURCE_LEAK)
13. leaked_storage: Variable "new" going out of scope leaks the storage
it points to.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-04-10 01:26:59 +04:00
Pavel Emelyanov
b71f9e80be vma: Introduce list-of-vmas object
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-03-01 20:11:51 +04:00
Cyrill Gorcunov
fcb9a9bfb1 cpu: Make cpu routines being per-acrh
They are really depends on CPU we're running on.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-02-18 18:42:08 +04:00
Cyrill Gorcunov
fdfef4b485 headers: Change "../protobuf/" to "protobuf/"
No need to walk up the directories if we need
to include protobuf file. This was always a bad
use of ability to walk the filesystem from other
headers.

Same time we don't need -I$(SRC_DIR)/protobuf/
in general makefile anymore.

[xemul: Small fixlet in head Makefile, since patch
 it out-of-order]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-02-15 17:33:06 +04:00
Qiang Huang
286801d4c1 crtools: collect and check file locks
We collect all file locks to a golbal list, so we can use them easily
in dump_one_task. For optimizaton, we only collect file locks hold by
tasks in the pstree.

Thanks to the ptrace-seize machanism, we can aviod the blocked file lock
issue, makes the work simpler.

Right now, the check handles only one situation:
-- Dumping tasks with file locks hold without the -l option.

This covers for the most part. But we still need some more work to make
it perfect robust in the future.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-17 14:42:07 +04:00
Cyrill Gorcunov
d5927a47f1 proc-parse: Add parsing of fanotify objects
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-15 18:34:49 +04:00
Cyrill Gorcunov
b724096f0f fsnotify: Rename inotify files to fsnotify
We will be handling both inotify and fanotify
objects here thus to make less confusion rename
the files to fsnotify.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-15 18:34:26 +04:00
Alexander Kartashov
6f61488f21 x86: moved x86-specific files into the directory arch/x86.
* The following files goes into the directory arch/x86/include/asm unmodified:
  - include/atomic.h,
  - include/linkage.h,
  - include/memcpy_64.h,
  - include/types.h,
  - include/bitops.h,
  - pie/parasite-head-x86-64.S,
  - include/processor-flags.h,
  - include/syscall-x86-64.def.

* Changed include directives in the source files that include the headers
  listed above.

* Modified build scripts to reflect the source moves.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-09 17:02:47 +04:00
Cyrill Gorcunov
830d92b0f0 headers: Unify include guards (in comments) and a few fixes
- fix names in comments
 - add empty lines where needed
 - fix rbtree.h
 - fix syscall-types.h

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-25 22:40:24 +04:00
Pavel Emelyanov
b4c2160449 hdrs: Fixup reinclusion preprocessor constants
Make them look like __CR_<smth>_H__ with

sed -e '1,2s/#\(ifndef\|define\) _\?_\?\(CR_\)\?/#\1 __CR_/' -e '1,2s/_H_\?_\?.*$/_H__/'

on every header file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-24 15:36:14 +04:00
Cyrill Gorcunov
cf2e6c54c9 cpu: Add code to fetch/test cpuinfo data
This patch add ability to test /proc/cpuinfo data
we're interested in at the moment.

The code provides the following functionality

 - cpu_init, to parse cpuinfo and check if the
   host cpu we're running on is suitable enough
   for FPU checkpoint/restore. If FPU present then
   there must be at least fxsave capability present

 - cpu_set_feature/cpu_has_feature helpers which
   provides to test certain bits and set them where
   needed (we need to set bits when parse cpuinfo)

Note, we reserve space for all cpuinfo bits known
by the kernel at moment, while use only three FPU
related bits for a while. This is done because we might
need to use or find out other features in future.

After all it's just 40 bytes of memory needed to keep
all possible bits.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-21 17:35:34 +04:00
Andrey Vagin
6949a848d3 mount: Add abstraction layer for dumping file systems (v2)
We need to dump content of some fs like binfmt_misc, tmpfs, ... To facilitate
this the existing list of filesystems is turned into an array of structures
with dump and restore callbacks. Each FS may declare them they need.

v2: rework encode/decode_fstype not to do it twice.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-09 16:27:30 +04:00
Pavel Emelyanov
8097a8dc09 signalfd: Add proc fdinfo parsing facility
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-02 12:25:18 +04:00
Pavel Emelyanov
ffd40996ea pb: Switch creds to protobuf format
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-07-19 12:35:25 +04:00
Cyrill Gorcunov
28638b611c protobuf: Convert inotify data to protobuf engine
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-07-17 07:25:42 +04:00
Cyrill Gorcunov
fa923ee14e protobuf: Convert eventpoll data to protobuf engine
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-07-17 07:25:40 +04:00
Cyrill Gorcunov
ca21674573 protobuf: Convert eventfd data to protobuf engine
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-07-17 07:24:54 +04:00
Pavel Emelyanov
feb6624ddf inotify: Wire into and use Generic fdinfo parsing engine
With this the code looks clearer and finally a ground for "check" code is prepared.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-07-11 09:35:36 +04:00