2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-29 13:28:27 +00:00

176 Commits

Author SHA1 Message Date
Pavel Emelyanov
8801f59615 mem: Protobuf format for page dumps
Since now we drain pages out of parasite, we can invent any format for
page dumps. Let is be ... prorobuf one! :)

Another thing to keep in mind, is that we're about to use splices and
implement iterative migration, so it's better to have actual pages be
page-aligned in the image.

And -- backward compatibility. That said the new format is:

1. pagemap-... file which contains a header (currently with a ID of
   the image with pages, see below) and an array of <nr_pages:vaddr>
   pairs. The first value means "how many pages to take from the
   file with pages (see below)" and the second -- where in the task
   address space to put them. Simple.

2. pages-... file which containes only pages one by one (thus aligned
   as we want).

This patch breaks backward compatibility (old images with pages wil
be restored and then crash). Need to do it before v0.5 release.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-03-14 18:16:41 +04:00
Pavel Emelyanov
4d0b24b52a vma: Keep track of lonest vma in list and sum of its lengths
I will have to push some sort of map of pages to dump into parasite.
For this, I need to have estimation of how much memory I'd need for
than in parasite args. These two values will help with it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-03-01 20:12:33 +04:00
Pavel Emelyanov
8fdc707084 vma: Helper, that checks whether vma is dumped via parasite
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-03-01 20:12:02 +04:00
Pavel Emelyanov
b71f9e80be vma: Introduce list-of-vmas object
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-03-01 20:11:51 +04:00
Cyrill Gorcunov
fdfef4b485 headers: Change "../protobuf/" to "protobuf/"
No need to walk up the directories if we need
to include protobuf file. This was always a bad
use of ability to walk the filesystem from other
headers.

Same time we don't need -I$(SRC_DIR)/protobuf/
in general makefile anymore.

[xemul: Small fixlet in head Makefile, since patch
 it out-of-order]

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-02-15 17:33:06 +04:00
Cyrill Gorcunov
90fbbabb34 make: Generate crtools version from Makefile definition
This allows us to have a unique place where version lives
and what is more important other subsystems (such as docs
generation) may reuse version definitions.

At moment EXTRA (which corresponds kernels -rc tag) and
NAME is not yet used, but I desided to put them in place
for future needs.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-02-11 19:24:06 +04:00
Pavel Emelyanov
475bb1e775 rst: Evaluate per-task clone mask early
When we've read all pstree-items and their ids we
can get the desired clone-flags early and avoid all
these dances with flag calculations in fork_with_pid
and company.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-19 01:16:19 +04:00
Pavel Emelyanov
ac845bd1d8 cr: Obsolete the --namespaces option
It's no longer required to use this option -- two currently
supported cases (tasks on host and tasks in containers) can
be detected automatically. Keep this option for future.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-18 13:25:16 +04:00
Qiang Huang
de54cd4385 crtools: add image contents for filelocks
Some contents needed by filelock's image.

Originally-signed-off-by: Zheng Gu <cengku.gu@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-17 14:40:00 +04:00
Qiang Huang
0de267ccdc crtools: add a callback to show filelocks-%d.img
Add a callback for filelock's image show.

Originally-signed-off-by: Zheng Gu <cengku.gu@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-17 14:38:40 +04:00
Qiang Huang
8f96ff85ff crtools: Add --file-locks option
Add --file-locks/-l option to support handling file locks, for safety,
only used for container.

Originally-signed-off-by: Zheng Gu <cengku.gu@huawei.com>
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-17 14:36:47 +04:00
Cyrill Gorcunov
f385d32da5 image: fanotify -- Add scaffold code for fanotify objects
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-15 18:34:32 +04:00
Andrey Vagin
979eb2a179 restore: restore pocesses which share one fdtable (v5)
Currenly crtools supports a case when a child shared a fd table
with parent.

Here is only two interesting things.
* Service descriptors should be cloned for each process
  who shared one fd table.
* One task should restore files and other tasks should sleep in this
* time.

v2: * allocate fdt_lock from shared memory
    * don't wait a child, if it doesn't share fdtable
v3: * don't move ids on the pstree image
v4: * save ids in a separate image
    * save fdinfo per id instead of pid
v5: fix alignment of service_fd_id

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-12 00:54:46 +04:00
Andrey Vagin
d50c786c7e files: dump fdinfo per files_id instead of pid
A few processes can share one fdtable.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-12 00:54:44 +04:00
Andrey Vagin
3c8bd5ccd1 pstree: allocate fdt shared data
fdt shared data contains PID of process, which will restore file
descriptors and a futex for synchronization.

A process with mimimal pid restores file descriptors.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-12 00:54:41 +04:00
Andrey Vagin
20b5ed916f crtools: put kobj-ids into separate image file (v2)
It is read together with pstree items for checking what kind of
resources should be shared. Core is too big for reading it in
this place.

v2: fix check_core

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-12 00:54:39 +04:00
Andrey Vagin
4ea6214391 crtools: add ability to create and close a service fd (v3)
A service fd should be created, otherwise get_service_fd returns -1.

This patch removes this functionality from other subsystems and
allows to clone service descriptors.

v2: rename open_service_fd to install_service_fd
v3: two patches were merged for bisecting

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-12 00:54:37 +04:00
Andrey Vagin
e6106956af util: clone service descriptors, if fd tables are shared for tasks (v3)
It looks like a namespace for service descriptors.
It will be used for restoring tasks with shared fd tables.
Service descriptors should be own for each process.

v2: clone_service_fd doesn't know about sub-systems like log, proc, etc
v3: Don't try to find a free name-space.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-12 00:54:36 +04:00
Pavel Emelyanov
748be83181 cr: Support rlimits
Dump the with "new" prlimit syscall that works on arbitrary pid.

Restore is done in restorer _after_ mappings mixup and _before_
caps drop to make it set any max value.

The RLIM_INFINITY is handled explicitly to help future 64<->32
bits migration.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-10 20:08:38 +04:00
Alexander Kartashov
6f61488f21 x86: moved x86-specific files into the directory arch/x86.
* The following files goes into the directory arch/x86/include/asm unmodified:
  - include/atomic.h,
  - include/linkage.h,
  - include/memcpy_64.h,
  - include/types.h,
  - include/bitops.h,
  - pie/parasite-head-x86-64.S,
  - include/processor-flags.h,
  - include/syscall-x86-64.def.

* Changed include directives in the source files that include the headers
  listed above.

* Modified build scripts to reflect the source moves.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-01-09 17:02:47 +04:00
Cyrill Gorcunov
830d92b0f0 headers: Unify include guards (in comments) and a few fixes
- fix names in comments
 - add empty lines where needed
 - fix rbtree.h
 - fix syscall-types.h

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-25 22:40:24 +04:00
Pavel Emelyanov
b4c2160449 hdrs: Fixup reinclusion preprocessor constants
Make them look like __CR_<smth>_H__ with

sed -e '1,2s/#\(ifndef\|define\) _\?_\?\(CR_\)\?/#\1 __CR_/' -e '1,2s/_H_\?_\?.*$/_H__/'

on every header file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-24 15:36:14 +04:00
Pavel Emelyanov
1e538cf02b dump: Drop unused image file perm change before sending it to parasite
Since we send _fd_ it's pointless to call chmod on image file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-18 21:01:53 +03:00
Pavel Emelyanov
4bb4ece76e exec: Initial skeleton
Reserve the cmdline option for this and link empty file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-17 22:56:06 +03:00
Pavel Emelyanov
9428f3a9a2 dump: Make collect_mappings non static
Will be used by other code.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-17 22:50:06 +03:00
Pavel Emelyanov
d721e9d48c criu: Version 0.3 release
This is mostly bugfix and improvements release.
Nonetheless, some new features exists, the most interesting are:

  * proper COW mappings handling
  * full packet sockets support, thus supporting the tcpdump tool
  * the --shell-job option, which makes it possible to dump apps
    launched from one shell and restore them in another

Some features are available with the custome kernel, but the good
news is that now _all_ of the patches we need are in one of the
-next trees or in the -mm one, and thus have good chances to get
merged in 3.8 (or soon after it).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-11 21:22:55 +04:00
Pavel Emelyanov
56e1decf48 show: Factor out skipping of data dump into show_image_data
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-10 18:57:54 +03:00
Pavel Emelyanov
da8dbe43bc rst: Move premmaped* variables on rst_info
This thing on pstre_item was created to carry task-specific
information across the "restore" code-flow.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-12-06 13:19:01 +03:00
Andrey Vagin
5d712b7430 cr-restore: remove unshared pages from inherited private mappings (v2)
A parent process can change a few pages after forking a child and
all this pages should not be avaliable from the child.

Each vma has a bitmap of existent pages. Parent's and child's bitmaps
can be compared and all pages which are not present in a child bitmap
are dropped.

v2: don't check page_bitmap on NULL

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-20 20:48:33 +04:00
Pavel Emelyanov
d4735a22fa packet: Support mmap-ing of packet sockets
Three parts.

Proc: open of map_files' link doesn't work on sockets. We fstatat
it and check that it's a socket (it will be packet), then save
the socket inode on vma_area.

Dump: we resolve socket inode to socket id and save it on vma.
We use id, not inode, since on restore we'll have to mmap some
opened file, not just abstract socket with inode.

Restore: when reading vma-s we just need to find out on what fd
the respective packet socket is opened (i.e. -- no map-and-close
sockets supported by now) and dup() it to let restorer mmap it
back.

All this make it possible to c/r the tcpdump tool!

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-11-02 16:00:18 +03:00
Cyrill Gorcunov
9ba5c935f1 options: Add "--shell-job" option
This option will tells the tool to procceed dumping
even if a root task is not a session leader.

This implies that this option will allow to "migrate"
one external tty connection. Say a person may dump
"top" application in one bash shell and restore it
in another shell session.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-18 15:51:57 +04:00
Cyrill Gorcunov
9c579cfd02 sfd_type: Add SELF_STDIN_OFF service fd and call helpers where needed
We will need it for slave ttys migration. They serve for one purpose --
to clone self stdio descriptor and use it with tty layer, which will
be addressed in further patches.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-18 15:51:56 +04:00
Cyrill Gorcunov
e0be540401 pstree: Move struct pid to pstree.h
I believe this make sense to keep this structure
in pstree.h where pstree related data lives.

Also I've added some comments on struct pid members.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-10-08 18:59:36 +04:00
Andrey Vagin
d33d2290bd files: rework a function for closing all descriptors (v2)
It reads /proc/PID/fd and close all descriptors except service fds.

v2: s/is_one_of_service_fds/is_any_service_fd

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-29 11:01:53 +04:00
Pavel Emelyanov
1be08acc7c remap: Add cmdline option to allow linked remap
By default crtools shouldn't modify the environment, except for
killing the dumped tasks. The link remap does so and should sit
under explicit cmdline option.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-26 06:39:23 +04:00
Pavel Emelyanov
d81c9a4618 criu: Version 0.2 release
The biggest acheivement since v0.1 -- initial support for LXC containers!

Other less notable (but still great) things done are:

* Implemented support for TTY-s
* Added support for packet sockets
* Bug-fix here and there

Note, that images generated by v0.1 tool are accepted by v0.2 one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-20 13:44:18 +04:00
Andrey Vagin
4c2a226458 crtools: add ability to execute external scripts (v2)
Scripts are executed when external actions required.
CRTOOLS_SCRIPT_ACTION contains a required action.
If a script doesn't know a current action, it should exit with 0.

The first usecase will be lock/unlock network.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-17 20:05:23 +04:00
Cyrill Gorcunov
45cc85eea4 Speed up service-fd retrieval
We're using get_service_fd in file engine,
better to make it fast. This patch caches
the limits system provides us, instead of
calling getrlimit() every time.

This patch introduces is_service_fd helper
which will be used instead of get_service_fd
where it make sense.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-13 14:45:06 +04:00
Cyrill Gorcunov
20d6762d93 tty: Add restoration of controlling terminal v4
The idea behind is pretty simple -- once we find
that there is a controlling terminal present we
do call ioctl on appropriate /dev/pts/N.

This is done in a bit unusuall manner. When we
find that there is a controling terminal present
we do create an additional FdinfoEntry for it
with object id taken from existing master peer.

The file engine stack this new FdinfoEntry on
fd_info_head head list. Thus we will have at
least two entries on this list. One for real
Fdinfo associated with master peer and one for
our new generated Fdfinfo entry, it depends on
pid which one become a file master.

Finally we do use post_open_fd hook in our
tty code which allows us to open controlling
terminal and yield proper ioctl on it.

v2:
 - restore control terminals via service fd,
   still need to speedup service fd retrieval.

v3:
 - use prepare_ctl_tty() helper to generate
   control terminal fdinfo entry

v4:
 - use post_open_fd

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-12 20:00:58 +04:00
Cyrill Gorcunov
a365befaea tty: Add show method
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-12 20:00:55 +04:00
Cyrill Gorcunov
89a7a45d37 tty: Add checkpoint/restore for unix terminals v6
Usually the PTYs represent a pair of links -- master peer and slave
peer. Master peer must be opened before slave. Internally, when kernel
creates master peer it also generates a slave interface in a form of
/dev/pts/N, where N is that named pty "index". Master/slave connection
unambiguously identified by this index.

Still, one master can carry multiple slaves -- for example a user opens
one master via /dev/ptmx and appropriate /dev/pts/N in sequence.
The result will be the following

master
`- slave 1
`- slave 2

both slave will have same master index but different file descriptors.
Still inside the kernel pty parameters are same for both slaves. Thus
only one slave parameters should be restored, there is no need to carry
all parameters for every slave peer we've found.

Not yet addressed problems:

- At moment of restore the master peer might be already closed for
  any reason so to resolve such problem we need to open a fake master
  peer with proper index and hook a slave on it, then we close
  master peer.

- Need to figure out how to deal with ttys which have some
  data in buffers not yet flushed, at moment this data will
  be simply lost during c/r

- Need to restore control terminals

- Need to fetch tty flags such as exclusive/packet-mode,
  this can't be done without kernel patching

[ avagin@:
   - ideas on contol terminals restore
   - overall code redesign and simplification
]

v4:
 - drop redundant pid from dump_chrdev
 - make sure optional fown is passed on regular ptys
 - add a comments about zeroifying termios
 - get rid of redundant empty line in files.c

v5 (by avagin@):
 - complete rework of tty image format, now we have
   two files -- tty.img and tty-info.img. The idea
   behind to reduce data being stored.

v6 (by xemul@):
 - packet mode should be set to true in image,
   until properly fetched from the kernel
 - verify image data on retrieval

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-12 20:00:54 +04:00
Cyrill Gorcunov
5340807b87 Add separate list for slave tty fds on restore v2
When we will be restoring ttys we need that restore
procedure for master peers will be yielded earlier
than for slave peers due to ttys specifics. With this
commit we introduce @tty_slaves list which will allow
us to order tty file restore procesure.

Because we need to fetch which list to be used depending
on tty type this patch extend select_ps_list with fdinfo_list_entry
parameter.

v2 (by xemul@):
 - make sure the epoll list is still last

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-12 20:00:52 +04:00
Andrey Vagin
9508e39e9b crtools: use pit_t for PIDs
Here is a bit mess, because we used unsigned int instead of pid_t.
A negative value is used for uninitialized PID's variables.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-07 19:16:35 +04:00
Andrey Vagin
ae70bc0ad6 net: add ability to set names for outside links of veth devices
When restoring a container crtools create veth pair inside it and then
pushed one end to the namespaces crtools live in (outside). To facilitate
the subsequent management of the otter end of the veth pair this option
is added -- one can specifu a name by which the respective end would be
visible. E.g.: --veth-pair eth0=veth101.0

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-02 01:07:32 +04:00
Andrey Vagin
9ec01ff307 log: don't create a log file in a current directory
We can set a directory for log and image files.
crtools sets it as a current directory and then creates all files in it.
It works before we don't decide to change a mount name space.

I suggest to open a log dir and create files for help openat.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-09-02 01:02:30 +04:00
Andrey Vagin
aabb56bd66 crtools: write a pid of a root task in a specified file
When we restore a pid namespace the root task will get some unknown pid
in the original (i.e. -- the ns crtools a launched from) one. To find
this pid out one can use this option -- it will make the pid obtained by
the new init to be written into a pid file.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-14 12:54:00 +04:00
Andrey Vagin
58e22f31fc mount: Add support of tmpfs (v3)
When dumping a tmpfs mount we need to take its contents with us.
So, use tar for it and put it into the image dir.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-09 19:51:22 +04:00
Pavel Emelyanov
fc7071d05e net: Packet sockets basic support
Support only basic packet socket functionality -- create and bind.
This should be enough to start testing dhclient inside container.
Other stuff (filter, mmaps, fanouts, etc.) will come later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-09 16:17:41 +04:00
Andrey Vagin
ba0d5fb226 dump: handle overmounted devices (v2)
"init" in LXC opens /dev/null and then mounts devtmpfs in /dev,
so crtools can not resolve the path to the origin /dev/null.

crtools with the option --evasive-devices will check the origin
device and a new device are the same and if it's true, crtools will
dump a new path.

v2: add a description for the option

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-07 19:33:49 +04:00
Andrey Vagin
420325dca6 restore: add an option for changing a root file system (v2)
The option is -r|--pivot-root and an argument is a path to new root.
A root task will make pivot_root. LXC CT does that, so we need that
for restoring.

v2: s/pivot-root/root/

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2012-08-02 16:07:43 +04:00