Based on work done by Cyrill Corcunov (many thanks for that).
In this commit we implement c/r for files which have opened
/proc/$pid/ns/$ids entries.
The idea is rather simple one
Checkpoint
==========
- Check if the file name is the one of known to be ns ref
- If match then write protobuf entry
Restore
=======
- Read all ns entries from the image
- When criu tries to open one we lookup over process
tree to figure out which PID should be used in path
and then just open it
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To be able to dump procfs namespace entries we will need to
analyze the link name first and then figure out if the file
referred indeed a procfs entry or it's plain regular file.
This means we read link early, and to escape double reading
of same link, once we read it we remember it in fd_parms
structure which we pass to dump_one_reg_file.
Still the dump_one_reg_file is not solely used from inside
of dump_one_file, but from a number of other places (fifo,
special files) where fd_parms is filled only partially
and fill_fd_params is not even called. Thus dump_one_reg_file
must be ready for such scenario and read the link name by own
if the caller has not provided one.
To achieve all this we do
- extend fd_parms structure to have a reference on a new
fd_link structure, thus if caller already read the link
name it might assign the reference and call for dump_one_reg_file
- tune up dump_one_reg_file to fill own copy of fd_link
structure if the caller has not provied one
[ xemul: Added const to fill_fdlink arg ]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
struct fd_parms is reused for two calls, so don't
forget to initialize it before dump_one_reg_file
call.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
No need to walk up the directories if we need
to include protobuf file. This was always a bad
use of ability to walk the filesystem from other
headers.
Same time we don't need -I$(SRC_DIR)/protobuf/
in general makefile anymore.
[xemul: Small fixlet in head Makefile, since patch
it out-of-order]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A few processes can share one fd table. Each process has own set of
service file descriptors and a process knows nothing about servic fds
of another processes. So if two process share one fd table,
close_old_fds will close servic descriptors of another process.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently fdinfo dumps for each task, so CR_FD_FDINFO is in cr_fdset.
A few tasks can share one fd table and the set of descriptors will be
dumped once and a image name will contain files_id instead of pid.
In this case CR_FD_FDINFO will go away from cr_fdset.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* The following files goes into the directory arch/x86/include/asm unmodified:
- include/atomic.h,
- include/linkage.h,
- include/memcpy_64.h,
- include/types.h,
- include/bitops.h,
- pie/parasite-head-x86-64.S,
- include/processor-flags.h,
- include/syscall-x86-64.def.
* Changed include directives in the source files that include the headers
listed above.
* Modified build scripts to reflect the source moves.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Make them look like __CR_<smth>_H__ with
sed -e '1,2s/#\(ifndef\|define\) _\?_\?\(CR_\)\?/#\1 __CR_/' -e '1,2s/_H_\?_\?.*$/_H__/'
on every header file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some file-type specific parameters can be fetched with
parasite code only, so lets carry parasite control block
pointer in struct fd_parms.
This is a bit ugly but requires less code to touch and
enough for now. In long terms we need some more generalized
routine/hooks which would depends on file type.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
By default crtools shouldn't modify the environment, except for
killing the dumped tasks. The link remap does so and should sit
under explicit cmdline option.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These are not ghost, as they are still on fs, so we cannot take
them with us in the image. Neither we can easily find the other name
of that file. Sad :(
To make it work we linkat() the new name to that file using the
AT_EMPTY_PATH flag to link directly to the opened fd. If we could
openat() the fd's parent we would better do it, but we can't and
thus have to create the link name by explicit absolute path :(
This modifies the fs we're dumping, so I'll introduce one more cmd
line option for that soon.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There are places when we have to select which fd to ->open
and which to ->receive. To avoid deadlocks we sort them in
an ascending order on { pid, fd } pair.
Make this idea more formal by introducing an explicit function
doing this check and call it where appropriate (pipe.c master
selection is also simplified to fit new ... API).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Don't do explicit switch by file type in files.c and don't export
intimate knowledge of pty being master/slave. Use a file desc op
for that.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The idea behind is pretty simple -- once we find
that there is a controlling terminal present we
do call ioctl on appropriate /dev/pts/N.
This is done in a bit unusuall manner. When we
find that there is a controling terminal present
we do create an additional FdinfoEntry for it
with object id taken from existing master peer.
The file engine stack this new FdinfoEntry on
fd_info_head head list. Thus we will have at
least two entries on this list. One for real
Fdinfo associated with master peer and one for
our new generated Fdfinfo entry, it depends on
pid which one become a file master.
Finally we do use post_open_fd hook in our
tty code which allows us to open controlling
terminal and yield proper ioctl on it.
v2:
- restore control terminals via service fd,
still need to speedup service fd retrieval.
v3:
- use prepare_ctl_tty() helper to generate
control terminal fdinfo entry
v4:
- use post_open_fd
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used for restoring epollfd.
Currently a transport fd may be added to epollfd.
epollfd should be populated, when all descriptors were already received.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used for restoring inet sockets. An inet socket is created with
the option REUSEADDR, because the restore logic requires this.
The origin value can be restored only when all sockets were restored.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's unclear why we've had two different names
for same argument. Moreover, "tsk" is definitely
misnamed feature.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It doesn't makemuch sense in pulling this further. The generic genid generation seems to
be enough for eny file type.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2:
- Use regular uint types in message proto
- Use PB engine for "show"
v3:
- drop usage of temp. variable in prepare_shmem_pid
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are ready to use FownEntry everywhere,
so drop fown_t type and clean up source code.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To not spread opencoded copying of PB entries and image structures
we add a few pb_ helpers to be used until PB code completely merged.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A short story -- there were a long conversation on which format should
be used to keep checkpointed data on disk image. We ended up in using
Google's Protocol Buffers (see https://developers.google.com/protocol-buffers/
for detailed description). Thus image entries should be convered to PB.
This patch converts fdinfo_entry to PB "message fdinfo_entry".
Build note: one should have protobuf and protobuf-c installed to be able
to build crtools.
- http://code.google.com/p/protobuf/
- http://code.google.com/p/protobuf-c/
Inspired-by: Pavel Emelianov <xemul@parallels.com>
Inspired-by: Kinsbursky Stanislav <skinsbursky@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need to use pointer here (to PB object) anyway
so better to make it in a separate patch for bisectability sake.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
- add argument names
- align members
- add extern kw
- group struct decl. on top
- make file_desc_ops::type being unsigned
integer similar to fdtype_ops::type
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These are declared in files-reg.h, so get rid of
them and add files-reg.h inclusion where needed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In the open_fdinfo we need to get a file_desc associated with the given
fdinfo_list_entry. This is done by searching the hash of descs, but this
can be speeded up by saving the desc pointer on the fdinfo at the time
of collecting them.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's only required for making pipe_id, but making it is better to
be done in place and using the st_ino only.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This saves some space on dynamic file_desc and makes file_desc_ops
looks closer to fdinfo_ops used on dump.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
No functional changes, just a code move.
TODO:
* need to write personal make_gen_id implementations for special
files (device and pos is const for them effectively)
* need to merge dump ops with restore ones (file_desc_ops)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is common, that opened fd fix its fowner and flags. Make
a cuntion for this. Those that obtain fd with open() don't need
to restore flags though.
A thought -- do we need yet another abstraction between fdinfo and
type-d files?
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Don't re-read fdinfo image 4 times on restore, just use those collected
on me pstree_entry instance.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Completely unlinked file is the one with n_link count being zero.
Such files only allow to read their contents and carry with us.
In order to dump this thing I introduce the "path remap" technology.
For reg file a remapping entry is dumped which describes, that at
restore stage before opening a regfile->path this path should be
linked to some other name and then (after open) unlinked.
For completely unlinked files the remap path would be a path to
a "ghost" file, i.e. a file which is created only at the time of
restore and which is removed completely at the end of it.
Partially unlinked files (i.e. those having n_link != 0, but a
path by which we see them in someone's fd is not accessible) should
be handled in another way.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The base idea is trivial, once file descriptor
created the owner is read and set up.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For regfiles this is done at open() time, for pipes thit is done with fcntl. Use
the same fcntl approach for sockets.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>