This is required to use criu swrk in libcontainer.
v2: remove useless function declaration
allow to set inherit_fd only for swrk
v3: check swrk out of loop
Cc: Saied Kazemi <saied@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If a process is executed in another pidns, a /proc/PID doesn't link with
the proper process.
This patch fixes a problem like this:
1: Error (util.c:106): Unable to close fd 33: Bad file descriptor
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have two places where we lookup the inherited-fd list
by name and dup() the descriptor found. I propose to factor
out this piece in a single inherited_fd() call. When
we will want to support inheritance for sockets or any
other files we'll simply add the inherited_fd() call
there.
I'm also thinking about moving the call to inherited_fd
into generic level, but the open_path() routine doesn't
allow to do it in a simple manner.
Also we have not yet finished issue with files-vs-inodes
mapping. Keeping all the logic in one function should
make the solution simpler.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Add comment to inherit_fd_reused() explaining how an inherit fd may be
closed or reused outside the inherit fd logic.
Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There are cases where a process's file descriptor cannot be restored
from the checkpoint images. For example, a pipe file descriptor with
one end in the checkpointed process and the other end in a separate
process (that was not part of the checkpointed process tree) cannot be
restored because after checkpoint the pipe will be broken.
There are also cases where the user wants to use a new file during
restore instead of the original file at checkpoint time. For example,
the user wants to change the log file of a process from /path/to/oldlog
to /path/to/newlog.
In these cases, criu's caller should set up a new file descriptor to be
inherited by the restored process and specify the file descriptor with the
--inherit-fd command line option. The argument of --inherit-fd has the
format fd[%d]:%s, where %d tells criu which of its own file descriptors
to use for restoring the file identified by %s.
As a debugging aid, if the argument has the format debug[%d]:%s, it tells
criu to write out the string after colon to the file descriptor %d. This
can be used, for example, as an easy way to leave a "restore marker"
in the output stream of the process.
It's important to note that inherit fd support breaks applications
that depend on the state of the file descriptor being inherited. So,
consider inherit fd only for specific use cases that you know for sure
won't break the application.
For examples please visit http://criu.org/Category:HOWTO.
v2: Added a check in send_fd_to_self() to avoid closing an inherit fd.
Also, as an extra measure of caution, added checks in the inherit fd
look up functions to make sure that the inherit fd hasn't been reused.
The patch also includes minor cosmetic changes.
Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To use it in tty code even when file
descriptor is not added into files chain.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will use this path in reg-files engine anyway
so simply switch to this ability now.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Instead of calling case() with majors all over the places lets
introduce own enum for tty types and use it instead.
Because we're using not @major numbers now but taking @minors
into account as well, this brings more strict check of which
kind of terminals we can dump now thus it's potentially should
fix the cases when we're trying to c/r terminals which we don't
understand yet (in particular /dev/console [5:1]).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
TASK_HELPERs are created with CLONE_FILES, so if we always close the cg yard
here, it will close it for the other helpers and cause problems. Instead, we
close it much later, in code only called by alive tasks, to ensure that there
is no conflict.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We want to have buffered images to speed up dump and,
slightly, restore. Right now we use plan file descriptors
to write and read images to/from. Making them buffered
cannot be gracefully done on plain fds, so introduce
a new class.
This will also help if (when?) we will want to do more
complex changes with images, e.g. store them all in one
file or send them directly to the network.
For now the cr_img just contains one int _fd variable.
This patch chages the prototype of open_image() to
return struct cr_img *, pb_(read|write)* to accept one
and fixes the compilation of the rest of the code :)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
There will be no int-fd soon, so one more preparation
to this fact.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Since we're going to switch from int-fd-s to class-image
soon the fdset name will not fit into the new terminology.
This patch is
sed -e 's/fdset/imgset/g' -i *
sed -e 's/imgset_fd/img_from_set/g' -i *
git mv include/fdset.h include/imgset.h
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
We have a bug. If someone opens proc with open_pid_proc or alike
with PROC_SELF of real PID before going to restore fds, then the
fd cached by proc helpers would be cached in fd 0 (we close all
fds beforehead) and it may clash with restored fds.
We don't hit this right now simply due to being too lucky -- we
call open_proc(PROC_GEN) on "locks" which first closes the cached
the per-pid descriptor and then reports back just the /proc one
which sits in service area.
But once we change this (next patch) things would get broken.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have a problem with file locks (bug #2512) -- the /proc/locks
file shows the ID of lock creator, not the owner. Thus, if the
creator died, but holder is still alive, criu fails to dump the
lock held by latter task.
The proposal is to find who _might_ hold the lock by checking
for dev:inode pairs on lock vs file descriptors being dumped.
If the creator of the lock is still alive, then he will take
the priority.
One thing to note about flocks -- these belong to file entries,
not to tasks. Thus, when we meet one, we should check whether
the flock is really held by task's FD by trying to set yet
another one. In case of success -- lock really belongs to fd
we dump, in case it doesn't trylock should fail.
At the very end -- walk the list of locks and dump them all at
once, which is possible by merge of per-task file-locks images
into one global one.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When cwd is removed (it can be) we need to collect the respective
file_desc before starting opening any files to properly handle
ghost refcounts. Otherwise we will miss one refcount from the
cwd's on ghost, which in turn will either BUG inside ghost removal,
or will fail the cwd due to the respective dir being removed too
early.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It uses absolute file names, so any open-s should happen _before_
we change tasks' root.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
If fchroot() succeeds the further failures don't get
noticed by caller.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
There's no such thing as fchroot() in Linux, but we need to do
chroot() into existing file descriptor. Before this patch we did
this by chroot()-ing into /proc/self/fd/$fd. W/o proc mounted it's
no longer possible, so do this like
fchdir(proc_service_fd);
chroot("./self/fd/$root_fd");
fchdir($cwd_fd);
Thanks to Andrey Vagin for this trick ;)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
One device can be mounted a few times, so files are identical only,
if they have the same mnt_id.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used for restoring files from proper mounts.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to parse fdinfo for getting mnt_id,
so we can take there pos and flags and don't call
fcntl and lseek for that.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Using O_PATH known to be buggy on 3.11 kernel so use
direct link reading procedure here.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
During the time some files become obsolete and might be missing
in checkpoint image set, but to keep backward compatibility we
still trying to open them, which might print out error like
| Unable to open 'path-to-file'
and confuse a reader why criu prints error but continue working.
To eliminate this problem O_OPT flag has been introduced in
commit 16b5692061e2, which suppress error message priting
if the flag is set.
Now start using O_OPT in the following functions
- open_irmap_cache: irmap cache is relatively new optional feature
- prepare_rlimits, open_signal_image, restore_file_locks,
prepare_fd_pid, prepare_mm_pid, collect_image: all these
helpers are trying to open image files which can be missing.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For example opening the FIFO blocks until the other end is opened also.
We can use O_PATH to avoid sleeping in the open() call.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We do it first -- on collect, second -- on restore. The
2nd lookup is excessive, we can put fd pointer on vm_area
at lookup and reuse one later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For unlinked opened and mmaped files we'd need to
care about remaps, for this the callback with both
file_desc and fdinfo_list_entry will be required.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The is_foo_link readlinks the lfd to check. This makes
anon-inodes dumping readlink several times to find proper
dump ops. Optimize this thing.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will generate some info about file-descriptors at that
stage. For now these pre-dumped ones would be fsnotifies,
so the pre-dump of a single fd is written as simple as
possible, but enough for that type of FDs pre-dump.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need to special-care NFS silly-rename files, thus
we need to know which FS a file belongs to.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Here is nothing interecting. If a file can't be dumped by criu,
plugins are called. If one of plugins knows how to dump the file,
the file entry is marked as need_callback. On restore if we see
this mark, we execute plugins for restoring the file.
v2: Callbacks are called for all files, which are not supported by CRIU.
v3: Call plugins for a file instead of file descriptor. A few file
descriptors can be associated with one file.
v4: A file descriptor is opened in a callback. It's required for
restoring anon vmas.
v5: Add a separate type for unsupported files
v6: define FD_TYPES__UNSUPP
v7: s/unsupp/ext (external)
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Remove whitespace at EOL (found by git grep ' $')
To people using vim, I'd suggest adding the following code to ~/.vimrc:
let c_space_errors = 1
highlight FormatError ctermbg=darkred guibg=darkred
match FormatError /\s\+$\|\ \+\t\|\%80v.\|\ \{8\}/
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
crtools.h is too heavy to be included in many sources
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There's ... a number of places where we want to do something
with /proc/self/fd/%d path. Each time we guess buffer size
that is enough for this. Make standard constant for this and
save some space on stack and drop args for some functions.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
First of all, this should be done strictly after we've stopped accessing
files by their paths, even absolute. This place is right before going
into restorer.
And the second thing is that we want to re-use the open_fd_by_id engine,
since it handles various tricky cases of open-file-by-path. And since
there's no such thing as fchroot(int fd), we emulate it using the
/proc/self/fd/ links.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>