When cwd is removed (it can be) we need to collect the respective
file_desc before starting opening any files to properly handle
ghost refcounts. Otherwise we will miss one refcount from the
cwd's on ghost, which in turn will either BUG inside ghost removal,
or will fail the cwd due to the respective dir being removed too
early.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It uses absolute file names, so any open-s should happen _before_
we change tasks' root.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
If fchroot() succeeds the further failures don't get
noticed by caller.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
There's no such thing as fchroot() in Linux, but we need to do
chroot() into existing file descriptor. Before this patch we did
this by chroot()-ing into /proc/self/fd/$fd. W/o proc mounted it's
no longer possible, so do this like
fchdir(proc_service_fd);
chroot("./self/fd/$root_fd");
fchdir($cwd_fd);
Thanks to Andrey Vagin for this trick ;)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
One device can be mounted a few times, so files are identical only,
if they have the same mnt_id.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used for restoring files from proper mounts.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to parse fdinfo for getting mnt_id,
so we can take there pos and flags and don't call
fcntl and lseek for that.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Using O_PATH known to be buggy on 3.11 kernel so use
direct link reading procedure here.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
During the time some files become obsolete and might be missing
in checkpoint image set, but to keep backward compatibility we
still trying to open them, which might print out error like
| Unable to open 'path-to-file'
and confuse a reader why criu prints error but continue working.
To eliminate this problem O_OPT flag has been introduced in
commit 16b5692061e2, which suppress error message priting
if the flag is set.
Now start using O_OPT in the following functions
- open_irmap_cache: irmap cache is relatively new optional feature
- prepare_rlimits, open_signal_image, restore_file_locks,
prepare_fd_pid, prepare_mm_pid, collect_image: all these
helpers are trying to open image files which can be missing.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For example opening the FIFO blocks until the other end is opened also.
We can use O_PATH to avoid sleeping in the open() call.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We do it first -- on collect, second -- on restore. The
2nd lookup is excessive, we can put fd pointer on vm_area
at lookup and reuse one later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For unlinked opened and mmaped files we'd need to
care about remaps, for this the callback with both
file_desc and fdinfo_list_entry will be required.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The is_foo_link readlinks the lfd to check. This makes
anon-inodes dumping readlink several times to find proper
dump ops. Optimize this thing.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will generate some info about file-descriptors at that
stage. For now these pre-dumped ones would be fsnotifies,
so the pre-dump of a single fd is written as simple as
possible, but enough for that type of FDs pre-dump.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need to special-care NFS silly-rename files, thus
we need to know which FS a file belongs to.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Here is nothing interecting. If a file can't be dumped by criu,
plugins are called. If one of plugins knows how to dump the file,
the file entry is marked as need_callback. On restore if we see
this mark, we execute plugins for restoring the file.
v2: Callbacks are called for all files, which are not supported by CRIU.
v3: Call plugins for a file instead of file descriptor. A few file
descriptors can be associated with one file.
v4: A file descriptor is opened in a callback. It's required for
restoring anon vmas.
v5: Add a separate type for unsupported files
v6: define FD_TYPES__UNSUPP
v7: s/unsupp/ext (external)
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Remove whitespace at EOL (found by git grep ' $')
To people using vim, I'd suggest adding the following code to ~/.vimrc:
let c_space_errors = 1
highlight FormatError ctermbg=darkred guibg=darkred
match FormatError /\s\+$\|\ \+\t\|\%80v.\|\ \{8\}/
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
crtools.h is too heavy to be included in many sources
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There's ... a number of places where we want to do something
with /proc/self/fd/%d path. Each time we guess buffer size
that is enough for this. Make standard constant for this and
save some space on stack and drop args for some functions.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
First of all, this should be done strictly after we've stopped accessing
files by their paths, even absolute. This place is right before going
into restorer.
And the second thing is that we want to re-use the open_fd_by_id engine,
since it handles various tricky cases of open-file-by-path. And since
there's no such thing as fchroot(int fd), we emulate it using the
/proc/self/fd/ links.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For the root task the clone syscall returns the pid in criu's pidns,
but for other processes the clone syscall returns PID in the restored
namespace.
The /proc/self link contains the PID value of the current process, so if
we want to determing the PID in a criu's pidns, we should use criu's
/proc.
v2: readlink() does not append a null byte to buf, so we must do that
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently pos has type unsigned long, so its size depends on
architecture. pos is saved as 64-bit value in the image file and it
isn't restored, if it is equal to -1. Due to convertation on 32-bit
platforms -1 is converted into UINT_MAX and we get error on restore.
$ zdtm.sh ns/static/tun
...
(00.398513) 5: Error (files-reg.c:534): Can't restore file pos: Illegal seek
(00.398888) 5: Error (files-reg.c:489): Can't open file /dev/net/tun: Illegal seek
...
id: 0x15 flags: 0x2 pos: 0x000000ffffffff fown: { uid: 0 euid: 0 signum: 0 pid_type: 0 pid: 0 } name: "/dev/net/tun"
crtools is compiled with _FILE_OFFSET_BITS=64, so off_t is always 64-bit.
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The recv-fd stage is only required if we created any transport fd.
The post-open one is only required if there's at least one fdtype
with post-open callback in ops.
Both things can be evaluated in advance and unneeded stages can
be skipped.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The major issue with dump is -- some info id get via netlink,
some via sysfs and some (!) via opened and attached tun file.
But the latter cannot be created, if there's another one attached
(or the mq device is full with threads).
Thus we have to dump this info via existing tun file and keep one
in memory till the link dump code takes place.
Opposite situation is also possible -- we can have a persistent
unattached device. In this case we have to attach to it, dump
things and detach back.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CID 1042292 (#1 of 1): Unsigned compared against 0 (NO_EFFECT)
unsigned_compare: This less-than-zero comparison of an unsigned value is
never true. "len < 0UL"
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since *all* of them just call do_dump_gen_file with proper ops,
just call one directly. Compacts the code.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Based on work done by Cyrill Corcunov (many thanks for that).
In this commit we implement c/r for files which have opened
/proc/$pid/ns/$ids entries.
The idea is rather simple one
Checkpoint
==========
- Check if the file name is the one of known to be ns ref
- If match then write protobuf entry
Restore
=======
- Read all ns entries from the image
- When criu tries to open one we lookup over process
tree to figure out which PID should be used in path
and then just open it
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To be able to dump procfs namespace entries we will need to
analyze the link name first and then figure out if the file
referred indeed a procfs entry or it's plain regular file.
This means we read link early, and to escape double reading
of same link, once we read it we remember it in fd_parms
structure which we pass to dump_one_reg_file.
Still the dump_one_reg_file is not solely used from inside
of dump_one_file, but from a number of other places (fifo,
special files) where fd_parms is filled only partially
and fill_fd_params is not even called. Thus dump_one_reg_file
must be ready for such scenario and read the link name by own
if the caller has not provided one.
To achieve all this we do
- extend fd_parms structure to have a reference on a new
fd_link structure, thus if caller already read the link
name it might assign the reference and call for dump_one_reg_file
- tune up dump_one_reg_file to fill own copy of fd_link
structure if the caller has not provied one
[ xemul: Added const to fill_fdlink arg ]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>