v2: add a comment before mntns_get_root_by_mnt_id(-1);
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When there is a bind mount present on same mountpoint
where mark is laying the device of both mountpoints
is the same so we might end up getting wrong mountpoint.
Instead we should used mount id which is unique among
all mounpoints.
Note it's a fast fix, I need to review fsnotify code
more and make sure all corner cases are covered.
https://bugzilla.openvz.org/show_bug.cgi?id=2974
Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
pre_dump_one_fanotify calls for parse_fdinfo_pid_s where
fsn_params mut not be NULL, otherwise we get nil dereference.
Fix it by passing a real variable instead.
Reported-by: Pavel Tikhomirov <snorcht@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On this fs path can be resolved via proc, so even if
we're asked to do force-irmap, try to go via regular
resolve anyway.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When migrating container with copying its FS, the inode numbers
and thus their handles wil change. This will make the restore of
inotify/fanotify fail, since they do it via fhandles.
We've already faced the problems with fsnotifies on NFS -- they
don't work there. To address this an irmap cache is created on
pre-dump, so to resolve the issue with changed inodes during
migration, we can force the irmap cache build.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently file handles are used for dumping {i,fs}notify watchers.
But inode numbers are not restored for tmpfs content, so watchers can't
be opened by handles.
Pavel found, that tmpfs cache is not pruned, so a handle can be opened,
and readlink(/proc/PID/fd/X) will return a corect path to the file.
v2: use read_fd_link()
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise if the mark is set up on link we end
with -ELOOP error trying to open it. Thus, use
O_PATH pointing the kernel that we're not going
to read/write this descriptor.
Repored-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The is_foo_link readlinks the lfd to check. This makes
anon-inodes dumping readlink several times to find proper
dump ops. Optimize this thing.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When dumping fsnotifies we may go to irmap to get inode->path
mapping. The irmap engine scans FS (in hinted locations) to
get one and it is slow even though we scan only part of the FS.
Since the above scanning is done while tasks are frozen the
freeze time goes up :(
Improve the situation by generating irmap cache in working dir
at pre-dump when tasks get unfrozen.
The on-disk irmap cache is PB file, it sits in -W directory
and can be loaded on dump/pre-dump start in memory. When
resolving the inode->path mapping irmap may meet these entries,
revalidate them and potentially save time.
After pre-dump the (re-)collected irmap data is written back
to irmap cache image. Typically entries written back are the
same read in on cache load.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will generate some info about file-descriptors at that
stage. For now these pre-dumped ones would be fsnotifies,
so the pre-dump of a single fd is written as simple as
possible, but enough for that type of FDs pre-dump.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If we failed to open inode by handle, try doing the irmap
search. If that's successful, dump the "real" path for the
inode.
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Some filesystems do not provide open-by-handle functionality. For those,
we should abort fsnotifies dumping, not restoring.
The open_mount() changes are about opening mountpoints inside another
mount namespace.
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Remove whitespace at EOL (found by git grep ' $')
To people using vim, I'd suggest adding the following code to ~/.vimrc:
let c_space_errors = 1
highlight FormatError ctermbg=darkred guibg=darkred
match FormatError /\s\+$\|\ \+\t\|\%80v.\|\ \{8\}/
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
@path is always nil here, we actually need @remap->path
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We really don't need it spread over all headers. The file
handlers are used in fsnotify only, declare it there.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
crtools.h is too heavy to be included in many sources
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There's ... a number of places where we want to do something
with /proc/self/fd/%d path. Each time we guess buffer size
that is enough for this. Make standard constant for this and
save some space on stack and drop args for some functions.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CRIU puts wd-s for one inotify in one row (one after another),
so when collecting next wd, we can find the ify to attach them
to faster.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The inotify_add_watch generates wd-s one-by one. We cannot
request for one, thus we call this syscall till the required
wd is generated.
Thus, if we want to restore several wd-s for an inotify, we
have to put them in ascending order. Otherwise we may restore
watch with higher wd earlier and will thus not be able to
generate the lower wd in a reasonable time.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To get more detailed error desciption. Also print watchdog
number if it exceed expected, for better error output.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have generic do_pb_show() call and tons of show_foo
routines, that just call one with proper args. Compact
the code by putting the args into array and calling
the do_pb_show() in one place.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These contain linkage between number, data type and routines
for pb messages we write/read to/from image files. Most of them
have simple number-type-routines mapping, so introduce a generating
script for that.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since *all* of them just call do_dump_gen_file with proper ops,
just call one directly. Compacts the code.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is less useful than fixing typos in output messages, but anyway.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch replaces the format specifier %ld with PRIx64
in the following places:
* the format string argument of the functions scanf() and printf(),
* in the macros GEN_SYSCTL_*_FUNC.
We need explicit specification of the integer size there.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I occasionally read FanotifyMarkEntry object as InotifyWdEntry
in collect_one_fanotify_mark, this didn't trigger a bug in test
since the events are still occured (and before protobuf file
refine the formats were close to each other), which means
the fanotify00 test-case need to be updated (which is addressed
in further patch).
And don't forget to init fields.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The object is allocated with malloc. The lack of initialization
is not problem at moment since we assign members in
collect_inotify_mark unconditionally but it might cause problems
in future so better to init it as early as possible.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To be consistent with naming (we have collect_one_fanotify_mark
helper already).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>