We have a mess of uintX_t and uX usage. Drop off uintX_t ones.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need to parse btrfs stuff, but this one is not
in the supported list yet (as it's bound to hardware).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
It will hold auxiliary data associated with mount point. We
will use it for btrfs handling but in future it can be extended.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used in cr-restore.c for stopping threads on the exit from
sigreturn.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A few sentences, which are required for understanging this patch
2a) A shared mount can be replicated to as many mountpoints and all the
replicas continue to be exactly same.
2b) A slave mount is like a shared mount except that mount and umount
events only propagate towards it.
2c) A private mount does not forward or receive propagation.
All rules is there Documentation/filesystems/sharedsubtree.txt
If it's a first mount in a group, all group members should be
bind-mounted from this one.
Each mount propagates to all members of parent's group. The group can
contains a few slaves.
Mounts, which have propagated to slaves, are unmounted, because we can't
be sure, that they propagated in real life. For example:
mount --bind --make-slave /share /slave1
mount --bind --make-slave /share /slave2
mount /share/test
umount /slave2/test
mount --make-share /slave1/test
mount --bind --make-share /slave1/test /slave2/test
41 40 0:33 / /share rw,relatime shared:28 - tmpfs xxx rw
42 40 0:33 / /slave1 rw,relatime master:28 - tmpfs xxx rw
43 40 0:33 / /slave2 rw,relatime master:28 - tmpfs xxx rw
44 41 0:34 / /share/test rw,relatime shared:29 - tmpfs xxx rw
46 42 0:34 / /slave1/test rw,relatime shared:30 master:29 - tmpfs xxx rw
45 43 0:34 / /slave2/test rw,relatime shared:30 master:29 - tmpfs xxx rw
/slave1/test and /slave2/test depend on each other and minimum one of them
doesn't propagate from /share/test
v2: use false and true for bool
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Try to restore mounts while a postpone list isn't empty and check
that each iteration has some progress, otherwice it will fails for
preventing infinite loops
v2: rework logic about postpone list
add more comments
v3: one more attempt to make it more readable
v4: Here is a master class from Pavel how to write self-documented code.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
All shared mounts from one group are connected to circular list.
All slave are added into the proper master list.
v2: change variable name and fix a bug about adding shared mounts in a
circular list.
v3: handle errors of collect_shared
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
mnt_entry contains a few strings and they should be release too
CID 996198 (#4 of 4): Resource leak (RESOURCE_LEAK)
20. leaked_storage: Variable "pm" going out of scope leaks the storage
it points to.
CID 996190 (#1 of 1): Resource leak (RESOURCE_LEAK)
13. leaked_storage: Variable "new" going out of scope leaks the storage
it points to.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
They are really depends on CPU we're running on.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
No need to walk up the directories if we need
to include protobuf file. This was always a bad
use of ability to walk the filesystem from other
headers.
Same time we don't need -I$(SRC_DIR)/protobuf/
in general makefile anymore.
[xemul: Small fixlet in head Makefile, since patch
it out-of-order]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We collect all file locks to a golbal list, so we can use them easily
in dump_one_task. For optimizaton, we only collect file locks hold by
tasks in the pstree.
Thanks to the ptrace-seize machanism, we can aviod the blocked file lock
issue, makes the work simpler.
Right now, the check handles only one situation:
-- Dumping tasks with file locks hold without the -l option.
This covers for the most part. But we still need some more work to make
it perfect robust in the future.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will be handling both inotify and fanotify
objects here thus to make less confusion rename
the files to fsnotify.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* The following files goes into the directory arch/x86/include/asm unmodified:
- include/atomic.h,
- include/linkage.h,
- include/memcpy_64.h,
- include/types.h,
- include/bitops.h,
- pie/parasite-head-x86-64.S,
- include/processor-flags.h,
- include/syscall-x86-64.def.
* Changed include directives in the source files that include the headers
listed above.
* Modified build scripts to reflect the source moves.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Make them look like __CR_<smth>_H__ with
sed -e '1,2s/#\(ifndef\|define\) _\?_\?\(CR_\)\?/#\1 __CR_/' -e '1,2s/_H_\?_\?.*$/_H__/'
on every header file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch add ability to test /proc/cpuinfo data
we're interested in at the moment.
The code provides the following functionality
- cpu_init, to parse cpuinfo and check if the
host cpu we're running on is suitable enough
for FPU checkpoint/restore. If FPU present then
there must be at least fxsave capability present
- cpu_set_feature/cpu_has_feature helpers which
provides to test certain bits and set them where
needed (we need to set bits when parse cpuinfo)
Note, we reserve space for all cpuinfo bits known
by the kernel at moment, while use only three FPU
related bits for a while. This is done because we might
need to use or find out other features in future.
After all it's just 40 bytes of memory needed to keep
all possible bits.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We need to dump content of some fs like binfmt_misc, tmpfs, ... To facilitate
this the existing list of filesystems is turned into an array of structures
with dump and restore callbacks. Each FS may declare them they need.
v2: rework encode/decode_fstype not to do it twice.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Build a tree of mountpoins that can be (u)mounted in a straight
(forward or backward) order without EBUSY errors.
The tree is built in two steps -- first create hierarchy based
on mount iDs. Next -- resort siblings in the path depth order.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This structure will be used on restore and will be created
from the image, thus the name proc_ is not suitable.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On the restore path this structure will be used and it
will be better to have them char * rather than char[64].
When scanning proc use the %ms specifier for this.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The proc_parse file turns into a strange pile of homebrew
scanf/fgets/strtok/strchr/etc. combination. I don't like it :(
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Make the proc_mountinfo obtaned after parse form a single linked list.
That's much easier to handle and doesn't have an artificial limitation
of 64 items...
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
What is stored there is the path to mountpoint. The root
is the field previously named "parent root".
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To restore inotify we need to know the
mount point device numbers, so this helper
parses /proc/pid/mountinfo file for that.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
They are just two next in this file, so extend. This is
required for pgid/sid early read, see next patches for
details.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since proc_parse.h declares functions which have
list_head as arguments, it should include list.h
and basic types.
Otherwise if included into the files without list support
the compilation might fail as
|
| In file included from cr-check.c:5:
| ./include/proc_parse.h:83: error: expected declaration specifiers or ‘...’ before ‘bool’
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
While we try to seize task it can die and give its pid to
somebody else. This can break pstree consistency. Check for
parent being valid after task is seized.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This patch tries to introduce lazy and hidden pid_dir support,
meaning one don't have to worry about pid_dir but the optimization
is still there.
The patch relies on the fact that we work with many /proc/pid files for
one pid, then for another pid and so on, i.e. not in a random manner.
The idea is when we call open_proc() with a new pid for the first time,
the appropriate /proc/PID directory is opened and its fd is stored.
Next call to open_proc() with the same PID only need to check that
the PID is not changed. In case PID is changed, we close the old one
and open/store a new one.
Now the code using open_proc() and friends:
- does not need to carry proc_pid around, pid is enough
- does not need to call open_pid_proc()
The only thing that can't be done in that "lazy" mode is closing the last
PID fd, thus close_pid_proc().
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
...and make it correctly print the file name we were unable to open.
Also, error from fdopen[dir]() is now reported with file name as well.
Note that open_proc() and friends need to be macros in order for
pr_perror() to show actual file name and line number where error occured.
Historical note: the original version of this patch was way more radical,
changing openat() to open() and thus removing pid_dir (replacing with pid
when needed) and open_proc_dir(), changing openat() to open(). The word
from Pavel is he wants to keep the openat/pid_dir optimization because
it saves two dentry lookups in kernel code for each open(). Because of
this optimization (and desire to print correct file name in case
of error) we have to carry both pid and pid_dir everywhere.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This commit brings the former "Rewrite task/threads stopping engine"
commit back. Handling it separately is too complex so better try
to handle it in-place.
Note some tests might fault, it's expected.
---
Stopping tasks with STOP and proceeding with SEIZE is actually excessive --
the SEIZE if enough. Moreover, just killing a task with STOP is also racy,
since task should be given some time to come to sleep before its proc
can be parsed.
Rewrite all this code to SEIZE task and all its threads from the very beginning.
With this we can distinguish stopped task state and migrate it properly (not
supported now, need to implement).
This thing however has one BIG problem -- after we SEIZE-d a task we should
seize
it's threads, but we should do it in a loop -- reading /proc/pid/task and
seizing
them again and again, until the contents of this dir stops changing (not done
now).
Besides, after we seized a task and all its threads we cannot scan it's children
list once -- task can get reparented to init and any task's child can call clone
with CLONE_PARENT flag thus repopulating the children list of the already seized
task (not done also)
This patch is ugly, yes, but splitting it doesn't help to review it much, sorry
:(
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Stopping tasks with STOP and proceeding with SEIZE is actually excessive --
the SEIZE if enough. Moreover, just killing a task with STOP is also racy,
since task should be given some time to come to sleep before its proc
can be parsed.
Rewrite all this code to SEIZE task and all its threads from the very beginning.
With this we can distinguish stopped task state and migrate it properly (not
supported now, need to implement).
This thing however has one BIG problem -- after we SEIZE-d a task we should seize
it's threads, but we should do it in a loop -- reading /proc/pid/task and seizing
them again and again, until the contents of this dir stops changing (not done now).
Besides, after we seized a task and all its threads we cannot scan it's children
list once -- task can get reparented to init and any task's child can call clone
with CLONE_PARENT flag thus repopulating the children list of the already seized
task (not done also)
This patch is ugly, yes, but splitting it doesn't help to review it much, sorry :(
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
All the IDs and caps are in there. Just read them for future use.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Requires patch #14 (for kernel). Also check for number of entries read be
at least required, not exactly equal for forward compatibility.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>