This patch is intended to reduce the usage of the macro P() since
integer size mismatches sometimes may be fixed by this type generalization.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* The following files goes into the directory arch/x86/include/asm unmodified:
- include/atomic.h,
- include/linkage.h,
- include/memcpy_64.h,
- include/types.h,
- include/bitops.h,
- pie/parasite-head-x86-64.S,
- include/processor-flags.h,
- include/syscall-x86-64.def.
* Changed include directives in the source files that include the headers
listed above.
* Modified build scripts to reflect the source moves.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch add ability to test /proc/cpuinfo data
we're interested in at the moment.
The code provides the following functionality
- cpu_init, to parse cpuinfo and check if the
host cpu we're running on is suitable enough
for FPU checkpoint/restore. If FPU present then
there must be at least fxsave capability present
- cpu_set_feature/cpu_has_feature helpers which
provides to test certain bits and set them where
needed (we need to set bits when parse cpuinfo)
Note, we reserve space for all cpuinfo bits known
by the kernel at moment, while use only three FPU
related bits for a while. This is done because we might
need to use or find out other features in future.
After all it's just 40 bytes of memory needed to keep
all possible bits.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since fdinfo patches were merged to -mm
tree the output format has been slightly
changed. So update our tool accordingly.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Same as eventpoll -- we might have no watchee assigned
but only inotify descriptor created.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Linux kernel emulates anon-shared mappings by mapping internal
tmpfs file in. We try to detect this by checking that the file
under map is such, but do it with error -- major == 0 check is
wrong, as regular tmpfs file can be such as well as btrfs or
ecryptfs can screw things up.
The only working way of doing this is to get the dev_t of this
internal tmpfs mount.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Three parts.
Proc: open of map_files' link doesn't work on sockets. We fstatat
it and check that it's a socket (it will be packet), then save
the socket inode on vma_area.
Dump: we resolve socket inode to socket id and save it on vma.
We use id, not inode, since on restore we'll have to mmap some
opened file, not just abstract socket with inode.
Restore: when reading vma-s we just need to find out on what fd
the respective packet socket is opened (i.e. -- no map-and-close
sockets supported by now) and dup() it to let restorer mmap it
back.
All this make it possible to c/r the tcpdump tool!
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The kernel now supports providing VMA flags via smaps
interface so add pasting of them.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Before 256 bytes were used for that, it may be not enough.
For example it was not enough for a NFS point on my test system.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is normal situation -- eventpoll fd may have no fds attached
thus resulting in empty fdinfo file. Report OK in that case.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
/proc/PID/maps can contains not up to date information about a stack vma.
A kernel marks a VMA as stack, if thread_struct->usersp is in it,
but usersp is updated, when a process calls a syscall.
This problem is occured, when we try to dump/restore a process in a loop.
When a restorer resumes a process, a restorer vma will be marked as stack.
A thread stack should not be marked as stack, because its vma is mapped
w/o MAP_GROWSDOWN.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We need to dump content of some fs like binfmt_misc, tmpfs, ... To facilitate
this the existing list of filesystems is turned into an array of structures
with dump and restore callbacks. Each FS may declare them they need.
v2: rework encode/decode_fstype not to do it twice.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This types specifies a strict set of what is hidden behind
the fd. Thus these numbers should be in the description of
the fdinfo message.
Plus protobuf makes shure nothing else will be there.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Build a tree of mountpoins that can be (u)mounted in a straight
(forward or backward) order without EBUSY errors.
The tree is built in two steps -- first create hierarchy based
on mount iDs. Next -- resort siblings in the path depth order.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This structure will be used on restore and will be created
from the image, thus the name proc_ is not suitable.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On the restore path this structure will be used and it
will be better to have them char * rather than char[64].
When scanning proc use the %ms specifier for this.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The task is not complete - this is just a part of what have to be done. I.e.
looks like a lot of excessive deps can be fixed.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The proc_parse file turns into a strange pile of homebrew
scanf/fgets/strtok/strchr/etc. combination. I don't like it :(
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We'll have to parse it all and the format is not scanf
friendly, so it's simpler to have it in a separate fn.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Make the proc_mountinfo obtaned after parse form a single linked list.
That's much easier to handle and doesn't have an artificial limitation
of 64 items...
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
What is stored there is the path to mountpoint. The root
is the field previously named "parent root".
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To restore inotify we need to know the
mount point device numbers, so this helper
parses /proc/pid/mountinfo file for that.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A few of our functions use buffer and string functions.
All these functions require that a string contains '\0' at the end.
Before this patch we didn't guarantee that.
I've seen segmentation fault in parse_pid_stat_small.
v2: simplify code
v3: remove assignment, because it's redundant and a size of object file
become bigger.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Kernel 3.4-rc4 has two formats for stack
"[stack]" and "[stack:%d]" (for threads)
so adopt our parsing routine for both.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
printf() has standard flag for adding "0x" prefix before hexadecimal integer.
MOD='[-+]\?[0-9]*.\?[0-9]*[lL]*'
git grep -l "0x%${MOD}x" | xargs -n1 sed -e "s/0x%\(${MOD}\)x/%#\1x/g" -i
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The task name may include spaces so %s
scanning format will fail (in particular
dumping rsyslogd fails since thread has
name "rs:main Q:Reg"). Fix it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Command below was executed several times:
sed 's/\(pr_.*[^%,x,X]\)\(\%[0-9,l,L]*x\)/\10x\2/g' -i *.c
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
They are just two next in this file, so extend. This is
required for pgid/sid early read, see next patches for
details.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used to restore shared mappings
v2: clean up
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
While we try to seize task it can die and give its pid to
somebody else. This can break pstree consistency. Check for
parent being valid after task is seized.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This patch tries to introduce lazy and hidden pid_dir support,
meaning one don't have to worry about pid_dir but the optimization
is still there.
The patch relies on the fact that we work with many /proc/pid files for
one pid, then for another pid and so on, i.e. not in a random manner.
The idea is when we call open_proc() with a new pid for the first time,
the appropriate /proc/PID directory is opened and its fd is stored.
Next call to open_proc() with the same PID only need to check that
the PID is not changed. In case PID is changed, we close the old one
and open/store a new one.
Now the code using open_proc() and friends:
- does not need to carry proc_pid around, pid is enough
- does not need to call open_pid_proc()
The only thing that can't be done in that "lazy" mode is closing the last
PID fd, thus close_pid_proc().
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
...and make it correctly print the file name we were unable to open.
Also, error from fdopen[dir]() is now reported with file name as well.
Note that open_proc() and friends need to be macros in order for
pr_perror() to show actual file name and line number where error occured.
Historical note: the original version of this patch was way more radical,
changing openat() to open() and thus removing pid_dir (replacing with pid
when needed) and open_proc_dir(), changing openat() to open(). The word
from Pavel is he wants to keep the openat/pid_dir optimization because
it saves two dentry lookups in kernel code for each open(). Because of
this optimization (and desire to print correct file name in case
of error) we have to carry both pid and pid_dir everywhere.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This patch introduces the following changes:
1) introduces new flag VMA_AREA_SYSVIPC to mark corresponding vma entries.
2) enhance task /proc/<pid>/maps parsing to obtain first 5 letters of mapped
file. If device major file belong to ins equal to 0 (tmpfs) and it's name
starts with "/SYSV", then this mapping is considered as SYSV IPC and
corresponding vma entry status is updated with VMA_AREA_SYSVIPC flag.
3) omit dumping of mapping pages for SYSV IPC vmas.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>