It will fall through all the other checks, but let's do it
explicitly for better code understanding.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
No functional changes, just a code move.
TODO:
* need to write personal make_gen_id implementations for special
files (device and pos is const for them effectively)
* need to merge dump ops with restore ones (file_desc_ops)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Reg files are those obtained by open() syscall on restore.
Pipes should be checked to belong to pipefs (fifos are not supported
yet).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2:
- open_mount is cleaned up
- byte-stream hex conversion remains untouched since
strtol is flipping numbers to LE manner
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2:
- Pass initial counter value to eventfd call
(can't pass flags here since they are obtained
with fcntl and must be restored same way or
restore will fail)
- Use rst_file_params for flags and owner restore
- Use eventfd.[ch] instead of eventfs.[ch]
- Move show funcs to eventfd.c
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is also a code move plus two changes:
* shmems renamed to dump_shmems
* shmem area size is shared with restorer (it's one page for both for now)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We use mincore to check which pages we should take with us into
the image. The anonymour present or swapped should go, file present
but not cow-ed should not.
The mincore syscall wasn't very helpful with this -- it didn't detect
swap, but did detect some non present pages (pagecache). Plus it
didn't know anything about cow-ing filemaps.
Andrey Morton suggested to use the pagemap file in proc, but it lacked
the importaint stuff -- the cow filemap bit. Now it's there and we
can switch to using it.
The mincore usage for shared memory is still there, as for _that_ case
it's correct.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When dump finished with error we should unlock all locked
previously connections.
When restoring we should collect connctions and unlock them
all at the end.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
printf() has standard flag for adding "0x" prefix before hexadecimal integer.
MOD='[-+]\?[0-9]*.\?[0-9]*[lL]*'
git grep -l "0x%${MOD}x" | xargs -n1 sed -e "s/0x%\(${MOD}\)x/%#\1x/g" -i
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if dgram socket peer is not connected back
we can try to resolve peer by name.
For security reason this happens only if '-x' option
is passed at checkpoint and restore time.
In particular this is needed for programs which do
use dgram socket to send messages to /dev/log.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Command below was executed several times:
sed 's/\(pr_.*[^%,x,X]\)\(\%[0-9,l,L]*x\)/\10x\2/g' -i *.c
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Completely unlinked file is the one with n_link count being zero.
Such files only allow to read their contents and carry with us.
In order to dump this thing I introduce the "path remap" technology.
For reg file a remapping entry is dumped which describes, that at
restore stage before opening a regfile->path this path should be
linked to some other name and then (after open) unlinked.
For completely unlinked files the remap path would be a path to
a "ghost" file, i.e. a file which is created only at the time of
restore and which is removed completely at the end of it.
Partially unlinked files (i.e. those having n_link != 0, but a
path by which we see them in someone's fd is not accessible) should
be handled in another way.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is preriquisity for terminals handling and just a good
practice to save and restore everything we can :)
Not all combinations are supported. All the problems we still
have come from the inability to attach to group/session with
ID no tasks own as its PID.
This can be workarounded by fork()-ing this pid temporarily,
but we'd rather think in the direction of modifying the kernel
to give us direct syscall for this (oh my...)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I store them on _entry since sids can only be inherited or
set to current's pid. Thus the best we can do it restore sids
at fork time, thus save them in the image we use to fork.
Maybe when we submit patches that will give us ability to set
arbitrary pgid and sid we'll change this, but this is in the
future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is trivial change, but is required to check for pgid/sid
are in 'restorable' state, see for respective patch/code for
details.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Wile the task is infected we cannot just detach from it and go away. The victim will die after this.
So, call the parasite_cure_seized on any error handing between infect and cure.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is not very good practice :) Just leave them in the state they've been to before
dumping. Plz note, that tasks segfault for some reason after unseizeing, but this is
another story.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This bit is not per-file, but per-fd, thus put it on the fdinfo_entry.
Draing these bits from parasite together with the fds themselves, save
into image and restore with fcntl F_SETFD cmd where applicable.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It used to be ulong, but it can be int now (no mapping addresses there). And the
name fd is better than fd_name (reason is the same).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Just dump their IDs and check they are not shared. For future.
IO and SEMUNDO is not there since tasks may have NO such objects
and currently we cannot detect whether they have them equal or
both don't have.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now we store only real fdtable entries in this file, so it's
time to name the field properly and change type to u32.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The mm_xxx bits are per-mm_struct, not per-task_struct in kernel.
Thus, when we support CLONE_VM we'd better have these bits in a
separate image file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Do not restore it yet -- the logic we're about to apply to
resolve tasks' paths relative to dumper/restorer is not yet
clear to me and it should better be hidden into a couple of
calls (dump_one_reg_file/open_fe_fd). But since we can't
chroot to fd we're about to expose the logic outside of the
open_fe_fd, which is not desirable ATM.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Why? Because one day we'll support various CLONE_ flags and
for fdtable and fs info we'd like to have separate images (since
these objects are separate in kernel).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The regfile's ID of a VMA is stored in its shmid field. And the
file itself if sumped into regfiles.img image with 'special'-ly
generated ID (i.e. -- just allocate a new unique one).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A file pointed by an fd should be not unlinked
and be open-able with the provided path (overmounts
and/or renames coupled with links can screw it up).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A pipe buffer has 16 slots. A slot is page, offset and size.
When we use splice and data is not aligned, splice connects
a page from file cache and set offset. For this reason we loose
a part of buffer.
If a data size is more than 15 pages, data will be aligned in a image.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Information about pipe's file structs saved in one global file and
fdinfo_entry is saved for each descriptor
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This was required when pages were stored in elf files for
exec. Now we can stop reading it on eof.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since now we have local copies of a remove FDs we can dump socket queus without entering
a parasite code. This makes the code MUCH simpler.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have two checks for fd being a chrdev:
One to skip stdio-s that are termilans and
The other one for any fd being a /dev/null or other special device.
Clean this a little bit.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The statfs is not required, we now check for fd being a socket with S_IFSOCK.
The 2nd stat is just not needed, the caller provides stat info.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>