We will need these helpers to transfer file
descriptors from dumpee to our space.
Also make send_fd/recv_fd to be a wrappers over
send_fds/revc_fds to not duplicate the code.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This structure will serve for multiple fds transmission/receive.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need them in file descriptors transfer addressed in further patches.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need it in parasite code where we can't use libc functions.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is not good to update images while restoring.
Thus, read vma_entry-es once into a list, put opened (when required) fds
in there and make restorer walk the entries in mem, not those read from
the image file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now every inetsk fd dump results in a new entry in the fdinfo.img file. Sockets itself are
dumped into inetsk.img global image file. On restore the generic fdinfo redistribution algo
is used and inet sockets are opened only when required.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Each fdset item now has the callback which will show a contents of a magic-described
image file. Per-task and global show code is reworked to walk the respective fdsets
and calling ->show on each file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
To be consistent. Mutexes are futex based but have
own semantics so better to be able to distinguish
the types.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Instead of open-coded u32 variables poking lets use
futex_t type and appropriate helpers where needed.
This should increase readability.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
After we removed the pid from pstree image file the -t or -p option for show
command no longer makes sense. Make 'show' mode rely on -D option to find out
where to find the root (i.e. pstree.img) file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This contains reg-files and sk-queues images, as they contain data
which is potentially generated by every task, so keep it open all
the time dump goes.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Current fdsets are ugly, limited (bitmask will exhaust in several months) and
suffer from unknown problems with fdsets reuse :(
With new approach (this set) the images management is simple. The basic function
is open_image, which gives you an fd for an image. If you want to pre-open several
images at once instead of calling open_image every single time, you can use the
new fdsets.
Images CR_FD_ descriptors should be grouped like
_CR_FD_FOO_FROM,
CR_FD_FOO_ITEM1,
CR_FD_FOO_ITEM2,
..
CR_FD_FOO_ITEMN,
_CR_FD_FOO_TO,
After this you can call cr_fd_open() specifying ranges -- _FROM and _TO macros,
it will give you an cr_fdset object. Then the fdset_fd(set, type) will give you
the descriptor of the open "set" group corresponding to the "type" type.
3 groups are introduced in this set -- tasks, ns and global.
That's it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Write two helpers for opening an fdset for task and one for ns.
This probably can be done with some "generic" macro(s), but this
time it's simpler not to produce more code of that type.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's not required any longer. Now fdsets are allocated one-by-one only
when required and there's no need in adding new fds to existing sets.
Thus just remove the last arg from cr_fdset_open.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This fd is global, so make it such. It will stop being just a global
variable soon.
Plus, remove the pid arg from format.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Pid number is redundant - this file is one for the whole tree.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since now on the fdinfo image only contains plain fdinfo_entry-es.
The tpye == FDINFO_REG files are described by regfiles.img entries
and are matched by te ID in both.
At dump stage each new ID generated results in a new entry in the
regfiles.img. At restore stage open_fe_fd should open a regfile by
the fdinfo's ID. Now this is done in suboptimal way, need to improve.
Show shows both images separately.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Make fdinfo_entry carry only the minimal info describing a file
descriptor -- the fd value itself, the fd type (regular file, exe
link, cwd, filemap and it will be pipes, sockets, inotifies, etc.)
and the describing file ID.
The mentioned ID will identify the type-d object, e.g. for regfiles
this ID is already generated with file-ids.c code.
The other part of this structure describes a regfile (i.e. a file
opened with open syscall). I put this new entry at the end of the
fdinfo_entry just to make the patching simpler. Soon this entry
will be dumped into its own file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The namelen is u16, to cover the PATH_MAX u8 is not enough.
The pos is u64, since file offset is that long indeed.
The id is u32 as per previous patch.
Fix printf-s respectively.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The unique id is 32 bit and consists only of the subid value. This
is _really_ enough. The genid part is just a hint for the tree-search
algirythm to avoid unneeded sys_kcmp calls.
Plus, generate IDs for special files. This will make it easier to
move the regfiles into into separate files (see the respective patch
for details).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Open the exec link at fd restore stage as yet another service fd,
then pass it to restover via args and just call prctl on it.
This is good for several reasons -- the amount of code required for
this is less and opening files should better happen before we switch
to restorer (opening will be complex and it's MUCH easier to open all
we need in one place).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Hide the structure - it's not required.
[ xemul: Ranem long id into u32 id and adopt to current tree ]
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is a precursor patch. Macro for max possible fd type will be required.
And it's easier to use enum in this case.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Just make the fixup_vma_fds read and write vma images and
those called by it provide and fd for this.
Acked-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The core image now contains only core per-task stuff.
The new file resurrects Tula magic number removed earlier.
Acked-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's a rudiment from old times, when restore worked via ececve.
Now we modify the core file in place to fixup vma-s.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now it has only one descritor for dumping pages
v2: remove rudiments
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
vma_entry contains shmid and all shared memory are dumped in own files.
The most interesting thing is restore.
A maping is restored by process with the smallest pid. The mamping
is created before executing restorer.
We map a full mapping and restore it's conten, then we open a file from
/proc/pid/map_files and store a descriptor in vma_info. The mapping is
unmaped. Now we can map any region of this mapping in the restorer.
We use this trick, because a target process may have this mapping in
some places and the restorer has not function to open proc files.
v2: fix error hangling
xemul: Fixed static-s and args for cr_dump_shmem
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used to restore shared mappings
v2: clean up
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now a name of an image file is hard coded ("smth-%d.img", pid),
but the images of namespaces, shared memery, etc belong to
not one task, so they may have other formats of names, which
will describe objects.
For example a image of shared memory content may have name like
this ("pages-shmem-%ld.img", shmid)
v2: fix comment
v3: rebase
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It happened some routines in parasite service code
were not following calling convention so I fixed the
callers and added a comment about adding new code here.
At the same time the 3 error code madness is dropped
as being requested by Pavel -- now we return one error
code only.
The PARASITE_ERR_ codes were dropped as well due to
become redundant.
The status of parasite service routine is set via
SET_PARASITE_RET helper. In case if there is no error --
just return 0. The calling code should not expect to
find anything sane in parasite_status_t because parasite
code might not touch it at all.
Same time, due to this convention the parasite's
dump_socket_queue is getting rid of third @err
member, because it's now returned as a regular
error code.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Using absolute paths for this is dangerous - while doing c/r we should
be extremely carefully and not change tasks' roots and mount namespaces
too early. Sometimes it will not work -- when restoring containers we'll
be unable to switch to new CT and still have the ability to open images.
Rework the images opening via openat and keep the image dir fd open all
the time as the service fd (introduced earlier).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These are the fds that help us to do c/r. We want them not to intersect
with any "real-life" ones and thus store them close the the file rlimit
boundary. For now only the logfd one is such.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is a cleanup patch. Use file entry type variable for special files
instead of file entry addr variable.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Reserve more mem for bootrstrap code and put all self vmas at its tail.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
These are required for inet sockets, but were not added since listen
sockets do not have them.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>