Only two fields are modifiable -- hostname and domainname. So
read them on dump and write on restore.
File format is simple --
u32 magic
u32 length of nodename
u8[] nodename string
u32 length of domainname
u8[] domainname string
For OpenVZ we can write the release at the end, but this is later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
New option -n to dump/restore namespaces.
Fork the namespaces dumping task and write a helper for switching a namespace.
Prepare the restorer code for restoring namespaces before root task.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
In order to restore task in namespaces we'll have to clone() them,
not fork. Thus switch the restorer into using clone.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
I will need them in the place where we restore the root task.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Timers are dumped from inside parasite code, the format is plain -- just
3 pairs of interval/value one-by-one.
The restoration occurs in two stages -- first prepare the timer values in
restorer (and check for sanity), then setup the timers in the latest stage
before actually calling the sigreturn.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Dump the core-pid.img file only. On restore select the way of killing
task based on his exit_code -- exit or kill with a signal. Before dying
unblock all the handlers and set SIG_DFL to it (to make the dead signal
other than KILL be deliverable).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
The absent image file on shared resources preparation now means -- no resources
for this pid (zombies will not have these files).
This is not the most elegant solution, but I don't have anything better in mind.
Need to think over, all the more so we're most likely about to reimplement the
way image is stored some day in the future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Kill all the macros for reading/writing image parts. New API looks like
* write_img_buf/write_img
Write an object into an image. Reports 0 for OK, -1 for error. The _buf
version accepts object size as an argument, the other one uses sizeof()
* read_img_buf/read_img
Reads an object from image. Reports 0 for OK, -1 for error or EOF.
* read_img_buf_eof/read_img
Reads an object from image. Reports 1 for OK, 0 for EOF and -1 for error.
This is not symmetrical with the previous one, but it was done deliberately
to make it possible to write code like
ret = read_img_bug_eof();
if (ret <= 0)
return ret; /* 0 means OK, all is done, -1 means error was met */.
... /* 1 means object was read, can proceed */
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
It's not needed anymore, it was handing cases
where no fork-with-pid functionality were in
kernel, but now it's simply unneeded.
Also drop redundant getpid() calls.
Passes all tests (except fork test which known to fail).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parllels.com>
After Andrey's work with making restorer a regular .o file we can do it
(the pthread00 test doesn't fail on it).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Handle SIGCHLD and if someone failed, nr_in_progress is set to -1.
If crtools notices that nr_in_progress is negative, it kills all
tasks.
v2: * Use named constants for task_entries->start in restorer.c
* Use SA_NOCLDWAIT when setting sigchild handler,
this makes sigchild handler simpler.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Now we have only one mutex nr_in_progress, it says how many
tasks are not restored yet. A negative value signs that someone
failed.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
v2: add FIXME for linking restorer-log.c and restorer.c by ld
I don't know how to do it now.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Before this patch the restorer's code is linked in crtools and then
we copied functions from it. In this case all function should
be inline and we can't use a global variables.
I suggest to make it like parasite. The restorer's code is isolated in
own file and will be copied wholly. The restorer's code is compiled as
position-independent code, so we can use functions and global variale
(E.g. to save descriptor for log messages).
v2: correct indentions in a separate patch
v3: introduce a variable restore_task_exec_start symmetrical to
restore_thread_exec_start
v4: don't give command in restorer_thread()
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
This patch prepares code to handle errors. In the near future
we will handle SIGCHLD. If a restore of one task fails, we will
send a signal to other for completing.
For this we should have ability to wait until all task wills be
restored. This patch does it.
v2: Don't wait children.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Sometime we want to have a shared mapping in restorer. E.g. A storage
for shared memory entries. This entries contains locks, which should
be released in restorer.
v2: fixed according to Pavel's comments
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
There is no need to use sys_ versions of libc functions
when we run in non relocated code. It's a leftover from
early testing time. Fix it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reprimand to commits bd8b2b0f and d0a6e9a1 authors for not
cleaning after themselves...
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This one is skipped at restore and leaves an open core file
in target task's fdtable.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Otherwise it pops up after restore in target task's fdtable.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
create_pipe() may restore up to 3 descriptors. They may be both ends
of pipes and a target descriptor. The image fd may hold any of them.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This reverts commit 46c613cc7d869ebf39532a1def054de7678e441f.
Andrey posted a proper fix for it. Moreover, the problem in first
place was initiated by a parasite application running during test
case, crtools knows nothing about.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
It's perfectly fine to reuse descriptors here,
since we use plain pipe() call and it migh choose
the descriptor which we need after.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
The old code carried the path through the stack. Now we have
pstree_pid and handy helpers to get one.
Tested by pipes00 test from zdtm (it forks).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Instead of passing self-vma file path to restorer
code simply open it before restore_task call and
pass descriptor instead. This saves some memory.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This allows us to get rid of open-coded "/proc/pid/X".
Based-on-patch-from: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
It basically reverts e189efc1763d9cae55e1cafd7aff7ffef6e47303
which was overdone one.
Reported-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Some process can share one struct file-s, we may find them by "object IDs".
A file descriptor is opened in one process and send to other via unix socket.
The procedure of restoring files contains four stages.
* Collect data about all file's descriptors
On this stage we find process which will restore a file descriptor and
create a list of processes, who should get this descriptor.
* Create datagrams unix sockets
If a file descriptor should be received, a unix socket is created
instead of it.
* Open file descriptors
A process with the least pid opens a file and sends this file
descriptors to all one who wait it.
* Receive file descriptors.
When we were thinking up this algoritm, we wanted to minimize a number
of context switches. A number of context switches is proportional of a
number of processes.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Restorer does really restore shared memory (including page
contents restore) only on master process, while all other
processes do open such memory area via map_files/ procfs entry
so that we might have a situation when shared VMA is present
in some particular core-%d.img file but it's not listed in
collected shmems array and find_shmem_by_pid will return NULL.
This is perfectly fine, be ready for that.
Another issue is that shared memory might look like
CR_FD_SHMEM: /home/cyrill/projects/kernel/crtools/shmem-2641.img
----------------------------------------
0x7f2200775000-0x7f2200776000 id 19664
0x7f2200776000-0x7f2200777000 id 19663
----------------------------------------
So vma area is [x;y) range and we should distinguish two
shmem lookup cases
- one when we search for page in shmem area
- second when we lookup shmem area in collected ranges
They both have a different lookup conditions so single
find_shmem splitted into two helpers find_shmem and
find_shmem_page as appropriate.
This patch finally fixes the three process asynchronious
shared memory updates test-case.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
The remapping of /proc path to shmem should be
done with pid of process which did mmap() call
initially.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Several places just need to open an image, thus the helper is OK to use.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>