Due to code sharing, especially in IPC area,
the unbinding is done via helper macros and
sysclt engine tuning (new CTL_SHOW action
added).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
The messages are filtered by their type
LOG_MSG - plain messages, they escape any (!) log level
filtration and go to stdout
LOG_ERROR - error messages
LOG_WARN - warning messages
LOG_INFO - informative messages
LOG_DEBUG - debug messages
By default the LOG_WARN log level is used, thus LOG_INFO
and LOG_DEBUG messages will not appear in output stream.
pr_panic helper was replaced with pr_err, pr_warning
shorthanded to pr_warn and old printk if rather pr_msg
now.
Because we share messages between "show" and "dump" actions,
before the "show" action proceed we need to tune up
log level and set it to LOG_INFO.
Also note that printing of VMA and siginfo now
became LOG_INFO messages, it was not that correct
to print them regardless the log level.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Remove CR_TASK_XXX states, use the TASK_XXX ones (for image). This is
required to unseize tasks properly in the next patches.
Plus, make sure that pstree_list and the seized set coincide (i.e.
handle error in collect_task).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This patch was designed to be generic and thus usable for all kinds of
sockets. Not sure, thah this goal has been reached, but at least I tried.
Key ideas:
1) On-stack structure for collecting sockets queues and then passing them to
parasite code.
2) Singly linked list is used for collecting structures, representing sockets
of any kind (!) with queues.
Based on xemul@ patches.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
v2: New "MSG_STEAL" functionality is used
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
v2: wrappers names become less obfuscating
This patch:
1) Updates function cr_fdset_open() to be suitable for handling fdset creation
for dump and show stages.
2) Replaces cr_fdset_open() by new wrapper function cr_fdset_dump().
3) Replaces prep_cr_fdset_for_restore() by new wrapper function cr_fdset_show().
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This patch removes collect stage and dumps tunables object right after
collect.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This commit brings the former "Rewrite task/threads stopping engine"
commit back. Handling it separately is too complex so better try
to handle it in-place.
Note some tests might fault, it's expected.
---
Stopping tasks with STOP and proceeding with SEIZE is actually excessive --
the SEIZE if enough. Moreover, just killing a task with STOP is also racy,
since task should be given some time to come to sleep before its proc
can be parsed.
Rewrite all this code to SEIZE task and all its threads from the very beginning.
With this we can distinguish stopped task state and migrate it properly (not
supported now, need to implement).
This thing however has one BIG problem -- after we SEIZE-d a task we should
seize
it's threads, but we should do it in a loop -- reading /proc/pid/task and
seizing
them again and again, until the contents of this dir stops changing (not done
now).
Besides, after we seized a task and all its threads we cannot scan it's children
list once -- task can get reparented to init and any task's child can call clone
with CLONE_PARENT flag thus repopulating the children list of the already seized
task (not done also)
This patch is ugly, yes, but splitting it doesn't help to review it much, sorry
:(
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Stopping tasks with STOP and proceeding with SEIZE is actually excessive --
the SEIZE if enough. Moreover, just killing a task with STOP is also racy,
since task should be given some time to come to sleep before its proc
can be parsed.
Rewrite all this code to SEIZE task and all its threads from the very beginning.
With this we can distinguish stopped task state and migrate it properly (not
supported now, need to implement).
This thing however has one BIG problem -- after we SEIZE-d a task we should seize
it's threads, but we should do it in a loop -- reading /proc/pid/task and seizing
them again and again, until the contents of this dir stops changing (not done now).
Besides, after we seized a task and all its threads we cannot scan it's children
list once -- task can get reparented to init and any task's child can call clone
with CLONE_PARENT flag thus repopulating the children list of the already seized
task (not done also)
This patch is ugly, yes, but splitting it doesn't help to review it much, sorry :(
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
* kid -> child
* First letter should be uppercase
* Misc typos in messages and comments
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
v2: strlen() check removed from parse_ns_string()
Now '-n' option must be followed by namespaces tags, separated by commas.
Currently, only "uts" namespace is supported.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
There are two cases for cr_fdset_open
- It might be called with already allocated
memory so we should reuse it.
- It might be called with NULL pointing out
that caller expects us to allocate memory.
If an open() error happens somewhere inside cr_fdset_open
it requires two error paths
- Just close all files opened but don't free memory
if it was not allocated by us
- Close all files opened *and* free memory allocated
by us.
In any case we should close all files opened so close_cr_fdset()
helper is splitted into two parts.
Also the caller should be ready for such semantics as well and
do not re-assign pointers obtained but simply test for NULL
on results.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
This is a standard convention to print error message (i.e. strerror(errno))
at the end of line, like this:
Cannot remove file: Permission denied
So pr_perror is fixed to follow this convention (using GNU extension
%m helps a lot here). Unfortunately, due to this we have to make
pr_perror() print a new line character, too, so we had to strip it
from the all pr_perror() invocations.
That (appending a newline) also makes pr_perror() a black sheep
in the herd of pr_* helpers, but what can we do? Worst case scenario
is an extra newline after an error message, not too harmful.
An alternative approach (stripping the newline from the passed format
string and re-adding it) was discussed thoroughly, and it was decided
that such a hack looks a bit too dirty.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Function pr_perror() already spits out strerror(errno), no need to do it
in the calling code.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cleaning a few space-at-EOL occurences, plus one spaces-instead-of-tab.
Found using:
git grep -n '[[:space:]]$'
git grep -n ' '
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Dumping is simple. All but secbits can be read from proc, secbits
are got from parasite.
Restoring is a bit tricky -- when you change anything on kernel
cred's struct it performs sophisticated checks and can change
some more stuff than requested, so the creds restoration procedure
is carefully commented step-by-step.
Another thing to mention is that creds are restored after everything
else, i.e. right before performing final threads sync and sigreturns.
This is done to avoid potential problems with insufficient caps for
restoring other stuff (e.g. CAP_DAC_OVERRIDE or zero euid is most
likely required for opening any image file and the notorious control
/proc/sys/kernel/ns_last_pid, which in turn is performed till the
very last moment).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Only two fields are modifiable -- hostname and domainname. So
read them on dump and write on restore.
File format is simple --
u32 magic
u32 length of nodename
u8[] nodename string
u32 length of domainname
u8[] domainname string
For OpenVZ we can write the release at the end, but this is later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
New option -n to dump/restore namespaces.
Fork the namespaces dumping task and write a helper for switching a namespace.
Prepare the restorer code for restoring namespaces before root task.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Timers are dumped from inside parasite code, the format is plain -- just
3 pairs of interval/value one-by-one.
The restoration occurs in two stages -- first prepare the timer values in
restorer (and check for sanity), then setup the timers in the latest stage
before actually calling the sigreturn.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Kill all the macros for reading/writing image parts. New API looks like
* write_img_buf/write_img
Write an object into an image. Reports 0 for OK, -1 for error. The _buf
version accepts object size as an argument, the other one uses sizeof()
* read_img_buf/read_img
Reads an object from image. Reports 0 for OK, -1 for error or EOF.
* read_img_buf_eof/read_img
Reads an object from image. Reports 1 for OK, 0 for EOF and -1 for error.
This is not symmetrical with the previous one, but it was done deliberately
to make it possible to write code like
ret = read_img_bug_eof();
if (ret <= 0)
return ret; /* 0 means OK, all is done, -1 means error was met */.
... /* 1 means object was read, can proceed */
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Rename prep_cr_fdset_for_dump into cr_fdset_open and make it reentable, i.e.
every next enter will open more files in the same fdset. Required for zombies
and makes the code cleaner.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
It was being done intentionally to be able to call close_cr_fdset
several times in a row, bring this ability back. Otherwise I'm
getting glibc complains about attemt to free already freed memory.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
The same as previous patch -- no need in two separate calls.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
They always go in pairs so there's no need in two calls.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Path is not needed there -- we can call the get_image_path() in prep_cr_fdset_
routines and in parasite-syscall.c when required.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
This one is required on allocation -- it's already there as an argument.
It's also required on free, but we can check for fd being >= 0.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
All the places we need one in can use the direct reference on template.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Introduce a helper for walking the list and sending signals.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Currently it can only work with stream sockets, which have no skbs in queues
(listening or established -- both work OK).
The cpt part uses the sock_diag engine that was merged to Dave recently to
collect sockets. Then it dumps sockets by checking the filesystem ID of a
failed-to-open through /proc/pid/fd descriptors (sockets do not allow for
such tricks with opens through proc) against SOCKFS_TYPE.
The rst part is more tricky. Listen sockets are just restored, this is simple.
Connected sockets are restored like this:
1. One end establishes a listening anon socket at the desired descriptor;
2. The other end just creates a socket at the desired descriptor;
3. All sockets, that are to be connect()-ed call connect. Unix sockets
do not block connect() till the accept() time and thus we continue with...
4. ... all listening sockets call accept() and ... dup2 the new fd into the
accepting end.
There's a problem with this approach -- socket names are not preserved, but
looking into our OpenVZ implementation I think this is OK for existing apps.
What should be done next is:
1. Need to merge the file IDs patches in our tree and make Andrey to
support files sharing. This will solve the
sk = socket();
fork();
case. Currently it simply doesn't work :(
2. Need to add support for DGRAM sockets -- I wrote comment how to do it
in the can_dump_unix_sk()
3. Need to add support for in-flight connections
4. Implement support for UDP sockets (quite simple)
5. Implement support for listening TCP sockets (also not very complex)
6. Implement support for connected TCP scokets (hard one, Tejun's patches are not
very good for this from my POV)
Cyrill, plz, apply this patch and put the above descriptions onto wiki docs (do we
have the plans page yet?).
Andrey, plz, take care of unix sockets tests in zdtm. Most likely it won't work till
you do the shared files support for sockets.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Instead of keeping all unrelated to
C/R procedure helpers in util.c move
logging related helpers to log.c.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>