Some file-type specific parameters can be fetched with
parasite code only, so lets carry parasite control block
pointer in struct fd_parms.
This is a bit ugly but requires less code to touch and
enough for now. In long terms we need some more generalized
routine/hooks which would depends on file type.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Will need it to fetch tty link parameters. This is
because the kernel provides SID/PGID related to the
caller context, and if we're dumping the process inside
namespace -- we need local SID/PGIDs, not global ones.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if here no task found which would restore
controlling terminal -- exit with error instead of
continue with just error message.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
No need for memcpy here, it's plain integer value
which need to be filled.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dumping them is performed via parasite, since calling the getgroups
is the only way of getting the complete list. Currently the nr of
groups to dump is limited explicitly with the size of shared memory
between crtools and parasite. This is MUCH more that we have seen
on real apps so far.
Restoring is done early, before restorer blob not to carry the undefined
array of grpous in there. This is OK, since groups do not affect us at
that point and are not affected by subsequent creds restore.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Getting groups can be done vie proc, but there's only 32 on them,
while task may have up to 65k :( We will use parasite for that and
thus require this syscall definition.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently move there the secbits dumping, which is not dumped
via misc-dumping command. This patch is required to support
per-task groups dumping (setgroups/getgroups) -- we'll have to
drain the groups from parasite.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I believe this make sense to keep this structure
in pstree.h where pstree related data lives.
Also I've added some comments on struct pid members.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
- @list member closer to @children
- add some comments on memebers
- add space lines for members grouping
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Remove the restorer-log and link log-simple into restorer
blob. Now we can use the normal pr_foo API.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's supposed to be used by parasite and restorer blob. It
has API equal to the core one -- with setfd, set_loglevel and
(the main thing) print_on_level fn. It currently supports only
strings, decimal and hex numbers (int and long).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It reads /proc/PID/fd and close all descriptors except service fds.
v2: s/is_one_of_service_fds/is_any_service_fd
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For executing an external tools we need to block a SIGCHLD
and to juggle file descriptors.
SIGCHLD is blocked for getting an exit code.
A problem with file descriptors can be if we want to set 2 to STDIN,
1 to STDERR, 0 to STDOUT for example.
v2: use helpers reopen_fd_as and move_img_fd
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
By default crtools shouldn't modify the environment, except for
killing the dumped tasks. The link remap does so and should sit
under explicit cmdline option.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These are not ghost, as they are still on fs, so we cannot take
them with us in the image. Neither we can easily find the other name
of that file. Sad :(
To make it work we linkat() the new name to that file using the
AT_EMPTY_PATH flag to link directly to the opened fd. If we could
openat() the fd's parent we would better do it, but we can't and
thus have to create the link name by explicit absolute path :(
This modifies the fs we're dumping, so I'll introduce one more cmd
line option for that soon.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For linked remaps we'll use similar technique as for ghost
files, but lighter. For that sake make reg_file_info remap
to file_remap, not to the whole host_file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The biggest acheivement since v0.1 -- initial support for LXC containers!
Other less notable (but still great) things done are:
* Implemented support for TTY-s
* Added support for packet sockets
* Bug-fix here and there
Note, that images generated by v0.1 tool are accepted by v0.2 one.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We don't support yet detached terminals migration,
so fail early if we can't proceed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In commit 97cfb707 we extended scm_fdset structure
up to 7K thus we need to increase the parasite stack
size enough to hold it. It was not noticed simply because
we were poking arguments area of memory.
The memory map of args and stack for parasite look like
+-----------+ min address
| | |
| arguments | v (8K)
| |
+-----------+
| |
| stack | ^ (16K)
| | |
+-----------+ max address
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise there is a race between files with same names:
link(name -> ghost) link(name->ghost)
open(name)
unlink(name)
open(name) -> ENOENT
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
One function is used on restoring and one is used on dumping,
so each function has own prefix rst or cpt.
The both functions have the same effect, so the main part of the names
is same and it describes "unlock_tcp_connections".
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Scripts are executed when external actions required.
CRTOOLS_SCRIPT_ACTION contains a required action.
If a script doesn't know a current action, it should exit with 0.
The first usecase will be lock/unlock network.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
TCP_REPAIR should be droppet when a network is unlocked.
A network should be unlocked at the last moment, because
after this moment restore must not failed, otherwise a state of
a tcp connection can be changed and a state of one side in our image
will be invalid.
v2: use xremalloc instead of mmap and remmap
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Use sys_setsockopt and declare a function in a header as "static inline"
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There are places when we have to select which fd to ->open
and which to ->receive. To avoid deadlocks we sort them in
an ascending order on { pid, fd } pair.
Make this idea more formal by introducing an explicit function
doing this check and call it where appropriate (pipe.c master
selection is also simplified to fit new ... API).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Don't do explicit switch by file type in files.c and don't export
intimate knowledge of pty being master/slave. Use a file desc op
for that.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We need to do two non-trivial things with ttys -- interconnect
slaves to masters (or to each other) and setup ctl-tty restoring
task.
Now this is done in subsequently depending on each other steps:
1. collect ttys
2. interconnect slaves and mark ctl-tty tasks
3. collect fake fds for tty-ctl tasks
4. setup orphaned slaves
We can relax this logic in two ways:
1. don't split marking ctl-tty tasks and then creating fds for them
do it in one step at the end
2. don't interconnect slaves with masters and orphaned slaves in
two steps -- do it in one place after fds are collected
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if there is no master peer associated
with a slave peer we have two cases
- the master peer was closed before slave
- we just have no master peer at all, but
only slave one
This patch addresses only first case -- we open
fake master and hook slaves on it, then close it
immediately.
The second case will be addressed later.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Add ability to use the same macros in restorer code.
In the future we will add ability to show arguments like printf.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We're using get_service_fd in file engine,
better to make it fast. This patch caches
the limits system provides us, instead of
calling getrlimit() every time.
This patch introduces is_service_fd helper
which will be used instead of get_service_fd
where it make sense.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The idea behind is pretty simple -- once we find
that there is a controlling terminal present we
do call ioctl on appropriate /dev/pts/N.
This is done in a bit unusuall manner. When we
find that there is a controling terminal present
we do create an additional FdinfoEntry for it
with object id taken from existing master peer.
The file engine stack this new FdinfoEntry on
fd_info_head head list. Thus we will have at
least two entries on this list. One for real
Fdinfo associated with master peer and one for
our new generated Fdfinfo entry, it depends on
pid which one become a file master.
Finally we do use post_open_fd hook in our
tty code which allows us to open controlling
terminal and yield proper ioctl on it.
v2:
- restore control terminals via service fd,
still need to speedup service fd retrieval.
v3:
- use prepare_ctl_tty() helper to generate
control terminal fdinfo entry
v4:
- use post_open_fd
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Usually the PTYs represent a pair of links -- master peer and slave
peer. Master peer must be opened before slave. Internally, when kernel
creates master peer it also generates a slave interface in a form of
/dev/pts/N, where N is that named pty "index". Master/slave connection
unambiguously identified by this index.
Still, one master can carry multiple slaves -- for example a user opens
one master via /dev/ptmx and appropriate /dev/pts/N in sequence.
The result will be the following
master
`- slave 1
`- slave 2
both slave will have same master index but different file descriptors.
Still inside the kernel pty parameters are same for both slaves. Thus
only one slave parameters should be restored, there is no need to carry
all parameters for every slave peer we've found.
Not yet addressed problems:
- At moment of restore the master peer might be already closed for
any reason so to resolve such problem we need to open a fake master
peer with proper index and hook a slave on it, then we close
master peer.
- Need to figure out how to deal with ttys which have some
data in buffers not yet flushed, at moment this data will
be simply lost during c/r
- Need to restore control terminals
- Need to fetch tty flags such as exclusive/packet-mode,
this can't be done without kernel patching
[ avagin@:
- ideas on contol terminals restore
- overall code redesign and simplification
]
v4:
- drop redundant pid from dump_chrdev
- make sure optional fown is passed on regular ptys
- add a comments about zeroifying termios
- get rid of redundant empty line in files.c
v5 (by avagin@):
- complete rework of tty image format, now we have
two files -- tty.img and tty-info.img. The idea
behind to reduce data being stored.
v6 (by xemul@):
- packet mode should be set to true in image,
until properly fetched from the kernel
- verify image data on retrieval
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we will be restoring ttys we need that restore
procedure for master peers will be yielded earlier
than for slave peers due to ttys specifics. With this
commit we introduce @tty_slaves list which will allow
us to order tty file restore procesure.
Because we need to fetch which list to be used depending
on tty type this patch extend select_ps_list with fdinfo_list_entry
parameter.
v2 (by xemul@):
- make sure the epoll list is still last
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used for restoring epollfd.
Currently a transport fd may be added to epollfd.
epollfd should be populated, when all descriptors were already received.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A ghost file is used for restoring descriptors of an unlinked file.
It is created, opened and deleted.
Currently ghost files are collected in root task and then removed
by crtools when everybody is restored. This scheme doesn't work,
ghost_file_list is not shared, plus tasks may live in different mount
namespace.
It was broken by the following commit:
bd4e5d2f restore: prepare shared objects after initializing namespaces
We can't just move clear_ghost_files(), because we need to wait, until
all processes have not opened a ghost file.
We can add one more global barrier or move clear_ghost_files() in
a restore code bellow an existent barrier.
Here is a better sollution, a gost file is deleted by the last user.
v2: Use the type atomic_t and fix a commit message.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>