* The following files goes into the directory arch/x86/include/asm unmodified:
- include/atomic.h,
- include/linkage.h,
- include/memcpy_64.h,
- include/types.h,
- include/bitops.h,
- pie/parasite-head-x86-64.S,
- include/processor-flags.h,
- include/syscall-x86-64.def.
* Changed include directives in the source files that include the headers
listed above.
* Modified build scripts to reflect the source moves.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Make them look like __CR_<smth>_H__ with
sed -e '1,2s/#\(ifndef\|define\) _\?_\?\(CR_\)\?/#\1 __CR_/' -e '1,2s/_H_\?_\?.*$/_H__/'
on every header file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is mostly bugfix and improvements release.
Nonetheless, some new features exists, the most interesting are:
* proper COW mappings handling
* full packet sockets support, thus supporting the tcpdump tool
* the --shell-job option, which makes it possible to dump apps
launched from one shell and restore them in another
Some features are available with the custome kernel, but the good
news is that now _all_ of the patches we need are in one of the
-next trees or in the -mm one, and thus have good chances to get
merged in 3.8 (or soon after it).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This thing on pstre_item was created to carry task-specific
information across the "restore" code-flow.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A parent process can change a few pages after forking a child and
all this pages should not be avaliable from the child.
Each vma has a bitmap of existent pages. Parent's and child's bitmaps
can be compared and all pages which are not present in a child bitmap
are dropped.
v2: don't check page_bitmap on NULL
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Three parts.
Proc: open of map_files' link doesn't work on sockets. We fstatat
it and check that it's a socket (it will be packet), then save
the socket inode on vma_area.
Dump: we resolve socket inode to socket id and save it on vma.
We use id, not inode, since on restore we'll have to mmap some
opened file, not just abstract socket with inode.
Restore: when reading vma-s we just need to find out on what fd
the respective packet socket is opened (i.e. -- no map-and-close
sockets supported by now) and dup() it to let restorer mmap it
back.
All this make it possible to c/r the tcpdump tool!
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This option will tells the tool to procceed dumping
even if a root task is not a session leader.
This implies that this option will allow to "migrate"
one external tty connection. Say a person may dump
"top" application in one bash shell and restore it
in another shell session.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need it for slave ttys migration. They serve for one purpose --
to clone self stdio descriptor and use it with tty layer, which will
be addressed in further patches.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I believe this make sense to keep this structure
in pstree.h where pstree related data lives.
Also I've added some comments on struct pid members.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It reads /proc/PID/fd and close all descriptors except service fds.
v2: s/is_one_of_service_fds/is_any_service_fd
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
By default crtools shouldn't modify the environment, except for
killing the dumped tasks. The link remap does so and should sit
under explicit cmdline option.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The biggest acheivement since v0.1 -- initial support for LXC containers!
Other less notable (but still great) things done are:
* Implemented support for TTY-s
* Added support for packet sockets
* Bug-fix here and there
Note, that images generated by v0.1 tool are accepted by v0.2 one.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Scripts are executed when external actions required.
CRTOOLS_SCRIPT_ACTION contains a required action.
If a script doesn't know a current action, it should exit with 0.
The first usecase will be lock/unlock network.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We're using get_service_fd in file engine,
better to make it fast. This patch caches
the limits system provides us, instead of
calling getrlimit() every time.
This patch introduces is_service_fd helper
which will be used instead of get_service_fd
where it make sense.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The idea behind is pretty simple -- once we find
that there is a controlling terminal present we
do call ioctl on appropriate /dev/pts/N.
This is done in a bit unusuall manner. When we
find that there is a controling terminal present
we do create an additional FdinfoEntry for it
with object id taken from existing master peer.
The file engine stack this new FdinfoEntry on
fd_info_head head list. Thus we will have at
least two entries on this list. One for real
Fdinfo associated with master peer and one for
our new generated Fdfinfo entry, it depends on
pid which one become a file master.
Finally we do use post_open_fd hook in our
tty code which allows us to open controlling
terminal and yield proper ioctl on it.
v2:
- restore control terminals via service fd,
still need to speedup service fd retrieval.
v3:
- use prepare_ctl_tty() helper to generate
control terminal fdinfo entry
v4:
- use post_open_fd
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Usually the PTYs represent a pair of links -- master peer and slave
peer. Master peer must be opened before slave. Internally, when kernel
creates master peer it also generates a slave interface in a form of
/dev/pts/N, where N is that named pty "index". Master/slave connection
unambiguously identified by this index.
Still, one master can carry multiple slaves -- for example a user opens
one master via /dev/ptmx and appropriate /dev/pts/N in sequence.
The result will be the following
master
`- slave 1
`- slave 2
both slave will have same master index but different file descriptors.
Still inside the kernel pty parameters are same for both slaves. Thus
only one slave parameters should be restored, there is no need to carry
all parameters for every slave peer we've found.
Not yet addressed problems:
- At moment of restore the master peer might be already closed for
any reason so to resolve such problem we need to open a fake master
peer with proper index and hook a slave on it, then we close
master peer.
- Need to figure out how to deal with ttys which have some
data in buffers not yet flushed, at moment this data will
be simply lost during c/r
- Need to restore control terminals
- Need to fetch tty flags such as exclusive/packet-mode,
this can't be done without kernel patching
[ avagin@:
- ideas on contol terminals restore
- overall code redesign and simplification
]
v4:
- drop redundant pid from dump_chrdev
- make sure optional fown is passed on regular ptys
- add a comments about zeroifying termios
- get rid of redundant empty line in files.c
v5 (by avagin@):
- complete rework of tty image format, now we have
two files -- tty.img and tty-info.img. The idea
behind to reduce data being stored.
v6 (by xemul@):
- packet mode should be set to true in image,
until properly fetched from the kernel
- verify image data on retrieval
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we will be restoring ttys we need that restore
procedure for master peers will be yielded earlier
than for slave peers due to ttys specifics. With this
commit we introduce @tty_slaves list which will allow
us to order tty file restore procesure.
Because we need to fetch which list to be used depending
on tty type this patch extend select_ps_list with fdinfo_list_entry
parameter.
v2 (by xemul@):
- make sure the epoll list is still last
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Here is a bit mess, because we used unsigned int instead of pid_t.
A negative value is used for uninitialized PID's variables.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When restoring a container crtools create veth pair inside it and then
pushed one end to the namespaces crtools live in (outside). To facilitate
the subsequent management of the otter end of the veth pair this option
is added -- one can specifu a name by which the respective end would be
visible. E.g.: --veth-pair eth0=veth101.0
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We can set a directory for log and image files.
crtools sets it as a current directory and then creates all files in it.
It works before we don't decide to change a mount name space.
I suggest to open a log dir and create files for help openat.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When we restore a pid namespace the root task will get some unknown pid
in the original (i.e. -- the ns crtools a launched from) one. To find
this pid out one can use this option -- it will make the pid obtained by
the new init to be written into a pid file.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When dumping a tmpfs mount we need to take its contents with us.
So, use tar for it and put it into the image dir.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Support only basic packet socket functionality -- create and bind.
This should be enough to start testing dhclient inside container.
Other stuff (filter, mmaps, fanouts, etc.) will come later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
"init" in LXC opens /dev/null and then mounts devtmpfs in /dev,
so crtools can not resolve the path to the origin /dev/null.
crtools with the option --evasive-devices will check the origin
device and a new device are the same and if it's true, crtools will
dump a new path.
v2: add a description for the option
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The option is -r|--pivot-root and an argument is a path to new root.
A root task will make pivot_root. LXC CT does that, so we need that
for restoring.
v2: s/pivot-root/root/
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Only the fact of the fd presence, its flags and fown and the sigmask.
The sigpending state is tightly coupled with the task's sigpending
state which is not yet supported.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Only support the lo device. This is not final yet (much more
stuff is to be handled for a link) but is rather a skeleton
showing how to do it and letting us check the LXC container
early.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is the first release of the tool! :)
Supported features:
* x86_64 architecture
* process' linkage
* process groups and sessions (without ttys though :\ )
* memory mappings of any kind (shared, file, etc.)
* threads
* open files (shared between tasks and partially opened-and-unlinked)
* pipes and fifos with data
* unix sockets with packet queues contents
* TCP and UDP sockets (TCP connections support exists, but needs polishing)
* inotifies, eventpoll and eventfd
* tasks' sigactions setup, credentials and itimers
* IPC, mount and PID namespaces
Most of the above works with kernel v3.5!
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We haven't tested it for several monthes and there's no evidence
it is required at all. For dumping a single task -t option works
just fine.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we store the images version in the core file. This is
bad, since core file describes a single process (or thread) and
says nothing about the images set as a whole (let alone the fact
that it's being parsed too late).
Thus introduce the inventory image file which describes the image
set the way we need (want). For now the only entry in it is the
images version. In the future it can be extended.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2:
- Use regular uint types in message proto
- Use PB engine for "show"
v3:
- drop usage of temp. variable in prepare_shmem_pid
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are ready to use FownEntry everywhere,
so drop fown_t type and clean up source code.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Checkpoint and restore of fifo is similar to
pipes c/r except the pipe end-points are named
file.
Because the fifo has a name we use regular files
facility for fifo path c/r.
Still there is a trick used to "open" fifo:
the opening procedure migh sleep if a fifo's peer
is not yet opened, so before doing a real open
we yield a fake open procedure (with O_RDWR flag)
which prevents us from sleeping even if peer
is not yet ready. Also we need writable fifo
end to restore data queued.
v2:
- add open/priv members to reg_file_info
- make open_fifo_fd to use open_fe_fd
- comment on pipe_id
- make sure the fifo data is not restored twice
v3:
- drop useless fixme comment and add sane one
v4:
- Use restore_data flag to escape data restore duplication
- Use S_ISREG for file contents copying
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's sign, that a parent has been changed sid after forking a child.
We should know a sid with which a process was born, because in a processes
chain, more then one process might change SID.
v2: fix names of variables
v3: prevent rewriting of born_sid
v4: Abort the restorer with error message if a born_sid can't be determing.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
v2: rework this by using openat() and service fds for proc root.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now pid is dumped from pid ns, it's gotted from parasite.
v2: fail if a zombie is in PIDNS
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>