2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 04:48:16 +00:00

1696 Commits

Author SHA1 Message Date
Andrey Vagin
9826d2dd04 crtools: don't include pstree.h in namespaces.h
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:39:50 +04:00
Andrey Vagin
82cd9e2c66 parasite: don't include restorer.h in parasite-syscall.c
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:39:36 +04:00
Andrey Vagin
924acd8450 shmem: don't include restorer.h in shmem.c
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:38:43 +04:00
Andrey Vagin
a6edbcf669 crtools: don't include restorer.h in proc_parse.h
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-06 12:37:55 +04:00
Pavel Emelyanov
45f39e0415 rst: Make shmem restore to use rst-malloc
This actually fixes a bug -- memory for shmem info was
not allocated dynamically, thus we were limited in the
amount of shmems to be restored.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-02 01:06:31 +04:00
Pavel Emelyanov
c9aaf9f3c4 rst: Introduce an engine to allocate memory on restore
On restore we need differetn types of memory allocation.
Here's an engine that tries to generalize them all. The
main difference is in how the buffer with objects is being
grown up.

There are 3 types of memory allocations:

1. shared memory -- objects, that will be used by all criu
   children, but will not reach the restorer
2. shared remapable -- the same, but restorer would need
   access to them, i.e. -- buffer with objects will get
   remapped into restorer
3. private -- the same, but allocatedby each task for itself

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-02 01:01:48 +04:00
Igor Sukhih
647207714a util: Update kdev_to_odev to respect BITS_PER_LONG
Depending on BITS_PER_LONG userspace representation of dev_t
may vary, so we need to choose proper encoding.

Signed-off-by: Igor Sukhih <igor@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-01 17:40:54 +04:00
Alexander Kartashov
03a9c6b0b7 parasite-syscall: use the ptrace requests PTRACE_(GET|SET)REGSET to retrieve and set CPU registers
This patch introduces the routines ptrace_get_gpregs() and ptrace_set_gpregs()
that wrap the ptrace interface to get and set CPU registers respectively.
The motivation is to make the CRIU code be compatible with architectures that
don't support the PTRACE_GETREGS and PTRACE_SETREGS ptrace calls ---
the requests PTRACE_GETREGSET and PTRACE_SETREGSET are implemented instead.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-01 14:26:27 +04:00
Cyrill Gorcunov
fcfa58026c dump: Don't forget to cleanup link remap if needed
In case if checkpoint is failed or -R option passed
we need to remove link remap files created during
dump procedure.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-11-01 13:36:07 +04:00
Ruslan Kuprieiev
fdbedf5a88 crtools: add init_opts()
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-31 20:00:04 +04:00
Pavel Emelyanov
c8fe0e5ea7 parasite: Introduce thread_ctx structure
This one keeps registers and sigmask for running thread. Will
be used for simpler parasite management.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-30 16:49:22 +04:00
Cyrill Gorcunov
0707df7745 restore: Don't unmap vdso proxy on final cleanup
In case if we need to use vdso proxy the memory area
which holds restorer also has a place for vdso proxy
code itself, so on final pass we should not unmap it,
otherwise any call to vdso function will cause sigsegv.

IOW, the memory before final "cleanup" pass of restorer
might look as

    +-----------+---------+     +-------------+------+
    | bootstrap | rt-vdso | ... | application | vdso |
    +-----------+---------+     +-------------+------+
                       ^                         |
                       `-------------------------+

and we have redirected "vdso" code to jump to "rt-vdso".
After final pass the memory must look as

                +---------+     +-------------+------+
                | rt-vdso | ... | application | vdso |
                +---------+     +-------------+------+
                       ^                         |
                       `-------------------------+

I noticed this problem during container migration
testing, the container itself was suspended on 2.6.32
OpenVZ kernel with apache running inside, and any attempt
to connect to apache caused apache to crash.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-30 16:30:57 +04:00
Pavel Emelyanov
55a04580d5 restorer: Compact rst stack evaluation code
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-29 22:59:48 +04:00
Pavel Emelyanov
00dc26602a restorer: Remove unused heap
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-29 22:59:41 +04:00
Pavel Emelyanov
b978c6f873 util: Introduce buffer size for carrying /proc/self/fd/N path
There's ... a number of places where we want to do something
with /proc/self/fd/%d path. Each time we guess buffer size
that is enough for this. Make standard constant for this and
save some space on stack and drop args for some functions.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-15 13:59:59 +04:00
Pavel Emelyanov
14a7aff288 rst: Read sys.last_cap only once in kerndat
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-12 00:03:25 +04:00
Pavel Emelyanov
c9d3145843 service: Change default socket path to /var/run/
This is where such stuff is typically placed.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-11 17:56:46 +04:00
Pavel Emelyanov
f0a8643736 kerndat: Initialize necessary kerndats on restore
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-11 17:38:57 +04:00
Pavel Emelyanov
20d64b4326 dump: Install target ns' proc fd as service fd
Don't carry it around in a static global variable. Would
be useful for pidns leaks (processes entered one) scan.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-10 15:07:01 +04:00
Cyrill Gorcunov
664659a0ad inet: tcp -- Find size of max memory allowed to restore TCP data
The maximal size which may be used in the kernel for sending TCP data
on restore is varies depending on how many memory installed on the
system, moreover the memory allocated for "read queue" is bigger than
used for "write queue". Thus when we checkpointed a big slab of data
we need to figure out which size is allowed for sending data on restore.

For this we read /proc/sys/net/ipv4/tcp_[wmem|rmem] on restore and calculate
the size needed, then we simply chop data to segements and send it
in a loop.

Typical output on restore is something like

 | (00.013001)  30110: TCP queue memory limits are 2097152:3145728

https://bugzilla.openvz.org/show_bug.cgi?id=2751

[xemul: moved stuff to kerndat.c]

Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-04 16:18:24 +04:00
Pavel Emelyanov
28014d7eb4 net: Save and restore iptables in net namespace
By default just use the iptables-save and iptables-restore commands.
User may define CR_IPTABLES variable, in this case the "sh -c $CR_IPTABLES"
would be called.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-04 02:51:33 +04:00
Ruslan Kuprieiev
dbced2f013 log: one default log filename
Lets use one default log filename. User can set if in request, if needed.
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-02 20:16:38 +04:00
Pavel Emelyanov
b4c8c5ae32 security: Also save gid of user requesting for C/R
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 15:51:09 +04:00
Pavel Emelyanov
6bf63b3f01 security: Push full creds info into may_xxx checks
It's not enough to check only uids on dump and restore -- we need to
check e-ids and s-ids now (and caps in the future).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 15:48:44 +04:00
Ruslan Kuprieiev
547d9bf959 v2 security: set suid flag on crtools and check real uid on dump/restore
v2: remove redundant functions and variables.

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-02 17:11:17 +04:00
Ruslan Kuprieiev
4d80f502e8 v2 rpc: add log_file field to opts, add defaults toi log.h and use them where needed
[xemul: Simplified !log_file case and renumbered .proto fields]

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-02 16:01:44 +04:00
Pavel Emelyanov
2fe5884df3 service: Remove empty cr_service_client
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 06:18:44 +04:00
Pavel Emelyanov
91389f8782 security: Introduce (rather basic) security restrictions for C/R
Right now we have an ability to launch the C/R service from root
and execure dump requests from unpriviledged users. Not to be bad
guys, we deny dumping tasks belonging to user, that cannot be
"watched" (traced, read /proc, etc.) by the dumper.

In the future we will use this "engine" when launched with suid
bit, and (probably) will have more sophisticated policy.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 06:16:17 +04:00
Pavel Emelyanov
cfe72ab77a service: Put service sk inode into separate variable
I'm about to get rid of service state struct.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 06:06:53 +04:00
Pavel Emelyanov
0acc2624d4 service: Remove sk fd from service state struct
This fd is an internal thing of the service. Remove it from
externally available structure.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 05:54:50 +04:00
Pavel Emelyanov
0521367f22 service: Remove actually unused pid variable from service state
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 05:52:18 +04:00
Pavel Emelyanov
0327d5511b fdset: Beautify fdset opening
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-28 05:39:52 +04:00
Andrey Vagin
07930a8df4 ns: replace pid on id in per-namespace files
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:17:04 +04:00
Andrey Vagin
51fca3806c namespaces: remove unused code
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:11:53 +04:00
Andrey Vagin
f995673d99 ipcns: don't use global fdset for dumping namespace
We are going to replace pid on id in names of image files. The id is
uniq for each namespace, so it's more convient, if image files are
opened per namespace.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:11:20 +04:00
Andrey Vagin
b895c73c82 mntns: don't use global fdset for dumping namespace
We are going to replace pid on id in names of image files. The id is
uniq for each namespace, so it's more convient, if image files are
opened per namespace.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:11:09 +04:00
Andrey Vagin
e63f8c20e9 uts: don't use global fdset for dumping namespace
We are going to replace pid on id in names of image files. The id is
uniq for each namespace, so it's more convient, if image files are
opened per namespace.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:10:57 +04:00
Andrey Vagin
faf7b94868 netns: don't use global fdset for dumping namespace
We are going to replace pid on id in names of image files. The id is
uniq for each namespace, so it's more convient, if image files are
opened per namespace.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:10:45 +04:00
Andrey Vagin
b1b02fe676 images: split namespace contants per subsystems
Currently all values of constants should be continuous,
because cr_fdset_open is used for opening images for all namespaces.

The next patches will rework this code and image files will be opened
per namespace, then all these ugly settings of one constant to another
will be removed.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:10:07 +04:00
Andrey Vagin
8a23c3106d images: export cr_fdset_open
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-10-01 12:09:56 +04:00
Pavel Emelyanov
f1edcb32f5 rst: Introduce fine-grained pgid-restore synchronization
We can restore task's pgid which is not equal to its pid,
only when the respective group leader is alive. To make
restore reliable we wait for all group leaders to restore
using separate restore stage.

It's better to optimize this -- each task has a pointer on
its group leader and waits for one to become such.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-27 04:38:00 +04:00
Pavel Emelyanov
75b1d4a1e3 rst: Open sys.ns_last_pid before diving into restorer
We restore chroot before doing this, so if we might need to
open one, we may have no access to the /proc/... paths.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-25 13:59:17 +04:00
Pavel Emelyanov
360c1c13b2 tcp: Show tcp queues contents when requested
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-19 12:16:07 +04:00
Andrey Vagin
e1e1034786 restorer: rework unmaping old VMA-s (v3)
All process VMA-s are in "premmaped area". All restorer stuff are in
bootstap "area", so we have two areas.

So we don't need to unmap extra VMA-s one by one. We can call munmap
three times for the region before the first area, for the hole between
areas and for the region after the second area.

The old scheme didn't work, because the list of VMA-s can be changed
after collecting. It can be due to memory allocations by libc or due to
increased stack.

v2: improve readability at the expense of beautiness
v3: print return code of munmap in error messages
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:11 +04:00
Andrey Vagin
c900bf74b0 parasite: unmap itself (v2)
This patch adds a new parasite command, which unmaps the parasite blob.
This command never returns and the criu process traps the target process
on the exit from the munmap syscall.

v2: rename the function for unmaping a parasite blob to not intersects
with criu's functions.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:10 +04:00
Andrey Vagin
89d8b20186 restorer: unmap itself (v2)
This patch adds a function for removing the restorer blob. This function
never returns and the process must be trapped on the exit from the
munmap syscall.

v2: * release parasite_ctl sturcture and use the new interface of
      parasite_prep_ctl

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:09 +04:00
Andrey Vagin
3e5ad587f4 parse_proc: move parse_threads from cr-dump.c
It will be used in cr-restore.c for stopping threads on the exit from
sigreturn.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:03 +04:00
Andrey Vagin
b65ba9a8d6 parasite: add a function for unmaping bootstrap blob
The munmap syscall must be executed from a process memory. The code can
be injected in memory and then removed. But we can avoid all these
actions, if the code will be in the blob and a process will be trapped
on the exit from the munmap syscall.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:02 +04:00
Andrey Vagin
f9f690262c parasite: allow to wait more than one process on the exit from syscall
All processes must be started by PTRACE_SYSCALL. The function calls wait
in a loop and if a process on the exit from the required syscall, it
is stopped, otherwise it will be reexecuted by PTRACE_SYSCALL.

The function doesn't know, which processes should be trapped, so
you should care, that wait() will not catch someone else.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:23:00 +04:00
Andrey Vagin
833b1f950a parasite: stop task on exit from a specified syscall
This patch adds nothing new, just splits the existant function.

Currently a parasite stopped on sigreturn for unmaping a parasite blob.
The same scheme will be used for restorer blob and this function will be
used to stop on exit from the munmap syscall.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2013-09-23 15:22:59 +04:00