2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 21:07:43 +00:00

1696 Commits

Author SHA1 Message Date
Pavel Emelyanov
e6c88abd62 check: Add some basic checks
* /proc/<pid>/map_files
* sock diag
* ns_last_pid sysctl
* SO_PEEK_OFF sockoptions

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-02 14:15:28 +04:00
Pavel Emelyanov
c39e759048 check: Initial skeleton
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-02 14:15:09 +04:00
Cyrill Gorcunov
8a8ce9b342 show: Don't forget to open sockets queue descriptor before accessing it
Otherwise I'm getting errors like

CR_FD_SK_QUEUES
----------------
Error (./include/util.h:102): Can't read img file: Bad file descriptor
----------------

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-02 11:58:55 +04:00
Cyrill Gorcunov
827cabb480 show: Use pr_msg for showing contents on console
Due to code sharing, especially in IPC area,
the unbinding is done via helper macros and
sysclt engine tuning (new CTL_SHOW action
added).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-02 11:12:59 +04:00
Cyrill Gorcunov
7aa8e4b6e2 log: log-engine slight redesign
The messages are filtered by their type

    LOG_MSG     - plain messages, they escape any (!) log level
                  filtration and go to stdout
    LOG_ERROR   - error messages
    LOG_WARN    - warning messages
    LOG_INFO    - informative messages
    LOG_DEBUG   - debug messages

By default the LOG_WARN log level is used, thus LOG_INFO
and LOG_DEBUG messages will not appear in output stream.

pr_panic helper was replaced with pr_err, pr_warning
shorthanded to pr_warn and old printk if rather pr_msg
now.

Because we share messages between "show" and "dump" actions,
before the "show" action proceed we need to tune up
log level and set it to LOG_INFO.

Also note that printing of VMA and siginfo now
became LOG_INFO messages, it was not that correct
to print them regardless the log level.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-02 01:05:43 +04:00
Cyrill Gorcunov
95f770b61c logfd: Distinguish functions which do setup logdf
We have a few helper functions which all do setup
logfd but in different program flow context. So
rename them for grepability.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-03-02 01:05:43 +04:00
Pavel Emelyanov
ae9f1bfdc4 dump: Restart seize in case reparent occurred
This can happen while dumping a pid-namespace (we can't do it now), thus
put this check not to forget one in the future.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-01 19:31:21 +04:00
Pavel Emelyanov
9e0b308af0 dump/restore: Rework final-state switch
Remove CR_TASK_XXX states, use the TASK_XXX ones (for image). This is
required to unseize tasks properly in the next patches.

Plus, make sure that pstree_list and the seized set coincide (i.e.
handle error in collect_task).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-01 19:31:20 +04:00
Pavel Emelyanov
0afad031d5 dump: Check for process/threads tree not to change after seizeing
When we've seized all the tasks and threads found in /proc check for
the /proc contents be the same. Do it one-by-one as we descend the tree.
This is OK, since tasks cannot create kids for anyone but themselves or
their parents (reparent will be handled later).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-01 19:31:20 +04:00
Pavel Emelyanov
199e8d8248 dump: Check for pids reuse at suspend
While we try to seize task it can die and give its pid to
somebody else. This can break pstree consistency. Check for
parent being valid after task is seized.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-01 19:31:20 +04:00
Pavel Emelyanov
f8a18edd44 dump: Remove SHOULD_BE_DEAD task state
Move proc checks for Z-state into seize_task().

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-01 19:31:20 +04:00
Pavel Emelyanov
3ab9285f0f parasite: Check for unexpected signals delivery
Signals shouldn't come to parasite after we've blocked them.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-03-01 19:31:20 +04:00
Kinsbursky Stanislav
46535130c6 restore: UNIX sockets queue support
Based on xemul@ patches.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-29 17:42:30 +04:00
Kinsbursky Stanislav
e518c44c7c show: UNIX sockets queue support
Based on xemul@ patches.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-29 17:42:30 +04:00
Kinsbursky Stanislav
c19012326d dump: socket queues support
This patch was designed to be generic and thus usable for all kinds of
sockets. Not sure, thah this goal has been reached, but at least I tried.

Key ideas:
1) On-stack structure for collecting sockets queues and then passing them to
   parasite code.
2) Singly linked list is used for collecting structures, representing sockets
   of any kind (!) with queues.

Based on xemul@ patches.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-29 17:42:30 +04:00
Kinsbursky Stanislav
8ce9e94705 parasite: support sockets queues
This patch adds sockets queue dump functionality. Key ideas
1) sockets info is passed as plain array in parasite args.
2) new socket option SO_PEEK_OFF with MSG_PEEK is used to read the get the
   queue's packets.
3) Buffer for packet will be allocated for each socket separately and with
   size of socket sending buffer. For stream sockets is means, that it's queue
   will be dumped in chunks of this size.

Note: loop around sys_msgrcv() is required for DGRAM sockets - sys_msgrcv()
with MSG_PEEK will return only one packet.

Based on xemul@ patches.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-29 17:42:30 +04:00
Kinsbursky Stanislav
fe559c7e48 parasite: remove unused headers
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-28 21:13:26 +04:00
Cyrill Gorcunov
2acc741a3a files: Use sys_kcmp to find file descriptor duplicates v4
We switch generic-object-id concept with sys_kcmp approach,
which implies changes of image format a bit (and since it's
early time for project overall, we're allowed to).

In short -- previously every file descriptor had an ID
generated by a kernel and exported via procfs. If the
appropriate file descriptors were the same objects in
kernel memory -- the IDs did match up to bit. It allows
us to figure out which files were actually the identical
ones and should be restored in a special way.

Once sys_kcmp system call was merged into the kernel,
we've got a new opprotunity -- to use this syscall instead.
The syscall basically compares kernel objects and returns
ordered results suitable for objects sorting in a userspace.

For us it means -- we treat every file descriptor as a combination
of 'genid' and 'subid'. While 'genid' serves for fast comparison
between fds, the 'subid' is kind of a second key, which guarantees
uniqueness of genid+subid tuple over all file descritors found
in a process (or group of processes).

To be able to find and dump file descriptors in a single pass we
collect every fd into a global rbtree, where (!) each node might
become a root for a subtree as well.

The main tree carries only non-equal genid. If we find genid which
is already in tree, we need to make sure that it's either indeed
a duplicate or not. For this we use sys_kcmp syscall and if we
find that file descriptors are different -- we simply put new
fd into a subtree.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:47 +04:00
Cyrill Gorcunov
8b187454b4 files: Rename prepare_fdinfo_global to prepare_shared_fdinfo
This function simply allocates shared memory. Name it so
and move it closer to the variables it referes on.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:47 +04:00
Cyrill Gorcunov
69fefa4d65 files: Move structures and enums into the header
Having them in the header file will allow to share these
structures with other callers. Moreover, this is a good
practice to have definition(s) in header file until otherwise
really needed.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:47 +04:00
Cyrill Gorcunov
1bae745ddb Add rbtree helpers
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:47 +04:00
Cyrill Gorcunov
5e3c0dc742 util: Add missing \n into BUG_ON_HANDLER
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:47 +04:00
Cyrill Gorcunov
7b4c352be3 compiler: Add likely/unlikely hints
Will be needed for file-ids handling.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:46 +04:00
Cyrill Gorcunov
cff00de82d syscalls: Introduce sys_kcmp syscall
Though we can use libc's syscall() wrapper
I would prefer to have own implementation
in case if we will need it in non-libc bindable
code.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-28 19:13:46 +04:00
Kinsbursky Stanislav
8cd120436e parasite: replace stack heap with mapped one
Stack heap size is probably small for dumping unix sockest skb's (it's hard to
discover, how big it could be; thus let's assume that 10MB is enough,
otherwise give up and throw error).
So let' replace this stack heap wil 10 MB anonimous private mapping.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-28 15:33:49 +04:00
Andrey Vagin
62ba357e4d dump: use prctl to dump clear_tid_address
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-22 15:06:51 +04:00
Cyrill Gorcunov
ef97467da9 log: Add log-levels
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-20 14:23:28 +04:00
Cyrill Gorcunov
fe99f501ef Move pr_ helpers to log.[ch]
This is a place where they should belong to.
util.c is too big already.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-20 14:23:28 +04:00
Pavel Emelyanov
097bc0b967 dump: Collect mem+regs+sigmask atomically
The ptrace seize doesn't prevent signals from delivery. That said,
we should block the signals in the target task before dumping
anything which is signals-related, i.e. memory and registers.

But once we've blocked signals, we should dump registers before
unblocking them, since any postponed signal will screw things up.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-20 12:33:43 +04:00
Kir Kolyshkin
447388d79b open_proc() and friends: hide pid_dir
This patch tries to introduce lazy and hidden pid_dir support,
meaning one don't have to worry about pid_dir but the optimization
is still there.

The patch relies on the fact that we work with many /proc/pid files for
one pid, then for another pid and so on, i.e. not in a random manner.

The idea is when we call open_proc() with a new pid for the first time,
the appropriate /proc/PID directory is opened and its fd is stored.
Next call to open_proc() with the same PID only need to check that
the PID is not changed. In case PID is changed, we close the old one
and open/store a new one.

Now the code using open_proc() and friends:
- does not need to carry proc_pid around, pid is enough
- does not need to call open_pid_proc()

The only thing that can't be done in that "lazy" mode is closing the last
PID fd, thus close_pid_proc().

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-17 16:46:25 +04:00
Kir Kolyshkin
5661d806cb Move error reporting to inside open_proc and friends
...and make it correctly print the file name we were unable to open.
Also, error from fdopen[dir]() is now reported with file name as well.

Note that open_proc() and friends need to be macros in order for
pr_perror() to show actual file name and line number where error occured.

Historical note: the original version of this patch was way more radical,
changing openat() to open() and thus removing pid_dir (replacing with pid
when needed) and open_proc_dir(), changing openat() to open(). The word
from Pavel is he wants to keep the openat/pid_dir optimization because
it saves two dentry lookups in kernel code for each open(). Because of
this optimization (and desire to print correct file name in case
of error) we have to carry both pid and pid_dir everywhere.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-17 16:46:25 +04:00
Kir Kolyshkin
03294077af util.c: introduce open_proc_rw()
To be used by the next patch

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-17 16:46:25 +04:00
Cyrill Gorcunov
cf8b39d4aa util: Drop jerr macros
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-16 22:05:40 +04:00
Cyrill Gorcunov
dcb1cbfb82 Rework parasite code
- make control block to keep all information
   needed to run injected syscall and parasite
   blobs

 - add ptrace_swap_area helper

 - handle both parasite engine calls and injected
   syscalls by single __parasite_execute function

 - drop jerr() usage

 - bring back handling of inflight signals from
   original program inside parasite code

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-16 22:05:28 +04:00
Kinsbursky Stanislav
4101487f87 IPC: show semaphores set
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-15 13:33:46 +04:00
Kinsbursky Stanislav
4141296ed7 IPC: dump semaphores set
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-15 13:33:46 +04:00
Kinsbursky Stanislav
574302bf8d restore: support SYSV IPC vma
This patch introduces the following changes:
1) writing of shmid value into vma_area->fd instead of
   waiting for shared memory region is open by parent,
   reopen it and dump fd.
2) new syscall support: sys_shmat
3) use sys_shmat() to map memory region in restorer's
   mapping function if vma flag VMA_AREA_SYSVIPC is set.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-15 13:30:34 +04:00
Kinsbursky Stanislav
b3cfe73556 dump: support SYSV IPC vma
This patch introduces the following changes:

1) introduces new flag VMA_AREA_SYSVIPC to mark corresponding vma entries.
2) enhance task /proc/<pid>/maps parsing to obtain first 5 letters of mapped
   file. If device major file belong to ins equal to 0 (tmpfs) and it's name
   starts with "/SYSV", then this mapping is considered as SYSV IPC and
   corresponding vma entry status is updated with VMA_AREA_SYSVIPC flag.
3) omit dumping of mapping pages for SYSV IPC vmas.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-15 13:30:34 +04:00
Kir Kolyshkin
7c961a7b8f include/types.h cleanup: remove *_FILENO
These defines are already provided by unistd.h, and the only user
is log.c which already includes unistd.h.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-14 21:16:00 +04:00
Kinsbursky Stanislav
24c4381644 IPC: show message queue dump content
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-14 20:21:30 +04:00
Kinsbursky Stanislav
fa2ff60680 IPC: dump message queue
v2: New "MSG_STEAL" functionality is used

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-14 20:21:30 +04:00
Cyrill Gorcunov
86392b7f2f compiler: Add compiler noinline attribute and barrier helper
They are not needed at moment but will be needed at
parasite/restorer code rework time, so add them now
just to not forget.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-14 17:18:09 +04:00
Cyrill Gorcunov
5988d401b7 Add processor-flags.h
We will need it for parasite.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-14 17:18:08 +04:00
Cyrill Gorcunov
ae817148e0 parasite: Some code style tuning in header
Easier to read.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
2012-02-14 17:18:08 +04:00
Kinsbursky Stanislav
f86d167bf1 ipc: rename struct ipc_seg
This name for the structure is obfuscating, because the structure
will be used also for queues and semaphores sets migration.

This patch renames this structure int ipc_desc_entry. It also renames
all related functions and prints to reflect structure name change.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-13 21:04:23 +04:00
Pavel Emelyanov
e6823c274e parasite: Enlighten parasite cmd/adg injection
Since we now have the parasite memory shared with crtools process we
can just memcpy this data between them.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-13 01:06:07 +04:00
Pavel Emelyanov
354ab03a67 parasite: Enlighten parasite blob injection
I don't like that we poke the parasite into remote space with 4k calls
to ptrace. Now we have the /proc/pid/map_files/ dir which helps us sharing
a mapping with some other process.

Use this -- map the remote area for parasite locally and put the parasite
blob into it with simple memcpy.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-13 01:06:06 +04:00
Pavel Emelyanov
2b1a58b3b4 parasite: Remove unneeded on-stack variables
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-13 01:06:06 +04:00
Pavel Emelyanov
cf662edc56 parasite: Make mmap/munmap_seized static
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-12 11:45:30 +04:00
Pavel Emelyanov
cdbd2563bf parasite: Symplify syscall_seized
There's no need in 3 instances of regs in arguments. One is more than enough.
Plus, this one can be made static.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
2012-02-12 11:45:29 +04:00