Actually rt_sigset_t and k_rtsigset_t are the same
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise ppage_bitmap and page_bitmap will be updated for wrong VMA
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This makes it easier to merge with anon vmas dumping and makes use of
page server for shared memory.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Need to have proper fake-item state to make this code work ok:
get_task_ids
if (item->state != TASK_DEAD) {
ret = dump_task_kobj_ids(item);
if (ret)
goto err_free;
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In Fedora /var/run is on tmpfs, so all directories should be
recreated each times.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently shmem generates page images in parallel
with page server and IDs may intersect. Fix this by
making page server create larger IDs.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The page server is a process, that is about to get pages over
the network and put them into pagemap- + pages- images. Right
now what it does is simply get the data and puts it into the
image files. When we have dirty set tracking in the kernel the
page server will have to collect "page changes" and properly
integrate them into images.
Running crtools with page server is like this:
dst_node# crtools page-server --port <port> -D dump/ ...
src_node# crtools dump -t <pid> --page-server --address <dst_node> --port <port> -D dump/ ...
After this images from dst_node/dump/ and src_node/dump/ should
be put into one place and tasks can be restored out of it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We'll send them over network soon, so prepare abstraction layer for
this. Shmem is not on this scheme yet.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Since now we drain pages out of parasite, we can invent any format for
page dumps. Let is be ... prorobuf one! :)
Another thing to keep in mind, is that we're about to use splices and
implement iterative migration, so it's better to have actual pages be
page-aligned in the image.
And -- backward compatibility. That said the new format is:
1. pagemap-... file which contains a header (currently with a ID of
the image with pages, see below) and an array of <nr_pages:vaddr>
pairs. The first value means "how many pages to take from the
file with pages (see below)" and the second -- where in the task
address space to put them. Simple.
2. pages-... file which containes only pages one by one (thus aligned
as we want).
This patch breaks backward compatibility (old images with pages wil
be restored and then crash). Need to do it before v0.5 release.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Generate random data only for buffers with sizes less than FAST_SIZE
If a size of buffer is more that FAST_SIZE, the first FAST_SIZE bytes
are filled by random generator and then this chunk is used as pattern
for all other chunks.
With out this patch:
$ time bash -x test/zdtm.sh static/maps04
real 0m16.777s
user 0m0.054s
sys 0m0.724s
With this patch:
$ time bash -x test/zdtm.sh static/maps04
real 0m1.865s
user 0m0.128s
sys 0m0.745s
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A signal can be handled by non-leader thread and sigsuspend
will not be woken up.
kill can send signals to a specified thread, so a futex is used for
synchronization.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This allows to reuse magic numbers outside of crtools code.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The ARM parasite head uses a pad of 228 bytes
to make the offset of the symbol __export_parasite_stack
representable in the ARM instruction set. This value
needs to be changed every time the value of the macro
PARASITE_STACK_SIZE changes.
This patch makes this manual interference redundant.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently we dump pages directly from parasite into image files. This
is bad for several reasons:
1. We cannot use any more-or-less custom format for pages easily, since
parasite code cannot be linked with any libraries;
2. We will not be able to optimize migration with preliminary memory
migration (a.k.a. iterative migration) with it -- if we send pages
from parasite over network we are not able to let the task we dump
continue running.
That said, what is done is -- pages from target task are put into a
page-pipe in one go, then (not in this patch) parasite can be released
and we can do with pages whatever we want. For now pages are just
spliced from pipe into image file.
Some numbers:
In order to drain 1Gb of memory from task we need 1.5M of shared map
in args (for iovecs) and 4 pipes (8 descriptors) each referencing 128Mb
of pages, which int turn requires 4 x 640K chunks of sequential kernel
memory (for pipe_buffer). Not that big I guess.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The page-pipe is an object, that can accumulate pages inside it. It
consists of list of page-pipe-bufs, which in turn has a pipa, an
array of iovecs that describe the pages' locations and some stats.
Users of it are supposed to vmsplice pages into pipes to accumulate
then for later use, and vmsplice them from pipes when required.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
I will have to push some sort of map of pages to dump into parasite.
For this, I need to have estimation of how much memory I'd need for
than in parasite args. These two values will help with it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now when we collect list of vmas we need to know the
number of elements in it. In the future I will need to know
more, so it makes sense to create a vmas-list object for it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Just make use of previous patch. The creds dumping args are tuned to
fit one page (minimal static args size).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Sometimes we don't know the exact amount of data we would want
to send to parasite via args area (e.g. -- while draining fds).
Fix this, by moving the args area behind the parasite blob and
mmap-ing it with the run-time calculated size.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
PIE code can't use glibc helpers so instead of passing
CR_NOGLIBC macro in every source file pie code uses just
pass it in pie/Makefile.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We will need futexes to use in PIE code but futex.h
uses BUG_ON helper, so to diet inclusions move BUG_ONs
code to include/bug.h.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CFLAGS can be overriden, we need own flags here.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Tested-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
PIE code use own handmade stack so we need -fno-stack-protector
option to eliminate compilation warning if -fstack-protector
passed in command line CFLAGS.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Tested-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's already defined in general Makefile
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Tested-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In case if CFLAGS is overriden from command line we don't
see our headers anymore. So provide mandatory options in
ccflags-y variable to fix that.
https://bugzilla.openvz.org/show_bug.cgi?id=2521
Reported-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Tested-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
| cr-check.c: In function ‘check_unaligned_vmsplice’:
| cr-check.c:372:2: error: ignoring return value of ‘pipe’, declared with attribute warn_unused_result [-Werror=unused-result]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This facrots out common core members freeing into pstree.c
helper. Per-arch freeing helpers are now symmetrical to the
allocating ones.
This is a merge of two Cyrill's patches.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
- For NT_X86_XSTATE we need a system elf.h
- Drop duplicated parasite-syscall.h
- Organize headers in the way
- system headers
- asm headers
- regular headers
- protobuf stuff
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This is the first release, that actually doesn't require a custom
kernel in order to make all the tool features to work. Just take
the v3.8 (with proper config) and that's it :)
Another coolness about this release is the ARM port. In this case,
however, one does require a custom kernel, since the kcmp system
call is not wired into the ARM table in the upstream kernel :(
What else? Quite a lot, actually:
* C/R ability of a LOT of new stuff
* Remote syscall execution
* Deprecation of --namespace option
* Build system rework
* Ability to collect gcov info
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>