Currenly crtools supports a case when a child shared a fd table
with parent.
Here is only two interesting things.
* Service descriptors should be cloned for each process
who shared one fd table.
* One task should restore files and other tasks should sleep in this
* time.
v2: * allocate fdt_lock from shared memory
* don't wait a child, if it doesn't share fdtable
v3: * don't move ids on the pstree image
v4: * save ids in a separate image
* save fdinfo per id instead of pid
v5: fix alignment of service_fd_id
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A few processes can share one fd table. Each process has own set of
service file descriptors and a process knows nothing about servic fds
of another processes. So if two process share one fd table,
close_old_fds will close servic descriptors of another process.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
fdt shared data contains PID of process, which will restore file
descriptors and a futex for synchronization.
A process with mimimal pid restores file descriptors.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used for determing which resources are shared
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It is read together with pstree items for checking what kind of
resources should be shared. Core is too big for reading it in
this place.
v2: fix check_core
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A service fd should be created, otherwise get_service_fd returns -1.
This patch removes this functionality from other subsystems and
allows to clone service descriptors.
v2: rename open_service_fd to install_service_fd
v3: two patches were merged for bisecting
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It looks like a namespace for service descriptors.
It will be used for restoring tasks with shared fd tables.
Service descriptors should be own for each process.
v2: clone_service_fd doesn't know about sub-systems like log, proc, etc
v3: Don't try to find a free name-space.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Because /proc could not be umounted, if any its file is opened.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It contains info about the pipe itself, not jut one of its
ends. Thus if we want to add more (e.g. -- its size) we'll
have to put it there and thus have it always present.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently fdinfo dumps for each task, so CR_FD_FDINFO is in cr_fdset.
A few tasks can share one fd table and the set of descriptors will be
dumped once and a image name will contain files_id instead of pid.
In this case CR_FD_FDINFO will go away from cr_fdset.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
crtools should not failed, if new images are absent.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
zdtm.sh -b <commit>
* save a current head, rollback on <commit> and compile crtools
* execute a test and dump its processes
* checkout the current head and comple crtools
* restore test processes and check results
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* Replaced the shell interpreter with bash to run
the script test/zdtm.sh correctly.
* Added new directories into the routine contruct_root()
searched by the Debian version of the dynamic linker.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dump the with "new" prlimit syscall that works on arbitrary pid.
Restore is done in restorer _after_ mappings mixup and _before_
caps drop to make it set any max value.
The RLIM_INFINITY is handled explicitly to help future 64<->32
bits migration.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On my FC17 box calloc calls brk() and the subsequent mprotect(PROT_EXEC)
fails with EACCESS. Using mmap is safer here.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The number of arguments used to carry data via them is too
big already. Just fill the required core fields inside.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* The routine sigreturn_prep_xsave_frame() is renamed to sigreturn_prep_fpu_frame().
* Moved the routines sigreturn_prep_fpu_frame(), show_rt_xsave_frame(), and
valid_xsave_frame() to the file crtools.c.
* Introduced the structure fpu_state_t to pass the FPU state to the restorer
in a machine-independent way.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The current sys_mmap error analysis code doesn't work on 32-bit architectures
with 3G/1G userspace/kernel virtual address space split since the syscall
allocates anonymous memory above the first 2G of the address space ---
such an address is a negative integer so it's interpreted as a error code.
The problem isn't encountered on x86-64 becauase it doesn't use negative
virtual addresses in the userspace.
The 3G/1G split is used because memory allocation is currently broken for other
values of the split on ARM: the value of TASK_UNMAPPED_BASE (arch/arm/include/asm/memory.h)
isn't page-aligned if other split value is used so the value of the field
mm_struct::mmap_base is initialized with a page-unaligned value by
the function arch_pick_mmap_layout() (arch/arm/mm/mmap.c) in some circumstances
that breaks page-alignment checks in the kernel memory management code.
This patch modifies sys_mmap return value analysis code replacing tests
for negativeness of the signed return value with tests that checks that
the return value isn't greater than TASK_SIZE.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The size of an auxv is the machine pointer but a 64-bit integer is reserved
in a MmEntry protobuf message to store an auxv. Moreover the number of auxv's
varies from one architecture to another. So the following is proposed
to alleviate the issue.
* Introduced the type auxv_t representing a machine-pointer sized integer.
* The size of auxv array is extracted from a MmEntry message instead of using
the value of the macro AT_VECTOR_SIZE.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
VM above TASK_SIZE is read-only but some areas are mapped on ARM
into the process address space.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
* The linker script pie/pie.lds.S is generated from the template
pie/pie.lds.S.in by prepending the output architecture specification.
The output architecture is defined by the variable LDARCH.
* Blobs are generated by objcopy instead of ld because the ARM linker
fails to produce a binary when supplied a script.
(See http://lists.gnu.org/archive/html/bug-binutils/2008-10/msg00091.html).
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>