2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-27 20:37:57 +00:00

1696 Commits

Author SHA1 Message Date
Andrey Vagin
8bcffef6b9 proc_parse: parse fdinfo to get pos and flags
We are going to parse fdinfo for getting mnt_id,
so we can take there pos and flags and don't call
fcntl and lseek for that.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-09 16:42:53 +04:00
Andrey Vagin
d36e07aabe crtools: close all desriptors only for the root task
For all other tasks only unsed service descriptors will be closed.

This change allows to have file descriptors, which may be used for
restoring namespaces. All non-server descriptors must be closed before
restoring files.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-09 15:50:40 +04:00
Alexander Kartashov
a75d39613f cr: collect short integer aliases in the single place
This patch moves the files arch/$ARCH/include/asm/int.h to
include/asm-generic/int.h and makes the types {u,s}{8,16,32}
be aliases of the fixed sized integer types [u]int{8,16,32}_t.

This makes it possible to use single set of integer typedefs
in all architectural ports.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-08 15:35:46 +04:00
Andrey Vagin
33a50cfdc8 mount: detect the newinstance option for devpts (v2)
The devpts instance was mounted w/o the newinstance option if,
the device number is equal to the root /dev/pts.

I think this condition is strong enough to not mount devpts in a
temporary place.

v2: move the host.bla-bla-bla in kerndat.c
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-04-08 15:32:35 +04:00
Cyrill Gorcunov
0bae3bc181 make: config -- Add testing if we have libbsd installed
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-26 01:44:23 +04:00
Deyan Doychev
69a6bf4439 criu: Add exec-cmd option (v3)
The --exec-cmd option specifies a command that will be execvp()-ed on successful
restore. This way the command specified here will become the parent process of
the restored process tree.

Waiting for the restored processes to finish is responsibility of this command.

All service FDs are closed before we call execvp(). Standad output and error of
the command are redirected to the log file when we are restoring through the RPC
service.

This option will be used when restoring LinuX Containers and it seems helpful
for perf or other use cases when restored processes must be supervised by a
parent.

Two directions were researched in order to integrate CRIU and LXC:

1. We tell to CRIU, that after restoring container is should execve()
   lxc properly explaining to it that there's a new container hanging
   around.

2. We make LXC set himself as child subreaper, then fork() criu and ask
   it to detach (-d) from restore container afterwards. Being a subreaper,
   it should get the container's init into his child list after it.

The main reason for choosing the first option is that the second one can't work
with the RPC service. If we call restore via the service then criu service will
be the top-most task in the hierarchy and will not be able to reparent the
restore trees to any other task in the system. Calling execve from service
worker sub-task (and daemonizing it) should solve this.

Signed-off-by: Deyan Doychev <deyandoichev@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-25 01:20:02 +04:00
Alexander Kartashov
7535c3909a vdso.c: share the PIE part of the vDSO proxy machinery between all architectures
This patch splits the file arch/x86/vdso-pie.c into machine-dependent
and machine-independent parts by moving the routines vdso_fill_symtable(),
vdso_proxify(), and vdso_remap() to the file pie/vdso.c.

The ARM version of the routines is moved to the source pie/vdso-stub.c
to provide the vDSO proxy stub implementation for architectures
that don't provide the vDSO.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Looks-good-to: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-19 21:09:01 +04:00
Alexander Kartashov
19c6534d86 vdso: make vDSO symbol names architecture-specific
This patch moves the enum VDSO_SYMBOL_* and macros VDSO_SYMBOL_*_NAME
to the x86 specific header since different architectures export
different symbols from their vDSOs.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Looks-good-to: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-19 21:08:57 +04:00
Alexander Kartashov
7c42c9b5b6 cr: generalize the type to store the value of the TLS register
Supported machine architectures provide TLS stogares of different sizes:
the size of the TLS storage in x86-64 is 24 bytes, ARM --- 4 bytes
and upcoming AArch64 --- 8 bytes. This means every supported architecture
needs a specific type to store the value of the TLS register.

This patch reworks the insterface of the routines arch_get_tls()
and restore_tls() passing them the TLS storage by pointer
rather than by value to simplify the TLS stub for x86.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-19 10:41:39 +04:00
Tikhomirov Pavel
670d1ce856 v2 page-read: rework open_page_read to use in shmem restore
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-18 11:48:58 +04:00
Cyrill Gorcunov
ff50ef3b21 image.h: Include fcntl.h
No all distros (Rpi) provide O_PATH definition,
so include fcntl.h here thus we don't hit compilation
problem like

 |  CC       image.o
 | image.c: In function ‘open_image_at’:
 | image.c:187:29: error: ‘O_PATH’ undeclared (first use in this function)

Reported-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-14 17:59:40 +04:00
Cyrill Gorcunov
bf76aa2068 rlimit: Move CR_FD_RLIMIT out of _CR_FD_TASK, v2
On Thu, Mar 13, 2014 at 02:30:50PM +0400, Cyrill Gorcunov wrote:
>
> This image is deprecated now so move it out of
> _CR_FD_TASK thus we won't be even generating it
> on the dump.
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>

Updated

>From cb9c3953beac7d42de80635e7a6e537cc867c479 Mon Sep 17 00:00:00 2001
From: Cyrill Gorcunov <gorcunov@openvz.org>
Date: Thu, 13 Mar 2014 14:24:50 +0400
Subject: [PATCH 7/7] rlimit: Move CR_FD_RLIMIT out of _CR_FD_TASK

This image is deprecated now so move it out of
_CR_FD_TASK thus we won't be even generating it
on the dump.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-14 15:44:51 +04:00
Cyrill Gorcunov
16b5692061 image: open_image_at -- Add O_OPT flag
This allows us to distinguish the situation where image
to be opened is missing but optional, thus no error message
should be printed.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-14 15:43:49 +04:00
Pavel Emelyanov
2616c87d9f core: Allocate CoreEntry (except arch) with single xmalloc
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
2014-03-14 13:39:28 +04:00
Andrey Vagin
5b564db91e namespace: move struct ns_id into namespace.h
It's going to be used for restoring namespaces. For example we need to
enumirate the ns_ids list for restoring mount namespaces.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-12 00:23:47 +04:00
Pavel Emelyanov
ee71c396fa bitops: Add comment about generic bitops header usage
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-12 00:20:22 +04:00
Alexander Kartashov
01c7a87988 asm: convert the ARM implementation of bit operations to the reference
The implementation of bit operations for ARM isn't actually
architecture-specific so it would rather be shared with
the upcoming port for AArch64 that won't provide optimized
implementation of bit operations.

Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-12 00:18:03 +04:00
Tikhomirov Pavel
32a48b67b7 v2 deduplication: bunch neighbour data to punch together
when decide that data is no longer needed, there are two cases:
-if data neighbours previous block of "no needed" data, extend bunch
block(it holds begining and size of concequent "no needed" data) by
length of curent block and go next.
-if data not neighbours bunch block(or bunch block size will be bigger
than MAX_BUNCH_SIZE), than we punch bunch block and set bunch block
to curent block.

in the end make cleanup to punch last bunch block.

changes in v1:
punch_hole takes whole page_read
make restriction more precise

Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-07 14:31:29 +04:00
Pavel Emelyanov
edde5fb461 irmap: Add option that forces fsnotify watches paths resolve
When migrating container with copying its FS, the inode numbers
and thus their handles wil change. This will make the restore of
inotify/fanotify fail, since they do it via fhandles.

We've already faced the problems with fsnotifies on NFS -- they
don't work there. To address this an irmap cache is created on
pre-dump, so to resolve the issue with changed inodes during
migration, we can force the irmap cache build.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-03-06 15:12:05 +04:00
Pavel Emelyanov
391e4bd7b9 page-read: Sanitize opening routines
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-28 15:19:19 +04:00
Tikhomirov Pavel
a355affd29 v2 deduplication: add separate function for punch to use on restore
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-28 14:10:51 +04:00
Cyrill Gorcunov
056047bdf9 criu: Add --cpu-cap option
This option will serve to manage CPU capabilities
to be matched/ignored on restore procedure. At the
moment we introduce 'fpu','all' capability arguments.
By default 'all' is set.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-28 13:36:38 +04:00
Cyrill Gorcunov
eee9ad2b44 log: Add pr_warn_once helper
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-28 13:34:29 +04:00
Andrey Vagin
06662f9f13 mount: rename TMPFS_MAGIC into TMPFS_IMG_MAGIC
TMPFS_MAGIC is already used in linux/magic.h

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-28 13:26:48 +04:00
Cyrill Gorcunov
6e95f544de headers: Add fs-magic.h
Not all distros provide magic numbers we might need
during build procedure, thus provide own definitions
in one known place.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-28 13:07:54 +04:00
Cyrill Gorcunov
e15914fb08 fsnotify: Open handle with O_PATH, v2
Otherwise if the mark is set up on link we end
with -ELOOP error trying to open it. Thus, use
O_PATH pointing the kernel that we're not going
to read/write this descriptor.

Repored-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-25 23:38:35 +04:00
Cyrill Gorcunov
8bea49a74b pagemap-cache: Use page.h helpers
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-21 16:29:41 +04:00
Cyrill Gorcunov
e252e37201 asm-generic: Introduce page.h
At the moment we are using 4K pages all the time,
so instead of copying code over all archs we're
supporting -- add asm-generic/page.h header.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-21 16:29:40 +04:00
Cyrill Gorcunov
3ad89009a2 bug: Include <stdbool.h>
We have #define BUG() BUG_ON(true) here, where 'true'
is defined in stdbool header, so to be able to include
bug.h on its own -- include needed header inplace.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-21 16:27:16 +04:00
Cyrill Gorcunov
9ddcc5d0d1 config-base: Add F_SETPIPE_SZ/F_GETPIPE_SZ
These are needed to compile project on CentOS 6.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-18 12:53:09 +04:00
Cyrill Gorcunov
e98eeaa8b9 config-base: Beautify header
- drop empty line
 - end ending endif comment

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-18 12:53:07 +04:00
Cyrill Gorcunov
21e510dd0f pagemap-cache: Introduce engine
Pavel reported that in case if there a big number
of small vmas present in the dumpee we're reading
/proc/pid/pagemap too frequently.

To speedup this procedue we inroduce pagemap cache.

The interface is:
 - pmc_init/pmc_fini for cache initialization and freeing
 - pmc_get_map to retrieve specific PMEs array for VMA area

v2:
 - Move internal constants to pagemap-cache.c
 - Make PAGEMAP_LEN to accept virtual address/size
 - Don't adjust low bound in caching mode to save a couple of code bytes

Reported-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-18 12:49:08 +04:00
Pavel Emelyanov
4b0d41c542 image: Don't unlink image we're dumping into
We want to write into empty image files, so we
unlink them before dumping into. Let's O_TRUNC
it instead.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
2014-02-14 16:47:02 +04:00
Andrey Vagin
18607116fa page-pipe: move tunable constants into config.h
PIPE_MAX_SIZE is calculated according with the kernel code.
PPB_IOV_BATCH has been taken from my mind.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-14 16:43:21 +04:00
Pavel Emelyanov
01e88d1c87 rpc: Add ability to specify veth pairs (--veth-pair option)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-12 00:33:02 +04:00
Andrey Vagin
bb98a82098 page-pipe: split dumping memory on chunks (v3)
The problem is that vmsplice() to a big pipe fails very often.

The kernel allocates a linear chunk of memory for pipe buffer
descriptos, but a big allocation in kernel can fail.

So we need to restrict maximal capacity of pipes. But the number of
pipes is restricted too, so we need to split dumping memory on chunks.

In this patch we calculates the pipe size for which vmsplice() will not
fail.

v2: s/batch/chunk and a few other small fixes
v3: Remove callbacks from page_pipes and reuse pipes
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-10 15:06:39 +04:00
Christopher Covington
b0549064c1 Make tpacket_req3 definition conditional
The makefile includes need to be moved for everything to be
defined properly when the configure tests run.

The Ubuntu 12.04 x86_64 GCC and Linaro's 13.08 and newer AArch64
GCC's have the if_packet.h kernel header, but as of 13.12,
the Linaro AArch32 GCC does not.

Change-Id: I363c43fdb81b028f99aac77e15bff9462c87af4b

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-10 14:31:02 +04:00
Andrey Vagin
2824c968e7 ptrace: include config.h in ptrace.h
CONFIG_HAS_PEEKSIGINFO_ARGS is used there.

Cc: Christopher Covington <cov@codeaurora.org>
Reported-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-07 15:32:17 +04:00
Pavel Emelyanov
b0c0933744 fifo: Don't lookup reg path twice
Same for previous patch with vmas -- we do it on collect and
on real open. Just put the pointer on fifo_info structure.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-07 13:55:27 +04:00
Pavel Emelyanov
dc7abdfb92 vma: Don't lookup file_desc for vma twice
We do it first -- on collect, second -- on restore. The
2nd lookup is excessive, we can put fd pointer on vm_area
at lookup and reuse one later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-07 13:51:29 +04:00
Pavel Emelyanov
fd41201975 restore: Parse /proc/self/maps for self mappings
On restore we only need to know currnet task mappings' start and end
to find where to put the restorer blob. And since the smaps file in
/proc/pid is up to 3 times slower, than the maps one, it makes
perfect sense just to parse the latter one.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-07 13:32:21 +04:00
Pavel Emelyanov
490efb4695 files: Properlu count number of users for mmaped/exe-d ghost files
If a file mmaped or pointed by exe link is unlinked, we will
generate a ghost file for it. On restore the ghost file will
be created with the users counter 1 and the very first open
(e.g. for mmap) will unlink the file.

Handle this by bumping up user counter for every mapping
pointing on the file.

This appeared after previous patches that packed the reg-files
image. Before it each vma and exe link created separate entry
in the reg-files image.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:18:21 +04:00
Pavel Emelyanov
3de41b8070 files: Rework select_ps_list fdsec ops callback
For unlinked opened and mmaped files we'd need to
care about remaps, for this the callback with both
file_desc and fdinfo_list_entry will be required.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:17:59 +04:00
Pavel Emelyanov
8a827ba403 files: Make fd_id_generate_special return ID into pointer
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:17:49 +04:00
Pavel Emelyanov
9857acc0a2 files: Pass stat information into fd_id_generate
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:17:41 +04:00
Pavel Emelyanov
8b611770aa files: Pass stat information into fd_id_generate_special
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-05 16:17:18 +04:00
Pavel Emelyanov
49b427b721 log: Don't override -v0 with -v2
If we specify log level to none (0) the result is LOG_INFO (2).

Acked-by: Andrew Vagin <avagin@parallels.com>
Acked-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 20:54:25 +04:00
Pavel Emelyanov
d8071ffd1a stats: Fix restore pages stats
We errorneously report nr_compared as total number of restored pages.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 14:03:10 +04:00
Pavel Emelyanov
54f4f889a5 mm: Move VmaEntries from separate image into Mm one
When writing VMAs we perform too many small writes into vma-.img files.
This can be easily fixed by moving the vma-s into mm-s, all the more
so they cannot be splitted from each other.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 11:44:05 +04:00
Pavel Emelyanov
72e462ad67 mm: Read mmentry early
We'll merge mm and vma images, so mm should be read in the
same place where vmas are.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
2014-02-04 11:44:04 +04:00