Use explicit type to placate the compiler.
| proc_parse.c: In function 'vma_get_mapfile':
| proc_parse.c:282:6: error: format '%lx' expects argument of type 'long unsigned
| int', but argument 5 has type 'uint64_t' [-Werror=format=]
| pr_err("Strange file mapped at %lx [%s]:%d.%d.%ld\n",
| ^
| proc_parse.c:335:5: error: format '%lx' expects argument of type 'long unsigned
| int', but argument 5 has type 'uint64_t' [-Werror=format=]
| pr_err("Failed to resolve mapping %lx filename\n",
| ^
| cc1: all warnings being treated as errors
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We were not going to change it on dump
Cc: Dmitry Safonov <dsafonov@odin.com>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Loginuid tests should run only when kdat.has_loginuid set.
This is for > 3.13 kernels with CONFIG_AUDITSYSCALL enabled.
Signed-off-by: Dmitry Safonov <dsafonov@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Dump/Restore loginuid value only when kdat.has_loginuid set.
Signed-off-by: Dmitry Safonov <dsafonov@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This value will differ on C/R:
- on checkpoint it means that it's possible to dump logiuid values;
- on restore it means that it's possible to unset loginuid and write
saved value to unsetted loginuid.
Signed-off-by: Dmitry Safonov <dsafonov@odin.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
rst_mem_alloc rules might move previously allocated
blocks which means we can't reuse previously allocated
pointer if new block is created. Thus remember pointers
positions where needed and adjust them accordingly.
https://github.com/xemul/criu/issues/97
Reported-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On restore we try to put data back into recv queue with quite
big chunks. However the kernel doesn't try hard to split the
data into packets in repair mode for this queue and just
allocates the linear skb of the given size. If the size is
moderately big, the allocation is subject to fail, slab doesn't
reliably allocates memory over 4k.
So, when failing with big chunk on recv queue -- shrink the
chunk and try again.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Some tests require criu to be root, e.g. tcp tests of unlink-mmaps ones,
so mark those in desc files as such.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
There are several restrictions:
1. Onlu dump is checked (--norst) for now
2. Only host flavor as tests has to start themselves in non-root mode
3. Only non-suid tests
4. TCP doesn't work too, should be manually excluded :\
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Currently parasite is loaded using the map_files dir,
which is guarged with CAP_SYS_ADMIN by default (which
is dropped in 4.2 series). So lets do a deal -- try
to use memfd interface first (which has been introduced
in 4.0 kernel series) and if we fail then switch to old
map_files interface.
With time all users are switched to new kernels so
memfd gonna be primary interface.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Kernel doesn't allow to mess with map_files dir in proc. So,
when doing dump from user process, we should try to get
file path using path from smaps file. To be 100% sure the
path is correct we also get device and ino numbers and
check them agains the stat()-ed path ones.
With this scheme we miss
- mapped packet sockets, but users don't use them
- AIOs, but this can be detected via device, inode and name
- several nested mntns's, but users don't use them
- mapped and unlinked files, but this can be fixed by
reading file via task's memory (slow, but still)
gorcunov@:
- For special mappings such as heap, vsyscall, vdso and such
the kernel provides names rounded by brackets so exit
from vma_get_mapfile if we meet one and allow the caller
to handle it.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
If criu is running in unprivileged mode and we can't
access dumpee's pagemap file -- simply switch to
greedy mode where all pages are gonna be dumped
regardless of their presence in memory.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We use page frame number to detect vDSO which has been remapped
in-place from runtime vDSO during restore. In such case if the
kernel is younger than 3.16 the "[vdso]" mark won't be reported
in procfs output.
Still to address recently reported CVEs and be able to run CRIU
in unprivileged mode we need to handle vDSO without pagemap access
and here is the deal -- when we find VMA which "looks like" vDSO
we try to scan it for vDSO symbols and if it matches we restore
its status without PFN access.
Here is some details on @pagemap access in-kernel history:
- @pagemap introduced in commit 85863e475e59 where anyone
which can attach to a task via ptrace is allowed to read
data from @pagemap (Feb 4 2008, v2.6.25-rc1)
- in commit 006ebb40d3d65 ptrace attach rule has been changed
into ptrace read permission (May 19 2008, v2.6.27-rc1)
- in commit ab676b7d6fbf4 opening of @pagemap become guarded
with CAP_SYS_ADMIN because of leak of physical addresses
into userspace (Mar 9 2015, v4.0-rc5)
- in commit 1c90308e7a77a opening of @pagemap become available
for regular users again (with ptrace read permission) but
physical addresses of pages are hidden from non-privileged
userd (Sep 8 2015, v4.3-rc1)
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
In particular, we won't be able to do memory tracking and
zero page detection.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Kernel doesn't allow to read /proc/pid/map_files. This file
is used to get pseudo device for anon shmem mappings, but
this info can be get by scanning /proc/self/maps file.
This works slower, but still.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
When run from regular user criu will get EACCES/EPERM from
opening proc, but in some situations criu will now how to
deal with it. So this patch makes it possible not to print
error message in logs for such cases.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
We will need an extra space for memfd based
syscall (without poking the stack since it's
not that safe without additional tests).
So add @pad argument which will be used
to find proper memory for seized syscall
execution.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Right now we only get first 31 symbols of it, but in the
next patches full path would be required.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
We no longer support root-mode service and suid binaries, so
any artificial restrictions no longer make sense.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
To test c/r of creds we need more precise way,
so lets add a few additional creds to test.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise we will not able to access /proc/pid/* for the process.
v2: s/__NR_WAIT4/__NR_setresuid
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Modification time changes after cpt/rst file_attr test in VZ7CT:
CT-102 criu# cat test/zdtm/live/static/file_attr.out
15:05:05.315: 146: FAIL: file_attr.c:101: modification time has
changed (errno = 11 (Resource temporarily unavailable))
https://jira.sw.ru/browse/PSBM-41401
v2: add timeval message, test seem to pass now - remove noauto
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This as well as restore requires several steps to reach per-thread
support during dump stage
- @creds area to be fetched from the parasite is embedded into
parasite_dump_structure
- when test for task to be dumpable we no longer compare caps
because we now allow them to be different (and I renamed
proc_status_creds_eq to proc_status_creds_dumpable for this
sake)
- have to extend dump_thread_common to support dumping of
creds (we call for dump_thread_common in several places,
in particular when we need to fetch misc params we don't
need creds, here @creds option comes into the play)
- after this patch no creds-X.img file be generated anymore,
I guess we might drop it off with time from descriptors
https://jira.sw.ru/browse/PSBM-41416
v2:
- In dump_task_creds() don't mangle the call for parasite_dump_creds
and collect_lsm_profile
- PARASITE_MAX_GROUPS takes parasite_dump_thread into account because
dump_thread_common now serves two cases: for plain misc parameters
fetching and for creds as well (depending on the context)
- when test for dumpable we still require the seccomp filters
to match, they can be different and we need to support such
configuration too but not in this series
v3:
- Rip off dump_task_creds completely, together with PARASITE_CMD_DUMP_CREDS,
we dump creds unconditionally in dump_thread_common
- the group leader thread data is fetched via new
parasite_dump_thread_leader_seized helper
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Because the creds parameters are to be passed inside pie/restorer
code but read before thread_restore_args and task_restore_args
structures are allocated we need a small trick and prepare
creds int several stages
- collect all creds data into separate private memory blobs
- once all memory needed for restorer is allocated we relocate
pointers in this blocks and setup
thread_restore_args::thread_creds_args to appropriate
address
- restorer works as usual and setup creds parameters as before
v2:
- fix addressing in positioning of rst_ memory (I've occasionally
zap pointers and when been sending patches forgot to merge changes
back, so while I've the series successfully restoring containers
with different creds, if been merged the series won't work. So
all changes are merged as appropriate)
- drop module's global @cap_last_cap from pie/restorer.c
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For easier comparision which gonna be addressed in next patch.
https://jira.sw.ru/PSBM-41416
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Creds are per-thread data, declare them appropriately.
We will need this data to restore threads with different
credentials.
(In a scope of https://jira.sw.ru/browse/PSBM-41416)
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The mountpoints.c test creates such mount and criu will try to
kerndat-check one, so this fs should be on "host".
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The mountpoints.c test creates such mount and criu will try to
kerndat-check one, so this fs should be on "host".
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A static test should not change its state during C/R
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Similar to devtmpfs and devpts, skip binfmt_misc
mount if it's not virtual.
Signed-off-by: Kirill Tkhai <ktkhai@odin.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>