This prepares CRIT for python2/python3 compatibility by auto-detecting
the installed python version. python2 is detected first and then the
variable PYTHON is set.
By setting the variable PYTHON to python2/python3 the user can override
the auto-detection.
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Otherwise, tests dependencies are not considered for build.
Add an error in Makefile.inc so this won't happen again.
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This reverts commit dcafa78b96084216e32b757be62b7fa8aebbdd87.
I've found that we already include deps in Makefile.inc,
was to fast on the first attempt and overlooked this.
(the include just doesn't work like it should yet..)
The origin patch may be just dropped before preparing master's merge.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This helper must init every member of a structure,
not some part of it. Beautify code as well.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Otherwise there is an imbalance in logs with number of
"Reverted working dir" message
| (00.018604) 36: unix: Connected 0x11ceff -> 0x11cf00 (0x11cf00) flags 0
| (00.018644) 36: unix: Reverted working dir
| (00.018652) 36: unix: Connected 0x11cefd -> 0x11cefe (0x11cefe) flags 0
| (00.018665) 36: unix: Reverted working dir
| (00.018688) 36: unix: Reverted working dir
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
To unify style of pointers fetching from memory slab.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
For readability sake. And use standart uint8_t types.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
CID 190174 (#1 of 1): Argument cannot be negative (NEGATIVE_RETURNS)
6. negative_returns: fd is passed to a parameter that cannot be negative.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
>>> CID 190177: Integer handling issues (NEGATIVE_RETURNS)
>>> rpc_sk is passed to a parameter that cannot be negative.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
CID 84654 (#1 of 1): Resource leak (RESOURCE_LEAK)
6. leaked_handle: Handle variable fd going out of scope leaks the handle.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The purpose of the num_children field in 'struct lazy_pages_info' was to
prevent closing the page-read while there are still active processes that
share it. It did work for the case when handling of the child processes
finished before the parent process. However, if the parent lpi is closed
first, we've got a dangling pointer at lpi->parent.
The obvious solution is to use simple reference counting.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It is possible that notification about restore finish arrives at the same
time with a fork event. In such case we return to epoll_run_rfds without
resetting the poll_timeout and then we'll keep polling for events
indefinitely. To avoid this, we reset the poll_timeout to 0 as soon as we
know that restore is finished.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The only use for the userfault file descriptor after the process exited is
for debug logs. Using negative value instead of 0 makes logs more readable.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Although they are the same, xfree() looks more consistent with other code
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
There are a lot of lines, which are longer than 79:
(04.331172)pie: 1: Error (criu/pie/restorer.c:460): seccomp: Unexpected tid ->
(04.331172)pie: 1: 1 != 1
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Credential commitment affects dumpable and pdeath signals
so we have to move their restore after the restore_creds,
just like we have in __export_restore_task (ie for
group leader).
https://jira.sw.ru/browse/PSBM-84198
Acked-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
kerndat_try_load_cache() fills kdat from /run/criu.kdat,
so it will contain some trash, if criu.kdat isn't compatible with the
current version of criu.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Looking up for pid in nesting pidns supposed to be done
for non group leaders only, thus __export_restore_thread
do this check on its own and we don't have to make
a similar lookup especially on group leader where
pids in args never were valid.
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Andrew proposed the test which actually triggered the issue
in current seccomp series, put it into a regular basis.
Suggested-by: Andrey Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
When considering if we to call PTRACE_O_SUSPEND_SECCOMP
on the tid we should take into account if there at least
one thread which has seccomp mode enabled, otherwise
we might miss filter suspension and restore procedure
might break due to own criu syscall get filtered out.
Same time we should move seccomp restore for threads
to take place after CR_STATE_RESTORE_SIGCHLD state
so that main criu code will attach to threads and
setup seccomp suspension flag before we start
restoring the filters.
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
To checkpoint per-thread seccomp filters we need
a significant rework of a dumping code. The general
idea is the following:
- Each thread is tracked by its tid inside global
seccomp rbtree thus we can easily add entries
there or lookup on demand.
- When we collect threads into pstree entries we fetch
its seccomp mode from procfs parsing routine and allocate
a new entry inside rbtree to remember the seccomp mode.
Note at this moment we're not dumping real filters yet
(because filter data image is a single one for all consumers)
- Once all tids are collected and our tree is complete we call for
seccomp_collect_dump_filters helper which walks every pstree entry
and iterate over each tid inside thread group calling
seccomp_dump_thread, which in turn uses ptrace engine to fetch
filters and keep this data in memory.
To optimize data usage we figure out if we can use TSYNC flag
on restore calling try_use_tsync helper: for TSYNC flag kernel
automatically propagate filter to all threads, thus we need to
compare all filters inside thread group for identity since there
is no other way to figure out if user passed TSYNC flag when
been creating filters.
- Finally dump_seccomp_filters is called which does real write
of seccomp filter data into an image file.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
At now we pretend that all threads are sharing seccomp chains
and at checkpoint moment we test seccomp modes to make sure
if this assumption is valid refusing to dump otherwise.
Still the kernel tacks seccomp filter chains per each thread
and now we've faced applications (such as java) where per-thread
chains are actively used. Thus we need to bring support of handling
filters via per-thread basis.
In this a bit intrusive patch the restore engine is lifted up
to treat each thread separately. Here what is done:
- Image core file is modified to keep seccomp filters
inside thread_core_entry. For backward compatibility
former seccomp_mode and seccomp_filter members in
task_core_entry are renamed to have old_ prefix and
on restore we test if we're dealing with old images.
Since per-thread dump is not yet implemeneted the
dumping procedure continue operating with old_ members.
- In pie restorer code memory containing filters are addressed
from inside thread_restore_args structure which now
contains seccomp mode itself and chain attributes
(number of filters and etc).
Reading of per-thread data is done in seccomp_prepare_threads
helper -- we take one pstree_item and walks over every thread
inside to allocate pie memory and pin data there.
Because of PIE specific, before jumping into pie code
we have to relocate this memory into new place and
for this seccomp_rst_reloc is served.
In restorer itself we check if thread_restore_args provides
us enabled seccomp mode (strict or filter passed) and call
for restore_seccomp_filter if needed.
- To unify names we start using seccomp_ prefix for all related
stuff involved into this change (prepare_seccomp_filters renamed
to seccomp_read_image because it only reads image and nothing
more, image handler is renamed to seccomp_img_entry instead
of too short 'se'.
With this change we're now allowed to start collecting and
dumping seccomp filters per each thread, which will be
done in next patch.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Note that there is no real usage of this flag on restore,
we simply save it in image and will make a real use
later.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This header is main place for all seccomp related
structures so move seccomp_info here. This will
allow to minimize changes area when need to update
definitions and such.
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>