As we have more than 1 working plugin right now, let's implement
the TODO item for "compel plugins" command.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Ubuntu 14.04 (Travis) doesn't have it.
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Looks-good-to: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Apparently, setup.py from distutils interprets --root= option without
an argument as "--root=." and we end up with what is described
in https://github.com/xemul/criu/issues/309.
Fix is to prepend DESTDIR value (if any) to --prefix argument.
v2: fix uninstall as well
v3: same code, resent via gmail
Reported-by: Juraj Oršulić <juraj.orsulic@fer.hr>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
soccr.h:
After previous patch <linux/types.h> is not needed anymore,
<netinet/tcp.h> is needed as we test for it in feature tests,
i.e., struct tcp_repair_window is declared there.
Also drop forward-declaration of (struct libsoccr_sk) - it's
declared again below.
soccr.c:
Sort headers by name so errors like twice-including errno.h
will not happen again. Also remove assert.h.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
As uint32_t already has occupied the header, use it for all 32-bit
unsigns. That will allow to drop additional include <linux/types.h>.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The same as for Checkpointing - we need to call 32-bit syscall
for compatible tasks here to correctly restore compat_robust_list,
not robust_list.
Note: I check here restorer's *task* arg for compatible mode, not
restorer *thread* arg. As changing application's mode during runtime
is very rare thing itself, application that runs different bitness
threads is most likely not present at all.
If we ever meet such application, this could be improved.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The kernel keeps two different pointers for 32-bit and 64-bit futex
lists: robust_list and compat_robust_list in task_struct.
So, dump compat_robust_list for ia32 tasks.
Note: this means that one can set *both* compat_robust_list and
robust_list pointers by using as we're here 32-bit and 64-bit syscalls.
That's one of mixed-bitness application questions.
For simplification (and omitting more syscalls), we dump here only
one of the pointers.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Check for presence of robust futex list is in the reality
(futex_rla_len != 0).
That does code on dumping, in get_task_futex_robust_list():
> ret = syscall(SYS_get_robust_list, pid, &head, &len);
> if (ret < 0 && errno == ENOSYS) {
[..]
> len = 0;
[..]
> }
[..]
> info->futex_rla_len = (u32)len;
And in images: futex_rla_len == 0 means that futex is not present.
So, we don't need additional restorer's parameter `has_futex'
which is always true, remove it.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Modules pre-load is also slow, but guarding this code with
the presence of criu.kdat cache file seems reasonable.
Or course, one may unload the needed modules by hands, but
such smart user may as well remove the /run/criu.kdat file :)
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Doing kerndat checks on every criu start is way too slow. We
need some way to speed this checks up on a particular box.
As suggested by Andre, Dima and Mike let's try to keep the
collected kdat bits into some tmpfs file. Keeping it on
tmpfs would invaludate this cache on every machine reboot.
There have been many suggestions how to generate this file,
my proposal is to create it once the kdat object is filled
by criu, w/o any explicit command. Optionally we can add
'criu kdat --save|--drop' actions to manage this file.
v2:
* don't ignore return code of write() (some glibcs complain)
* unlink tmp file in case rename failed
v3:
* add one more magic into kerndat_s which is the 'date +%s'
* ignore any errors opening or saving cache. Only size/magic
mismatch matters (and result in dropping the cache)
* cache file path is Makefile-configurable (RUNDIR)
* don't save cache if kerndat auto-detection failed
v4:
* Use ?= for RUNDIR definition.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
v2: When uffd is present, the reported features may still be 0,
so we need one more bool for uffd syscall itself.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Some get_status() methods may allocate data, because
not all of the fields in /proc/[pid]/status file
have the fixed size. For example, NSpid, which
size may vary.
Introduce new method free_status() in counterweight
for such type get_status() methods. it will be called
in case of we go to try_again and need to free allocated
data.
Also, introduce data parameter for a use in the future.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The goal of this function is to compare everything except caps,
but caps size is took to compare. It's wrong, there must be
used offsetof(struct proc_status_creds, cap_inh) instead.
Also, sigpnd may be different too.
v3: Move excluding sigpnd from comparation in this patch (was in another patch).
Reorder fields in seize_task_status().
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Having a "header library" is nice if it's small and clean, but
- we compile its code a few times;
- there is no distinction between internal and external functions.
Let's separate functions out of header and into a .c file.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The function is not included into the library, so having its prototype
there was a shortcut. Move it to a separate include file.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This is an auxiliary source file. The corresponding object file was
cleaned, but .d was not. Add it to SRC/OBJ/DEP so the appropriate files
will be cleaned automatically.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
MAKEFLAGS += -r only works for sub-make, and it is not applicable to
the current instance. Since previous commit make is not re-running
itself (after re-reading deps files), so MAKEFLAGS no longer works.
Use one more way to disable built-in rules that stand in our way.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
As it was pointed out by our esteemed maintainer (let his light shine),
after my recent changes to test/zdtm Makefiles all dependencies are
regenerated even if we only need to build a single test (for example,
cd test/zdtm/static && make env00).
This was caused by "-include $(DEP)" statement. Make sees that these files
are need to be included, but are missing, and since it knows how to generate
them it goes on to do so.
The solution is to use $(wildcard) function which returns the list of
_existing_ files, and so include will only receive the files that exist.
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
inotify_system_nodel.c is supposed to be a symlink to inotify_system.c,
but somehow the file was committed. This, together with the statement
in Makefile to recreate the file, lead to replacing the file with a
symlink during make.
Remove the file, add the symlink, and remove the Makefile rule.
PS yes, I have checked the files are identical.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
When we don't restore any namespaces criu forces tasks to wake
it up two times simply to no-op and wake up tasks back. This
can be optimized by simply omitting the not needed wakeups.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Here's why:
This stage is needed to make sure all tasks have appeared
and did some actions (that are called before restore_finish_stage()).
With this description there's no need in involving criu process
in it, this stage is purely inter-tasks sync point.
Taking into account we do already make root task wait for others
to complete forking (it calls restore_wait_ther_tasks()) we may
rework this stage not to involve criu process in it.
Here's how:
So the criu task starts the forking stage, then goes waiting for
"inprogress tasks". The latter wait is purely about nr_in_progress
counter, thus there's no strict requirement that the stage remains
the same by the time criu is woken up.
Siad that, the root task waits for other tasks to finish forking,
does fini_restore_mntns() (already in the code), then switches
the stage to the next (the RESTORE one). Other tasks do normal
staging barrier. Criu task is not woken up as nr_in_progress
always remains >= 1.
The result is -2 context switches -- from root task to criu and
back -- which gives us good boost when restoring single task app.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Describe what the tasks do during restore and what the
expectations at the sync points are.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Looks like this separate stage is not needed. The scripts
involved in ns restore are synchronized with existing stages
like this:
criu: root task:
ROOT_TASK stage
<appear>
"setup-ns" script
PREPARE_NAMESPACES
prepare_namespace()
"post-setup-ns" script
FORKING
restore_task_mnt_ns()
<everything else>
which seems to be OK.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This is wat root task does -- calls prepare_namespace().
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The stage name is what tasks do, not what criu waits for.
When the first stage is started we want the root task to
come up, rather than namespaces to get restored.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Don't set futexes by hands, use the restore_switch_stage
helpers explicitly (for code readability).
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The restore_switch_stage() already waits tasks at the
end, so there's no need in one more waiting.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It is used now to close descriptors of mount namespaces
and will be used for network namespaces too.
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It is required for cases when we inject a fault in criu restore.
In this case we execute "criu restore" and check that it fails,
then we execute "criu restore" without a fault and check that it passes.
If the first "criu restore" restores only a part of processes,
the second criu can get PID of one of restored processes.
https://github.com/xemul/criu/issues/282
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
We need compat realization for restorer unmap as after rt_sigreturn()
the task is stopped it 32-bit code and ptrace API doesn't allow
setting x86_64 full registers set to ia32 task.
Generic restorer has now x86-specific __export_unmap_compat()
function, which isn't right.
Clean restorer from x86-related realization.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
We don't need __export_unmap_compat() for !CONFIG_COMPAT in restorer
blob. This preparation will allow to move compatible unmap that's
written in x86 asm from generic restorer blob to arch/x86.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Each `make tags` resulted in running feature-tests.
That's not needed for tags generation.
Don't waste time on tests as we don't compile anything.
Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Let's pretend that we're doing something ;-D
FWIW: cleaning two lines in the test output:
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
========================== Run zdtm/static/tty03 in h ==========================
make[2]: Nothing to be done for `default'.
Start test
make[2]: Nothing to be done for `default'.
./tty03 --pidfile=tty03.pid --outfile=tty03.out
Run criu dump
Run criu restore
Send the 15 signal to 24
Wait for zdtm/static/tty03(24) to die for 0.100000
Removing dump/zdtm/static/tty03/24
========================= Test zdtm/static/tty03 PASS ==========================
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Right now they all sit in a separate file. Since we
don't support CLONE_SIGHAND (and don't plan to) it's
much better to have them in core, all the more so
by the time we dump/restore sigacts, the core entry
is at hands already.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
HOSTCFLAGS are populated with CFLAGS if they are set from environment:
CFLAGS=-O1 make
But it turns out that =? operator, which sets variable iff it was unset
previously - is recursive expanded operator.
Which means that value of HOSTCFLAGS is evaluated every time it's used.
Which is wrong with the current flaw in Makefile:
1. it assigns HOSTCFLAGS with CFLAGS from environment as recursive
2. it assigns target-related options to CFLAGS such as -march.
3. HOSTCFLAGS are used (with the current code - here they are expanded
from CFLAGS).
Which results in target-related options supplied to host objects building,
which breaks cross-compilation.
Fix by omitting recursive expansion for HOSTCFLAGS.
Still we need to keep $(WARNINGS) and $(DEFINES) in HOSTCFLAGS.
Link: https://lists.openvz.org/pipermail/criu/2017-April/037109.html
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Reported-by: "Brinkmann, Harald" <Harald.Brinkmann@bst-international.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
DEFINES, LDARCH, VDSO in one place - visually simpler, more terse.
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Arch-specific options will be clearer without support checks.
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It's defined in NMK - don't redefine it.
Remove BTW twice exports and twice definition of $(LDARCH).
Call 64-bit ARM as aarch64.
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Let's keep the same name for 64-bit ARM platform across source.
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Let's call it aarch64 across all CRIU - as I was confused at least once
with arm64 name in NMK and this arch-support check.
Yet allowed to be named arm64, as NMK patch is separated.
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
I bumped in this myself when I had libc6-dev-i386 installed,
while criu said I didn't. Save other guys'es time, spent
in this place.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
When restoring w/o namespaces it doesn't make sence to
mount /proc by hands and detach it. We can just use the
host-side one.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>