Will need them to mask some of the features from
command line options.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
On fedora rawhide seccomp_metadata for some
reason is not defined (while in kernel it introduced
together with PTRACE_SECCOMP_GET_METADATA). So
lets do a trick for a while -- define own alias.
Once system headers get settled down we might find
more suitable solution. Because it's a part of kernel
API we're on the safe side.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
We will use it to figure out if filter log target is used.
Metadata associated with seccomp filter is relatively new
feature which allows userspace to get and set it back.
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
For architectures like aarch64/ppc64 it's needed to propagate the size
of page inside PIEs. For the parasite page size will be defined during
seizing, and for restorer during early initialization.
Afterward we can use PAGE_SIZE in PIEs like we did before.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Ugh, I've spent 25 mins at 4 A.M. to figure out why the tests are failing.
And the reason is stupied me, who defined a new flag after 0x8
as 0x16, not as 0x10. Simplify those definitions for such simple-minded
living creatures like Dima.
Signed-off-by: Dmitry Safonov <dima@arista.com>
On Skylake processors and kernel older than v4.14
ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, iov)
may return not full xstate, ommiting FP part (that is XFEATURE_MASK_FP).
There is a patch which describes this bug:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1318800.html
Anyway, it's fixed in v4.14 kernel by (what we believe with Andrey) this:
https://patchwork.kernel.org/patch/9567939/
As we still support kernels from v3.10 and newer, we need to have a
workaround for this kernel bug on Skylake CPUs.
Big thanks to Shlomi for the reports, the effort and for providing an
Amazon VM to test this. I wish more bug reporters were like you.
Reported-by: Shlomi Matichin <shlomi@binaris.com>
Provided-test-env: Shlomi Matichin <shlomi@binaris.com>
Investigated-with: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
get_task_regs() needs to know if it needs to use workaround
for a Skylake ptrace() bug. The next patch will introduce a
new flag for that.
I also thought about making 3 versions of get_task_regs() and
adding them to ictx->get_task_regs() depending on the flags..
But get_task_regs() is a private function and infect_ctx is
a uapi.. So, let's just pass context flags to get_task_regs().
Signed-off-by: Dmitry Safonov <dima@arista.com>
As we anyway define save_regs_t for other purposes,
use it in the function declaration.
To unify infect_ctx style, add make_sigframe_t.
Mere cleanup.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Some get_status() methods may allocate data, because
not all of the fields in /proc/[pid]/status file
have the fixed size. For example, NSpid, which
size may vary.
Introduce new method free_status() in counterweight
for such type get_status() methods. it will be called
in case of we go to try_again and need to free allocated
data.
Also, introduce data parameter for a use in the future.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The goal of this function is to compare everything except caps,
but caps size is took to compare. It's wrong, there must be
used offsetof(struct proc_status_creds, cap_inh) instead.
Also, sigpnd may be different too.
v3: Move excluding sigpnd from comparation in this patch (was in another patch).
Reorder fields in seize_task_status().
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This naming is left from the first compatible kernel patches.
At that time to return to 32-bit task rt_sigreturn was used with
a special flag.
Now it's not true anymore, the naming doesn't relate.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
These helpers are valuable and can be used outside.
Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
When infecting victim we construct sigframe to
be able to self-rectore it in case if something
goes wrong. But in case is a targer been using
alternative stack for signal handling it will
be missed in sigframe since we don't fetch it.
Thus add fetching sas on infection stage and
put it into signal frame early.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
CID 73371 (#1 of 1): Big parameter passed by value (PASS_BY_VALUE)
pass_by_value: Passing parameter regs of type user_regs_struct_t
(size 224 bytes) by value.
Suggesting to do this until compel is released and API is cut in stone.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Found by Coverity error:
> CID 172193 (#1 of 1): Bad bit shift operation (BAD_SHIFT)
> 1. large_shift: In expression 1 << sig % 64, left shifting
> by more than 31 bits has undefined behavior. The shift amount,
> sig % 64, is as much as 63.
That is:
1. yes, UB
2. while adding a signal to mask, this has flushed all other
signals from mask.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This function works like printf, and it helps the compiler
to know that, so it can check whether arguments fit the
format string.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
These are part of compel UAPI so should be prefixed with COMPEL_
in order to not pollute the namespace. While at it, move from
set of defines to an enum, which looks a bit cleaner.
Also, kill LOG_UNDEF as it's not used anywhere.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Those macros look twice as long as they should be on my 80-columns
terminal. As there is nothing here to justify such width, go ahead
and remove the extra tabs, keeping the code within 80 cols.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It's now obsoleted by compel library.
Maybe-TODO: Add compel tool exec action?
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
First, TASK_* defines provided by compel should be prefixed
with COMPEL_. The complication is, same constants are also used
by CRIU, some are even writted into images (meaning we should
not change their values).
One way to solve this would be to untie compel values from CRIU ones,
using some mapping between the two sets when needed (i.e. in calls to
compel_wait_task() and compel_resume_task()).
Fortunately, we can avoid implementing this mapping by separating
the ranges used by compel and criu. With this patch, compel is using
values in range 0x01..0x7f, and criu is reusing those, plus adding
more values in range 0x80..0xff for its own purposes.
Note tha the values that are used inside images are not changed
(as, luckily, they were all used by compel).
travis-ci: success for compel uapi cleanups (rev2)
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
infect.h includes compel.h, and compel.h includes infect.h.
Surely, due to include guards it will be sorted out, but
we'd rather just include what we need.
travis-ci: success for compel uapi cleanups
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
We have ptrace defines and functions that are part of UAPI,
and we have some internal stuff not to be exposed. Split
ptrace.h into two files accordingly.
While at it, do some cleanups:
- add ptrace_ prefix to some functions and macros
- remove (duplicated) PTRACE_* defines from .c files
- rename ptrace_seccomp(), remove its duplicate
- remove unused ptrace defines
- remove unneeded (ptrace-related) includes
travis-ci: success for compel uapi cleanups
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Similar to the previous commit, there is absolutely no need
to create/remove this symlink from Makefiles, as it can be
made a constant one. Add the symlink to sources and save
a few lines in Makefiles.
travis-ci: success for More polishing for compel cli
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
There is absolutely no need to create/remove this symlink
from Makefiles, as it is constant. Just add the symlink to
sources and save a few lines in Makefiles.
travis-ci: success for More polishing for compel cli
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
I saw this line in the code
unsigned long sret = -ENOSYS;
and ended up with this patch. Note syscall(2) man page says return value
is long -- who am I to disagree?
travis-ci: success for More polishing for compel cli
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This -u option always looked wrong to me, I mean, how the hell a user
is supposed to know where the hell those headers are? It took quite
a while to figure out what to do with it, but the end result is --
this option is not needed at all and can easily be dropped.
For finding paths to includes, there is a -I compiler option,
there's no need to specify something to compel.
In fact, it should know by itself where its own headers are kept
(and emit -I... to cflags if needed), but that's another story
which is to be told when we'll decide to pack compel as a standalone
tool. For now, just add "#include <compel/compel.h>" and be done.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
They are no longer needed.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Currently, some compel internals are exposed to user API
(both C and CLI), making its usage more complicated than
it can be.
In particular, compel user have to specify a number of parameters
(names for various data) on the command line, and when in C code
assign a struc piegen_opt_t fields using the same names, without
using those identifiers anywhere else in the code.
It makes sense to hide this complexity from a user, which is what
this commit does.
First, remove the ability to specify individual names for data,
instead introducing a prefix that is prepended to all the names.
Second, generate a function %PREFIX%_setup_c_header() which does
all the needed assignments.
Third, convert users (criu/pie and compel test) to the new API.
NOTE that this patch breaks ARM, as compel hgen is not used for ARM.
This is to be fixed by a later patch in the series.
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Simply run tracee from specfied IP assuming
it's arelady have trapping instruction in
stream.
It's unsafe low-level function use with caution.
travis-ci: success for compel: A fix and new helper
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Why should we have self-unmapping code in parasite?
It looks like, we can drop this code using simple sys_unmap()
injection (like that I did for `criu exec` action and for cases where we
failed to insert parasite by some reason, but still need to unmap remotes).
It's an RFC, so just a suggestion - maybe I miss something you have in
mind - please, describe that/those things.
My motivation is:
- less code, defined commands for PIE, one BUG() less, one jump to PIE less
- I'm making one 64-bit parasite on x86 instead of two 32 and 64 bit.
It works (branch 32-one-parasite) with long-jump in the beginning to
64-bit code from 32-bit task.
On parasite curing it sig-returns from 64-bit parasite to 32-bit task,
this point we're trapping in CRIU. After that we command parasite to
unmap itself, so it long-jumps again to parasite 64-bit code, unmaps,
we caught task after sys_unmap and the task is with 64-bit CS.
We can't set 32-bit registers after this - kernel checks that
registers set is the same on PTRACE_SETREGSET:
> > static int ptrace_regset(struct task_struct *task, int req, unsigned int type,
> > struct iovec *kiov)
...
> > if (!regset || (kiov->iov_len % regset->size) != 0)
> > return -EINVAL;
So, to return again to 32-bit task I need sigreturn() again or add
long-jump with 32-bit CS.
I've disable that for 32-bit testing with (in compel_cure_remote):
- if (ctl->addr_cmd) {
+ if (ctl->addr_cmd && user_regs_native(&ctl->orig.regs)) {
And it works. It also works for native tasks, so why should we keep it?
travis-ci: success for compel: kill self-unmap in parasite
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It uses regs caller doesn't always know and is actually a
core routine under the API compel_syscall() one.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The size value should be page_size() aligned, which is
inconvenient for callers, and also differs from the bsize
only a little bit, so it's nicer to have the nr_gotpcrel
value which is anyway generated by compel hgen.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
These names are generated by compel hgen, so there's no
need in making callers know them.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The routine itself is in library, just forgot to putt the
declaration into UAPI header.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Because we build compel from toplevel directory
inclusion of "common/" doesn't cause any problem
but will in future (especially when our headers
start using it).
Thus add symlink immediately and it will be a notice
for installer that common directory in needed in uapi.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Right now we load blob into libcompel by providing values
from .h file which was generated by "compel hgen" command.
In the future we'd like to provide other ways (e.g. by
pusing mmap()-ed memory with .o file, or by .o file path),
so prepare for such future.
travis-ci: success for compel: Prepare for several ways to load blob into libcompel
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Now we have two routines one of which needs a callback for
proc parsing. This is complex, but needed by CRIU. For others
let's have a single "stop" call that would to everything.
travis-ci: success for compel: Contrinue improving library
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
CRIU keeps all registers on CoreEntry and makes sigframe from
them as well, which means anyone using the compel library
have to provide own handlers, which is inconvenient. So
now it's possible to leave this task for libcompel itself:
it will save the regs and prerare sigframe on its own.
travis-ci: success for compel: Contrinue improving library
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
CRIU sets up a child hander to get errors from tasks it
infects. For compel we'd have the same problem, so there's
a way to request for custom child handler, but compel
should provide some default by himself. And it's not clear
atm how this should look like, so here's a plain stub to
move forward.
travis-ci: success for compel: Contrinue improving library
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The original compel_prepare() also initializes the infect_ctx with
values suitable for simple usage. As a starting point the task_size
value is set.
The compel_prepare_noctx() allocates ctx-less handler that is to be
filled by the caller (CRIU).
travis-ci: success for compel: Contrinue improving library
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This one is needed only for task_size() on some arches and it is
simpler to keep this routine in compel .c rather than messing
with common/page.h installation.
https://travis-ci.org/xemul/criu/builds/177585567
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Currently we prepare a parasite socket only once and
save it in a static variable.
It's bad idea to use a static variable in a library.
In addition, it doesn't work if we have processes in
different network namespaces. In this case, we have to have
a separate socket for each namespace.
v2: fix compilation on Alpine
convert *p_sock into sock
travis-ci: success for compel: check whether a parasite socket is prepared each time (rev2)
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
The same as prev patch -- clean up the compel.h
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>