When PTRACE_GET_THREAD_AREA errors on kernels with
!CONFIG_IA32_EMULATION beacuse of missing support (-EIO), compel should
ignore uch errors in native mode.
However the check for error type uses return value of ptrace rather than
errno, which will always result in error propagation.
Use errno to detect type of error to fix this.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Parasite creation started to fail with GCC 12:
On x86_64 with:
./compel/compel-host hgen -f criu/pie/restorer.built-in.o -o criu/pie/restorer-blob.h
Error (compel/src/lib/handle-elf-host.c:337): Unexpected undefined symbol: `strlen'. External symbol in PIE?
On aarch64 with:
ld: criu/pie/restorer.o: in function `lsm_set_label':
/drone/src/criu/pie/restorer.c:174: undefined reference to `strlen'
Line 174 is: "for (len = 0; label[len]; len++)"
Adding '-ffreestanding' to parasite compilation fixes these errors
because, according to GCC developers:
"strlen is a standard C function, so I don't see any bug in that being used
unless you do a freestanding compilation (-nostdlib isn't that)."
Signed-off-by: Adrian Reber <areber@redhat.com>
This is a confusing change as it seems the original code was just wrong.
GCC 12 complains with:
In function ‘__conv_val’,
inlined from ‘std_strtoul’ at compel/plugins/std/string.c:202:7:
compel/plugins/std/string.c:154:24: error: array subscript 97 is above array bounds of ‘const char[37]’ [-Werror=array-bounds]
154 | return &conv_tab[__tolower(c)] - conv_tab;
| ^~~~~~~~~~~~~~~~~~~~~~~
compel/plugins/std/string.c: In function ‘std_strtoul’:
compel/plugins/std/string.c:10:19: note: while referencing ‘conv_tab’
10 | static const char conv_tab[] = "0123456789abcdefghijklmnopqrstuvwxyz";
| ^~~~~~~~
cc1: all warnings being treated as errors
Which sounds correct. The array conv_tab has just 37 elements.
If I understand the code correctly we are trying to convert anything
that is character between a-z and A-Z to a number for cases where
the base is larger than 10. For a base 11 conversion b|B should return 11.
For a base 35 conversion z|Z should return 35. This is all for a strtoul()
implementation.
The original code was:
static const char conv_tab[] = "0123456789abcdefghijklmnopqrstuvwxyz";
return &conv_tab[__tolower(c)] - conv_tab;
and that seems wrong. If conv_tab would have been some kind of hash it could
have worked, but '__tolower()' will always return something larger than
97 ('a') which will always overflow the array.
But maybe I just don't get that part of the code.
I replaced it with
return __tolower(c) - 'a' + 10;
which does the right thing: 'A' = 10, 'B' = 11 ... 'Z' = 35
Signed-off-by: Adrian Reber <areber@redhat.com>
Fixes: e2e8be37 ("x86/compel/fault-inject: Add a fault-injection for corrupting extended regset")
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Since
e2e8be37 ("x86/compel/fault-inject: Add a fault-injection for corrupting extended regset")
we doing fault-injection test for C/R of threads register set by filling tasks
xsave structures with the garbage. But there are some features for which that's not
safe. It leads to failures like described in #1635
In this particular case we meet the problem with PKRU feature, the problem
that after corrupting pkru registers we may restrict access to some vma areas,
so, after that process with the parasite injected get's segfault and crashes.
Let's manually specify which features is save to fill with the garbage by
keeping proper XFEATURE_MASK_FAULTINJ mask value.
Fixes: e2e8be37 ("x86/compel/fault-inject: Add a fault-injection for corrupting extended regset")
https://github.com/checkpoint-restore/criu/issues/1635
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
The mips64el-cross test target started to show following error:
error: listing the stack pointer register '$29' in a clobber list is deprecated [-Werror=deprecated]
This fixes it in three different places by removing $29' from the
clobber list. This is only compile tested as we have no mips hardware
for testing.
Signed-off-by: Adrian Reber <areber@redhat.com>
pidfd_getfd syscall will be needed later to send pidfds between
pre-dump/dump iterations for pid reuse detection.
v2:
- check size written/read of val_a/val_b is correct
- return with error when val_a != val_b
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
pidfd_open syscall will be needed later to send pidfds between
pre-dump/dump iterations for pid reuse detection.
v2:
- make kerndat_has_pidfd_open void since 0 is always returned
- fix missing tabs in syscall tables
Signed-off-by: Zeyad Yasser <zeyady98@gmail.com>
This is the workaround for #1429.
The parasite code contains instructions that trigger SIGTRAP to stop at
certain points. In such cases, the kernel sends a force SIGTRAP that
can't be ignore and if it is blocked, the kernel resets its signal
handler to a default one and unblocks it. It means that if we want to
save the origin signal handle
Signed-off-by: Andrei Vagin <avagin@gmail.com>
CRIU follows Linux kernel coding style. This patch updates the
architecture-specific code for MIPS to use tab indentation,
add whitespace between closing parenthesis and open bracket,
and changes the mode of source files from 755 to 644.
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
My editor (vim) auto-removes whitespace at EOL for *.c and *.h files,
and I think it makes sense to have a separate commit for this, rather
than littering other commits with such changes.
To make sure this won't pile up again, add a line to Makefile under
the linter target to check for such things (so CI will fail).
This is all whitespace except an addition to Makefile.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Another pr_perror spring cleaning time!
As pr_perror adds a semicolon, an strerror(errno), and a newline,
there's no need to add one manually.
Brought to you by
for f in $(git grep -l pr_perror); do
test -f $f || continue
echo $f
sed -i '\%^[[:space:]]*pr_perror(.*\\n"%s/\\n//' $f
done
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This reverts commit c98af78c58e2168d2322cd0ee15837468fd4ffb0.
Now FPU/SSE/MMX/etc can be used inside parasite.
Let's have compiler optimizations back.
Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Arch-dependend way to restore extended registers set.
Use it straight-away to restore per-thread registers.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Extended registers set for task is restored with rt_sigreturn() through
prepared sigframe. For threads it's currently lost.
Preserve it inside thread context to restore on thread curing.
Signed-off-by: Dmitry Safonov <dima@arista.com>
With pseudo-random garbage, the seed is printed with pr_err().
get_task_regs() is called during seizing the task and also for each
thread.
At this moment only for x86.
Signed-off-by: Dmitry Safonov <dima@arista.com>
To minimize things done in parasite, PTRACE_GET_THREAD_AREA can be
used to get remote tls. That also removes an additional compat stack
(de)allocation in the parasite (also asm-coded syscall).
In order to use PTRACE_GET_THREAD_AREA, the dumpee should be stopped.
So, let's move this from criu to compel to non-seized state and put tls
into thread info on x86.
Signed-off-by: Dmitry Safonov <dima@arista.com>
musl defines 'loff_t' in fcntl.h as 'off_t'.
This patch resolves the following error when running the compel tests
on Alpine Linux:
gcc -O2 -g -Wall -Werror -c -Wstrict-prototypes -fno-stack-protector -nostdlib -fomit-frame-pointer -ffreestanding -fpie -I ../../../compel/include/uapi -o parasite.o parasite.c
In file included from ../../../compel/include/uapi/compel/plugins/std/syscall.h:8,
from ../../../compel/include/uapi/compel/plugins/std.h:5,
from parasite.c:3:
../../../compel/include/uapi/compel/plugins/std/syscall-64.h:19:66: error: unknown type name 'loff_t'; did you mean 'off_t'?
19 | extern long sys_pread (unsigned int fd, char *buf, size_t count, loff_t pos) ;
| ^~~~~~
| off_t
../../../compel/include/uapi/compel/plugins/std/syscall-64.h:96:46: error: unknown type name 'loff_t'; did you mean 'off_t'?
96 | extern long sys_fallocate (int fd, int mode, loff_t offset, loff_t len) ;
| ^~~~~~
| off_t
../../../compel/include/uapi/compel/plugins/std/syscall-64.h:96:61: error: unknown type name 'loff_t'; did you mean 'off_t'?
96 | extern long sys_fallocate (int fd, int mode, loff_t offset, loff_t len) ;
| ^~~~~~
| off_t
make[1]: *** [Makefile:32: parasite.o] Error 1
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
These changes enable running all compel tests with a single
command from the root path of the repository:
# sudo make -C compel/test
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
1. The -ERESTART_RESTARTBLOCK case in get_task_regs() depends on kernel
internals too much, and for no reason. We shouldn't rely on fact that
a) we are going to do sigreturn() and b) restore_sigcontext() always
sets restart_block->fn = do_no_restart_syscall which returns -EINTR.
Just change this code to enforce -EINTR after restore, this is what
we actually want until we teach criu to handle ERESTART_RESTARTBLOCK.
2. Add pr_warn() to make the potential bug-reports more understandable,
a sane application should handle -EINTR correctly but this is not
always the case.
Fixes: #1325
Report-by: Mr Travis
Inspired-by: dd71cca58ada ("dump/x86: sanitize the ERESTART_RESTARTBLOCK -> EINTR transition")
Signed-off-by: Andrei Vagin <avagin@gmail.com>
We've seen addresses in parasite.built-in.o precalculated by linker but
in some unexpected manner:
readelf -WS criu/pie/parasite.built-in.o
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 1] .text PROGBITS 0000000000000000 000040 00400a 00 AX 0 0 16
[87] .data PROGBITS 0000000000000000 005000 000068 00 WA 0 0 4096
[88] .rodata PROGBITS 0000000000000080 005080 001016 00 A 0 0 32
(Notes: All other sections does not have SHF_ALLOC or are of size 0, so I
skip them. Need to add "-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1" to
CFLAGS to reproduce.)
Section 88 has address 0x80 in elf file but in our "consequent"
addresses precalculation algorithm it should be at 0x5080:
addr(.text) == 0x0
addr(.data) == 0x400a + (0x1000 - 0x400a % 0x1000) + 0x68 == 0x5068
addr(.rodata) == 0x5068 + (0x20 - 0x5068 % 0x20) == 0x5080
Probably the linker advises us to move 4096 aligned section to the
beginning to save some space, but it's just a guess.
So probably we should be ready to "non-consequent" alignments
precalculated and just override them.
https://github.com/checkpoint-restore/criu/issues/1301
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
%u is quite common and I remember there were workarounds to print
(unsigned long) as long or whatever.
Just support it from now - it's not hard and not much code.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Currently if the specifier can't be parsed - error message is printed
and parsing of the format string continues. That's wrong as the argument
for the specifier will be used for the next specifier. I.e:
pr_info("[%zu]`%s`\n", 0UL, "")
will crash PIE because %u is not known and the argument (0UL) will be
used for dereferencing string for %s.
Stop parsing printf position arguments at an unknown specifier.
Make this string visible so that `grep Error` in zdtm.py will catch it:
=[log]=> dump/zdtm/static/busyloop00/52/1/restore.log
------------------------ grep Error ------------------------
b'(00.001847) pie: 52: vdso: ['
b'Error: Unknown printf format %u'
------------------------ ERROR OVER ------------------------
Send the 15 signal to 52
Wait for zdtm/static/busyloop00(52) to die for 0.100000
======================= Test zdtm/static/busyloop00 PASS =======================
Reported-by: @ashwani29
Signed-off-by: Dmitry Safonov <dima@arista.com>
Using scan-build there is a warning about
infect.c:231:17: warning: The left operand of '!=' is a garbage value
if (ss->state != 'Z') {
which is a false positive as every process will have a 'Status' field,
but initializing the structure makes the clang analyzer silent.
Signed-off-by: Adrian Reber <areber@redhat.com>
Some kernels have W^X mitigation, which means they won't execute memory
blocks if that memory block is also writable or ever was writable. This
patch enables CRIU to run on such kernels.
1. Align .data section to a page.
2. mmap a memory block for parasite as RX.
3. mprotect everything after .text as RW.
Signed-off-by: Michał Cłapiński <mclapinski@google.com>
GNU ld precalculates this information but lld does not. With this
change, handle-elf.c calculates those addresses on its own.
When calculating addresses sections with SHF_ALLOC bit are put one after
another, respecting their alignment requirements. This matches the way
how the blob is constructed by copying section contents.
Signed-off-by: Wojciech Marczenko <marczenko@google.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Previously, the GOT table was using the same memory location as the
args region, leading to difficult to debug memory corruption bugs.
We allocate the GOT table between the parasite blob and the args region.
The reason this is a good placement is:
1) Putting it after the args region is possible but a bit combersome as
the args region has a variable size
2) The cr-restore.c code maps the parasite code without the args region,
as it does not do RPC.
Another option is to rely on the linker to generate a GOT section, but I
failed to do so despite my best attempts.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
We don't need to push 0 on the stack. This seems to be a remnant of the
initial commit of 2011 that had a `pushq $0`.
The 16 bytes %rsp alignment was added with commit 2a0cea29 in 2012.
This is no longer necessary as we already guarantee that %rsp is 16
bytes aligned. A BUG_ON() is added to enforce this guarantee.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Previously, __export_parasite_cmd was located in parasite-head.S, and
__export_parasite_args located exactly at the end of the parasite blob.
This is not ideal for various reasons:
1) These two variables work together. It would be preferrable to have
them in the same location
2) This prevent us from allocating another section betweeen the parasite
blob and the args area. We'll need this to allocate a GOT table
This commit changes the allocation of these symbols from assembly/linker
script to a C file.
Moreover, the assembly entry points that invoke parasite_service()
prepares arguments with hand crafted assembly. This is unecessary.
This commit rewrite this logic with regular C code.
Note: if it wasn't for the x86 compat mode, we could remove all
parasite-head.S files and directly jump to parasite_service() via
ptrace. An int3 architecture specific equivalent could be called at the
end of parasite_service() with an inline asm statement.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
It is unnecessary and potentially confusing for understanding the memory
layout requirement of the parasite blob.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
It removes the potential confusion when it comes to virtual address vs
offsets. Further, doing so makes naming more consistent with the rest of
the parasite_blob_desc struct.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
compel_relocs_apply() was taking arguments mostly from the struct
parasite_blob_desc. Instead of passing all the arguments, we pass a
pointer to the struct itself.
This makes the code safer, as cr-restore.c calls compel_relocs_apply().
It previously needed to poke into what can be considered private
variables of the restorer-pie.h file.
To allow the parasite_blob_desc struct to be populated without a
parasite_ctl struct, we expand the compel API to export a
parasite_setup_c_header_desc() in the generated pie.h.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
COMMON symbols are emitted for global variable that are not initialized
in shared libraries.
They typically end up in the .bss section when linking an executable.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
We should follow Linux Kernel Codding Style:
... the closing brace is empty on a line of its own, except in the cases
where it is followed by a continuation of the same statement, ie ... an
else in an if-statement ...
https://www.kernel.org/doc/html/v4.10/process/coding-style.html#placing-braces-and-spaces
Automaticly fixing with:
:!git grep --files-with-matches "^\s*else[^{]*{" | xargs
:argadd <files>
:argdo :%s/}\s*\n\s*\(else[^{]*{\)/} \1/g | update
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The file only includes other headers (which may be not needed).
If we aim for one-include-for-compel, we could instead paste all
subheaders into "compel.h".
Rather, I think it's worth to migrate to more fine-grained compel
headers than follow the strategy 'one header to rule them all'.
Further, the header creates problems for cross-compilation: it's
included in files, those are used by host-compel. Which rightfully
confuses compiler/linker as host's definitions for fpu regs/other
platform details get drained into host's compel.
Signed-off-by: Dmitry Safonov <dima@arista.com>
This patch fixes the problem with SSE (xmm) registers corruption on amd64
architecture. The problem was that gcc generates parasite blob that uses
xmm registers, but we don't preserve this registers in CRIU when injecting
parasite. Also, gcc, even with -nostdlib option uses builtin memcpy,
memset functions that optimized for amd64 and involves SSE registers.
It seems, that optimal solution is to use -ffreestanding gcc option
to compile parasite. This option implies -fno-builtin and also it designed
for OS kernels compilation/another code that suited to work on non-hosted
environments and could prevent future sumilar bugs.
To check that you amd64 CRIU build affected by this problem you could simply
objdump -dS criu/pie/parasite.o | grep xmm
Output should be empty.
Reported-by: Diyu Zhou <zhoudiyupku at gmail.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Signed-off-by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Just like on all other supported architectures gcc complains about the
stack pointer register being part of the clobber list:
error: listing the stack pointer register ‘15’ in a clobber list is deprecated [-Werror=deprecated]
This removes the stack pointer from the clobber list.
'zdtm.py run -a' still runs without any errors after this change.
Signed-off-by: Adrian Reber <areber@redhat.com>