This option was introduced with:
e2c38245c6
v2: (comment from Pavel Tikhomirov) --enable-fs does not fit with
--external dev[]:, see try_resolve_ext_mount, external dev mounts
only determined for FSTYPE__UNSUPPORTED.
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
In the recent kernels the userfaultfd support for FORK events is limited to
CAP_SYS_PTRACE. That causes the followong error when the ioctl(UFFDIO_API)
is executed from non-privilieged userns:
Error (criu/uffd.c:273): uffd: Failed to get uffd API: Operation not permitted
Wrapping the call to ioctl(UFFDIO_API) in userns_call() resolves the issue.
Fixes: #964
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
gcc8 in Fedora Rawhide has a new useful warning:
> criu/img-remote.c: In function 'push_snapshot_id':
> criu/img-remote.c:1099:2: error: 'strncpy' specified bound 4096 equals destination size [-Werror=stringop-truncation]
> 1099 | strncpy(rn.snapshot_id, snapshot_id, PATH_MAX);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From man 3 strncpy:
> Warning: If there is no null byte among the first n bytes of src,
> the string placed in dest will not be null-terminated.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Testing for all the memfd features, namely support for CR of:
* the same fd shared by multiple processes
* the same file shared by multiple processes
* the memfd content
* file flags and fd flags
* mmaps, MAP_SHARED and MAP_PRIVATE
* seals, excluding F_SEAL_FUTURE_WRITE because this feature only exists
in recent kernels (5.1 and up)
* inherited fd
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
File pairs naturally block on read() until the write() happen (or the
writer is closed). This is not the case for regular files, so we
take extra precaution for these.
Also cleaned-up an extra my_file.close()
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
See "man fcntl" for more information about seals.
memfd are the only files that can be sealed, currently. For this
reason, we dump the seal values in the MEMFD_INODE image.
Restoring seals must be done carefully as the seal F_SEAL_FUTURE_WRITE
prevents future write access. This means that any memory mapping with
write access must be restored before restoring the seals.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
* During checkpoint, we add a vma flags: VMA_AREA_MEMFD to denote memfd
regions.
* Even though memfd is backed by the shmem device, we use the file
semantics of memfd (via /proc/map_files/<vma>) which we already have
support for.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Upon file restore, inherited_fd() is called to check for a user-defined
inerit-fd override. Note that the MEMFD_INODE image is read at each
invocation (memfd name is not cached).
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
See "man memfd_create" for more information of what memfd is.
This adds support for memfd open files, that are not not memory mapped.
* We add a new kind of file: MEMFD.
* We add two image types MEMFD_FILE, and MEMFD_INODE.
MEMFD_FILE contains usual file information (e.g., position).
MEMFD_INODE contains the memfd name, and a shmid identifier
referring to the content.
* We reuse the shmem facilities for dumping memfd content as it
would be easier to support incremental checkpoints in the future.
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Podman changed the output of 'podman ps'. For the test only running
containers are interesting. Adding the filter '-f status=running' only
returns running containers as previously.
Signed-off-by: Adrian Reber <areber@redhat.com>
Apparently, C/R is broken when CONFIG_VDSO is not set.
Probably, I've broken it while adding arm vdso support.
Or maybe some commits after.
Repair it by adding checks into vdso_init_dump(), vdso_init_restore().
Also, don't try handling vDSO in restorer if it wasn't present in
parent. And prevent summing VDSO_BAD_SIZE to {vdso,vvar}_rt_size.
Reported-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
In the current version, the offsets of remapping vvar and vdso regions
are mixed up.
If vdso is before vvar, vvar has to be mapped with the vdso_size offset.
if vvar is before vdso, vdso has to be mapped with the vvar_size offset.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
On gcc-10 (and gcc-9 -fno-common) build fails as:
```
ld: criu/arch/x86/crtools.o:criu/include/cr_options.h:159:
multiple definition of `rpc_cfg_file'; criu/arch/x86/cpu.o:criu/include/cr_options.h:159: first defined here
make[2]: *** [scripts/nmk/scripts/build.mk:164: criu/arch/x86/crtools.built-in.o] Error 1
```
gcc-10 will change the default from -fcommon to fno-common:
https://gcc.gnu.org/PR85678.
The error also happens if CFLAGS=-fno-common passed explicitly.
Reported-by: Toralf Förster
Bug: https://bugs.gentoo.org/707942
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Commit 0493724c8e added support for using asciidoctor
(instead of asciidoc + xmlto) to generate man pages.
For some reason, asciidoctor does not deal well with some
complex formatting that we use for options such as --external,
leading to literal ’ and ' appearing in the man page instead
of italic formatting. For example:
> --inherit-fd fd[’N']:’resource'
(here both N and resource should be in italic).
Asciidoctor documentation (asciidoctor --help syntax) tells:
> == Text Formatting
>
> .Constrained (applied at word boundaries)
> *strong importance* (aka bold)
> _stress emphasis_ (aka italic)
> `monospaced` (aka typewriter text)
> "`double`" and '`single`' typographic quotes
> +passthrough text+ (substitutions disabled)
> `+literal text+` (monospaced with substitutions disabled)
>
> .Unconstrained (applied anywhere)
> **C**reate+**R**ead+**U**pdate+**D**elete
> fan__freakin__tastic
> ``mono``culture
so I had to carefully replace *bold* with **bold** and
'italic' with __italic__ to make it all work.
Tested with both terminal and postscript output, with both
asciidoctor and asciidoc+xmlto.
TODO: figure out how to fix examples (literal multi-line text),
since asciidoctor does not display it in monospaced font (this
is only true for postscript/pdf output so low priority).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Add a/the articles where I see them missing
2. s/Forbid/disable/
3. s/crit/crit(1)/ as we're referring to a man page
4. Simplify some descriptions
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case asciidoc is installed and xmlto is not, make returns an error
but there's no diagnostics shown, since "xmlto: command not found"
goes to /dev/null.
Remove the redirect.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This patch fixes the problem with SSE (xmm) registers corruption on amd64
architecture. The problem was that gcc generates parasite blob that uses
xmm registers, but we don't preserve this registers in CRIU when injecting
parasite. Also, gcc, even with -nostdlib option uses builtin memcpy,
memset functions that optimized for amd64 and involves SSE registers.
It seems, that optimal solution is to use -ffreestanding gcc option
to compile parasite. This option implies -fno-builtin and also it designed
for OS kernels compilation/another code that suited to work on non-hosted
environments and could prevent future sumilar bugs.
To check that you amd64 CRIU build affected by this problem you could simply
objdump -dS criu/pie/parasite.o | grep xmm
Output should be empty.
Reported-by: Diyu Zhou <zhoudiyupku at gmail.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Signed-off-by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
This fixes the validation errors from Travis:
Build config validation
root: deprecated key sudo (The key `sudo` has no effect anymore.)
root: missing os, using the default linux
root: key matrix is an alias for jobs, using jobs
Signed-off-by: Adrian Reber <areber@redhat.com>
This is the last architecture specific change to make CRIU use clone3()
with set_tid if available. Just as on all other architectures this adds
a clone3() based assembler wrapper to be used in the restorer code.
Tested on Fedora 31 with the same 5.5.0-rc6 kernel as on the other
architectures.
Signed-off-by: Adrian Reber <areber@redhat.com>
clone3() explicitly blocks setting an exit_signal if CLONE_PARENT is
specified. With clone() it also did not work, but there was no error
message. The exit signal from the thread group leader is taken.
Signed-off-by: Adrian Reber <areber@redhat.com>
Just like on all other supported architectures gcc complains about the
stack pointer register being part of the clobber list. This removes the
stack pointer from the clobber list.
Signed-off-by: Adrian Reber <areber@redhat.com>
This adds the parasite clone3() with set_tid wrapper for s390x.
In contrast to the x86_64 implementation the thread start address and
arguments are not put on the thread stack but passed via r4 and r5. As
those registers are caller-saved they still contain the correct value
(thread start address and arguments) after returning from the syscall.
Tested on 5.5.0-rc6.
Signed-off-by: Adrian Reber <areber@redhat.com>
Just like on all other supported architectures gcc complains about the
stack pointer register being part of the clobber list:
error: listing the stack pointer register ‘15’ in a clobber list is deprecated [-Werror=deprecated]
This removes the stack pointer from the clobber list.
'zdtm.py run -a' still runs without any errors after this change.
Signed-off-by: Adrian Reber <areber@redhat.com>
With the in Linux Kernel 5.4 introduced clone3() with set_tid it is no
longer necessary to write to to /proc/../ns_last_pid to influence the
next PID number. clone3() can directly select a PID for the newly
created process/thread.
After checking for the availability of clone3() with set_tid and adding
the assembler wrapper for clone3() in previous patches, this extends
criu/pie/restorer.c and criu/clone-noasan.c to use the newly added
assembler clone3() wrapper to create processes with a certain PID.
This is a RFC and WIP, but I wanted to share it and run it through CI
for feedback. As the CI will probably not use a 5.4 based kernel it
should just keep on working as before.
Signed-off-by: Adrian Reber <areber@redhat.com>
To create a new process/thread with a certain PID based on clone3() a
new assembler wrapper is necessary as there is not glibc wrapper (yet).
Signed-off-by: Adrian Reber <areber@redhat.com>
Linux kernel 5.4 extends clone3() with set_tid to allow processes to
specify the PID of a newly created process. This introduces detection
of the clone3() syscall and if set_tid is supported.
This first implementation is X86_64 only.
Signed-off-by: Adrian Reber <areber@redhat.com>
We are running each podman test loop 50 times. This takes more than 20
minutes in Travis. Reduce both test loops to only run 20 times.
Signed-off-by: Adrian Reber <areber@redhat.com>
To ensure consistency of runtime environment processes within a
container need to see same start time values over suspend/resume
cycles. We introduce new field to the core image structure to
store start time of a dumped process. Later same value would be
restored to a newly created task. In future the feature is likely
to be pulled here, so we reserve field id in protobuf descriptor.
Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
Compiling 'criu-dev' on Fedora 31 gives two errors about wrong clobber
lists:
compel/include/uapi/compel/asm/sigframe.h:47:9: error: listing the stack pointer register ‘1’ in a clobber list is deprecated [-Werror=deprecated]
criu/arch/ppc64/include/asm/restore.h:14:2: error: listing the stack pointer register ‘1’ in a clobber list is deprecated [-Werror=deprecated]
There was also a bug report from Debian that CRIU does not build because
of this.
Each of these errors comes with the following note:
note: the value of the stack pointer after an ‘asm’ statement must be the same as it was before the statement
As far as I understand it this should not be a problem in this cases as
the code never returns anyway.
Running zdtm very seldom fails during 'zdtm/static/cgroup_ifpriomap'
with a double free or corruption. This happens not very often and I
cannot verify if it happens without this patch. As CRIU does not build
without the patch.
Signed-off-by: Adrian Reber <areber@redhat.com>