Container runtimes like CRI-O and containerd utilize the freezer cgroup
to create a consistent snapshot of container root filesystem (rootfs)
changes. In this case, the container is frozen before invoking CRIU.
After CRIU successfully completes, a copy of the container rootfs diff
is saved, and the container is then unfrozen.
However, the `cuda-checkpoint` tool is not able to perform a 'lock'
action on frozen threads. To support GPU checkpointing with these
container runtimes, we need to unfreeze the cgroup and return it to its
original state once the checkpointing is complete.
To reflect this new behavior, the following changes are applied:
- `dont_use_freeze_cgroup(void)` -> `set_compel_interrupt_only_mode(void)`
- `bool freeze_cgroup_disabled` -> `bool compel_interrupt_only_mode`
- `check_freezer_cgroup(void)` -> `prepare_freezer_for_interrupt_only_mode(void)`
Note that when `compel_interrupt_only_mode` is set to `true`,
`compel_interrupt_task()` is used instead of `freeze_processes()`
to prevent tasks from running during `criu dump`.
Fixes: #2508
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This patch optimizes shell code as reading a single file as input using a 'cat' command to a program.
It is considered to be a Useless Use of Cat (UUOC).
It's more efficient to simply use redirection.
However, in some cases, even using the redirection operator '<' seems unnecessary.
Signed-off-by: KKrypt <sankalpacharya1211@gmail.com>
It looks like we've got broken fhandles from fdinfo
for inotifies/fanotifies for btrfs. I will look into that.
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Let's run zdtm in jenkins with --mntns-compat-mode option and same for
device-external mount test from others.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Looking at CI logs there are often messages like:
"[WARNING] Option --keep-going is more useful when running multiple tests"
This commit removes '--keep-going' from single zdtm test runs.
Signed-off-by: Adrian Reber <areber@redhat.com>
Jenkins test runs are failing with:
./test/jenkins/run_ct ./test/jenkins/crit.sh
./test/jenkins/crit.sh: 3: source: not found
Switch to bash which has 'source'.
Signed-off-by: Adrian Reber <areber@redhat.com>
We permanently have issues like this:
./test/jenkins/criu-iter.sh: 3: source: not found
It looks like a good idea to use one shell to run our jenkins scripts.
Signed-off-by: Andrei Vagin <avagin@gmail.com>
The criu-lazy-migration.sh was copied from criu-lazy-pages.sh and was not
updated enough to actually run zdtm.py with --lazy-migrate.
Fix it.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
On loaded systems running maps04 with lazy-pages takes too much time. The
same functionality is anyway covered by other tests so excluding maps04
shouldn't decrease the test coverage.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
We don't tune tracers, so we don't expect to get anything.
In docker containers, debugfs isn't mounted and this command failed.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
With userfaultfd we cannot reliably service process_vm_readv calls. The
maps007 test that uses these calls passed previously by sheer luck.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Test for previously fixed bugs for vdso-trampolines insertion:
- unmapping original vvar (which went unnoticed)
- leaving rt-vvar after each C/R cycle and resulting pollution
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Even with 2 parallel jobs maps04 takes too much time with
--remote-lazy-pages. Let's skip it for now.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Running zdtm/static/maps04 with --remote-lazy-pages in parallel with 3
other tests takes too much time on the Jenkins builder. Let's try running
with --parallel 2.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
* select excluded tests based on the kernel version
* test local and remote lazy-pages with and withour pre-dump
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Add pre-dump and remote-lazy-pages passes to criu-lazy-pages.sh
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Very minimalistic at the moment, no remote pages and namesapces.
Still better than nothing :)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
To check that jump trampolines to rt-vdso works.
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
link-remap files has to be deleted only if processes have been
restored successfully. Otherwise we can want to repeat an attempt,
but it is imposible if some link-remap files were deleted.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Add a new option to zdtm.py to run "criu dedup" after "criu dump"
or "criu pre-dump".
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Here is a race when someone umounted something and this operation
isn't propagated into our namespace.
CRIU | Another process
-----------------------------------------------------------------
pivot_root(".", put_root) |
mount(put_root, MS_REC|MS_PRIVATE) |
| umount /xxx/yyy
| umount /xxx -> EBUSY
umount(put_root)
We do this to not affect mounts in put_root, but we can mask
these mounts as slave and this will work for us and for external
users.
Reported-by: Mr Travis
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
There is no similar jenkins script that tests similar options now.
So make a new script.
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Main test features:
- Non trivial ps tree with non trivial anon shmem regions
(no such test exists now).
- Each ps tree process continuously writes parts of anon shmem
vmas and validates these writes after restore
(required for dedup testing).
- Checking simultaneous changing of anon shmem contents in different
processes.
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
We expect:
- all 9 scripts are called
- there's always images dir variable
- for 7 of those scripts there's root-pid variable
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>