2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-09-02 07:15:31 +00:00

seize: enable support for frozen containers

Container runtimes like CRI-O and containerd utilize the freezer cgroup
to create a consistent snapshot of container root filesystem (rootfs)
changes. In this case, the container is frozen before invoking CRIU.
After CRIU successfully completes, a copy of the container rootfs diff
is saved, and the container is then unfrozen.

However, the `cuda-checkpoint` tool is not able to perform a 'lock'
action on frozen threads.  To support GPU checkpointing with these
container runtimes, we need to unfreeze the cgroup and return it to its
original state once the checkpointing is complete.

To reflect this new behavior, the following changes are applied:
 - `dont_use_freeze_cgroup(void)` -> `set_compel_interrupt_only_mode(void)`
 - `bool freeze_cgroup_disabled` -> `bool compel_interrupt_only_mode`
 - `check_freezer_cgroup(void)` -> `prepare_freezer_for_interrupt_only_mode(void)`

Note that when `compel_interrupt_only_mode` is set to `true`,
`compel_interrupt_task()` is used instead of `freeze_processes()`
to prevent tasks from running during `criu dump`.

Fixes: #2508

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
This commit is contained in:
Radostin Stoyanov
2024-11-04 19:57:30 +00:00
committed by Andrei Vagin
parent 216d804aab
commit f8f0e1df76
6 changed files with 32 additions and 26 deletions

View File

@@ -509,7 +509,7 @@ int cuda_plugin_init(int stage)
INIT_LIST_HEAD(&cuda_pids);
}
dont_use_freeze_cgroup();
set_compel_interrupt_only_mode();
return 0;
}