After the commit that walks /proc/self/fd/N path instead of the temporary
one, the add_cgroup() started trimming first several bytes from the cgroup
path.
Test passed, since all cgroups were left as is after dump, so criu restore
didn't recreate them but got EEXIST on all mkdir-s.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
A mount point, which is mounted by someone else, may be umounted in
any moment.
For example the test system executes tests concurrently and sometimes
one test looks up a mount point, which has been mounted by another test.
==================================== ERROR ====================================
Test: zdtm/live/static/inotify00, Namespace: 1
Dump log : /var/lib/jenkins/jobs/CRIU-dump/workspace/test/dump/inotify00/15535/1/dump.log
--------------------------------- grep Error ---------------------------------
(00.021951) Error (cgroup.c:409): cg: failed walking /var/lib/jenkins/jobs/CRIU-dump/workspace/test/dump/signalfd00/15538/1/.criu.cgmounts.UGj28v/ for empty cgroups
(00.021967) Error (cr-dump.c:1601): Dump core (pid: 15535) failed with -1
(00.025509) Error (cr-dump.c:1914): Dumping FAILED.
------------------------------------- END -------------------------------------
================================= ERROR OVER =================================
In the previous patch I suggested to open a mount point, but it brought
other problems. We may open a directory where a cgroup mount has been
umounted and an owner will get EBUSY on attempt to remove this
directory.
Reported-by: Jenkins Criuovich
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have two bugs actually.
First, the check for 'item == root_item' in dump_task_cgroup fires
twice: first when we rite inventory (item == NULL as argument and
root_item == NULL because we haven't yet collected tasks) and the
2nd time when we dump the root task itself.
The 2nd issue sits in dump_cgroups() -- if root_cgset == criu_cgset
we don't write cgroups information at all (checking that we don't
have them with list_is_singular() inside that if). That said, we
don't need to read the cgroups tree if we're not going to dump it.
This patch fixes both.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
Before we would not detect the mount point for co-mounted controllers. Things
still worked because we'd just re-mount them ourselves and traverse our own
mount point, but this saves an extra mount().
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The path in cc->path here always has a "/" prefix since it comes from
/proc/$pid/cgroup. The additional / confuses the string slinging because ftw()
normalizes paths to have one "/" when we start traversing a subdirectory.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have to many controllers names in cgroup.c file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
During the dump phase, /proc/cgroups is parsed to find co-mounted cgroups.
Then, for each task /proc/self/cgroup is parsed for the cgroups that it is a
member of, and that cgroup is traversed to find any child cgroups which may
also need restoring. Any cgroups not currently mounted will be temporarily
mounted and traversed. All of this information is persisted along with the
original cg_sets, which indicate which cgroups a task is a member of.
On restore, an initial phase creates all the cgroups which were saved. Tasks
are then restored into these cgroups via cg_sets as usual.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Skip the string "name=" when recreating cgroups directories in cgyard.
For example, systemd's entries in cgroup.img are:
name: "name=systemd"
path: "/user/1000.user/4.session"
When creating systemd subdir, named= should not be part of the name.
Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Otherwise cgroups sub-mounts may propagate to another namespaces
and the directory would become unremovable.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
On restore find out in which sets tasks live in and move
them there.
Optimization note -- move tasks into cgroups _before_ fork
kids to make them inherit cgroups if required. This saves
a lot of time.
Accessibility note -- when moving tasks into cgroups don't
search for existing host mounts (they may be not available)
and don't mount temporary ones (may be impossible due to
user namespaces). Instead introduce service fd with a yard
of mounts.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Each task points to a single ID of cgroup-set it lives in. This
is done so to save some space in the image, as tasks likely
live in the same set of cgroups.
Other than this we keep track of what cgroup set we dump the
subtree from. If it happens, that root task lives in the same
cgroup set as criu does, we don't allow for any other sub-cgroups
and make restore (next patch) much simpler and faster.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>