Doing it at inventory write time is too late. Other than this, inventory
isn't created for pre-dump, thus this one always generates v1 images.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These images have common magic in front of per-image one. With
this we have 3 "types" of images -- inventory (head), other
images, service files. The latter would be stats (not an image,
just happen to be in PB format) and irmap cache (not an image
again, just auxiliary thing which is in PB for convenience).
Since inventory file is the first one we read on restore it's
OK to set the global "new images" flag there. Dump (write) is
always in new format.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
Acked-by: Andrew Vagin <avagin@odin.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Currently on dump we generate too many image files, effectively
all the stuff from the GLOB set is created. The thing is that
sometimes some of created images can be empty (just contain the
magic number at the head). Thos images are useless and just
waste the space.
When applied after the "empty images" set, this introduces the
lazy images -- when we call open_image() the actual file is
only created (and the magic number is written into it) when the
very first object goes into it.
For example for the simplest test we have, then static/env00
one, the created image files are
core-7290.img
creds-7290.img
fdinfo-2.img
fs-7290.img
ids-7290.img
inventory.img
mm-7290.img
pagemap-7290.img
pages-1.img
pstree.img
reg-files.img
sigacts-7290.img
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
When an image of a certian type is not found, CRIU sometimes
fails, sometimes ignores this fact. I propose to ignore this
fact always and treat absent images and those containing no
objects inside (i.e. -- empty). If the latter code flow will
_need_ objects, then criu will fail later.
Why object will be explicitly required? For example, due to
restoring code reading the image with pb_read_one, w/o the
_eof suffix thus required the object to be in the image.
Another example is objects dependencies. E.g. fdinfo objects
require various files objects. So missing image files will
result in non-resolved searches later.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Current code doesn't make any difference between OPT and no-OPT
except for the message is printed or not in the open_image().
So this particular change changes nothing but the availability of
this message.
In the next patches I wil introduce "empty images" to deal with
the ENOENT situation in a more graceful manner.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have some images that store raw data together with
the pb objects (and one that just stores raw data) and
use custom access to this. E.g. pipe-data images splice
data into them and sk-queue one lseeks the image for
queue packets.
For those using buffered mode mixed with raw may lead
to troubles. Explicitly mark such images, so that the
buffering (next patches) handle such images carefully.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
We want to have buffered images to speed up dump and,
slightly, restore. Right now we use plan file descriptors
to write and read images to/from. Making them buffered
cannot be gracefully done on plain fds, so introduce
a new class.
This will also help if (when?) we will want to do more
complex changes with images, e.g. store them all in one
file or send them directly to the network.
For now the cr_img just contains one int _fd variable.
This patch chages the prototype of open_image() to
return struct cr_img *, pb_(read|write)* to accept one
and fixes the compilation of the rest of the code :)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Since we're going to switch from int-fd-s to class-image
soon the fdset name will not fit into the new terminology.
This patch is
sed -e 's/fdset/imgset/g' -i *
sed -e 's/imgset_fd/img_from_set/g' -i *
git mv include/fdset.h include/imgset.h
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
This is to simplify the change from int fd to more
generic image class data-type.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
We drop the O_OPT from flags and will drop one more. So
instead of a set of bools let's have the flags copy at
hands.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
On restore find out in which sets tasks live in and move
them there.
Optimization note -- move tasks into cgroups _before_ fork
kids to make them inherit cgroups if required. This saves
a lot of time.
Accessibility note -- when moving tasks into cgroups don't
search for existing host mounts (they may be not available)
and don't mount temporary ones (may be impossible due to
user namespaces). Instead introduce service fd with a yard
of mounts.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Each task points to a single ID of cgroup-set it lives in. This
is done so to save some space in the image, as tasks likely
live in the same set of cgroups.
Other than this we keep track of what cgroup set we dump the
subtree from. If it happens, that root task lives in the same
cgroup set as criu does, we don't allow for any other sub-cgroups
and make restore (next patch) much simpler and faster.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This allows us to distinguish the situation where image
to be opened is missing but optional, thus no error message
should be printed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We want to write into empty image files, so we
unlink them before dumping into. Let's O_TRUNC
it instead.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
When pr_perror is used, an error message is appended with a comma
and an strerror(errno), so we should not put a period at the end,
otherwise we'll end up with something like this:
Error: Can't bind.: Permission denied
Found by git grep -w pr_perror | grep '\."'
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
After fixes with -W option we've changed the cwd at the
time parent images are opened. Use the -at syscall to
proerly access ones.
[ Cleanup and comment from xemul@ ]
Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
fcntl data is arch independent, so move it out of include/asm/type.h
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
crtools.h is too heavy to be included in many sources
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have generic do_pb_show() call and tons of show_foo
routines, that just call one with proper args. Compact
the code by putting the args into array and calling
the do_pb_show() in one place.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Move image object descriptors to own image-desc
file(s). This allow to reuse the code in other tools.
I had to move show declarations to cr-show.h as well.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
without this patch, fd will not be freeed
* Changelog from v1:
* just free fd, no crt.ids
Signed-off-by: Libo Chen <libo.chen@huawei.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
They are not documented, thus OK for now. Two options --
* one to specify where the parent images are
* one to reset dirty memory tracking
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We'll have one more "image" file generated by dump and (surprisingly)
restore commands -- the stats one. It will contain in a single pb
object all the statistics collected by dump/restore.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
1. Directory with images may have a "parent" symlink pointing to the
place where the previous snapshot is
2. Each pagemap will have "in_parent" bit, which means, that the
pages for this pagemap entry are not in the respective page.img
but in parent
3. New --leave-running option to use with --snapshot not to kill
tasks after snapshot
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>