When we'll restore nested mount namespaces, all but root ones (sub-namespaces)
will be restored as sub-mounts in the root mount namespace. So mi->mountpoint
will be not '/' even if a mount is root for its mntns.
v2: s/is_root/is_ns_root/
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently ns_ids list is filled only on dump. Soon we'll need this
list for mount namespaces on restore, e.g. to know which tasks share
the namespaces.
v2: merge the patch "namespace: add a function to search an ns_id
item by id" into this one.
v3: add prefix rst_ to add_ns_id
v4: look up namespace by two values -- type AND ID
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to restore nested mount namespaces and we will need to
change root for each of them.
v2: don't call chdir in a second time, because a path may be relative
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Currently it's initialized for the root mount namespace, but we are
going to dump nested mount namespaces.
It's used in open_mountpoint(), which is used in dump_tmpfs() and in
other callbacks.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We're about to collect root several times in a row, so keeping
the old one isn't required.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have a race. Consider we have 3 tasks, A, B and C. A and B
share fdtable, C -- does not. Then we might be in a situation
when A is restoring memory reading mem images, and B -- forking
the C child. In that case descriptors held by A (for mem restore)
will be inherited by C and will not get closed.
This reverts commit d36e07aabe.
BTRFS returns subvolume dev-id instead of superblock dev-id,
so we need to know which mounts are btrfs.
The mi->fstype->name is "unsuppoerted" here, because the fstype->code
is saved in an image
{
.name = "unsupported",
.code = FSTYPE__UNSUPPORTED,
},
{
.name = "btrfs",
.code = FSTYPE__UNSUPPORTED,
}
An a second reason is that pocesses can be migrated from smth to btrfs.
This all can happen _only_ for the root mount and for bind mounts of
the root mount...
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
here was bug cause if e.g.: iterative snapshots are made and
between two of them new process in process tree was created,
it can have pages which are non dirty, and won't save them
into image. but there is no parent image for it.
pages which are non soft-dirty appear if process with some pages
in non dirty state forks, child will inherit those pte's
and if child don't write to those pages, they will be still in non
soft-dirty state when next dump comes.
also this bug was not catched because of error in zdtm, look 3/3
v2: simplify, add more justification in commit message.
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
"relative path" is absolute path with dot at the beginning.
We already use relative paths on restore. In this patch we add "."
on dump too. It's convinient, because we needed to add dot each time
when we want to access this mount point.
Before this patch we had to created a temporary copy.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Now we have two funсtions which do mostly the same, so this patch merges
them.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
st_dev and s_dev have different formats.
st_dev is (MAJOR(dev) << 8) | MINOR(dev)
s_dev is (MAJOR(dev) << 20) | MINOR(dev)
so we need to convert one of them
v2: use kdev_to_odev
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Only one function use DIR, so I don't see reason to return it
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It's already used for dumping files and it will be used for restoring,
so it should be service fd to avoid intersection with restored
descriptors.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Make explicit checks and helpers for legacy images.
This should facilitate its removal some day in the
future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This as well gives us minus one image per-task and
allocates more space on core task entry.
One thing to note -- the amount of posix timers is
not easily accessible at the core entry allocation
time, so the respective array is allocated on demand.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This allows to have one image less per-task, which in turn
reduces live migration time a little bit.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This one will hold all info about timers in the core_entry.
Since timers are always per-task, this one is on task core
entry.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
There are a few nits in protobuf generation (which in most
cases are harmless but somethime may cause building procedure
to fail)
- Deps files over generated sources should be produced only
when _all_ .proto files are handled, because we include headers
which are autogenerated as well, thus simply wait until everything
is complete then use compiler to generate deps files
- typo in non-clean targes, i rather should use proto-obj-y objects
Reported-by: Andrey Vagin <avagin@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
otherwize it won't use/check parent snapshots, only last one
on restore.
Signed-off-by: Tikhomirov Pavel <snorcht@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
A process can be restored in a new pidns. close_old_fds() opens
the /proc/PID directory. Without this patch we can see errors like this:
(00.333915) 1: Error (util.c:102): Unable to close fd 6: Bad file descriptor
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The root mount is an external mount and its source can be not '/'.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The source of the root mount may be not equal to "/" and we need to take
this fact into account, when we bind-mount it to somewhere.
For example:
11877 ? Ss 0:00 ./bind-mount --pidfile=bind-mount.pid --outfile=bind-mount.out --dirname=bind-mount.test
11880 ? Ss 0:00 \_ ./bind-mount --pidfile=bind-mount.pid --outfile=bind-mount.out --dirname=bind-mount.test
[root@avagin-fc19-cr crtools]# cat /proc/11880/mountinfo
68 42 8:3 /root/git/crtools/test / rw,relatime - ext4 /dev/sda3 rw,data=ordered
43 68 0:33 / /proc rw,relatime - proc proc rw
44 68 0:34 / /dev/pts rw,relatime - devpts pts rw,mode=666,ptmxmode=666
45 68 8:3 /root/git/crtools/test/zdtm/live/static/bind-mount.test/test /zdtm/live/static/bind-mount.test/bind rw,relatime - ext4 /dev/sda3 rw,data=ordered
The 45 mount is bind-mount of the 68 mount.
mi(45)->root = /root/git/crtools/test/zdtm/live/static/bind-mount.test/test
mi(68)->root = /root/git/crtools/test
so the comman part is "/root/git/crtools/test" and the command is
mount --bind /zdtm/live/static/bind-mount.test/test /zdtm/live/static/bind-mount.test/bind
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
These messages are constructed in the same spirit as ARM and x86 ones
except for two major points:
* general-purpose registers are stored in a variable-length array
of uint64's: the architecture provides 32 general-purpose registers
that makes it unfeasible to create a separate protobuf field
for each of them since it requires a lot of "copy-paste" to convert
between the struct pt_regs and protobuf message; the length of
the array storing registers is to be checked by the architecture-
dependent CRIU code;
* AArch64 FP/SIMD registers are 128 bit long while protobuf lacks
the support for integers of this size; the FP/SIMD registers
are stored in an array of uint64, two consecutive elements
of the array represent a single FP/SIMD register.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
CRIU can handle stopped multithreaded processes when all threads
are stopped. Refine the check to allow this case.
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
It will be used for restoring files from proper mounts.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We are going to parse fdinfo for getting mnt_id,
so we can take there pos and flags and don't call
fcntl and lseek for that.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
For all other tasks only unsed service descriptors will be closed.
This change allows to have file descriptors, which may be used for
restoring namespaces. All non-server descriptors must be closed before
restoring files.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Opening /dev/null may fail, check for ret code.
CID 1168167
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
We have a condition
BUG_ON(kid > UINT_MAX);
but kid is unsigned int so it's never bigger than UINT_MAX,
use unsigned long instead.
CID 1042296
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The type of the field ucontext::uc_sigmask isn't k_rtsigset_t
if the struct ucontext is imported from system headers
rather than provided by an architecture-specific header.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Cc: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
This patch moves the files arch/$ARCH/include/asm/int.h to
include/asm-generic/int.h and makes the types {u,s}{8,16,32}
be aliases of the fixed sized integer types [u]int{8,16,32}_t.
This makes it possible to use single set of integer typedefs
in all architectural ports.
Signed-off-by: Alexander Kartashov <alekskartashov@parallels.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
The devpts instance was mounted w/o the newinstance option if,
the device number is equal to the root /dev/pts.
I think this condition is strong enough to not mount devpts in a
temporary place.
v2: move the host.bla-bla-bla in kerndat.c
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Fixes two issues with efe594f8f4 "criu: fix filemap open permissions":
- Permissions on files with both open file descriptors and mappings.
- Restore compatibility with dumps created by previous versions of criu.
Signed-off-by: Jamie Liu <jamieliu@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
maps03 should have caught the bug fixed by 288cf51741 "restore: mutate
tgt_addr in map_private_vma", but didn't because integer literals
(defaulting to 32-bit ints) were shifted out of range.
Signed-off-by: Jamie Liu <jamieliu@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>