2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-31 06:15:24 +00:00
Commit Graph

9272 Commits

Author SHA1 Message Date
Kirill Tkhai
a3ed2f284d pstree: Change arguments of read_pstree_ids()
Pass vpid instead of pstree_item as input argument,
and return ids to caller. No functional changes here.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2017-11-30 01:22:17 +03:00
Kirill Tkhai
4521b7823e pid: Pass thread pid to caller
This is refactoring, no functional changes.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
3c4fbc6caf pid: Alloc threads dynamically
Threads pid values also may be multi-level, so allocate
them dynamically.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
a11a465128 pid: Use last_level_pid() in restore_pgid()
This patch is cleanup, which just makes comparation
on values on the one pid level. It has no functional
payload, because the new patches turn off pgid set
if for multi-level pids cases, till it will be implemented.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
ba990b7d69 pid: Make pgid and sid be allocated dynamically
They may contain several levels like task's pid,
so they must be struct pid type, not a scalar.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
2e1c6e000c pid: Add last_level_pid() helper
It returns pid in task's active pid namespace
(i.e., returned by getpid()).

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
ec87dd1072 pid: Add equel_pid() helper
This allows to compare pids values on the whole hierarhy.

v3: Do not use break as some travis builds don't like it.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
a2bfe6a607 pid: Add pid::level field and level argument for __alloc_pstree_item()
Pid may contain several levels, so add level field to this struct.
Currently, level is always "1".

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
b76154cf9c restore: Block SIGCHLD during root_item initialization
(Was "user_ns: Block SIGCHLD during namespaces generation")

We don't want asynchronous signal handler during creation
of namespaces (for example, in create_user_ns_hierarhy())
as we do wait() synchronous. So we need to block the signal.
Do this once globally.

v2: Set initial ret = 0
v3: Block signal globally in root_item before its children
are created.
v4: Move block to prepare_namespace()

Suggested-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
e0987dc238 ns: Use waitpid() in create_user_ns_hierarhy_fn()
We're interested in just created child only. Other possibly guys
will be handled in appropriate places later (criu task may have
helpers-children).

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Dmitry Safonov
f456c6bd17 zdtm: rely on -D_GNU_SOURCE passed from Makefiles
After the commit
  02c763939c10 ("test/zdtm: unify common code")

CFLAGS with -D_GNU_SOURCE defined in the top Makefile
are being passed to tests Makefiles.
As _GNU_SOURCE is also defined by tests, that resulted in
zdtm tests build failures:

  make[2]: Entering directory `/home/criu/test/zdtm/lib'
   CC        test.o
  test.c:1:0: error: "_GNU_SOURCE" redefined [-Werror]
   #define _GNU_SOURCE
   ^
  <command-line>:0:0: note: this is the location of the previous definition
  cc1: all warnings being treated as errors
  make[2]: *** [test.o] Error 1

However, we didn't catch this in time by Travis-CI, as zdtm.py doesn't
do `make zdtm`, rather it does `make -C test/zdtm/{lib,static,transition}`.
By calling middle makefile this way, it doesn't have _GNU_SOURCE in
CFLAGS from top-Makefile.

I think the right thing to do here - is following CRIU's way:
rely on definition of _GNU_SOURCE by Makefiles.

This patch is almost fully generated with
  find test/zdtm/ -name '*.c' -type f					\
     -exec sed -i '/define _GNU_SOURCE/{n;/^$/d;}' '{}' \;		\
     -exec sed -i '/define _GNU_SOURCE/d' '{}' \;

With an exception for adding -D_GNU_SOURCE in tests Makefile.inc for
keeping the same behaviour for zdtm.py.
Also changed utsname.c to use utsname::domainname, rather private
utsname::__domainname, as now it's uncovered (from sys/utsname.h):
> struct utsname
>  {
...
> # ifdef __USE_GNU
>     char domainname[_UTSNAME_DOMAIN_LENGTH];
> # else
>     char __domainname[_UTSNAME_DOMAIN_LENGTH];
> # endif

Reported-by: Adrian Reber <areber@redhat.com>
Cc: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
d9a785adbb zdtm: Add proc-self01 test
Check, that fdstore-keeped user ns files are opened
correct after restore.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
ec157994f8 ns: Fix wrong opened net ns file
Since net ns is assigned after prepare_fds() and,
in common case, at the moment of open_ns_fd() call
task points to a net ns, which differs to its target
net ns, we can't get the ns from a task. So, get it
from fdstore. Also, support userns ns fds.

v2: Add comment

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
e55cae10cc user_ns: Keep root_user_ns ns fd in fdstore
This improves uniformity. Also, this will be used in next patch.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
7af38c85cf ns: Pack functionality of storing ns fd to store_self_ns()
Move the code to simplify it and to allow to use this function others.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
47170c462d ns: Use CLONE_VM in create_user_ns_hierarhy_fn()
This function may call functions like open_proc(),
so use CLONE_VM to reflect children open files in
parent memory.

v3: New

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kirill Tkhai
10989ef274 ns: Alloc child stack dynamically in create_user_ns_hierarhy_fn()
This will be used in next patch.

Also, check for MAP_FAILED istead of NULL before munmap().

v3: New

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
7c9b385cd2 util: block SIGCHLD to run a sub process
CRIU sets a sigchld handler and calls waitpid from it,
but when we call a sub-process, we want to wait it

v2: remove a debug code

Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
c6cd71d43b restore: don't forget to update creds after rst_mem_remap_ptr()
rst_mem_alloc() can moves a vma with previous objects, so
if we want to access them, we have to update their pointers

https://github.com/xemul/criu/issues/304

Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Fixes: 72e295ebbb26 ("ns: Convert task cred's xids to target user ns")
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Kir Kolyshkin
e4d1970427 Don't use open_proc() where it fails
This reverts a hunk from commit 4ad343c ("Use *open_proc* where
possible"), and adds a comment explaining why.

The bug was caught by ci [1] and wasn't caught by Travis because
the last one runs on the older kernel.

(00.271276)      1: Error (criu/util.c:204): fd 0 already in use
	(called at criu/files.c:1008)
(00.292162) Error (criu/cr-restore.c:1127): 425 exited, status=1
(00.295802) Error (criu/cr-restore.c:2059): Restoring FAILED.

[1] https://ci.openvz.org/view/CRIU/job/CRIU/job/CRIU-snap/job/criu-dev/2079/consoleFull

Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
b15a25c698 criu: don't use a glibc cached pid
In glibc 2.24, getpid() returns a parent PID, if a child was created
with the CLONE_VM flag.

https://sourceware.org/bugzilla/show_bug.cgi?id=17214
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909

The glibc git contains the next patch, which removes cached pid too:
 commit c579f48edba88380635ab98cb612030e3ed8691e
 Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
 Date:   Mon Oct 10 15:08:39 2016 -0300

    Remove cached PID/TID in clone

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
9fb0bd805e test: check veth devices from two network namespaces
We shave a test case for external veth devices. This test case
checks veth devices which are living in two dumped network
namespaces.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
9e917ab04d net: dump and restore connected to a bridge links
A network device, which is connected to a bridge, is restored
after the bridge. In this case we can set the master attribute and
the device will be connected to the bridge automatically.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
757cafa727 net: create a list of all links
We will need to enumirate links a few times

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
3944814754 net: split restore_links on read and restore parts
It's a preparation for enumirating links a few times.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
a6d717b7bc netns: restore internal veth devices
When we dump a veth device, the kernel reports where a peer device lives
and we use this information to restore this veth pair.

On restore we set a net ns id for a peer and it is created in the required
netns.

v2: add more comments
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
21c71776cd net: give ns_id to link_info functions
It will be used to restore links in different net namesapces.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
ad974be213 netns: dump and restore network namespace ID-s
In each network namespace we can set an id for another network namespace
to be able to address it in netlink messages.

For example, we can say that a peer of a veth devices has to be created
in a network namespace with a specified id. If we request information about
a veth device, a kernel will report where a peer device lives.

An user are able to set this ID-s, so we have to dump and restore them.

v2: add more commetns
v3: make a union of nsfd_id and ns_fd, they are not used together
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:16 +03:00
Andrei Vagin
f7d15e436f netns: create a netlink route socket out of dump_links()
It will be used to dump netns id-s too.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Andrei Vagin
0cd9ad0bd5 net: transfer ns_id structures to functions about c/r-ing netns
It will be used to get or set netns id-s.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Andrei Vagin
301f39a499 netlink: add nla_get_s32()
This function was added into libnl3 recently,
but we have to support old versions of this library.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Andrei Vagin
6c41550ffd kerndat: check whether a kernel supports netns id-s or not
Each network namespaces has a list of ID-s for other namespaces,
so if we request infomation about a veth device, we get an id
for a namespace of a peer device.

These ID-s can be set by users or by kernel when they are required.
CRIU has to restore these ID-s for network namespaces. We have to
remember that one netns can have different id-s in different network
namespaces.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Andrei Vagin
b60a836f32 images: add a network namespace id into images
It is possible to assign id for network namespaces and
this id will be used by the kernel in some netlink messages.
If no id is assigned when the kernel needs it, it will be
automatically assigned by the kernel.

For example, this id is reported for peer veth devices.

v2: add a comment
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Cyrill Gorcunov
43e00a8cf5 images: tty -- Reserve entries for multiple devpts support
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Cyrill Gorcunov
87a269c9b2 images: sk-packet -- Reserve entries for ucreds messages
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Cyrill Gorcunov
2f89607bc4 images: sk-netlink -- Reserve entries for netlink queued messages
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Cyrill Gorcunov
827d9ff58d images: sk-inet -- Reserve entries for IP raw sockets
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Cyrill Gorcunov
0788b01772 images: remap-file-path -- Reserve entries for spfs manager
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kirill Tkhai
a591636f9f ns: Do not reuse PROC_SELF after CLONE_VM child
Child opens PROC_SELF, populates open_proc_self_pid and exits. If parent creates
one more child with the same pid later, the new child will try to reuse PROC_SELF,
set by exited child. So, we need to close PROC_SELF after the first child has finished.

We have this issue in two places, which have the same code. Let's move the code
into new function call_in_child_process() and fix the issue there using close_pid_proc().

https://travis-ci.org/tkhai/criu/builds/214182862

v2: Introduce the helper call_in_child_process() and fix issue there.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kir Kolyshkin
6a3280c747 No \n in pr_perror
It adds one for you already.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kir Kolyshkin
41bcdd8f0c copy_file: rm extra error message
No need to print an error after xmalloc(), it already printed one
for you.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kir Kolyshkin
2af2159176 criu/img-remote.c: use pr_err not pr_perror
In those error paths where we don't have errno set,
don't use pr_perror(), use pr_err() instead.

Cc: Rodrigo Bruno <rbruno@gsd.inesc-id.pt>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kir Kolyshkin
ae845f0079 criu/img-remote.c: use xmalloc
1. Use xmalloc() where possible.

2. There is no need to print an error message, as xmalloc()
   has already printed it for you.

Cc: Rodrigo Bruno <rbruno@gsd.inesc-id.pt>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kir Kolyshkin
47b5489548 criu/img-remote-proto.c: error printing fixes
OK, so we have pr_perror() for cases where errno is set (and it makes
sense to show it), and pr_err() for other errors. A correct function
is to be used, depending on the context.

1. pthread_mutex_*() functions don't set errno, therefore pr_perror()
   should not be used.

2. accept() sets errno => makes sense to use pr_perror().

3. read_header() arguably sets errno => use pr_err().

4. open_proc_rw() already prints an error message, there is no need
   for yet another one.

Cc: Rodrigo Bruno <rbruno@gsd.inesc-id.pt>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kir Kolyshkin
941929c61f criu/img-remote-proto.c: use static mutex init
I see no need to do dynamic init here.

Cc: Rodrigo Bruno <rbruno@gsd.inesc-id.pt>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Andrei Vagin
e84540e1f1 zdtm: show a process tree if a test doesn't show signs of life
Call "ps axf" if waitpid() is running more than 10 seconds

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Andrei Vagin
136ae76997 mount: use switch_ns_by_fd() instead of open_proc() + setns()
This function was added recently.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kirill Tkhai
6a430da7ca net: Create child with CLONE_VM in prepare_net_namespaces()
Some functions in prepare_net_ns() use vmalloc(), and this
memory should be visible to our children.

v4: munmap() stack or err path.
v3: Call prepare_userns_creds() before restore net.
v2: No functional changes, just killed continue.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kirill Tkhai
96cbd9f5bd zdtm: Add userns02 test
Differs to userns01 test by unsharing net net in child.
This should test nested user/net ns interaction.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00
Kirill Tkhai
6105dcd590 ns: export prepare_userns_creds()
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-11-30 01:22:15 +03:00