Implementation changes for usage of simple configuration files. Before
parsing the command line options, either default configuration files
(/etc/criu/default.conf, $HOME/.criu/default.conf; in this order) are
parsed, or a specific config file passed by the user. Two new options are
introduced: "--config FILEPATH" option allows users to specify a single
configuration file they want to use; and "--no-default-config" option to
forbid the parsing of default configuration files. Both options are to be
passed only via the command line.
Usage of configuration files is not mandatory to keep backwards
compatibility. The implementation of this feature tries to be compatible
with command line usage -- the user should get the same results whether
he passes the options (in the right order of parsing) on command line or
writes them in config files. This allows the user to:
1) Override boolean options if needed
2) Specify partial configuration for options that are possible to pass
several times (e.g. "--external"), and pass the rest of the options
based on process runtime by command line
Configuration file syntax allows comments marked with '#' sign, the rest
of the line after '#' is ignored. The user can use one option per line
(with argument supplied on the same line if needed, divided with whitespace
characters), the options are the same as long options (without the "--"
prefix used on command line).
Configuration file example (syntax purposes only, doesn't make sense):
$ cat ~/.criu/default.conf
tcp-established
work-dir /home/<USERNAME>/criu/work_directory
extra # inline comment
no-restore-sibling
tree 111111
Signed-off-by: Veronika Kabatova <vkabatov@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
On s390 the first two paramters are swapped because we use
the CONFIG_CLONE_BACKWARDS2 kernel config option.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
If task's pid were hashed before the task itself
(this may happen, when another task has sid or pgid
equal to this pid), the pid mustn't contain zero
levels. So, if pgid or sid has zero levels, we should
not add them.
Otherwise, session04 --iter 3 fails with:
=[log]=> dump/zdtm/static/session04/30/2/restore.log
------------------------ grep Error ------------------------
(01.858187) 6: Restoring children in our session:
(01.858206) 6: Forking task with 303 pid (flags 0x600)
(01.869893) 1: PID: real 145 virt 15
(01.870247) 1: Forking task with 20 pid (flags 0x0)
(01.872948) Error (criu/cr-restore.c:381): 0: Write -1 to sys/kernel/ns_last_pid: Invalid argument
(01.873030) Error (criu/namespaces.c:2664): Can't set next pid
(01.873103) 1: Error (criu/ns-common.c:46): Error answer
(01.873123) 1: Error (criu/cr-restore.c:404): Can't request next pid
(01.873135) 1: Error (criu/cr-restore.c:1321): Can't set next pid
(01.873310) 1: Error (criu/cr-restore.c:1434): Can't fork for 20: No such file or directory
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Since commit 84eedc49a (pstree: Make lookup_create_pid() able to create
tasks with pid->level > 1) the read_pstree_image function presumes that
namespaces image is already parsed.
This patch ensures that this is the case for prepare_dummy_pstree users.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
wait() waits children created using SIGCHLD signal only.
Add it.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
There's a
if (bad_thing) {
ret = -1;
break;
}
code above this hunk, whose intention is to propagate -1 back to
caller. This propagation is obviously broken.
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Commit 37e4c7bfc264 fixed arm, ppc, x86 (32bit),
while it made wrong definition of x86_64. Fix that.
Also, add commentary to raw fork() implementation.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
This adds the option '--check-only' to zdtm.py. If specified each test
cases is first dumped with the '--check-only' option enabled before the
real dump. Also during restore the test case is first restored with
--check-only before doing the real restore.
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
If calling gone() without ever calling getpid() before leads to
backtrace. Just call getpid() to avoid that.
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
In preparation for the zdtm option '--check-only' a new helper function
reset_pid() is added which writes to ns_last_pid to avoid PID collisions
during check-only restore and the real restore.
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
1)Create a socket, bind it, then create a child in lower user, pid and net ns.
2)Close socket in parent
3)After signal, check that child can create the socket with the same name.
(It must, as it's in another net namespace).
v2: Add uid/gid mapping.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Sockets are sent via SCM_CREDENTIALS, and this kernel interface
needs to have uid and gid mapped (see __scm_send() in kernel).
So, set them before send_fds() use.
Also, move prep_usernsd_transport() below to be after this
for uniformity.
v2: New
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Iterate over fake_master_head and add a fake fake fle of root_item,
which becomes new master and have permissions to restore file_desc.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
On Thu, Jun 15, 2017 at 12:16, Cyrill Gorcunov wrote:
> On Thu, Jun 15, 2017 at 12:10:43PM +0300, Kirill Tkhai wrote:
> > On Wed, Jun 14, 2017 at 23:32, Andrei Vagin wrote:
> > > On Wed, Jun 07, 2017 at 02:28:53PM +0300, Kirill Tkhai wrote:
> > > > 1)Find such fle, and link it at the beginning of list.
> > > > 2)Order by pid, where possible, if it does not contradict (1)
> > >
> > > Why do we need to order by pid?
> >
> > This was initially, and I left the logic. As I know,
> > it's need for epoll, to place master in parent task.
> >
> > CC: gorcunov@virtuozzo.com
> > Cyrill, could you please say, why we need this, if you remember?
>
> I think it's the same as in bug we met.
> ---
> commit 2df9c9dc6e0b926aaba00138e3e66295ebea76ce
> Author: Cyrill Gorcunov <gorcunov@virtuozzo.com>
> Date: Mon Apr 3 18:38:55 2017 +0300
>
> vz7: files -- Select proper master fd when collecting fd
>
> When choosing the master file which gonna be sending file
> descriptor to the children we must not only look into
> their PIDs but consider process tree relations, in particular
> the child of a process might be choosen as a master and
> epoll restore will fail because target files are simply
> not present in child tree.
>
> | 31964 31964 31964 epoll
> | 585 31964 31964 epoll
> | 586 31964 31964 epoll
> |...
> | (04.797121) 585: Error (criu/eventpoll.c:180): epoll: Unexpected state for tfd (id 0 fd 8)
>
> That's because the target files are blong to 31964 and not
> present in child 585, but because PID wrapp happened it
> has been chosen as a leader which is of course wrong.
>
> https://jira.sw.ru/browse/PSBM-63355
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
[PATCH v3 18/30]files: Choose file master with enough permissions
1)Find such fle, and link it at the beginning of list.
2)Order by pid, where possible, if it does not contradict (1)
3)If there is no a master, leave fdesc in fake_master_head.
v3: Describe pid order reasons
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Returns user_ns of file (currently it's not exported to userspace)
and minimal user_ns need for restore file (for example, socket
net_ns->user_ns, regulating setns() permittions).
This will be need to choose correct process as owner of file master.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The aim is to have top_user_ns set even if !(root_ns_mask & CLONE_NEWUSER).
This allows to avoid additional comparison top_user_ns with NULL elsewhere.
Thus, move fixup for old images to generic code, to support the case above.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
I'm going to use this in !(root_ns_mask & CLONE_NEWUSER) case,
so choose a better name to fit everything.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
It will be need for fast obtaining root_item's net_ns,
and to fixup old dumps.
v2: Add a comment to top_xxx_ns. Extend MARK_ROOT_NS().
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Just to not allocate path buffer twice.
v2: Change debug message.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Plain wait() waits only children created with SIGCHLD flag.
Add it.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Wait child before daemonization to do not allow
zdtm.py to see child fds and maps before it
becomes zombie.
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Wait child before daemonization to do not allow
zdtm.py to see child fds and maps before it
becomes zombie.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
The original idea was to sort children and to keep child
reapers at the beginning of the list. But there a mistake
happened: we must look for last_level_pid() as it is
an indicator of a child_reaper.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Parasite returns last level pid (pid in task's pid namespace),
so we mustn't rewrite already collected from /proc/[pid]/status
vpid.
We handle that correctly on dump, do the same on pre-dump.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Session 15's(20) leader is in first pidns, one it's process is in
second pidns and one is
in the third. So we create two helpers here for each aditional
pidns.
(It is critical that
Full test now looks like (mind pids here are different(real) from
their id's in source code e.g. 15 is 20 here):
(pid,ppid,sid)
session04(1, 0, 1)───session04(4, 1, 4)───session04(5, 4, 4)───session04(6, 5, 6,pid1)─┬─session04(8, 6, 8)───session04(9, 8, 7)
├─session04(10, 6, 6)───session04(11, 10, 11)
├─session04(13, 6, 13)───session04(14, 13, 11)
├─session04(15, 6, 15)
├─session04(17, 6, 17)─┬─session04(18, 17, 15)
│ └─session04(19, 17, 17,pid2)───session04(22, 19, 20)
├─session04(20, 6, 20)
└─session04(23, 6, 6,pid3)───session04(25, 23, 20)
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Demand ns_pid, ns_get_userns and ns_get_parent features, else will
have "Can't do ns ioctl" error in criu:set_ns_opt().
v2:remove unused variable i in cleanup
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Before "pstree: rework init reparent handling for pid namespaces" patch
we would get:
$ ./test/zdtm.py run -t zdtm/static/session01
=== Run 1/1 ================ zdtm/static/session01
======================= Run zdtm/static/session01 in ns ========================
Start test
./session01 --pidfile=session01.pid --outfile=session01.out
Run criu dump
Run criu restore
=[log]=> dump/zdtm/static/session01/31/1/restore.log
------------------------ grep Error ------------------------
(00.001103) 8 was born with sid 4
(00.001105) 7 was born with sid 4
(00.001106) 21 was born with sid 17
(00.001108) 1 was born with sid 17
(00.001109) Error (criu/pstree.c:1005): Can't find a session leader for 17
------------------------ ERROR OVER ------------------------
Corresponding tree before dump:
(combined 'pstree -pS 1' and 'ps axf -o pid,ppid,sid')
session01(1, 0, 1)─┬─session01(3, 1, 1)───session01(4, 3, 4)─┬─session01(5, 4, 5)─┬─session01(23, 5, 5)
│ │ ├─session01(24, 5, 5)
│ │ └─session01(26, 5, 5)
│ ├─session01(6, 4, 4)
│ ├─session01(7, 4, 7)───session01(16, 7, 4)
│ └─session01(8, 4, 8)───session01(15, 8, 15)───session01(20, 15, 4)
├─session01(12, 1, 12)───session01(17, 12, 17)───session01(18, 17, 18)───session01(27, 18, 4)
├─session01(13, 1, 10)
├─session01(14, 1, 4)
└─session01(21, 1, 21)───session01(22, 21, 17)
22 can not restore as it needs session 17, but 17-th's leader is not in
ancestors(21 had been reparented from 17; 12, 13 an 14 from 4).
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
These checks skip adding helpers and setting ids in case
of nested pid namespaces.
FIXME disable pgid, as it does not work yet
v2: add a comment near the added check for pgid
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
- Put code into new handle_init_reparent, make it pidns relative
and call it for each pidns.
- Consider the case when process tree branch(subtree) reparented to init
(parent of root of these branch died) riping some session in two
pieces and representative of these session in reparented branch can
not inherit its session if we simply try to fork the tree as is.
Patch adds helper can_inherit_sid to find such "adopted" brunches and
re-reparent them to helpers.
Previousely we had only direct children of init handled.
- We need many helpers for one session as:
1) The leader of session, if it is already dead, can not be recreated as
a helper in arbitrary pidns. But only in pidns ancestor of pidns of
any alive process of these session (sessions processes can't leave
pidns in which the session had been created).
More over session can be created only on proper level: sid array of the
alive process can end with several zerroes, meaning that after creation
of session, processes had entered several more pidnses, so we need to
cut these extra levels before creating the leader.
2) We can not re-reparent branch directly to session leader as the latter
can be in other pidns, thus create additional helper in our init's pidns,
and it's children will reparent to init.
If parents of session processes are in multiple pidnses we will need
helper per each such pidns, to be able to re-reparent them. See test
with setns for an example
- Collect all helper processes in separate list, so that it would
be easier to find them with get_helper_by_sid for other possibly
existing pieces of these sid. Branches re-reparented to such helpers
are temporary out of the tree and also skipped from walk over items
in for_each_pssubtree_item.
- Collect zombies and helpers which will reparent to init of pidns in
collect_child_pids to init of pidns instead of root task.
- The process tree which had only reparents to pidns init process
(no child subreapers reparents) will be restored fine). One tricky case
than we need re-reparent and the session leader is in same pidns with us
and our parent is in lower pid ns will fail - it happens than somebody
enters the pidns does setsid and then does clone(CLONE_PARENT).
v2: handle get_free_pids returns 0 as error
v3: rebase due to patchwork fail - use add_child_task and move_child_task
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
If process belonging to some session is in different pidns than leader
of these session, it will have zeroes on all aditional levels in sid,
so though levels for these process and leader does not match - sids do.
v2: change to static inline function as there is no more pr_err
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Need it to lookup adoptive children of pidns init. Also add
skip_descendants flag to be able to skip unneeded subtrees.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
v2: handle get_free_pids returns 0 as error, remove unneeded iter var in
get_free_pids
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Create a child in new pid_ns; then the child creats thread and zombie.
Zombie is in the second created new pid_ns. Then the great parent
setns() to its active pid_ns. So, lets draw the table:
pid_ns vs pid_for_children_ns
great parent: equal
child: not equal
child thread: equal
grand child zombie: zombies don't have pid_for_children_ns
After signal chech that everything remains the same.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Threads may have different pid_for_children ns.
Allow them to set it after they are created:
just get a pid_ns fd from fdstore, and setns()
to it, after thread creation.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Restore it in dependence of thread numbers:
1)single-threaded -- before user_ns assignment
2)multi-threaded -- after thread creation (in next patch).
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
In next patches set_pid_for_children_ns() will be used
without pid, so move pid check out of function.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
No functional changes -- just to improve readability.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>