2
0
mirror of https://github.com/checkpoint-restore/criu synced 2025-08-28 21:07:43 +00:00

9635 Commits

Author SHA1 Message Date
Mike Rapoport
c89a22a8e9 zdtm: simulate lazy migration with page server that can send pages
Lazy migration requires both dumped and restored processes to coexist at
the same time. This breaks some basic assumptions in the zdtm design.
Simulation of lazy migration with the page server allows testing most of
the involved code paths without major intervention into zdtm
infrastructure.

travis-ci: success for lazy-pages: improve testability (rev2)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 11:47:02 +03:00
Mike Rapoport
ac6b3b0a1e zdtm: add 'nolazy' flag for tests not compatible with lazy pages
The kernel support for lazy pages (userfaultfd) lacks many important
features which effectively prevents success in certain tests.
Allow skipping such test with somewhat informative message

travis-ci: success for lazy-pages: improve testability (rev2)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 11:47:01 +03:00
Mike Rapoport
8be54383ff page-xfer: add ability to send pages from local dump
Currently, standalone page-server can only receive pages from the remote
dump. Extend it with the ability to serve local memory dump to a remote
lazy-pages daemon.

travis-ci: success for lazy-pages: improve testability (rev2)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 11:47:01 +03:00
Mike Rapoport
22b5d5e92d page-xfer: page_server_get_pages: replace BUG_ONs with 'return -1'
Instead of crashing dump/page-server when a problem detected after the
page-pipe was split, print nice error messages and return error.

travis-ci: success for page-xfer: page_server_get_pages: replace BUG_ONs with 'return -1' (rev2)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 11:47:01 +03:00
Mike Rapoport
836b1bc87a page-pipe: (yet another) fix for split page-pipe buffers
Splitting of the trailing part of page-pipe buffer worked by coincidence
for single page requests. Request longer than a single page were not
handled correctly.
The proper point for splitting the trailing part of the page-pipe buffer is
the IOV following the IOV containing the desired page(s).

travis-ci: success for page-pipe: (yet another) fix for split page-pipe buffers (rev2)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 11:47:01 +03:00
Pavel Emelyanov
1883b7692f uffd: Relax counting the number of sockets
travis-ci: success for Some more cleanups over uffd.c (rev3)
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 11:47:01 +03:00
Pavel Emelyanov
8edb68e9ec uffd: Hide page server socket back
With epoll helpers in util we can stop exposing the
page-server socket to the oter world.

travis-ci: success for Some more cleanups over uffd.c (rev3)
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 11:47:01 +03:00
Pavel Emelyanov
4cb743e48a util: Move epoll aux code from uffd to util (v2)
v2: Move epoll_prepare() too

travis-ci: success for Some more cleanups over uffd.c (rev3)
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 11:47:01 +03:00
Mike Rapoport
f30cca66ec uffd: Relax reading the pstree image (v2)
The uffd code only needs the pstree items themselves, not
any IDs and relations they might have.

travis-ci: success for Some more cleanups over uffd.c (rev3)
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 11:47:01 +03:00
Pavel Emelyanov
12c0f452fe uffd: Unify page handling in normal and remaining modes (v2)
This run away from previous set :) Two routines are now
identical, only page-read flags differ.

v2: Keep the uffd_hanle_pages() name

travis-ci: success for Some more cleanups over uffd.c (rev3)
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 11:47:01 +03:00
Adrian Reber
e82f03e1eb cr-service: add lazy-pages RPC feature check
Extend the RPC feature check functionality to also test for lazy-pages
support. This does not check for certain UFFD features (yet). Right now
it only checks if kerndat_uffd() returns non-zero.

The RPC response is now transmitted from the forked process instead of
encoding all the results into the return code. The parent RPC process
now only sends an RPC message in the case of a failure.

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-09-16 11:47:01 +03:00
Mike Rapoport
3fdf7c02d5 lazy-pages: reduce amount of debug printouts
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
8cf8ddfb33 lazy-pages: use -PID instead of -1 for zombie processes
This gives somewhat saner debug messages

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
d66d5cdde8 lazy-dump: do not start page server if there were errors
Currently, lazy dump starts page server regardless of errors that might
have been encountered at earlier stages. Fix it.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Kir Kolyshkin
7999c84265 page_server_async_read: fix pr_perror usage
Le sigh.

travis-ci: success for more pr_perror() usage fixes
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Pavel Emelyanov
479c778a24 page-xfer: Introduce fully asynchronous read
Add a queue of async-read jobs into page-xfer. When the
page_server_sk gets a read event from epoll it reads as
many bytes into page_server_iov + page buffer as recv
allows and returns.

Once the full iov+data is ready the requestor is notified
and the next async read is started.

This patch removes calls to recv(...MSG_WAITALL) from all
remote async paths.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:34 +03:00
Pavel Emelyanov
adea705b29 uffd: Unify local and remote PF handlers
Finally, page_fault_local and page_fault_remote are
absolutely identical, so we can just merge them.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:34 +03:00
Pavel Emelyanov
76bf7d4571 page-read: Callback on io completion
This one is called by PR once IO is complete (right now
for sync cases only, more work is required here) and
lets us unify local and remote PF code in uffd.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:34 +03:00
Pavel Emelyanov
c9d374bb01 uffd: Helper to complete the #PF
The _copy and _update_lazy_iovecs are both called by hands
once the data is ready.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:34 +03:00
Pavel Emelyanov
eb0e042666 page-read: Introduce PR_ASAP flag for read_pages
This flag means, that the PR_ASYNC is valid, but the IO
should be started ASAP. This is how remote reader works,
so this flag is mostly for the local reader. It will let
us unify page-fault handlers for local and remote cases.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:34 +03:00
Pavel Emelyanov
275f81bcb5 page-read: Drop get_remote_pages
We already have routines that do send-req, recv-info
and recv-page, so no need in yet another one.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Pavel Emelyanov
edf5809f57 page-read: Only the top-most can be remote
All the "lower" page-read-s should have already arrived with
pre-dump. This fixes the combined scheme.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
4d9d7ae7e5 lazy-pages: unblock second receive in page_server_event
The page transfer protocol is completely synchronous on the dump side,
therefore we can presume that when we get POLLIN event on the page server
socket it is either page info response for the last sent page request or
the page data following the last page info.
In the first case we set ev_data associated with page server socket events
to values received in receive_remote_page_info and in the second case we
reset ev_data to zero. This allows us to distinguish what was the reason
page_server_event have been called.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
0b27e6520e lazy-pages: implement semi-async remote page transfer
The synchronous remote page transfer prevents reception of uffd events
during the communications with the page server on the dump side. Adding
socket file descriptor to epoll_wait allows processing of incoming uffd
events after non-blocking request for remote page is issued and before the
dump side page server replies.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
9537433fae pagemap: add ability to request remote pages
The asynchronous version of remote page_read will send the request to the
dump side and return happily.
The response will be handled by the uffd.c because it's epoll loop is the
only place where we can handle events.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
d4edd9bf56 lazy-pages: introduce uffd_seek_or_zero_pages
This part of code is responsible for reseting pagemap to proper locatation,
and mapping requested address to zero pfn if needed. The upcoming addtions
to uffd.c will reuse this code.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
250448559f page-xfer: add methods for requesting and receiving remote pages
For asynchrounous page transfers in post-copy migration we need to be able
to request a remote pages, receive back information about the data is going
to arrive and receive the page data itself.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
57d341f105 page-xfer: make connect_to_page_server return socket fd
It will used by lazy-pages daemon to enable polling for reception of page
data from remote dump

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
c907796dc6 lazy-pages: make uffd_{copy,zero} return 0 on success
In early days of uffd.c return value from uffd_copy was used to count
transferred pages. Since this is not the case anymore we can use 0 as
success.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
b72d5f2dca lazy-pages: extend the page_read with ability to read remote pages
Currently lazy-pages daemon uses either pr->read_pages or get_remote_pages
to get actual page data from local images or remote server. From now on,
page_read will be completely responsible for getting the page data.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
c23b83ce3a criu: page-xfer: get_remote_page: respect nr_pages parameter
travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
20d7eb4c56 criu: lazy-pages: copy remaining IOVs in chunks
travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
0d5d286feb criu: lazy-pages: add nr (of pages) parameter to handle_regular_pages
travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
2ce10690c7 criu: lazy-pages: add nr_pages parameter to uffd_{copy,zero}
travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
f45229886e lazy-pages: drop _page suffix from uffd_{copy,zero}_page
travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
1bdd3e46eb criu: lazy_pages: make buffer for copying pages per-process
Currently we allocate a single page to use as intermediate buffer for
holding data that will be used in UFFDIO_COPY. Let's allocate a buffer per
process and make that buffer large enough to hold the largest continuos
chunk.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:34 +03:00
Mike Rapoport
ec585622ef lazy-pages: fix zero pages handling
page_read->seek_page was restored to skip zero pagemaps, therefore we
should check its return value rather than underlying PME.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:33 +03:00
Mike Rapoport
a98eed95fa lazy-pages: refactor uffd_handle_page
Inline relevant parts of get_page inside uffd_handle_page and call
uffd_{copy,zero}_page after we've got the data.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:33 +03:00
Pavel Emelyanov
7c6f72cf4d uffd: Introduce lazy_pages_fd as preparation for socket polling
We will want to poll not only a bunch of uffd-s, but also the lazy
socket, so here's "an fd and a callback" object to be pushed into
epoll.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:33 +03:00
Mike Rapoport
700419bc7e criu: lazy-pages: replace page list with IOVs list
Instead of tracking memory handled by userfaultfd on the page basis we can
use IOVs for continious chunks.

travis-ci: success for uffd: A new set of improvements
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:33 +03:00
Pavel Emelyanov
0086dca47d uffdd: Implement --daemon mode
Right now the zdtm.py hacks around core code and waits for
a second for the socket to appear. Let's better make proper
--daemon mode for lazy-pages daemon and pidfile generation.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:33 +03:00
Mike Rapoport
fe53a87f70 criu: lazy-pages: simplify intialization of lazy pages list
Instead of creating mm-related parts of restore info in process tree we
can directly use MmEntry for VMA traversals.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:33 +03:00
Mike Rapoport
c00dd3459f criu: lazy-pages: move find_vmas and related code around
Moving the find_vmas and collect_uffd_pages functions before they are
actually used. This allows to drop forward declaration of find_vmas and
will make subsequent refactoring cleaner.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:33 +03:00
Pavel Emelyanov
6d405370da uffd: Check for uffd event being PF early
The event received should be checked to be #PF before
accessing its other arguments.

[ Mike:
    Well, looking forward to see non-cooperative userfaultfd patches in kernel
    we should have something like

    static int handle_uffd_enent(struct lazy_pages_info *lpi)
    {
    	read(&msg...);

    	switch (msg.event) {
    	case UFFD_EVENT_PAGEFAULT:
    		handle_pagefault(lpi, msg);
    		break;
    	default:
    		return -1;
    	}
    }

    But since this patch is anyway is a bugfix: <ack>
]

travis-ci: success for uffd: A set of improvements over criu/uffd.c
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:33 +03:00
Pavel Emelyanov
51e6e19d11 uffd: Turn lpi_hash into list
After previous patch we no longer need this hash since
we don't need fd -> lpi conversion.

travis-ci: success for uffd: A set of improvements over criu/uffd.c
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:33 +03:00
Pavel Emelyanov
6f4c60c611 uffd: Keep lpi pointer on epoll_event, not fd
This helps us get lpi MUCH faster on #PF.

travis-ci: success for uffd: A set of improvements over criu/uffd.c
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:33 +03:00
Pavel Emelyanov
345b4e77ef uffdd: Read pages directly into destination buffer
This avoids excessive memcpy() one instruction below.

travis-ci: success for uffd: A set of improvements over criu/uffd.c
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
2017-09-16 09:16:33 +03:00
Kir Kolyshkin
82c8229100 uffd.c: error logging nitpicks
In cases errno is being set, we need to use pr_perror() to print it.

In cases errno is not set, we should use pr_err().

pr_perror() doesn't need a colon or a newline. pr_err() needs a newline.

Cc: Adrian Reber <areber@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
travis-ci: success for Assorted nitpicks
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:33 +03:00
Mike Rapoport
145c509495 lazy-pages: use relative path for UNIX socket
Use relative path for UNIX socket instead of absolute one.
This ensures we won't run into problems with invalid socket names.

travis-ci: success for lazy-pages: use relative path for UNIX socket
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:33 +03:00
Mike Rapoport
428226a8ac lazy-pages: handle zombie processes
Restore of a zombie process does not call setup_uffd which causes
lazy-pages daemon to stuck forever waiting for (pid, uffd) pair to arrive.
Let's extend the protocol between restore and lazy-pages so that for zombie
process a (0, -1) pair will be sent instead of actual (uffd, pid).

travis-ci: success for lazy-pages: misc fixes (rev4)
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
2017-09-16 09:16:33 +03:00