mir/criu - criu - Mike's Git repositories

mir/criu

mirror of https://github.com/checkpoint-restore/criu synced 2025-08-26 11:57:52 +00:00

Author	SHA1	Message	Date
Kirill Tkhai	d3d17b0cbf	files: Move prepare_ctl_tty() to criu/tty.c Move the function and reduce its arguments number. This is cleanup needed to keep all tty code together. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2017-12-28 20:02:50 +03:00
Kirill Tkhai	56cd4b53c2	files: Close ctl tty via generic engine Just mark the fle as "fake" and the engine will do all the work. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>	2017-12-28 20:02:50 +03:00
Pavel Tikhomirov	b1a67f8572	zdtm: improve tempfs_overmounted test Unchanged test provided by Andrew. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Pavel Tikhomirov	b286426642	mount: do remaps for child-overmount of another overmount In case we have mounts: 1 /mnt/ 2 /mnt/a with parent 1 3 /mnt/a/b with parent 1 4 /mnt/a with parent 2 We determine 2 as needing remap with does_mnt_overmount() and remap it. Next we mount 4 on top of 2. Next in fixup_remap_mounts() we want to move 2 back to it's parent 1, but instead move 4 there. So in these case children-overmounts need to be remapped too. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Pavel Tikhomirov	b82c935c27	mount: fix try_remap_mount Remaps in mnt_remap_list should follow same descending order which was setup in mnt_resort_siblings(), so don't reorder them. For instance if we have sibling mounts with mountpoints: 1) /dir1/dir2/dir3 2) /dir1/dir2 3) /dir1 Here (2) is sibling-overmount for (1). Mount (3) is sibling-overmount for both (1) and (2). So when we move overmounts back in fixup_remap_mounts() we should first move (2) and only then (3). Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Pavel Tikhomirov	f083987f27	mount: fix mnt_resort_siblings to work as described We should add new entry _before_ first entry with less depth to sort in descending order. e.g: entries in list have depths [7,5,3], adding new entry m with depth 4 we would break list_for_each_entry loop on p with depth 3, before patch we would get [7,5,3,4] after list_add, which is wrong. Also we can relax "<=" check to "<" to avoid unnecessary reordering. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Pavel Tikhomirov	7ca51df5a8	zdtm: now tempfs_overmounted will pass so remove crfail Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Pavel Tikhomirov	b6cfb1ce29	mount: make open_mountpoint handle overmouts properly dump of VZ7 ct fails, if we have overmounted tmpfs inside: [root@silo ~]# prlctl enter su-test-2 entered into CT CT-829e7b28 /# mkdir /mnt/overmntedtmp CT-829e7b28 /# mount -t tmpfs tmpfs /mnt/overmntedtmp/ CT-829e7b28 /# mount -t tmpfs tmpfs /mnt CT-829e7b28 /# logout [root@silo ~]# prlctl suspend su-test-2 Suspending the CT... Failed to suspend the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: Will skip in-flight TCP connections (01.657913) Error (criu/mount.c:1202): mnt: Can't open ./mnt/overmntedtmp: No such file or directory (01.662528) Error (criu/util.c:709): exited, status=1 (01.664329) Error (criu/util.c:709): exited, status=1 (01.664694) Error (criu/cr-dump.c:2005): Dumping FAILED. Failed to checkpoint the Container All dump files and logs were saved to /vz/private/829e7b28-f204-4bce-b09f-d203b99befd4/dump/Dump.fail Checkpointing failed ) Criu wants to dump the contents of /mnt/overmntedtmp/ mount but it is unavailable. So we copy the mount namespace in such a case and unmount overmounts to access what we want to dump. Actual usecase here is dumping CT with active mariadb and ssh connection. Together they happen to create such overmount. As by default systemd creates a separate mount namespace for mysql and also mounts tmpfs to /run/user in it, and when ssh(root) is connected - systemd also mounts tmpfs in container root mount namespace to /run/user/0 for user files. As /run is slave mount /run/user/0 also propagates to mysql's mount namespace and initially becomes overmounted by /run/user. https://jira.sw.ru/browse/PSBM-57362 remove __maybe_unused for mnt_is_overmounted and umount_overmounts changes in v2: 1) Use clone not fork, share resources with parent same as in call_in_child_process. 2) Do not enter userns (create helper) for non-overmounted mounts. Thus return back setns/resorens logic. 3) Helper opens fd for parent directly due to CLONE_FILES, remove futex. 4) Check helper exit status properly. 5) Add get_clean_fd helper. 6) Add better comments. changes in v3: 1) Pass fd from helper through args instead of ret code, fix ret code checking. 2) Add \n to pr_err in open_mountpoint changes in v5: Make comments even better. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Pavel Tikhomirov	24298e00b3	mount add umount_overmounts helper to make mount visible also remove __maybe_unused for __umount_children_overmounts note: leave it __maybe_unused yet Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Pavel Tikhomirov	9e71c1f284	mount: add __umount_children_overmounts helper to make mount visible note: leave it __maybe_unused yet Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Pavel Tikhomirov	d4fa52e6b9	mount: add mnt_is_overmounted helper to check mount visibility note: leave it __maybe_unused yet Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-26 21:11:45 +03:00
Andrei Vagin	c1b0a849e4	syscall: fix arguments for preadv() It has two arguments "pos_l and "pos_h" instead of one "off". It is used to handle 64-bit offsets on 32-bit kernels. SYSCALL_DEFINE5(preadv, unsigned long, fd, const struct iovec __user *, vec, unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h) https://github.com/checkpoint-restore/criu/issues/424 Signed-off-by: Andrei Vagin <avagin@openvz.org> Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-15 21:54:58 +03:00
Vitaly Ostrosablin	6ccd871ba6	criu: Don't fail if ghost file has no parent dirs. Due to way CRIU handles paths (as relative to workdir), there's a case, where migration would fail. Simple example is a ghost file in filesystem root (with root being cwd). For example, "/unlinked" becomes "unlinked". And original code piece scans path for other slashes, which would be missing in this case. But it's still a perfectly valid case, and there's no need to fail. So if there's no parent dir - we just don't need to create one and we can just return 0 here instead of failing. Signed-off-by: Vitaly Ostrosablin <vostrosablin@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-15 21:54:58 +03:00
Andrei Vagin	c767b11c54	test: check that corked udp sockets are not dumped The kernel doesn't have an interface to get a sent queue for udp sockets, so currently we can't dump them and criu dump has to fail in such cases. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-15 09:14:03 +03:00
Andrei Vagin	776782fe20	sk-inet: detect corked sockets by getting a proper sock opt Now we block all sockets with non-zero idiag_wqueue, but it doesn't mean that a CORK option is enabled for a socket. A packet can be in a network stack and it is accounted into idiag_wqueue. https://github.com/checkpoint-restore/criu/issues/409 Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-15 09:14:03 +03:00
Andrei Vagin	9ca8953baa	test/docker: check a continaer with a read-only file system Now it's probably one valide use case, because there is no way to commit a container when a container is being checkpointed. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-07 19:56:24 +03:00
Andrei Vagin	514510ba87	zdtm/tempfs_subns: sync children with the main process All static tests has to stop any activity before C/R. ./tempfs_subns --pidfile=tempfs_subns.pid --outfile=tempfs_subns.out --dirname=tempfs_subns.test Run criu dump Unable to kill 128: [Errno 3] No such process Run criu restore 7: Old mounts lost: [] 7: New mounts appeared: [('/rootfs/criu/test', '/'), ('/', '/proc'), ('/', '/dev/pts')] : Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-07 19:55:23 +03:00
Pavel Tikhomirov	35f7f5e360	pr_err: add \n where we miss them Except for several false positives done by: find -type f -name ".c" -not -path "./test/" -exec sed -i 's/\(\<pr_err.*[^\][^n]\)\("[,)]\)/\1\\n\2/g' {} \; Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-07 19:52:13 +03:00
Pierre-Olivier Mercier	acde064232	compel: add missing header required by musl This fix compilation issue regarding undeclared NULL and memcpy when using musl on ARM. Signed-off-by: Pierre-Olivier Mercier <nemunaire@nemunai.re>	2017-12-06 21:44:24 -08:00
Andrei Vagin	0c1b1d0fe7	pipe: dump all data from a pipe Currently we use an additional pipe to steal data from a pipe, but we don't check that we steal all data. And the additional pipe can have a smaller size. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-05 04:49:50 +03:00
Andrei Vagin	b921ad2908	test: check a pipe with a custom size CRIU doesn't handle correctly pipes with sizes which are bigger than a default one. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-04 23:59:06 +03:00
Pavel Tikhomirov	4d3ae51725	remap: don't free rpath and don't shfree_last gf a) As we shmalloced rpath it can not be xfreed. b) As we shmalloc variables in collect_remap_ghost() far away from open_remap_ghost() where we want to free them, there is no guaranty that our shmalloc was last and we can't use shfree_last(). fixes commit 0c675a5e9d40 ("files: remove link_remaps when everything has been restored") When create_ghost() fails for some reason that produces a segfault for me. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-12-02 04:08:47 +03:00
Mike Rapoport	8c545089ed	zdtm: use {read,write}_data in fifo tests Reading and writing large buffers may result in short read/write. In cases we expect the entire buffer to be transferred use {read,write}_data rather than plain read/write syscalls. Reported-by: Mr Jenkins Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 21:59:03 +03:00
Mike Rapoport	a76ff847d6	zdtm: lib: add {read,write}_data helpers Some tests expect that all the data will be handled in a single invocation of read/write system call. But it possible to have short read/write on a loaded system and this is not an error. Add helper functions that will reliably read/write the entire buffer. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 21:58:59 +03:00
Andrei Vagin	39d65fba82	compel: use a correct name format for vma files in /proc/pid/map_files/ Currently we use the "map_files/%p-%p" format, but actually it should be "map_files/%lx-%lx". The kernel could handle both formats, but recently Alexey Dobriyan fixed the kernel and it accept only the second format. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Mike Rapoport	7491d9191c	lazy-pages: do not allow background fetch before restore is finished If we start backroung memory fetch before restore is completely finished, we may try to write to the memory areas which were not yet remapped to proper place and are not registered with userfaultfd. Add synchronization between restore and the lazy-pages so that lazy-pages will only handle #PFs before all the tasks are restored. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Mike Rapoport	b3a754d706	page-server: implement epoll->hangup_event The remote page read has nothing to do if the page-server on the source has closed the connection. Just report an error and abort. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Mike Rapoport	c6cb9d882a	util: epoll: add processing of EPOLL{RD}HUP Currently when we poll a file descriptor, we only process EPOLLIN events and if a connection is closed the receiving side has no means to deal with it. Add a callback for EPOLL{RD}HUP events and the default implementation for these events that removes the file descriptor from epoll and closes it. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Mike Rapoport	584123e99b	util: epoll: rename revent to read event A bit more readable and will be easy to distinguish from upcoming hevent^Whangup_event. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Mike Rapoport	307c55640c	util: epoll: move comment about timeout decrease to uffd.c The generic epoll_wait wrapper should not do any assumptions about timeout. It's it up to lazy-pages daemon to make (future) policy decisions. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	de467742c6	netfilter: use ipv4 iptables rules to block IPv4-mapped IPv6 addresses If ipv6 socket has an IPv4-mapped address, it is used to handle ipv4 connection, so we have to use ipv4 iptables rules to block this connection. Reported-by: Mr Jenkins Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Adrian Reber	89cdc6bcb4	crtools: also print the current kernel version In addition to writing the CRIU version to the log file this adds the current kernel version to the log file: (00.000008) Version: 3.5 (gitid v3.5-511-ga8cc6cf) (00.000303) Running on node01 Linux 3.10.0-513.el7.x86_64 #1 SMP Tue Feb 29 06:78:90 EST 2017 x86_64 v2: - small changes as suggested by Dmitry (thanks) Signed-off-by: Adrian Reber <areber@redhat.com> Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Pavel Begunkov	6b7c744f7b	locks: skip 'lease correction' for non-regular files Leases can be set only on regular files. Thus, as optimization we can skip attempts to find associated leases in 'correct_file_leases_type' for other fd types. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Jacob Wen	5efcb6028d	phaul: use relative path for parent link Absolute paths for parent links may not work on restore. e.g: restore on a different server(during migration). See https://github.com/checkpoint-restore/criu/blob/criu-2.x-stable/criu/image.c#L432 Signed-off-by: Jacob Wen <jian.w.wen@oracle.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Pavel Begunkov	369b56068b	zdtm: Test inherited file leases -- check childs' errors in file_leases03 -- test c/r of lease transfered to child process Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Pavel Begunkov	394875cdbc	locks: Remove duplicated locks CRIU creates dictinct lock record for each file descriptor on the same OFD. The patch removes this duplicates. To do so, it adds new field into struct file_lock, which stores pid of fd, on which lock was found. 'owner pid' is not actually helpful, because the original fd, on which lock have been set, can be already closed. Also it purges crutches doing the same stuff but only for file leases. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Mike Rapoport	6db46e554f	lazy-pages: drop_iovs: mark iov as not queued If we receive only part of the IOV from the page-server we recalculate the IOV so it will point to the area we still have to fetch. During the split, the IOV covering the remaining area may remain marked as 'queued' and we'll never retry fetching it. Marking the IOV as not queued will ensure its pages will be requested again. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Mike Rapoport	5f87346f27	page-xfer: remote-pages: allow receiving partial data Since commit e609267f681062b4370e528a50f635222e0c2330 ("page-pipe: allow to share pipes between page pipe buffers") the assumption that we will receive the exact amount of pages we've requested with PS_IOV_GET does not always hold. In the case we serve pages data from the images using 'page-server --lazy-page' the IOVs seen by the pagemap may cross page-pipe buffer boundaries and read_page_pipe will clamp the pages in the response to those boundaries. Adjust page_server_read so it will not try to receive more pages than page-server is going to send. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Mike Rapoport	1c94e98bf1	debug_show_page_pipe: add PPB's pipe offset Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	59770d4f8b	phaul: run the phaul test in a docker container golang from the Ubuntu Trusty is too old. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	ff599cd966	travis: run phaul tests Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	eb2736dd56	phaul/Makefile: add a target to run tests Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	ddb09e6e18	phaul: check an exit code of a page-server Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	7f482d20e8	phaul/test: exit with a non-zero code in error cases Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	08a0555ea5	phaul: print a message from error objects It can help to understand a error. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	98888ec773	phaul: add a script to run tests Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:15 +03:00
Andrei Vagin	cc1c41a03c	phaul/test: add github.com/golang/protobuf in vendor/ In this case, we can compile tests without cloning third party libraries. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:14 +03:00
Andrei Vagin	3637c414ca	phaul: add phaul/src/stats/stats.pb.go This is required for "go get", it can't execute any commands. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:14 +03:00
Andrei Vagin	767d6e1ace	lib: add lib/go/src/rpc/rpc.pb.go It is required for "go get", it can't execute any commands. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:14 +03:00
Andrei Vagin	e5f1a37925	phaul: use full paths for modules It is a general practice in golang and "go get" works in this case. Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>	2017-11-30 01:36:14 +03:00

... 6 7 8 9 10 ...

9635 Commits