mirror of
https://github.com/openvswitch/ovs
synced 2025-08-29 13:27:59 +00:00
docs: Document manual cluster recovery procedure.
Remove the notion of cluster/leave --force since it was never implemented. Instead of these instructions, document how a broken cluster can be re-initialized with the old database contents. Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This commit is contained in:
parent
139b564dbd
commit
01a0fff361
@ -315,16 +315,11 @@ The above methods for adding and removing servers only work for healthy
|
|||||||
clusters, that is, for clusters with no more failures than their maximum
|
clusters, that is, for clusters with no more failures than their maximum
|
||||||
tolerance. For example, in a 3-server cluster, the failure of 2 servers
|
tolerance. For example, in a 3-server cluster, the failure of 2 servers
|
||||||
prevents servers joining or leaving the cluster (as well as database access).
|
prevents servers joining or leaving the cluster (as well as database access).
|
||||||
|
|
||||||
To prevent data loss or inconsistency, the preferred solution to this problem
|
To prevent data loss or inconsistency, the preferred solution to this problem
|
||||||
is to bring up enough of the failed servers to make the cluster healthy again,
|
is to bring up enough of the failed servers to make the cluster healthy again,
|
||||||
then if necessary remove any remaining failed servers and add new ones. If
|
then if necessary remove any remaining failed servers and add new ones. If
|
||||||
this cannot be done, though, use ``ovs-appctl`` to invoke ``cluster/leave
|
this is not an option, see the next section for `Manual cluster recovery`_.
|
||||||
--force`` on a running server. This command forces the server to which it is
|
|
||||||
directed to leave its cluster and form a new single-node cluster that contains
|
|
||||||
only itself. The data in the new cluster may be inconsistent with the former
|
|
||||||
cluster: transactions not yet replicated to the server will be lost, and
|
|
||||||
transactions not yet applied to the cluster may be committed. Afterward, any
|
|
||||||
servers in its former cluster will regard the server to have failed.
|
|
||||||
|
|
||||||
Once a server leaves a cluster, it may never rejoin it. Instead, create a new
|
Once a server leaves a cluster, it may never rejoin it. Instead, create a new
|
||||||
server and join it to the cluster.
|
server and join it to the cluster.
|
||||||
@ -362,6 +357,40 @@ Clustered OVSDB does not support the OVSDB "ephemeral columns" feature.
|
|||||||
ones when they work with schemas for clustered databases. Future versions of
|
ones when they work with schemas for clustered databases. Future versions of
|
||||||
OVSDB might add support for this feature.
|
OVSDB might add support for this feature.
|
||||||
|
|
||||||
|
Manual cluster recovery
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. important::
|
||||||
|
|
||||||
|
The procedure below will result in ``cid`` and ``sid`` change. A *new*
|
||||||
|
cluster will be initialized.
|
||||||
|
|
||||||
|
To recover a clustered database after a failure:
|
||||||
|
|
||||||
|
1. Stop *all* old cluster ``ovsdb-server`` instances before proceeding.
|
||||||
|
|
||||||
|
2. Pick one of the old members which will serve as a bootstrap member of the
|
||||||
|
to-be-recovered cluster.
|
||||||
|
|
||||||
|
3. Convert its database file to the standalone format using ``ovsdb-tool
|
||||||
|
cluster-to-standalone``.
|
||||||
|
|
||||||
|
4. Backup the standalone database file.
|
||||||
|
|
||||||
|
5. Create a new single-node cluster with ``ovsdb-tool create-cluster``
|
||||||
|
using the previously saved standalone database file, then start
|
||||||
|
``ovsdb-server``.
|
||||||
|
|
||||||
|
6. Once the single-node cluster is up and running and serves the restored data,
|
||||||
|
new members should be created and added to the cluster, as usual, with
|
||||||
|
``ovsdb-tool join-cluster``.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
The data in the new cluster may be inconsistent with the former cluster:
|
||||||
|
transactions not yet replicated to the server chosen in step 2 will be lost,
|
||||||
|
and transactions not yet applied to the cluster may be committed.
|
||||||
|
|
||||||
Upgrading from version 2.14 and earlier to 2.15 and later
|
Upgrading from version 2.14 and earlier to 2.15 and later
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
@ -461,8 +461,7 @@ This does not result in a three server cluster that lacks quorum.
|
|||||||
.
|
.
|
||||||
.IP "\fBcluster/kick \fIdb server\fR"
|
.IP "\fBcluster/kick \fIdb server\fR"
|
||||||
Start graceful removal of \fIserver\fR from \fIdb\fR's cluster, like
|
Start graceful removal of \fIserver\fR from \fIdb\fR's cluster, like
|
||||||
\fBcluster/leave\fR (without \fB\-\-force\fR) except that it can
|
\fBcluster/leave\fR, except that it can remove any server, not just this one.
|
||||||
remove any server, not just this one.
|
|
||||||
.IP
|
.IP
|
||||||
\fIserver\fR may be a server ID, as printed by \fBcluster/sid\fR, or
|
\fIserver\fR may be a server ID, as printed by \fBcluster/sid\fR, or
|
||||||
the server's local network address as passed to \fBovsdb-tool\fR's
|
the server's local network address as passed to \fBovsdb-tool\fR's
|
||||||
|
Loading…
x
Reference in New Issue
Block a user