2
0
mirror of https://github.com/openvswitch/ovs synced 2025-09-01 14:55:18 +00:00

ovsdb-cs: Perform forced reconnects without a backoff.

The ovsdb-cs layer triggers a forced reconnect in various cases:
- when an inconsistency is detected in the data received from the
  remote server.
- when the remote server is running in clustered mode and transitioned
  to "follower", if the client is configured in "leader-only" mode.
- when explicitly requested by upper layers (e.g., by the user
  application, through the IDL layer).

In such cases it's desirable that reconnection should happen as fast as
possible, without the current exponential backoff maintained by the
underlying reconnect object.  Furthermore, since 3c2d6274bc ("raft:
Transfer leadership before creating snapshots."), leadership changes
inside the clustered database happen more often and, therefore,
"leader-only" clients need to reconnect more often too.

Forced reconnects call jsonrpc_session_force_reconnect() which will not
reset backoff.  To make sure clients reconnect as fast as possible in
the aforementioned scenarios we first call the new API,
jsonrpc_session_reset_backoff(), in ovsdb-cs, for sessions that are in
state CS_S_MONITORING (i.e., the remote is likely still alive and
functioning fine).

jsonrpc_session_reset_backoff() resets the number of backoff-free
reconnect retries to the number of remotes configured for the session,
ensuring that all remotes are retried exactly once with backoff 0.

This commit also updates the Python IDL and jsonrpc implementations.
The Python IDL wasn't tracking the IDL_S_MONITORING state explicitly,
we now do that too.  Tests were also added to make sure the IDL forced
reconnects happen without backoff.

Reported-at: https://bugzilla.redhat.com/1977264
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This commit is contained in:
Dumitru Ceara
2021-07-21 14:51:00 +02:00
committed by Ilya Maximets
parent 69b2bdfd3f
commit daf627f459
6 changed files with 123 additions and 4 deletions

View File

@@ -102,6 +102,7 @@ class Idl(object):
IDL_S_SERVER_MONITOR_REQUESTED = 2
IDL_S_DATA_MONITOR_REQUESTED = 3
IDL_S_DATA_MONITOR_COND_REQUESTED = 4
IDL_S_MONITORING = 5
def __init__(self, remote, schema_helper, probe_interval=None,
leader_only=True):
@@ -295,6 +296,7 @@ class Idl(object):
else:
assert self.state == self.IDL_S_DATA_MONITOR_REQUESTED
self.__parse_update(msg.result, OVSDB_UPDATE)
self.state = self.IDL_S_MONITORING
except error.Error as e:
vlog.err("%s: parse error in received schema: %s"
@@ -442,6 +444,15 @@ class Idl(object):
def force_reconnect(self):
"""Forces the IDL to drop its connection to the database and reconnect.
In the meantime, the contents of the IDL will not change."""
if self.state == self.IDL_S_MONITORING:
# The IDL was in MONITORING state, so we either had data
# inconsistency on this server, or it stopped being the cluster
# leader, or the user requested to re-connect. Avoiding backoff
# in these cases, as we need to re-connect as soon as possible.
# Connections that are not in MONITORING state should have their
# backoff to avoid constant flood of re-connection attempts in
# case there is no suitable database server.
self._session.reset_backoff()
self._session.force_reconnect()
def session_name(self):

View File

@@ -612,5 +612,18 @@ class Session(object):
def force_reconnect(self):
self.reconnect.force_reconnect(ovs.timeval.msec())
def reset_backoff(self):
""" Resets the reconnect backoff by allowing as many free tries as the
number of configured remotes. This is to be used by upper layers
before calling force_reconnect() if backoff is undesirable."""
free_tries = len(self.remotes)
if self.is_connected():
# The extra free try will be consumed when the current remote
# is disconnected.
free_tries += 1
self.reconnect.set_backoff_free_tries(free_tries)
def get_num_of_remotes(self):
return len(self.remotes)