2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 05:47:55 +00:00

34 Commits

Author SHA1 Message Date
Terry Wilson
6f24c2bc76 ovsdb: Add Local_Config schema.
The only way to configure settings on a remote (e.g. inactivity_probe)
is via --remote=db:DB,table,row. There is no way to do this via the
existing CLI options.

For a clustered DB with multiple servers listening on unique addresses
there is no way to store these entries in the DB as the DB is shared.
For example, three servers listening on  1.1.1.1, 1.1.1.2, and 1.1.1.3
respectively would require a Manager/Connection row each, but then
all three servers would try to listen on all three addresses.

It is possible for ovsdb-server to serve multiple databases. This
means that we can have a local "config" database in addition to
the main database we are serving (Open_vSwitch, OVN_Southbound, etc.)
and this patch adds a Local_Config schema that currently just mirrors
the Connection table and a Config table with a 'connections' row that
stores each Connection.

Signed-off-by: Terry Wilson <twilson@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-06-30 11:10:31 +02:00
Ilya Maximets
04e5adfedd ovsdb: raft: Fix transaction double commit due to lost leadership.
While becoming a follower, the leader aborts all the current
'in-flight' commands, so the higher layers can re-try corresponding
transactions when the new leader is elected.  However, most of these
commands are already sent to followers as append requests, hence they
will actually be committed by the majority of the cluster members,
i.e. will be treated as committed by the new leader, unless there is
an actual network problem between servers.  However, the old leader
will decline append replies, since it's not the leader anymore and
commands are already completed with RAFT_CMD_LOST_LEADERSHIP status.

New leader will replicate the commit index back to the old leader.
Old leader will re-try the previously "failed" transaction, because
"cluster error"s are temporary.

If a transaction had some prerequisites that didn't allow double
committing or there are other database constraints (like indexes) that
will not allow a transaction to be committed twice, the server will
reply to the client with a false-negative transaction result.

If there are no prerequisites or additional database constraints,
the server will execute the same transaction again, but as a follower.

E.g. in the OVN case, this may result in creation of duplicated logical
switches / routers / load balancers.  I.e. resources with the same
non-indexed name.  That may cause issues later where ovn-nbctl will
not be able to add ports to these switches / routers.

Suggested solution is to not complete (abort) the commands, but allow
them to be completed with the commit index update from a new leader.
It the similar behavior to what we do in order to complete commands
in a backward scenario when the follower becomes a leader.  That
scenario was fixed by commit 5a9b53a51ec9 ("ovsdb raft: Fix duplicated
transaction execution when leader failover.").

Code paths for leader and follower inside the raft_update_commit_index
were very similar, so they were refactored into one, since we also
needed an ability to complete more than one command for a follower.

Failure test added to exercise scenario of a leadership transfer.

Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Fixes: 3c2d6274bcee ("raft: Transfer leadership before creating snapshots.")
Reported-at: https://bugzilla.redhat.com/2046340
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-26 11:43:53 +02:00
Ilya Maximets
8d480c5cec ovsdb-cluster.at: Avoid test failures due to different hashing.
Depending on compiler flags and CPU architecture different hash
function are used.  That impacts the order of tables and columns
in database representation making ovsdb report different columns
in the warning about ephemeral-to-persistent conversion.

Stripping out changing parts of the message to avoid the issue.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-11 21:13:33 +01:00
Ilya Maximets
999ba294fb ovsdb: raft: Fix inability to join the cluster after interrupted attempt.
If the joining server re-connects while catching up (e.g. if it crashed
or connection got closed due to inactivity), the data we sent might be
lost, so the server will never reply to append request or a snapshot
installation request.  At the same time, leader will decline all the
subsequent requests to join from that server with the 'in progress'
resolution.  At this point the new server will never be able to join
the cluster, because it will never receive the raft log while leader
thinks that it was already sent.

This happened in practice when one of the servers got preempted for a
few seconds, so the leader closed connection due to inactivity.

Destroying the joining server if disconnection detected.  This will
allow to start the joining from scratch when the server re-connects
and sends the new join request.

We can't track re-connection in the raft_conn_run(), because it's
incoming connection and the jsonrpc will not keep it alive or
try to reconnect.  Next time the server re-connects it will be an
entirely new raft conn.

Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Reported-at: https://bugzilla.redhat.com/2033514
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
2022-02-25 14:15:12 +01:00
Ilya Maximets
c5a58ec155 python: idl: Allow retry even when using a single remote.
As described in commit [1], it's possible that remote IP is backed by
a load-balancer and re-connection to this same IP will lead to
connection to a different server.  This case is supported for C version
of IDL and should be supported in a same way for python implementation.

[1] ca367fa5f8bb ("ovsdb-idl.c: Allows retry even when using a single remote.")

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
2021-06-11 01:11:57 +02:00
Dumitru Ceara
4c0d093b17 ovsdb-idl.at: Make test outputs more predictable.
IDL tests need predictable output from test-ovsdb.

This used to be done by first sorting the output of test-ovsdb and then
applying uuidfilt to predictably translate UUIDs.  This was not
reliable enough in case test-ovsdb processes two or more insert/delete
operations in the same iteration because the order of lines in the
output depends on the automatically generated UUID values.

To fix this we change the way test-ovsdb and test-ovsdb.py generate
outputs and prepend the table name and tracking information before
printing the contents of a row.

All existing ovsdb-idl.at and ovsdb-cluster.at tests are updated to
expect the new output format.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 13:53:20 +02:00
Ilya Maximets
35454eba79 ovsdb-cluster.at: Fix infinite loop in torture tests.
For some reason, while running cluster torture tests in GitHub Actions
workflow, failure of 'echo' command doesn't fail the loop and subshell
never exits, but keeps infinitely printing errors after breaking from
the loop on the right side of the pipeline:

  testsuite: line 8591: echo: write error: Broken pipe

Presumably, that is caused by some shell configuration option, but
I have no idea which one and I'm not able to reproduce locally with
shell configuration options provided in GitHub documentation.
Let's just add an explicit 'exit' on 'echo' failure.  This will
guarantee exit from the loop and the subshell regardless of
configuration.

Fixes: 0f03ae3754ec ("ovsdb: Improve timing in cluster torture test.")
Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-04 20:00:36 +01:00
Dumitru Ceara
f2cf667730 ovsdb-server: Replace in-memory DB contents at raft install_snapshot.
Every time a follower has to install a snapshot received from the
leader, it should also replace the data in memory. Right now this only
happens when snapshots are installed that also change the schema.

This can lead to inconsistent DB data on follower nodes and the snapshot
may fail to get applied.

Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after raft install_snapshot.")
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-08-06 22:22:05 +02:00
Han Zhou
93ee420935 raft: Fix the problem of stuck in candidate role forever.
Sometimes a server can stay in candidate role forever, even if the server
already see the new leader and handles append-requests normally. However,
because of the wrong role, it appears as disconnected from cluster and
so the clients are disconnected.

This problem happens when 2 servers become candidates in the same
term, and one of them is elected as leader in that term. It can be
reproduced by the test cases added in this patch.

The root cause is that the current implementation only changes role to
follower when a bigger term is observed (in raft_receive_term__()).
According to the RAFT paper, if another candidate becomes leader with
the same term, the candidate should change to follower.

This patch fixes it by changing the role to follower when leader
is being updated in raft_update_leader().

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-06 14:32:32 -08:00
Han Zhou
2833885f7a raft: Fix raft_is_connected() when there is no leader yet.
If there is never a leader known by the current server, it's status
should be "disconnected" to the cluster. Without this patch, when
a server in cluster is restarted, before it successfully connecting
back to the cluster it will appear as connected, which is wrong.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-06 13:47:15 -08:00
Han Zhou
bda1f6b605 ovsdb-server: Don't disconnect clients after raft install_snapshot.
When "schema" field is found in read_db(), there can be two cases:
1. There is a schema change in clustered DB and the "schema" is the new one.
2. There is a install_snapshot RPC happened, which caused log compaction on the
server and the next log is just the snapshot, which always constains "schema"
field, even though the schema hasn't been changed.

The current implementation doesn't handle case 2), and always assume the schema
is changed hence disconnect all clients of the server. It can cause stability
problem when there are big number of clients connected when this happens in
a large scale environment.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-06 13:46:10 -08:00
Han Zhou
7efc698014 ovsdb raft: Fix the problem when cluster restarted after DB compaction.
Cluster doesn't work after all nodes restarted after DB compaction,
unless there is any transaction after DB compaction before the restart.

Error log is like:
raft|ERR|internal error: deferred vote_request message completed but not ready
to send because message index 9 is past last synced index 0: s2 vote_request:
term=6 last_log_index=9 last_log_term=4

The root cause is that the log_synced member is not initialized when
reading the raft header. This patch fixes it and remove the XXX
from the test case.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-20 12:57:38 -08:00
Han Zhou
6d03405394 ovsdb-cluster.at: Wait until leader is elected before DB compact.
In test case "election timer change", before testing DB compact,
we had to insert some data. Otherwise, inserting data after DB
compact will cause busy loop as mentioned in the XXX comment.

The root cause of the busy loop is still not clear, but the test
itself didn't wait until the leader election finish before initiating
DB compact. This patch adds the wait to make sure the test continue
after leader is elected so that the following tests are based on
a clean state. While this wait is added, the busy loop problem is
gone even without inserting the data, so the additional data insertion
is also removed by this patch.

A separate patch will address the busy loop problem in the scenario:
1. Restart cluster
2. DB compact before the cluster is ready
3. Insert data

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-20 12:57:36 -08:00
Han Zhou
9bfb280a17 ovsdb raft: Fix election timer parsing in snapshot RPC.
Commit a76ba825 took care of saving and restoring election timer in
file header snapshot, but it didn't handle the parsing of election
timer in install_snapshot_request/reply RPC, which results in problems,
e.g. when election timer change log is compacted in snapshot and then a
new node join the cluster, the new node will use the default timer
instead of the new value.  This patch fixed it by parsing election
timer in snapshot RPC.

At the same time the patch updates the test case to cover the DB compact and
join senario. The test reveals another 2 problems related to clustered DB
compact, as commented in the test case's XXX, which need to be addressed
separately.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-21 10:51:10 -08:00
Ilya Maximets
15394e0ff7 tests: Get rid of timeout options for control utilities.
'OVS_CTL_TIMEOUT' environment variable is exported in tests/atlocal.in
and controls timeouts for all OVS utilities in testsuite.

There should be no manual tweaks for each single command.

This helps with running tests under valgrind where commands could
take really long time as you only need to change 'OVS_CTL_TIMEOUT'
in a single place.

Few manual timeouts were left in places where they make sense.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-10-16 12:11:32 +02:00
Ben Pfaff
817db73019 ovsdb-cluster: Use ovs-vsctl instead of ovn-nbctl and ovn-sbctl.
This removes a dependency on OVN from the tests.

This adds some options to ovs-vsctl to allow it to be used for testing
the clustering feature.  The new options are undocumented because
they're really just useful for testing clustering.

Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-30 13:13:26 -07:00
Han Zhou
8e35461419 ovsdb raft: Support leader election time change online.
A new unixctl command cluster/change-election-timer is implemented to
change leader election timeout base value according to the scale needs.

The change takes effect upon consensus of the cluster, implemented through
the append-request RPC.  A new field "election-timer" is added to raft log
entry for this purpose.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 11:30:08 -07:00
Han Zhou
923f01cad6 raft.c: Set candidate_retrying if no leader elected since last election.
candiate_retrying is used to determine if the current node is disconnected
from the cluster when the node is in candiate role. However, a node
can flap between candidate and follower role before a leader is elected
when majority of the cluster is down, so is_connected() will flap, too, which
confuses clients.

This patch avoids the flapping with the help of a new member had_leader,
so that if no leader was elected since last election, we know we are
still retrying, and keep as disconnected from the cluster.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 11:30:07 -07:00
Han Zhou
89771c1e65 raft.c: Stale leader should disconnect from cluster.
As mentioned in RAFT paper, section 6.2:

Leaders: A server might be in the leader state, but if it isn’t the current
leader, it could be needlessly delaying client requests. For example, suppose a
leader is partitioned from the rest of the cluster, but it can still
communicate with a particular client. Without additional mechanism, it could
delay a request from that client forever, being unable to replicate a log entry
to any other servers. Meanwhile, there might be another leader of a newer term
that is able to communicate with a majority of the cluster and would be able to
commit the client’s request. Thus, a leader in Raft steps down if an election
timeout elapses without a successful round of heartbeats to a majority of its
cluster; this allows clients to retry their requests with another server.

Reported-by: Aliasgar Ginwala <aginwala@ebay.com>
Tested-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 11:30:05 -07:00
Han Zhou
ca367fa5f8 ovsdb-idl.c: Allows retry even when using a single remote.
When clustered mode is used, the client needs to retry connecting
to new servers when certain failures happen. Today it is allowed to
retry new connection only if multiple remotes are used, which prevents
using LB VIP with clustered nodes. This patch makes sure the retry
logic works when using LB VIP: although same IP is used for retrying,
the LB can actually redirect the connection to a new node.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 11:30:03 -07:00
Han Zhou
4d9b28cbb7 ovsdb raft: Avoid unnecessary reconnecting during leader election.
If a server claims itself as "disconnected", all clients connected
to that server will try to reconnect to a new server in the cluster.

However, currently a server would claim itself as disconnected even
when itself is the candidate and try to become the new leader (most
likely it will be), and all its clients will reconnect to another
node.

During a leader fail-over (e.g. due to a leader failure), it is
expected that all clients of the old leader will have to reconnect
to other nodes in the cluster, but it is unnecessary for all the
clients of a healthy node to reconnect, which could cause more
disturbance in a large scale environment.

This patch fixes the problem by slightly change the condition that
a server regards itself as disconnected: if its role is candidate,
it is regarded as disconnected only if the election didn't succeed
at the first attempt. Related failure test cases are also unskipped
and all passed with this patch.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-22 13:02:59 -07:00
Han Zhou
5a9b53a51e ovsdb raft: Fix duplicated transaction execution when leader failover.
When a transaction is submitted from a client connected to a follower,
if leader crashes after receiving the execute_command_request from the
follower and sending out append request to the majority of followers,
but before sending execute_command_reply to the follower. The
transaction would finally got commited by the new leader. However,
with current implementation the transaction would be commited twice.

For the root cause, there are two cases:

Case 1, the connected follower becomes the new leader. In this case,
the pending command of the follower will be cancelled during its role
changing to leader, so the trigger for the transaction will be retried.

Case 2, another follower becomes the new leader. In this case, since
there is no execute_command_reply from the original leader (which has
crashed), the command will finally timed out, causing the trigger for
the transaction retried.

In both cases, the transaction will be retried by the server node's
trigger retrying logic. This patch fixes the problem by below changes:

1) A pending command can be completed not only by
execute_command_reply, but also when the eid is committed, if the
execute_command_reply never came.

2) Instead of cancelling all pending commands during role change, let
the commands continue waiting to be completed when the eid is
committed. The timer is increased to be twice the election base time,
so that it has the chance to be completed when leader crashes.

This patch fixes the two raft failure test cases previously disabled.
See the test case for details of how to reproduce the problem.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-15 13:16:02 -07:00
Han Zhou
eb6922584c ovsdb raft: Test cases for cluster failures when there are pending transactions.
Implement test cases for the failure scenarios when there are pending
transactions from clients. This patch implements test cases for different
combinations of conditions with the help of previously added test
commands and options for cluster mode. The conditions include:

- Connected node from which client transaction is executed: leader, follower
- Crashed node: leader, follower that is connected, or the other follower
- Crash point:
    - For leader:
        - before/after receiving execute_command_request
        - before/after sending append_request
        - before/after sending execute_command_reply
    - For follower:
        - before/after sending execute_command_request
        - after receiving append_request

There are 16 test cases in total, and 9 of them are skipped purposely
because of the bugs found by the test cases to avoid CI failure. They will
be enabled in coming patches when the corresponding bugs are fixed.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-15 12:54:44 -07:00
Han Zhou
0f954f326d ovsdb raft: Sync commit index to followers without delay.
When update is requested from follower, the leader sends AppendRequest
to all followers and wait until AppendReply received from majority, and
then it will update commit index - the new entry is regarded as committed
in raft log. However, this commit will not be notified to followers
(including the one initiated the request) until next heartbeat (ping
timeout), if no other pending requests. This results in long latency
for updates made through followers, especially when a batch of updates
are requested through the same follower.

$ time for i in `seq 1 100`; do ovn-nbctl ls-add ls$i; done

real    0m34.154s
user    0m0.083s
sys 0m0.250s

This patch solves the problem by sending heartbeat as soon as the commit
index is updated in leader. It also avoids unnessary heartbeat by resetting
the ping timer whenever AppendRequest is broadcasted. With this patch
the performance is improved more than 50 times in same test:

$ time for i in `seq 1 100`; do ovn-nbctl ls-add ls$i; done

real    0m0.564s
user    0m0.080s
sys 0m0.199s

Torture test cases are also updated because otherwise the tests will
all be skipped because of the improved performance.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-15 12:54:39 -07:00
Ilya Maximets
b6c9325b57 ovsdb-cluster.at: Make torture tests BSD compliant.
'read' requires explicit argument.
'sed' partially replaced with more portable 'tr' because '\n'
could not be recognized as a line break.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-12-20 10:15:16 -08:00
Ben Pfaff
d97af4288d ovsdb-client: Make "wait" command logging more sensible.
The "wait" command in ovsdb-client (which was introduced as part of the
clustering support) fairly often logs things that are normal for it but
in other circumstances might be cause for concern, for example messages
about being unable to connect to a remote.  Until now, it has tried to
suppress some of those itself by raising log levels.  Unfortunately, in
some cases this had the opposite effect because it overrode any settings on
the command line, such as an attempt in ovsdb-cluster.at to suppress all
logging related to the timeval module.  This commit drops the special
log levels from the "wait" command and puts equivalents into the tests
themselves.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2018-08-17 17:43:52 -07:00
Ben Pfaff
6c8dd8caaf ovsdb-cluster: Add comment to test.
I thought I had added this while revising a previous patch but oops.

Fixes: 7ee9c6e03416 ("tests: Fix cluster torture test.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-08-03 16:40:20 -07:00
Ben Pfaff
f39f02102f tests: Suppress "long poll interval" messages for ovsdb-cluster tests.
The cluster torture tests can provoke these messages, especially if run in
parallel or with valgrind, and they shouldn't cause a failure.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
2018-08-03 16:24:33 -07:00
Ben Pfaff
c7b5c53484 tests: Fix use of variable in cluster torture test.
remove_server() is supposed to deal with its argument $i, not $victim.  In
this case they happen to have the same value so the difference is moot,
but it's still best to be clear.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
2018-08-03 16:24:29 -07:00
Ben Pfaff
7ee9c6e034 tests: Fix cluster torture test.
A previous commit to improve timing also caused the cluster torture test to
be skipped (unless it failed early).  This is related to the shell "while"
loop's use of a variable $phase to indicate how far it got in the test
procedure.  A very fast machine, or one on which the races went just the
right way, might finish the test before all the torture properly starts, so
the code is designed to just skip the test if that happens.  However, a
commit to improve the accuracy ended up skipping it all the time.

Prior to the timing commit, the loop looked something like this:

    phase=0
    while :; do
        ...things that eventually increment $phase to 2...
    done
    AT_SKIP_IF([test $phase != 2])

This works fine.

The timing commit changed the "while :" to "(...something...) | while
read".  This looks innocuous but it actually causes everything inside the
"while" loop to run in a subshell.  Thus, the increments to $phase are not
visible after the loop ends, and the test always gets skipped.

This commit fixes the problem by storing the phase in a file instead of a
shell variable.

Fixes: 0f03ae3754ec ("ovsdb: Improve timing in cluster torture test.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
2018-08-03 16:23:04 -07:00
Ben Pfaff
5a0e4aec1a treewide: Convert leading tabs to spaces.
It's always been OVS coding style to use spaces rather than tabs for
indentation, but some tabs have snuck in over time.  This commit converts
them to spaces.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2018-06-11 15:32:00 -07:00
Ben Pfaff
0f03ae3754 ovsdb: Improve timing in cluster torture test.
Until now the timing in the cluster torture test has been pretty
inaccurate because it just worked by calling "sleep 1" in a loop that
did other things.  The longer those other things took, the more
inaccurate it got.

This commit changes to using a separate process for timing.  It still won't
be all that accurate but at least the timing loop doesn't try to do
anything else.

(I'm not sure how to actually get accurate timing in shell.)

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2018-05-25 14:25:40 -07:00
Ben Pfaff
dff60a1e71 ovsdb: Improve torture test for clusters.
This test is supposed to be parameterized, but one of the loops didn't
honor the parameterization and just had hardcoded values.  Also, the
output comparison didn't work properly for more than 100 client sets
(n1 > 100), so this adds some explicit sorting to the mix.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2018-05-25 14:25:25 -07:00
Ben Pfaff
1b1d2e6daa ovsdb: Introduce experimental support for clustered databases.
This commit adds support for OVSDB clustering via Raft.  Please read
ovsdb(7) for information on how to set up a clustered database.  It is
simple and boils down to running "ovsdb-tool create-cluster" on one server
and "ovsdb-tool join-cluster" on each of the others and then starting
ovsdb-server in the usual way on all of them.

One you have a clustered database, you configure ovn-controller and
ovn-northd to use it by pointing them to all of the servers, e.g. where
previously you might have said "tcp:1.2.3.4" was the database server,
now you say that it is "tcp:1.2.3.4,tcp:5.6.7.8,tcp:9.10.11.12".

This also adds support for database clustering to ovs-sandbox.

Acked-by: Justin Pettit <jpettit@ovn.org>
Tested-by: aginwala <aginwala@asu.edu>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-03-24 12:04:53 -07:00