It's useful to have a way to update all the JSON-RPC session
options all at once and not call 3 separate functions every
time. This may also allow the internals of these options
to be better abstracted, i.e. allow users to not know what
are these options exactly.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
We compare the logs in some tests, for example record/replay tests.
And those fail if for some reason the JSON object traversal happens
in the different order.
Sort the output in debug logs in order to fix sporadic test failures.
Should not affect performance in real-world cases as the actual
outgoing message is still not sorted.
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The ovsdb-cs layer triggers a forced reconnect in various cases:
- when an inconsistency is detected in the data received from the
remote server.
- when the remote server is running in clustered mode and transitioned
to "follower", if the client is configured in "leader-only" mode.
- when explicitly requested by upper layers (e.g., by the user
application, through the IDL layer).
In such cases it's desirable that reconnection should happen as fast as
possible, without the current exponential backoff maintained by the
underlying reconnect object. Furthermore, since 3c2d6274bcee ("raft:
Transfer leadership before creating snapshots."), leadership changes
inside the clustered database happen more often and, therefore,
"leader-only" clients need to reconnect more often too.
Forced reconnects call jsonrpc_session_force_reconnect() which will not
reset backoff. To make sure clients reconnect as fast as possible in
the aforementioned scenarios we first call the new API,
jsonrpc_session_reset_backoff(), in ovsdb-cs, for sessions that are in
state CS_S_MONITORING (i.e., the remote is likely still alive and
functioning fine).
jsonrpc_session_reset_backoff() resets the number of backoff-free
reconnect retries to the number of remotes configured for the session,
ensuring that all remotes are retried exactly once with backoff 0.
This commit also updates the Python IDL and jsonrpc implementations.
The Python IDL wasn't tracking the IDL_S_MONITORING state explicitly,
we now do that too. Tests were also added to make sure the IDL forced
reconnects happen without backoff.
Reported-at: https://bugzilla.redhat.com/1977264
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
If a new database server added to the cluster, or if one of the
database servers changed its IP address or port, then you need to
update the list of remotes for the client. For example, if a new
OVN_Southbound database server is added, you need to update the
ovn-remote for the ovn-controller.
However, in the current implementation, the ovsdb-cs module always
closes the current connection and creates a new one. This can lead
to a storm of re-connections if all ovn-controllers will be updated
simultaneously. They can also start re-dowloading the database
content, creating even more load on the database servers.
Correct this by saving an existing connection if it is still in the
list of remotes after the update.
'reconnect' module will report connection state updates, but that
is OK since no real re-connection happened and we only updated the
state of a new 'reconnect' instance.
If required, re-connection can be forced after the update of remotes
with ovsdb_cs_force_reconnect().
Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Current version of replay engine doesn't handle time-based internal
events that results in stream send/receive. Disabling jsonrpc inactivity
probes for now to not block process waiting for probe being sent.
The proper solution would be to implement correct record/replay
of time, probably, by recording time and using the time warping.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Open vSwitch has a few different jsonrpc-based protocols that depend on
jsonrpc_session to make sure that the connection is up and working.
In turn, jsonrpc_session uses the "reconnect" state machine to send
probes if nothing is received. This works fine in normal circumstances.
In unusual circumstances, though, it can happen that the program is
busy and doesn't even try to receive anything for a long time. Then the
timer can time out without a good reason; if it had tried to receive
something, it would have.
There's a solution that the clients of jsonrpc_session could adopt.
Instead of first calling jsonrpc_session_run(), which is what calls into
"reconnect" to deal with timing out, and then calling into
jsonrpc_session_recv(), which is what tries to receive something, they
could use the opposite order. That would make sure that the timeout
was always based on a recent attempt to receive something. Great.
The actual code in OVS that uses jsonrpc_session, though, tends to use
the opposite order, and there are enough users and this is a subtle
enough issue that it could get flipped back around even if we fixed it
now. So this commit takes a different approach. Instead of fixing
this in the users of jsonrpc_session, we fix it in the users of
reconnect: make them tell when they've tried to receive something (or
disable this particular feature).
This commit fixes the problem that way. It's kind of hard to reproduce
but I'm pretty sure that I've seen it a number of times in testing.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
RAFT messages could be fairly big. If something abnormal happens to
one of the servers in a cluster it may not be able to process all the
incoming messages in a timely manner. This results in jsonrpc backlog
growth on the sender's side. For example if follower gets many new
clients at once that it needs to serve, or it decides to take a
snapshot in a period of high number of database changes.
If backlog grows large enough it becomes harder and harder for follower
to process incoming raft messages, it sends outdated replies and
starts receiving snapshots and the whole raft log from the leader.
Sometimes backlog grows too high (60GB in this example):
jsonrpc|INFO|excessive sending backlog, jsonrpc: ssl:<ip>,
num of msgs: 15370, backlog: 61731060773.
In this case OS might actually decide to kill the sender to free some
memory. Anyway, It could take a lot of time for such a server to catch
up with the rest of the cluster if it has so much data to receive and
process.
Introducing backlog thresholds for jsonrpc connections.
If sending backlog will exceed particular values (500 messages or
4GB in size), connection will be dropped and re-created. This will
allow to drop all the current backlog and start over increasing
chances of cluster recovery.
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
It's pretty easy to get 0 remotes here from ovn-northd if you specify
--ovnnb-db='' or --ovnnb-db=' ' on the command line. The internals
of jsonrpc_session aren't equipped to cope with that, so just add a
dummy remote instead.
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Increase jsonrpc input buffer size from 512 to 4096 bytes in order to
reduce the syscall overhead when downloading huge db size
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch allows remotes not being shuffled if desired (mostly for
testing purpose, when we need the order of remotes during retrying
be predictable). By default it still shuffles as how it behaves today.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Several OVS structs contain embedded named unions, like this:
struct {
...
union {
...
} u;
};
C11 standardized a feature that many compilers already implemented
anyway, where an embedded union may be unnamed, like this:
struct {
...
union {
...
};
};
This is more convenient because it allows the programmer to omit "u."
in many places. OVS already used this feature in several places. This
commit embraces it in several others.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Tested-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
This commit adds support for OVSDB clustering via Raft. Please read
ovsdb(7) for information on how to set up a clustered database. It is
simple and boils down to running "ovsdb-tool create-cluster" on one server
and "ovsdb-tool join-cluster" on each of the others and then starting
ovsdb-server in the usual way on all of them.
One you have a clustered database, you configure ovn-controller and
ovn-northd to use it by pointing them to all of the servers, e.g. where
previously you might have said "tcp:1.2.3.4" was the database server,
now you say that it is "tcp:1.2.3.4,tcp:5.6.7.8,tcp:9.10.11.12".
This also adds support for database clustering to ovs-sandbox.
Acked-by: Justin Pettit <jpettit@ovn.org>
Tested-by: aginwala <aginwala@asu.edu>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The implementation cycles through the remotes in random order. This allows
clients to perform some load balancing across alternative implementations
of a service.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Poll-loop is the core to implement main loop. It should be available in
libopenvswitch.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The purpose of the sequence number is to allow the client to figure out
when the connection status has changed. The significant event for the
client is when a connection completes, not when a connection attempt
starts. Thus, this commit changes the code to increment the sequence
number at completion, not at the attempt.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
Add suport for ovsdb RBAC (role-based access control). This includes:
- Support for "RBAC_Role" table. A db schema containing a table
by this name will enable role-based access controls using
this table for RBAC role configuration.
The "RBAC_Role" table has one row per role, with each row having a
"name" column (role name) and a "permissions" column (map of
table name to UUID of row in separate permission table.) The
permission table has one row per access control configuration,
with the following columns:
"name" - name of table to which this row applies
"authorization" - set of column names and column:key pairs
to be compared against client ID to
determine authorization status
"insert_delete" - boolean, true if insertions and
authorized deletions are allowed.
"update" - Set of columns and column:key pairs for
which authorized updates are allowed.
- Support for a new "role" column in the remote configuration
table.
- Logic for applying the RBAC role and permission tables, in
combination with session role from the remote connection table
and client id, to determine whether operations modifying database
contents should be permitted.
- Support for specifying RBAC role string as a command-line option
to ovsdb-tool (Ben Pfaff).
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
To easily allow both in- and out-of-tree building of the Python
wrapper for the OVS JSON parser (e.g. w/ pip), move json.h to
include/openvswitch. This also requires moving lib/{hmap,shash}.h.
Both hmap.h and shash.h were #include-ing "util.h" even though the
headers themselves did not use anything from there, but rather from
include/openvswitch/util.h. Fixing that required including util.h
in several C files mostly due to OVS_NOT_REACHED and things like
xmalloc.
Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This attempts to prevent namespace collisions with other list libraries
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
All code is now in include/openvswitch/list.h.
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Most vlog calls are for the log module owned by the translation unit being
compiled, but this module was referenced indirectly through a pointer
variable. That seems silly, so this commit changes the code so that the
local vlog module is referred to directly, as &this_module.
We could get rid of the global variables for vlog modules entirely, but
I like getting linker errors when there's a duplicate module name.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
This change reuses the string length that available from 'ds', saving
a strlen() call.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
In OVN, ovsdb-server is the daemon that manages the databases
and can be called as the central controller. So it would be
nice for ovsdb-server to be able to push its self-signed
certificate to all the other nodes where ovn-controller runs.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
We've been warning about the change since 2.1, which was released a year
ago.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
ofpbuf was complicated due to its wide usage across all
layers of OVS, Now we have introduced independent dp_packet
which can be used for datapath packet, we can simplify ofpbuf.
Following patch removes DPDK mbuf and access API of ofpbuf
members.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This function is really of marginal utility. This commit drops it and
makes the existing callers instead open a new pstream with the desired
dscp.
The ulterior motive here is that the set_dscp() function that actually sets
the DSCP on a socket really wants to know the address family (AF_INET vs.
AF_INET6). We could plumb that down through the stream code, and that's
one reasonable option, but I thought that simply eliminating some calls
to set_dscp() where we don't already have the address family handy was
another reasonable way to go.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
struct list is a common name and can't be used in public headers.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This commit adds a log message to notify the excessive backlog
for jsonrpc. Expectedly, this message should never be printed.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Trivial ID counters do not synchronize anything, therefore can use
atomic_count.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Until now, jsonrpc_recv() used separate iterations of its loop to receive
data, feed it to the JSON-RPC parser, and return the received message.
This is unnecessarily complicated and can occasionally mean that the
jsonrpc object has received and parsed but not returned a message. This
commit refactors the code to receive data, feed it to the parse, and
return the received message in a single iteration, and simplifies the code
in the process.
Reported-by: Chris Hydon <chydon@aristanetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
If ovs-vsctl has to wait for ovs-vswitchd to reconfigure itself
according to the new database, then sometimes ovs-vsctl could
end up stuck in the event loop if OVSDB connection was dropped
while ovs-vsctl was still running.
This patch fixes this problem by letting ovs-vsctl to reconnect
to the OVSDB, if it has to wait cur_cfg field to be updated.
Issue: 1191997
Reported-by: Spiro Kourtessis <spiro@nicira.com>
Signed-Off-By: Ansis Atteka <aatteka@nicira.com>
The OVS code has always made a distinction between the unencrypted (TCP)
and SSL port numbers for the OpenFlow and OVSDB protocols. The default
port numbers for both protocols has changed, and there continues to be
no distinction between the unencrypted and SSL versions. This
commit removes the distinction in port numbers. A future patch will
recognize the change in default port number.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This commit adds annotations for thread safety check. And the
check can be conducted by using -Wthread-safety flag in clang.
Co-authored-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
In C, one can do preprocessor tricks by making a macro expansion include
the macro's own name. We actually used this in the tree to automatically
provide function arguments, e.g.:
int f(int x, const char *file, int line);
#define f(x) f(x, __FILE__, __LINE__)
...
f(1); /* Expands to a call like f(1, __FILE__, __LINE__); */
However it's somewhat confusing, so this commit stops using that trick.
Reported-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ed Maste <emaste@freebsd.org>
Until now, ovs-vsctl has kept trying to the database server until it
succeeded or the timeout expired (if one was specified with --timeout).
This meant that if ovsdb-server wasn't running, then ovs-vsctl would hang.
The result was that almost every ovs-vsctl invocation in scripts specified
a timeout on the off-chance that the database server might not be running.
But it's difficult to choose a good timeout. A timeout that is too short
can cause spurious failures. A timeout that is too long causes long delays
if the server really isn't running.
This commit should alleviate this problem. It changes ovs-vsctl's behavior
so that, if it fails to connect to the server, it exits unsuccessfully.
This makes --timeout obsolete for the purpose of avoiding a hang if the
database server isn't running. (--timeout is still useful to avoid a hang
if ovsdb-server is running but ovs-vswitchd is not, for ovs-vsctl commands
that modify the database. --no-wait also avoids that issue.)
Bug #2393.
Bug #15594.
Reported-by: Jeff Merrick <jmerrick@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2012-09-14T05:38:26Z|00001|jsonrpc|WARN|tcp:127.0.0.1:6634: receive error: Con
ovsdb-client: transaction failed (Connection reset by peer)
NOTE: This occurs intermittently depending on how ovsdb-server runs.
Running ovsdb-client on a remote machine increases the possibility.
This is because ovsdb-server closes newly accepted tcp connection.
The following changesets caused it. struct jsonrpc_session::dscp isn't set
based on listening socket's dscp value.
- ovsdb-server creates passive connection and listens on it.
- ovsdb-server accepts connection by ovsdb_jsonrpc_server_run().
The accepted socket inherits from the listening sockets.
ovsdb_jsonrpc_server_run() creates json session, but leaves dscp of
struct jsonrpc_session zero.
- On calling reconfigure_from_db(), it resets dscp value to
all jsonrpc sessions. Eventually jsonrpc_session_set_dscp() is called.
Then jsonrpc_session_force_reconnect() closes the connection.
With this patch,
- struct jsonrpc_session::dscp is correctly set based on
listening sockets dscp value.
- dscp of listening socket is changed dynamically by setsockopt.
This leaves a window where accepted socket may have old dscp.
But it is ignored for now because it would complicates codes
too much.
The related change sets:
- 0442efd9b1a88d923b56eab6b72b6be8231a49f7
Reapplying the dscp changes: No need to restart DB/OVS on changing
dscp value.
- 59efa47adf3234ec51541405726d033173851285
Revert DSCP update changes.
- b2e18db292cd4962af3248f11e9f17e6eaf9c033
No need to restart DB / OVS on changing dscp value.
- f125905cdd3dc0339ad968c0a70128807884b400
Allow configuring DSCP on controller and manager connections.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Mehak Mahajan <mmahajan@nicira.com>