The ovsdb-cs layer triggers a forced reconnect in various cases:
- when an inconsistency is detected in the data received from the
remote server.
- when the remote server is running in clustered mode and transitioned
to "follower", if the client is configured in "leader-only" mode.
- when explicitly requested by upper layers (e.g., by the user
application, through the IDL layer).
In such cases it's desirable that reconnection should happen as fast as
possible, without the current exponential backoff maintained by the
underlying reconnect object. Furthermore, since 3c2d6274bc ("raft:
Transfer leadership before creating snapshots."), leadership changes
inside the clustered database happen more often and, therefore,
"leader-only" clients need to reconnect more often too.
Forced reconnects call jsonrpc_session_force_reconnect() which will not
reset backoff. To make sure clients reconnect as fast as possible in
the aforementioned scenarios we first call the new API,
jsonrpc_session_reset_backoff(), in ovsdb-cs, for sessions that are in
state CS_S_MONITORING (i.e., the remote is likely still alive and
functioning fine).
jsonrpc_session_reset_backoff() resets the number of backoff-free
reconnect retries to the number of remotes configured for the session,
ensuring that all remotes are retried exactly once with backoff 0.
This commit also updates the Python IDL and jsonrpc implementations.
The Python IDL wasn't tracking the IDL_S_MONITORING state explicitly,
we now do that too. Tests were also added to make sure the IDL forced
reconnects happen without backoff.
Reported-at: https://bugzilla.redhat.com/1977264
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This follows up on commit 4241d652e4 ("jsonrpc: Avoid disconnecting
prematurely due to long poll intervals."), which implemented the same
thing in C.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Requested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Since Python 2 support was removed in 1ca0323e7c ("Require Python 3 and
remove support for Python 2."), python3-six is not needed anymore.
Moreover python3-six is not available on RHEL/CentOS7 without using EPEL
and so this patch is needed in order to release OVS 2.13 on RHEL7.
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This fix was reverted because it depended on a small bit of code
in a patch that was reverted that changed some python/ovs testing
and build. The fix is still necessary.
The OVS C-based JSON parser operates on bytes, so the parser_feed
function returns the number of bytes that are processed. The pure
Python JSON parser currently operates on unicode, so it expects
that Parser.feed() returns a number of characters. This difference
leads to parsing errors when unicode characters are passed to the
C JSON parser from Python.
Acked-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
If attempt to open non-blocking connection results with EINPROGRESS,
further polling will trigger DISCONNECT action in case of failures.
While handling this action, jsonrpc python library closes the
connection but does not change the current remote. This leads to
subsequent connection to the same remote. And the story starts from
the beginning producing infinite attempts to connect to a single
remote regardless of existense of others. Like this:
reconnect | DBG | tcp:127.0.0.1:45932: entering BACKOFF
reconnect | INFO | tcp:127.0.0.1:45932: connecting...
reconnect | DBG | tcp:127.0.0.1:45932: entering CONNECTING
poller | DBG | 999-ms timeout
reconnect | INFO | tcp:127.0.0.1:45932: connection attempt timed out
reconnect | DBG | tcp:127.0.0.1:45932: entering BACKOFF
poller | DBG | 0-ms timeout
reconnect | INFO | tcp:127.0.0.1:45932: connecting...
<...>
reconnect | DBG | tcp:127.0.0.1:45932: entering CONNECTING
poller | DBG | 1999-ms timeout
reconnect | INFO | tcp:127.0.0.1:45932: connection attempt timed out
reconnect | INFO | tcp:127.0.0.1:45932: waiting 4 seconds before reconnect
reconnect | DBG | tcp:127.0.0.1:45932: entering BACKOFF
<...>
Fix that by always picking the new remote on disconnect.
This mimics the behaviour of jsonrpc C library.
Fixes "multiple remotes" tests on FreeBSD.
CC: Numan Siddique <nusiddiq@redhat.com>
Fixes: 31e434fc98 ("python jsonrpc: Allow jsonrpc_session to have more than one remote.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This reverts commit a7be68a4d7
and a subsequent commit 4617d1f6bd.
There are too many issues with these patches. It's better to revert
them for now and make a separate fixed versions later if needed.
List of issues (maybe not full):
1. 'make clean' removes entire 'python' directory.
2. Fully broken Travis-CI testsuite build:
building 'ovs._json' extension
creating build/temp.linux-x86_64-2.7
error: could not create 'build/temp.linux-x86_64-2.7': \
Permission denied
https://travis-ci.org/openvswitch/ovs/jobs/440693765
3. Broken local testsuite build on Ubuntu 18.04:
running build_ext
building 'ovs._json' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/ovs
<...>
/usr/bin/ld: .libs/libopenvswitch.a(util.o): \
relocation R_X86_64_TPOFF32 against `var.7749' can not be \
used when making a shared object; recompile with -fPIC
<...>
collect2: error: ld returned 1 exit status
4. Fedora build failure because of 'setuptools' ('distutils')
hard dependency on 'redhat-rpm-config' package:
building 'ovs._json' extension
<...>
gcc: error: <...>/redhat-hardened-cc1: No such file or directory
5. Looks like 'setuptools' also could download and install
unwanted python modules during package build.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The OVS C-based JSON parser operates on bytes, so the parser_feed
function returns the number of bytes that are processed. The pure
Python JSON parser currently operates on unicode, so it expects
that Parser.feed() returns a number of characters. This difference
leads to parsing errors when unicode characters are passed to the
C JSON parser from Python.
Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Python IDL implementation doesn't have the support to connect to the
cluster dbs. This patch adds this support. We are still missing the
support in python idl class to connect to the cluster master. That
support will be added in an upcoming patch.
This patch is similar to the commit 8cf6bbb184 which added multiple remote
support in the C jsonrpc implementation.
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
It can only receive 4096 bytes of data each time in jsonrpc,
when there are similar and Chinese characters occupy multiple bytes,
it may receive half a character, this time the decoding will be abnormal.
We need to receive the completed character to decode.
Signed-off-by: Guoshuai Li <ligs@dtdream.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Ensure that JSON is utf-8 encoded and that bytes sent/received on
the stream sockets are in utf-8 form. Add a test case to verify
that unicode data can be sent/received successfully using Python
IDL module.
Co-authored-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
This patch is adding a new parameter called "probe_interval" to the
constructor of the Idl class. This new parameter will be used to tune
the database connection probing for that IDL session, some users might
want to tune it to be less agressive than the current 5s default in OVS
or even disable it.
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1680146
Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Acked-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
https://review.openstack.org/#/c/432906/
flake8-import-order adds 3 new flake8 warnings:
I100: Your import statements are in the wrong order.
I101: The names in your from import are in the wrong order.
I201: Missing newline between sections or imports.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Unix sockets (AF_UNIX) are not supported on Windows.
The replacement of Unix sockets on Windows is implemented
using named pipes, we are trying to mimic the behaviour
of unix sockets.
Instead of using Unix sockets to communicate
between components Named Pipes are used. This
makes the python sockets compatible with the
Named Pipe used in Windows applications.
Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Signed-off-by: Alin Balutoiu <abalutoiu@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions>
Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Python 3 has separate types for strings and bytes. Python 2 used the
same type for both. We need to convert strings to bytes before writing
them out to a socket. We also need to convert data read from the socket
to a string.
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
I've hit several bugs in this Python 3 work where the fix was some code
needed to be converted to use isinstance(). This has been primarily
around deadling with the changes to unicode handling. Go ahead and
convert the rest of the direct type comparisons to use isinstance(), as
it could avoid a bug I haven't hit yet and it's more Pythonic, anyway.
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Python 2 had str and unicode. Python 3 only has str, which is always a
unicode string. Drop use of unicode with the help of six.text_type
(unicode in py2 and str in py3) and six.string_types ([str, unicode] in
py2 and [str] in py3).
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Fix the following pep8 errors:
E201 whitespace after '('
E203 whitespace before ','
E222 multiple spaces after operator
E225 missing whitespace around operator
E226 missing whitespace around arithmetic operator
E231 missing whitespace after ':'
E241 multiple spaces after ':'
E251 unexpected spaces around keyword / parameter equals
E261 at least two spaces before inline comment
E262 inline comment should start with '# '
E265 block comment should start with '# '
E271 multiple spaces after keyword
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Resolve pep8 errors:
E711 comparison to None should be 'if cond is None:'
The reason comparing against None with "is None" is preferred over
"== None" is because a class can define its own equality operator and
produce bizarre unexpected behavior. Using "is None" has a very
explicit meaning that can not be overridden.
E721 do not compare types, use 'isinstance()'
This one is actually a mistake by the tool in most cases.
'from ovs.db import types' looks just like types from the Python stdlib.
In those cases, use the full ovs.db.types name. Fix one case where it
actually was types from the stdlib.
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
When opening a JSONRPC connection, the health probes
are incorrectly getting turned off for connections
that need probes.
In other words, when stream_or_pstream_needs_probes()
return non-zero, the probes are gettting disabled as
the probe interval is getting set to zero. This leads
to incorrect behavior such that probes are:
- not getting turned off for unix: connections
- getting turned off for tcp:/ssl: connections
The changes in this commit fix this issue.
Signed-off-by: Sumit Garg <sumit@extremenetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This suppresses a testsuite failure when the testsuite is run from a
directory whose name contains a non-ASCII character. I'd rather fix the
problem but webpages like the following make it sound difficult or
impossible on Python 2.x: http://stackoverflow.com/a/11742928
Signed-off-by: Ben Pfaff <blp@nicira.com>
When a JSON-RPC session receives bytes, or when it successfully sends
queued bytes, then it should count that as activity. However, the code
here was reversed, in that it used the wrong check in each place. That is,
when it tried to receive data, it would check whether data had just been
sent, and when it tried to send data, it would check whether data had just
been received. Neither one makes sense and doesn't work.
Bug #13214.
Reported-by: Luca Giraudo <lgiraudo@nicira.com>
CC: James Schmidt <jschmidt@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, the jsonrpc code has only counted receiving a full JSON-RPC
messages as activity. This could theoretically time out, then, while a
very long message is in transit or if a slow link is involved. This commit
changes this code to count receiving any part of a message as activity.
This isn't a problem for OpenFlow connections because OpenFlow messages are
at most 64 kB in size.
This problem hasn't actually been observed in practice.
Bug #12789.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, the jsonrpc module has used messages received from the
remote peer as the sole means to determine that the JSON-RPC
connection is up. This could in theory interact badly with a
remote peer that stops reading and processing messages from the
receive queue when there is a backlog in the send queue for a
given connection (ovsdb-server is an example of a program that
behaves this way). This commit fixes the problem by expanding
the definition of "activity" to include successfully sending
JSON-RPC data that was previously queued.
The above change is exactly analogous to the similar change
made to the rconn library in commit 133f2dc954 (rconn: Treat
draining a message from the send queue as activity.).
Bug #12789.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Receiving data is not the only reasonable way to verify that a connection
is up. For example, on a TCP connection, receiving an acknowledgment that
the remote side has accepted data that we sent is also a reasonable means.
Therefore, this commit generalizes the naming.
Also, similarly for the Python implementation: Reconnect.received() becomes
Reconnect.activity().
Signed-off-by: Ben Pfaff <blp@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
There isn't a lot of value in sending inactivity probes on unix
sockets. This patch changes the default to disable them.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
The ovs_error() and ovs_fatal() helper functions are useful enough
to be ported to Python. A user will be added in a future commit.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
This patch does minor style cleanups to the code in the python and
tests directory. There's other code floating around that could use
similar treatment, but updating it is not convenient at the moment.
These initial bindings pass a few hundred of the corresponding tests
for C implementations of various bits of the Open vSwitch library API.
The poorest part of them is actually the Python IDL interface in
ovs.db.idl, which has not received enough attention yet. It appears
to work, but it doesn't yet support writes (transactions) and it is
difficult to use. I hope to improve it as it becomes clear what
semantics Python applications actually want from an IDL.