I looked at almost every uint<N>_t in the tree to determine whether it was
really in network byte order, and converted the ones that were.
The only remaining ones, modulo my mistakes, are in openflow.h. I'm not
sure whether we should convert those, because there might be some value
in remaining close to upstream for this header.
I've never heard of anyone actually using controller discovery.
It adds a great deal of code to the source tree, and a little
bit of complication to ofproto, so this commit removes it.
ovs-vswitchd writes only the duration of its connection to or disconnection
from each controller to the database. This changes that behavior to write the
time since both the last connection and disconnection events regardless of
connection state. This mirrors the new behavior for reporting database manager
connection status.
Requested-by: Peter Balland <peter@nicira.com>
Bug #4833.
At first glance the vconn_wait() call looks risky because this function
checked whether rc->vconn is nonnull at the top. In fact it's OK because
rc->state will be S_ACTIVE or S_IDLE only if rc->vconn is nonnull, but
there's no harm in putting that check inside the block that only runs if
rc->vconn is nonnull.
Coverity #10714.
ovs_queue doesn't seem very useful; it's just a singly-linked list. It's
more generally useful to use a general-purpose "struct list" for lists of
packets, so this commit adds such a member to "struct ofpbuf" and shifts
the existing users to use it.
Until now, the collection of coverage counters supported by a given OVS
program was not specific to that program. That means that, for example,
even though ovs-dpctl does not have anything to do with mac_learning, it
still has a coverage counter for it. This is confusing, at best.
This commit fixes the problem on some systems, in particular on ones that
use GCC and the GNU linker. It uses the feature of the GNU linker
described in its manual as:
If an orphaned section's name is representable as a C identifier then
the linker will automatically see PROVIDE two symbols: __start_SECNAME
and __end_SECNAME, where SECNAME is the name of the section. These
indicate the start address and end address of the orphaned section
respectively.
Systems that don't support these features retain the earlier behavior.
This commit also fixes the annoyance that files that include coverage
counters must be listed on COVERAGE_FILES in lib/automake.mk.
This commit also fixes the annoyance that modifying any source file that
includes a coverage counter caused all programs that link against
libopenvswitch.a to relink, even programs that the source file was not
linked into. For example, modifying ofproto/ofproto.c (which includes
coverage counters) caused tests/test-aes128 to relink, even though
test-aes128 does not link again ofproto.o.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
Logging a few messages about a failed connection every few seconds for
every bridge is too much. This just logs failed connection messages for a
few attempts, then shuts them off until something changes.
Bug #2610.
The main purpose of the vconn code is to ship OpenFlow messages across
network connections. Over time a large number of utility functions related
to OpenFlow messages have also crept into vconn.c, but that's really
logically separate. This commit breaks those functions out into a new
file.
Until now, log messages about OpenFlow connections have named the target
of the connection, e.g. "tcp:1.2.3.4:5555", but they have not named the
datapath. Most often, every datapath has the same target, so this can
make it difficult to tell which connection is going wrong. Usually, that
isn't important, because all connections with the same target will have the
same problems, but it's probably better to be more informative.
This commit changes the log messages to include the datapath name, so that
"tcp:1.2.3.4:5555" becomes, e.g., "xenbr0<->tcp:1.2.3.4:5555".
Requested-by: Keith Amidon <keith@nicira.com>
The 'name' argument to these functions is actively unhelpful, because none
of the callers provided a better name than the one provided by
vconn_get_name(). So drop it.
In some cases the rconn library was logging disconnections twice, and in
other cases it was not logging them at all. This cleans it up, so that
each disconnection or connection failure should cause one log message.
This commit makes disconnect()'s 'error' argument unused, but it will be
used again in the following commit.
The vconn code is a relative fossil as OVS code goes. It was written
before we had really figured how code should fit together. Part of that
history is that it used poll_fd_callback() to register callbacks without
the assistance of other code. That isn't how the rest of OVS works now;
this code is the only remaining user of that function.
To make it more like the rest of the system, this code gets rid of the use
of poll_fd_callback(). It also adds vconn_run() and vconn_run_wait()
functions and calls to them from the places where they are now required.
When the switch is configured to connect to a controller that accepts
connections, waits a few seconds, and then disconnects without setting up
flows, currently this causes "fail-open" to flush the flow table and
stop setting up new flows during the connection duration. This is OK if
it happens once, but it can easily happen every 8 seconds with typical
backoff settings, and that isn't so great.
This commit changes fail-open to only flush the flow table once the switch
appears to have been admitted by the controller, which prevents these
frequent network interruptions.
Thanks to Jesse Gross for especially valuable feedback.
QA notes: Behavior in fail-open and especially behavior with a controller
that rejects the switch after it connects needs to be re-tested. The
ovs-controller --mute switch added by this commit is one simple way to
create such a controller.
CC: Peter Balland <peter@nicira.com>
Bug #1695. Bug #2055.
In-band control needs to know the IP and port of the controller, so that
it can set up the correct flows to talk to that controller. Until now,
the rconn code has only made this available when a connection was actually
in progress. This means that, say, ARP packets will not be allowed through
when the rconn backs off. The same is true of packets sent by switches
that access the controller through this one.
This commit makes the rconn cache the remote IP and port and local IP
across connection attempts, improving the situation. In particular, it
reduces the overall amount of time that it takes to connect in my own
simple test case from over 10 seconds to about 2 seconds.
This was a somewhat difficult merge since there was a fair amount of
superficially divergent development on the two branches, especially in the
datapath.
This has been build-tested against XenServer 5.5.0 and XenServer 5.7.0
build 15122. It has been booted and connected to XenCenter on 5.5.0.
The merge revealed a couple of outstanding bugs, which will be fixed on
citrix and then merged back into master.
Users at Nicira have commented that a maximum reconnection time of 15
seconds, which was the default, is too long. This commit cuts it to 8
seconds, on the theory that an administrator is willing to wait that long
before deciding that a change that should restore connectivity did not
work.
Previously, rconn and vconn only allowed users to find out about the
remote IP address. This set of changes allows users to retrieve the
remote port, local IP, and local port used for the connection.