Creating bonds sometimes fails due to ovs-vsctl timing out. This commit
increases the time interface-reconfigure gives ovs-vsctl from five to
twenty seconds. We should investigate why it's taking ovs-vsctl so
long, but this helps for now.
TAP devices need to be treated slightly differently from other other
devices because they cannot be opened multiple times. Instead we
open them once and share the file descriptor. This means that if
the netdev is opened multiple times one reader can drain the buffers
of another. While this is a deviation from the normal convention,
it does not impact current or planned users.
In addition, this cleans up some confusion between the file
descriptor for tap devices versus other FD's.
This builds on earlier work that implemented netdev object refcounting.
However, rather than requiring explicit create and destroy calls,
these operations are now performed automatically based on the referenece
count. This is important because in certain situations it is not
possible to know whether a netdev has already been created. A
workaround existed (which looked fairly similar to this paradigm) but
introduced it's own issues. This simplifies and unifies the API.
Some systems, such as XenServer, expect that bonds have their own interface.
This commit adds the ability to do that with the "--fake-iface" option
in ovs-vsctl's add-bond command. It also has XenServer's
interface-reconfigure use it.
Part of solution to Bug #2376
The first change is to not propagate the IP DF bit from the inner
packet to the outer packet. Large TCP packets can get segmented
first which will set the DF bit. However these segmented packets
might still be too large after the GRE header is added, requiring
fragmentation.
The second change is to raise the MTU of the GRE tunnel device.
This prevents packets from being dropped in the datapath before
they can be fragmented. Since the datapath is layer 2 it does not
do any fragmentation and drops any packets that are too large.
Both of these are temporary workarounds that need to be addressed
more carefully in the future.
Bug #2379
ovs-vsctl supports the "--timeout" option, which specifies the amount
of time that the operation is allowed to take before a SIGALRM is
raised. The code that parsed options had a local "timeout" that masked
the global one that was supposed to be set.
When configuring the slaves of a bond, interface-reconfigure was
directly writing the Xen OpaqueRef as the name instead of first
converting it into a standard device name.
Bug #2377
When ovs-brcompatd can't connect to the database, the "ovs" variable
is never set. The function "brc_recv_update" takes care of draining
brcompat kernel module's netlink messages. When the netlink message
comes in to modify the bridge, that function never gets called, so a
netlink message always appears to be ready and we consume 100% CPU
looping.
With this commit, we log a warning and drop the request on the floor.
Bug #2373
dp_dev_recv() is called from do_output(), which can be called from
execute_actions(), which can be called from process context with
preemption enabled, so it needs to disabled preemption before it can
access per-CPU data.
Build tested on a few different kernels.
Bug #2316.
Reported-by: John Galgay <john@galgay.net>
CC: Dan Wendlandt <dan@nicira.com>
daemonize() now closes the standard file descriptors, but ovsdb-client's
"monitor" command uses stdout even after daemonizing. This caused
tests that used "ovsdb-client --detach monitor" to fail without printing
their complete output. This commit fixes the problem.
Before SSH terminates, it waits for the PTYs that it creates for use as
stdin, stdout, and stderr to be closed. When any of the Open vSwitch
daemons were started in the background over an SSH session, they held
those file descriptors open and thus the SSH session hung. This commit
fixes the problem by closing those file descriptors, allowing SSH to
terminate.
During the transition to the configuration database, not all code from
the bridge compatibility layer was updated. In particular, the code
which removes port configuration based on RTNL notifications of port
removal was not updated. This commit brings that code back.
Note that the code that removes ports based on actively checking whether
ports exist is still commented out pending a review of its impact on GRE
support.
Bug #2370
When printing the fail-mode, ovs-vsctl would always attempt to print the
top-level one--even if it didn't exist. So, in addition to sometimes
being wrong, it could cause segfaults.
Thanks to Peter Balland for reporting the error.
Bug #2374
reconnect_run() returns RECONNECT_CONNECT to tell the client that it should
start a new connection. The client is then supposed to call
reconnect_connecting() to tell the FSM that it has begun a connection
attempt. However, even after reconnect_connecting() was called,
reconnect_run() continued to return RECONNECT_CONNECT on each call until
the connection succeeded or failed. This confused the jsonrpc_session
client, which expected that it would get a 0 return value from
reconnect_run() while the connection attempt was in progress. Connections
that required multiple trips through the main poll loop, e.g. for SSL
negotiation, would often get cut off to start a second connection attempt.
This commit change reconnect_run() to return RECONNECT_CONNECT only until
the client tells it that a connection is in progress, which fixes the
problem. This change entails a change to the internal details of the
reconnect FSM, so this commit also updates the reconnect tests to match.
Reported by Jeremy Stribling.
When a new record is inserted into a database, ovsdb logs the values of all
of the fields in the record. However, often new records have many columns
that contain default values. There is no need to log those values, so this
commit causes them to be omitted.
As a side effect, this also makes "ovsdb-tool show-log --more --more"
output easier to read, because record insertions print less noise. (Adding
--more --more to this command makes it print changes to database records.
The --more option will be introduced in an upcoming commit.)
The sFlow license requires sFlow to be written as sFlow(R) for the first
reference in user documentation, so this commit implements that.
Suggested by Justin Pettit.
jsonrpc_session_connect() indirectly called reconnect_disconnected(), which
told the reconnect object that the connection had failed, before it told it
that the connection attempt had started. When the connection didn't
complete immediately, this caused the connection to time out immediately,
without any backoff.
Reported by Jeremy Stribling.
Debian's kernel-headers packages starting from 2.6.32 (or thereabouts) put
links to the kernel build and source directories at the same level, named
"build" and "source" respectively. Add support for this structure.