When the switch is configured to connect to a controller that accepts
connections, waits a few seconds, and then disconnects without setting up
flows, currently this causes "fail-open" to flush the flow table and
stop setting up new flows during the connection duration. This is OK if
it happens once, but it can easily happen every 8 seconds with typical
backoff settings, and that isn't so great.
This commit changes fail-open to only flush the flow table once the switch
appears to have been admitted by the controller, which prevents these
frequent network interruptions.
Thanks to Jesse Gross for especially valuable feedback.
QA notes: Behavior in fail-open and especially behavior with a controller
that rejects the switch after it connects needs to be re-tested. The
ovs-controller --mute switch added by this commit is one simple way to
create such a controller.
CC: Peter Balland <peter@nicira.com>
Bug #1695. Bug #2055.
Currently only ofproto.c ever composes OFPT_PACKET_IN messages, but some
upcoming code wants to do the same thing, so factor this out into a new
function to avoid code duplication.
The bonding code in vswitch sends out gratuitous learning packets that
are supposed to teach switches but not cause anything else to happen on
the network. Some upcoming code wants to synthesize packets with similar
properties, so factor this code into a new function so that it can be
used in both places.
Dan requested this change to make it less likely that a user encounter a
CA certificate expiring.
For the "citrix" branch instead of "master" in case a customer upgrades
(without generating new CA certificates) away from the beta.
CC: Dan Wendlandt <dan@nicira.com>
Whether a port is internal is cached to avoid requerying the kernel
every time stats are requested. However, the cache vality bit was
never being set so the cache wasn't used. This corrects that
oversight.
Thanks to Ben Pfaff for noticing.
A lock file in /var/lock/subsys must be created with the same name as
the initscript in order for the stop action to be automatically called
on runlevel change. This is true at least on Red Hat derived systems
such as XenServer where /etc/rcS contains:
# First, run the KILL scripts.
for i in /etc/rc$runlevel.d/K* ; do
check_runlevel "$i" || continue
# Check if the subsystem is already up.
subsys=${i#/etc/rc$runlevel.d/K??}
[ -f /var/lock/subsys/$subsys -o -f /var/lock/subsys/$subsys.init ] \
|| continue
...
(This could potentially expose bugs e.g. in the stop priority for the
script since I think it is likely that the stop action hasn't been
running to now. I haven't closely considered this case yet but vswitch
is currently scheduled at K91vswitch vs K90network which seems correct
at first glance)
Linux 2.6.27 introduces a new mechanism for sharing STP packets among
kernel modules, which means that the code in datapath.c to avoid loading
when the Linux bridging module is also loaded has false positives. So
fall back on these newer kernels to a less reliable way of avoiding the
bridge module, but one that does not have false positives.
CC: Jean Tourrihles <jt@hpl.hp.com>
Commit c798b21c6a "xenserver: Only consider the host we are running on in
interface-reconfigure" dropped the get_pifs_by_record function in favor
of get_pifs_by_device, but didn't adapt callers properly, so that the
XenServer network PIFs weren't properly found and thus the xs-network-uuids
keys weren't set correctly.
This fixes the caller.
Bug #2043.
Currently ov-vsctl tries to treat /var/run/ovs-vswitchd.*.ctl as a
file/pipe when it is actually a Unix domain socket:
# ovs-vsctl add-br TEST
Traceback (most recent call last):
File "/usr/bin/ovs-vsctl", line 498, in ?
main()
File "/usr/bin/ovs-vsctl", line 493, in main
function(*args)
File "/usr/bin/ovs-vsctl", line 345, in cmd_add_br
cfg_save(cfg, VSWITCHD_CONF)
File "/usr/bin/ovs-vsctl", line 142, in cfg_save
cfg_reload()
File "/usr/bin/ovs-vsctl", line 126, in cfg_reload
f = open(target, "r+")
IOError: [Errno 6] No such device or address: ' '
# ls -l /var/run/ovs-vswitchd.4173.ctl
srw------- 1 root root 0 Sep 14 12:25 /var/run/ovs-vswitchd.4173.ctl
From strace:
open("/var/run/ovs-vswitchd.4173.ctl", O_RDWR|O_LARGEFILE) = -1 ENXIO (No such device or address)
Internal ports appear to have their transmit and receive stats swapped
because from the kernel's point of view these ports are acting like
the machine connected to the switch, not the switch itself. This swaps
the stats for consistency with other ports.
Until now, when dp_output_control() queued a GSO packet to userspace, it
would first compute the checksum for the whole GSO packet, then break the
packet into segments. However this had two drawbacks:
1. The checksum had to be recomputed for each segment, wasting time.
2. Linux 2.6.22 and later would emit a warning in skb_gso_segment()
because the checksum was precomputed.
This commit changes dp_output_control() to instead break the packet into
segments, then compute the checksum across each of the segments
individually. This fixes both drawbacks.
This commit has seen light testing on Xen's 2.6.27. It has been build
tested on a few different kernel versions.
Our test automation needs to be able to validate that a VLAN bridge and
for this I needed two new operations in ovs-vsctl:
* The ability to query the VLAN tag for a bridge.
* The ability to query the 'parent' of a bridge. The parent is the
non-VLAN/untagged bridge with the same physical devices and
could be a bond.
So given xenbr0 (containing eth0) + xapi2 (VLAN 42 on eth0) and xapi1
(containing bond0 == eth2+eth3) + xapi3 (VLAN 23 on the bonded
interface):
[root@warlock ~]# ovs-vsctl br-to-vlan xapi2
42
[root@warlock ~]# ovs-vsctl br-to-vlan xapi3
23
[root@warlock ~]# ovs-vsctl br-to-parent xapi2
xenbr0
[root@warlock ~]# ovs-vsctl br-to-parent xapi3
xapi1
In the /proc compatibility layer, the bond member was reported as up
immediately after link recovery, regardless of the updelay. I believe
the compatibility code was correct if the check had been done with carrier,
but since 'iface->enabled' already does that calculation, we can use it
directly.
Additinally, when a bond slave was enabled or disabled, the bond
compatibility code was not being told to update its state. This commit
makes that call.
NIC-39
Given a possible 1,024 ports on a bridge the previous limit of 2,048
entries seems low.
If we want to increase this further we should introduce dynamic allocation
of table entries to avoid wasting memory in the common case.
CC: Keith Amidon <keith@nicira.com>
The original xen-bugtool did not collect any OVS logs. Now that more
logging is moving from /var/log/messages to ovs-vswitchd's and
ovs-brcompatd's private log files, we should include them in the
information collected for bug reports.
It seems really strange that this one slipped through. Perhaps this
means that we have never tested with any action other than OFPAT_OUTPUT
(which has value 0 and thus is not affected by byte-swapping).
/usr is the standard location for installation, so use that instead of our
nonstandard location under /root.
This migrates everything except the kernel modules to /usr. The kernel
modules will be migrated in an upcoming commit.
One possibly surprising change is that the manpages listed in the %files
section of vswitch-xen.spec not only moved but added .gz extensions. This
seems to be because RPM automatically compresses manpages, but only if they
are installed in a standard system location.
The RPM install was generating a database cache in Python pickle format in
/etc/ovs-vswitchd.conf, but interface-reconfigure was looking for it in
XML format in /var/lib/openvswitch/dbcache. This fixes the problem, by
adding an init-dbcache command to interface-reconfigure and then using that
at RPM install time.
This moves the database cache creation from %pre to %post. This is
necessary so that interface-reconfigure is available from the install
script.
Set the default log level for file logging to INFO for ovs-vswitchd and
ovs-brcompatd. This is done so that coverage messages are kept in the
log file, since we no longer log them through syslog due its synchronous
writing on Xen hosts. The issue is described in detail in commit 6bc995e.
Fix test for whether file logging should be enabled for ovs-brcompatd.
Reported by Ian Campbell.
By default, many OVS processes keep track of their time through a poll
loop. If it takes an unusually long time (measured as some distance
from the mean), the processes will log stats it has been keeping about
coverage. It was doing this at level WARN.
On Xen systems, syslog messages written at level INFO and higher are
written to /var/log/messages synchronously. This would mean that there
would be dire messages that it took a few dozen milliseconds to go
through the loop, meanwhile, it would take up to 6(!) seconds writing
those. Meanwhile, the process would do no other processing, which could
be quite serious in the case of a process such as ovs-vswitchd.
This problem was somewhat masked because the time used by this logging
was not used in the calculations for determining how long it was taking
to get through the loop.
This commit lowers the default log level for those coverage messages to
INFO. On Xen systems, it raises the default level at which messages are
written to syslog to WARN.
Diagnosed and fixed with the help of Ian Campbell.
Our hope is that this will resolve many of the issues we have seen
where temporary delays in the forwarding of packets have caused issues
of various types.
This is a crossport from master of commit 44cb492 by Ian Campbell.
This reverts commit 641a0a4ed0a79d53a52d4e78ce1d90140a768798.
This is a crossport from master of commit b2cdfea by Ian Campbell, with
the following commit message:
Do not renice the netback thread.
We should increase the vswitchd daemon's priority instead.
Our hope is that this will resolve many of the issues we have seen
where temporary delays in the forwarding of packets have caused issues
of various types.
The RPM spec file was trying to remove Open vSwitch log files as part of
the RPM uninstall process. It wasn't succeeding, however, since the glob
pattern was wrong.
Instead of fixing the glob pattern, just stop trying to remove the log
files. Log files are, after all, important for trying to debug problems,
and if we delete them at uninstall time then that makes life harder.
The vswitchd.conf manpage said that fail-open is disabled by default. This
is wrong: it is enabled by default. This commit fixes the documentation.
CC: Sujatha Sumanth <ssumanth@nicira.com>
Before transimitting a packet, the datapath checks that the packet
length is not greater than the MTU. It determines the length based on
the 'protocol' field in the skb. If 'protocol' is ETH_P_8021Q, it reduces
the packet length as stored in the 'len' field by four bytes, which
is the size of a VLAN tag header. Unfortunately, packets that arrived
from userspace were not having the 'protocol' field set, which would
cause MTU-sized packets to be dropped. This commit sets the 'protocol'
field appropriately.
Thanks to Ben Pfaff for the help diagnosing this issue.
NIC-17 and NIC-26
Until now, the vswitch RPM has installed /etc/sysconfig/vswitch.example
and made the system administrator copy it to /etc/sysconfig/vswitch if he
desires. This is slightly inconvenient, since it is slightly easier for
the admin if he can just edit /etc/sysconfig/vswitch directly. This commit
changes to the latter behavior.
Bug #1810.