On pool join, the bridge.<bridge>.xs-network-uuids key is not updated
properly for the primary management interface. We don't have a proper
fix for this problem yet, and probably won't ever have one for XenServer
5.5.0, so this commit adds a script that works around the problem.
Running the script is a shortcut for rebooting the XenServer host,
which should also solve the problem.
Bug #2097.
NOX packages depend on a particular version of openvswitch-pki, which
depends on openvswitch-common without specifying a version. This meant
that the installed versions of openvswitch-pki and openvswitch-common
could easily get out of sync. This commit makes all of the dependencies
among openvswitch packages specify an explicit version, which should fix
this problem.
CC: Dan Wendlandt <dan@nicira.com>
vlan.%s.* will match e.g. eth0.123 if the %s expands to eth0. We only
want it to match eth0 in that case.
This is based on code inspection. It may or may not fix a real problem.
The make_unix_socket() function that Unix vconns use to create their
bindings calls fatal_signal_add_file_to_unlink() to make sure that the
binding socket gets unlinked from the file system if the process is killed
by a fatal signal. However, this doesn't happen until the process is
actually killed, even if the vconn that owns the socket is actually closed.
This wasn't a problem when the vconn-unix code was written, because all
of the unix vconns were created at process start time and never destroyed
during the normal process runtime. However, these days the vswitch can
create and destroy unix vconns at runtime depending on the contents of its
configuration file, so it's better to clean up the file system and free
the memory required to keep track of these sockets.
This commit makes unix vconns and pvconns delete their files and free
the memory used to track them when the (p)vconns are closed.
This is only a very minor leak most of the time.
Bug #1817.
Ben Pfaff dug through the kernel sources and reported that
bond_miimon_inspect() supports four BOND_LINK_* states:
* BOND_LINK_UP: carrier detected, updelay has passed.
* BOND_LINK_FAIL: carrier lost, downdelay in progress.
* BOND_LINK_DOWN: carrier lost, downdelay has passed.
* BOND_LINK_BACK: carrier detected, updelay in progress.
And that bond_info_show_slave() only considers BOND_LINK_UP to be "up"
and anything else to be "down".
Thanks for doing this and suggesting a fix, Ben!
Free dpif_names when we're done with it.
This memory leak is not a big deal since bridge_init() is only ever called
once in a given ovs-vswitchd execution.
ROUND_UP rounds up to a multiple of a given value. That means that
bitmap_allocate() was allocating one byte for each bit in the bitmap,
which is clearly excessive.
Instead, just allocate one bit for every bit in the bitmap.
There have been numerous attempts at getting in-band correct. If
history is at all an example, it probably still isn't. However, this is
an attempt to document its current design, so that we can understand
what our current thinking is.
The synopsis section of the man page for ovs-appctl incorrectly stated
that the target option takes "pid" as an argument. This commit corrects
that to say "socket".
* Drop "--test-mode" option -- it was never wired up to anything.
* Add some additional checks for valid parameter combinations
* Raise some errors for unimplemented (but not currently used in
XenServer) options.
This manpage was using a nonstandard macro that it did not define. Fix
the problem by adding the definition.
Reported-by: Ian Campbell <Ian.Campbell@citrix.com>
Our test case automation has a requirement to know which hash value a
given MAC address hashes to, in order to validate that balancing is
happening as expect etc.. Rather than attempt to reimplement the hash
algorithm used by vswitchd in python instead expose an appctl which
returns this information.
It's good to clean up.
Ported from "citrix" to "master" branch with file name updated.
CC: Keith Amidon <keith@nicira.com>
CC: Henrik Amren <henrik@nicira.com>
Commit ac9634f0af "xenserver: Make RPM install work again" introduced a
new command "init-dbcache" for the interface-reconfigure script. However
it is cleaner to simply make the PIF argument to the "rewrite" command
optional.
CC: Ian Campbell <Ian.Campbell@citrix.com>
When the switch is configured to connect to a controller that accepts
connections, waits a few seconds, and then disconnects without setting up
flows, currently this causes "fail-open" to flush the flow table and
stop setting up new flows during the connection duration. This is OK if
it happens once, but it can easily happen every 8 seconds with typical
backoff settings, and that isn't so great.
This commit changes fail-open to only flush the flow table once the switch
appears to have been admitted by the controller, which prevents these
frequent network interruptions.
Thanks to Jesse Gross for especially valuable feedback.
QA notes: Behavior in fail-open and especially behavior with a controller
that rejects the switch after it connects needs to be re-tested. The
ovs-controller --mute switch added by this commit is one simple way to
create such a controller.
CC: Peter Balland <peter@nicira.com>
Bug #1695. Bug #2055.
Currently only ofproto.c ever composes OFPT_PACKET_IN messages, but some
upcoming code wants to do the same thing, so factor this out into a new
function to avoid code duplication.
The bonding code in vswitch sends out gratuitous learning packets that
are supposed to teach switches but not cause anything else to happen on
the network. Some upcoming code wants to synthesize packets with similar
properties, so factor this code into a new function so that it can be
used in both places.
Dan requested this change to make it less likely that a user encounter a
CA certificate expiring.
For the "citrix" branch instead of "master" in case a customer upgrades
(without generating new CA certificates) away from the beta.
CC: Dan Wendlandt <dan@nicira.com>
Whether a port is internal is cached to avoid requerying the kernel
every time stats are requested. However, the cache vality bit was
never being set so the cache wasn't used. This corrects that
oversight.
Thanks to Ben Pfaff for noticing.
A lock file in /var/lock/subsys must be created with the same name as
the initscript in order for the stop action to be automatically called
on runlevel change. This is true at least on Red Hat derived systems
such as XenServer where /etc/rcS contains:
# First, run the KILL scripts.
for i in /etc/rc$runlevel.d/K* ; do
check_runlevel "$i" || continue
# Check if the subsystem is already up.
subsys=${i#/etc/rc$runlevel.d/K??}
[ -f /var/lock/subsys/$subsys -o -f /var/lock/subsys/$subsys.init ] \
|| continue
...
(This could potentially expose bugs e.g. in the stop priority for the
script since I think it is likely that the stop action hasn't been
running to now. I haven't closely considered this case yet but vswitch
is currently scheduled at K91vswitch vs K90network which seems correct
at first glance)
Linux 2.6.27 introduces a new mechanism for sharing STP packets among
kernel modules, which means that the code in datapath.c to avoid loading
when the Linux bridging module is also loaded has false positives. So
fall back on these newer kernels to a less reliable way of avoiding the
bridge module, but one that does not have false positives.
CC: Jean Tourrihles <jt@hpl.hp.com>
Commit c798b21c6a "xenserver: Only consider the host we are running on in
interface-reconfigure" dropped the get_pifs_by_record function in favor
of get_pifs_by_device, but didn't adapt callers properly, so that the
XenServer network PIFs weren't properly found and thus the xs-network-uuids
keys weren't set correctly.
This fixes the caller.
Bug #2043.
Currently ov-vsctl tries to treat /var/run/ovs-vswitchd.*.ctl as a
file/pipe when it is actually a Unix domain socket:
# ovs-vsctl add-br TEST
Traceback (most recent call last):
File "/usr/bin/ovs-vsctl", line 498, in ?
main()
File "/usr/bin/ovs-vsctl", line 493, in main
function(*args)
File "/usr/bin/ovs-vsctl", line 345, in cmd_add_br
cfg_save(cfg, VSWITCHD_CONF)
File "/usr/bin/ovs-vsctl", line 142, in cfg_save
cfg_reload()
File "/usr/bin/ovs-vsctl", line 126, in cfg_reload
f = open(target, "r+")
IOError: [Errno 6] No such device or address: ' '
# ls -l /var/run/ovs-vswitchd.4173.ctl
srw------- 1 root root 0 Sep 14 12:25 /var/run/ovs-vswitchd.4173.ctl
From strace:
open("/var/run/ovs-vswitchd.4173.ctl", O_RDWR|O_LARGEFILE) = -1 ENXIO (No such device or address)
Internal ports appear to have their transmit and receive stats swapped
because from the kernel's point of view these ports are acting like
the machine connected to the switch, not the switch itself. This swaps
the stats for consistency with other ports.
Until now, when dp_output_control() queued a GSO packet to userspace, it
would first compute the checksum for the whole GSO packet, then break the
packet into segments. However this had two drawbacks:
1. The checksum had to be recomputed for each segment, wasting time.
2. Linux 2.6.22 and later would emit a warning in skb_gso_segment()
because the checksum was precomputed.
This commit changes dp_output_control() to instead break the packet into
segments, then compute the checksum across each of the segments
individually. This fixes both drawbacks.
This commit has seen light testing on Xen's 2.6.27. It has been build
tested on a few different kernel versions.
Our test automation needs to be able to validate that a VLAN bridge and
for this I needed two new operations in ovs-vsctl:
* The ability to query the VLAN tag for a bridge.
* The ability to query the 'parent' of a bridge. The parent is the
non-VLAN/untagged bridge with the same physical devices and
could be a bond.
So given xenbr0 (containing eth0) + xapi2 (VLAN 42 on eth0) and xapi1
(containing bond0 == eth2+eth3) + xapi3 (VLAN 23 on the bonded
interface):
[root@warlock ~]# ovs-vsctl br-to-vlan xapi2
42
[root@warlock ~]# ovs-vsctl br-to-vlan xapi3
23
[root@warlock ~]# ovs-vsctl br-to-parent xapi2
xenbr0
[root@warlock ~]# ovs-vsctl br-to-parent xapi3
xapi1
In the /proc compatibility layer, the bond member was reported as up
immediately after link recovery, regardless of the updelay. I believe
the compatibility code was correct if the check had been done with carrier,
but since 'iface->enabled' already does that calculation, we can use it
directly.
Additinally, when a bond slave was enabled or disabled, the bond
compatibility code was not being told to update its state. This commit
makes that call.
NIC-39
Given a possible 1,024 ports on a bridge the previous limit of 2,048
entries seems low.
If we want to increase this further we should introduce dynamic allocation
of table entries to avoid wasting memory in the common case.
CC: Keith Amidon <keith@nicira.com>
The original xen-bugtool did not collect any OVS logs. Now that more
logging is moving from /var/log/messages to ovs-vswitchd's and
ovs-brcompatd's private log files, we should include them in the
information collected for bug reports.