Ben Pfaff dug through the kernel sources and reported that
bond_miimon_inspect() supports four BOND_LINK_* states:
* BOND_LINK_UP: carrier detected, updelay has passed.
* BOND_LINK_FAIL: carrier lost, downdelay in progress.
* BOND_LINK_DOWN: carrier lost, downdelay has passed.
* BOND_LINK_BACK: carrier detected, updelay in progress.
And that bond_info_show_slave() only considers BOND_LINK_UP to be "up"
and anything else to be "down".
Thanks for doing this and suggesting a fix, Ben!
Free dpif_names when we're done with it.
This memory leak is not a big deal since bridge_init() is only ever called
once in a given ovs-vswitchd execution.
Our test case automation has a requirement to know which hash value a
given MAC address hashes to, in order to validate that balancing is
happening as expect etc.. Rather than attempt to reimplement the hash
algorithm used by vswitchd in python instead expose an appctl which
returns this information.
The bonding code in vswitch sends out gratuitous learning packets that
are supposed to teach switches but not cause anything else to happen on
the network. Some upcoming code wants to synthesize packets with similar
properties, so factor this code into a new function so that it can be
used in both places.
In the /proc compatibility layer, the bond member was reported as up
immediately after link recovery, regardless of the updelay. I believe
the compatibility code was correct if the check had been done with carrier,
but since 'iface->enabled' already does that calculation, we can use it
directly.
Additinally, when a bond slave was enabled or disabled, the bond
compatibility code was not being told to update its state. This commit
makes that call.
NIC-39
If all of the ports specified as mirror selection criteria actually do not
exist, then until now the bridge would mirror all incoming packets (on
specified VLAN(s), if any). This matches the behavior that occurs if no
mirror selection ports were specified at all, and so it makes a certain
amount of logical sense.
But it is far more likely that the user simply misspelled a port name, or
specified the name of a port that does not always exist. In fact we have
seen this behavior in practice when the controller has not caught up to
the switch's current configuration. So this commit changes the bridge to
instead disable a mirror if ports are specified and none of those ports
exist.
Bug #1904.
compose_dsts() was updating the VLAN of packets sent to VLAN mirrors
before it changed the VLAN value, but of course it's the final VLAN value
that actually matters.
Thanks to Reid for his good work tracking this one down.
Bug #1898.
This bug was introduced in the merge from the citrix branch in commit
8fef8c71 "Merge citrix into master."
Thanks to Reid for characterizing the problem.
Bug #1907.
This was a somewhat difficult merge since there was a fair amount of
superficially divergent development on the two branches, especially in the
datapath.
This has been build-tested against XenServer 5.5.0 and XenServer 5.7.0
build 15122. It has been booted and connected to XenCenter on 5.5.0.
The merge revealed a couple of outstanding bugs, which will be fixed on
citrix and then merged back into master.
Citrix QA scripts expect that "brctl show" shows a bond interface for each
bond that is added to a bridge. The only way to do that without modifying
brctl itself is to create an actual network device by that name, so this
commit adds a new bonding configuration key that causes an internal
device by the name of the bond to be created.
This feature is also necessary, but not sufficient, to allow XenCenter to
accurately show the link status and statistics of bridges (bug #1363).
This new configuration key is intentionally undocumented, because I don't
want anyone to use it.
Bug NIC-19.
Originally, the function dp_enumerate() initialized the 'all_dps'
argument. This is inconsistent with most other functions that take an
svec argument, which would only clear the contents. Further, if someone
were not careful when reusing the svec, it could lead to memory leaks.
With this change, the caller is expected to first call svec_init() on
the argument.
When a port's MAC is explicitly specified in the config file, we did not
initialize 'iface' and therefore later we could dereference a wild pointer.
This commit fixes the problem.
Justin reported that adding an internal port to a bridge caused
ovs-vswitchd to log a pair of warnings. This commit suppresses those
warnings, which were harmless.
For consistency, it's best if every netdev function takes a netdev instead
of a device name. The netdev_nodev_*() functions have always been a bit
ugly.
The netdev_nodev_*() functions have always been a bit of a kluge. It's
better to keep a network device open than to open it every time that it is
needed. This commit converts all of the users of these functions to use
the corresponding functions that take a "struct netdev *" instead.
The netdev_nodev_*() functions have always been a bit of a kluge. It's
better to keep a network device open than to open it every time that it is
needed.
Two different pieces of code in vswitchd were both iterating over all
the interfaces in a bridge and deleting some of them, then deleting any
ports that ended up with no interfaces because of this. This commit
factors this operation out into a helper function.
When there is the possibility of multiple classes of netdevs,
netdev_add_router() needs to know which of these to use, so it needs a
"struct netdev *" parameter.
Until now ovs-vswitchd has created the files in /proc/net/bonding, but not
updated them, because there was little need. But the Citrix QA tests check
that the list of bond hashes in that file is kept up-to-date, so we need
to update them whenever the bond hashes (or other data in the file) change.
This commit does that.
Bug NIC-16.
The Citrix QA scripts require the bond hashes and their assigned devices
to be noted in /proc/net/bonding. We weren't doing that, so this commit
adds them.
Bug NIC-16.
Previously, the only way to query the flow table was to run "ovs-ofctl
dump-flows". This returned most flows, but not those marked hidden by
secchan. Hidden flows are setup by mechanisms such as in-band control,
since they must not be modified by users of the controller. However,
when debugging problems on the switch, it is often useful to see what
the flow table is actually doing. The new "bridge/dump-flows" command
added to ovs-appctl shows all flows being used by the OpenFlow stack.
Before now, the default probe interval (the idle time after which an echo
request is sent on an OpenFlow connection) was set to 15 seconds. The
fail-open timeout is 3 times the probe interval, so this meant that it
took 45 seconds for a switch to fail open.
Users at Nicira have commented that this is too long. They don't like the
idea that the network will be down for most of a minute before it begins to
recover. So this commit changes the default probe interval to 5 seconds,
hence the fail-open timeout to 15 seconds.
Users at Nicira have commented that a maximum reconnection time of 15
seconds, which was the default, is too long. This commit cuts it to 8
seconds, on the theory that an administrator is willing to wait that long
before deciding that a change that should restore connectivity did not
work.
The xapi database for PIFs specifies the MAC address that should be used
for bonds, but interface-reconfigure didn't honor it and ovs-vswitchd
didn't have a way to configure it anyhow. This commit fixes both problems.
Bug #1645.
The "fdb/show" unixctl command was showing vswitch-internal port indexes,
which cannot be meaningfully interpreted by software outside vswitchd.
Also, they potentially change every time the vswitchd configuration file
changes. This commit changes it to use a datapath port index instead,
which are both more meaningful and more stable.
To implement "brctl showmacs" for bridge compatibility, brcompatd needs to
be able to extract the MAC learning table from ovs-vswitchd. This provides
a way, and it may be directly useful to switch administrators also.
If a network device takes a few seconds to detect carrier, as some do, then
when bringing up a network device and then immediately adding that device
to a bridge, the bond code would start out with that slave considered down
and apply the full updelay to it before bringing it up. With the 31-second
updelay set by XenServer, this is excessive: we end up having no
connectivity at all for 31 seconds even though there is no reason for it.
This commit makes the bond code disregard the updelay when an interface
comes up on a bond that has no enabled interfaces, and updates the
documentation to match.
Part of bug #1566.
At startup, the vswitch needs to delete datapaths that are not configured
by the administrator. Until now this was done by knowing the possible
names of Linux datapaths. This commit cleans up by allowing each
datapath class to enumerate its existing datapaths and their names.
Commit f4b96c92c "vswitch: Disallow bridges named "dpN" or "nl:N"" disabled
naming bridges "dpN" because the vswitchd code made the bad assumption that
the bridge's local port has the same name as the bridge, which was not
true (at the time) for bridges named dpN. Now that assumption has been
eliminated, so this commit eliminates the restriction too.
This change is also a cleanup in that it eliminates one form of the
vswitch's dependence on specifics of the dpif implementation.
Soon we will allow for multiple datapath implementations. By allowing
the datapath to choose the port numbers, we possibly simplify some datapath
implementations, and the datapath's clients don't have to guess (or to
check) what port numbers are free, so this seems like a better way to go.
dpif_id() is often used in error messages, e.g. "dp%u: screwed up". But
soon we will be generalizing the concept of a datapath, so it is better
to have a function that returns a full name, e.g. "%s: screwed up".
Accordingly, this commit replaces dpif_id() by a new function dpif_name()
that does so.