In vSwitch, the minimum probe interval is supposed to be 5 seconds, but
that was not enforced. If no interval was specified in the config file,
a value of 0 was being used, which would cause probes to never be sent
and the rconn not to move out of its ACTIVE state.
Possible fix to Bug #1466.
When we create a datapath we do this:
1. Create local port.
2. Call add_dp hook.
3. Allow userspace to add more ports.
When we deleted a datapath we were doing this:
1. Call del_dp hook
2. Delete all the ports.
Unfortunately step 1 destroys dp->ifobj, then dp_del_port on any port other
than the local port in step 2 tries to reference dp->ifobj through a call
to sysfs_remove_link().
This commit fixes the problem by changing datapath deletion to mirror
creation:
1. Delete all the ports but the local port.
2. Call dp_del hook.
3. Delete local port.
Commit 010082639 "datapath: Add sysfs support for all (otherwise supported)
Linux versions" makes this problem obvious on a 2.6.25+ kernel configured
with slab debugging, because on such kernels the ifobj is a pointer to a
slab object that is freed by the del_dp hook function (when brcompat_mod
is loaded). This bug may be just as present on older kernels, but there
the ifobj is part of struct datapath, not a pointer, and thus it is much
harder to trigger.
Bug #1465.
If a user moves from one controller to another, we did not remove the
cacert. This prevents the switch from connecting to the new controller.
To ease confusion, we now delete the cacert when the user changes or
removes the controller in xsconsole.
Note: This commit has a minor security issue, since we do not remove
trust for the old certificate until the switch is restarted. In
general, users should only be connected to trusted servers, so the
impact should be low. Fixes this would require larger changes to the
vconn-ssl code, which we don't want to do so late in the release cycle.
Bug #1457
We've gone through a couple of iterations for names of these mailing
lists. Currently, there are three: announce, discuss, and git. There
are aliases that point "bugs" and "dev" to the "discuss" mailing list.
This commit drops the "ovs-" prefix to mailing lists, since we're not
using them.
When a switch is using in-band control, the controller must be specified
in dotted quad format, since DNS names cannot be resolved until a
connection to the controller has been established. This commit
validates the user input in the xsconsole plugin.
When a managment connection is configured and then removed, putting it
back causes the management connection to never be reestablished. The
management code checks whether the configuration file has changed before
it attempts to reconfigure itself. If the only thing that changed was
the lack of a management connection, then it tore down the connection
but didn't update its view of the configuration. When the same
manager IP is configured, the cached version matches the new version, so
no changes are made. This commit clears the cached version, so that a
removing and then adding the manager will be detected as a change.
Bug #1448
In Linux 2.6.30, the rtnl_notify() return type was changed from int to
void along with the following commit message:
This patch also modifies the rtnetlink code to ignore the return
value of rtnl_notify() in all callers. The function rtnl_notify()
(before this patch) returned the error of the unicast notification
which makes rtnl_set_sk_err() reports errors to all listeners. This
is not of any help since the origin of the change (the socket that
requested the echoing) notices the ENOBUFS error if the notification
fails and should resync itself.
Thus there's no point in checking the return value, even in older versions
of the kernel, and so this commit changes our code to ignore it, even
on older kernel versions. We also update the rtnl_notify() wrapper macros
to make the return type void on older kernel versions.
This has not been tested, just built.
Thanks to Mikio for spurring me to try building with Linux 2.6.29 and
2.6.30.
INFO level messages are meant to be logged in the ordinary case, and they
are useful for debugging problems, so turn them on by default.
It would be a good idea to do so for ovs-vswitchd also, but we have not
tested how much this would increase the log volume.
XenServer Tools version 5.0.0 destroys and recreates network devices with
the same name on boot of (at least) Windows VMs. We had a race such that
ovs-brcompatd would delete the new device from the vswitchd configuration
file (not the old one). This commit fixes that problem.
Bug #1429.
When we receive an OpenFlow management protocol Config Update, we
immediately force the switch to reconfigure itself. This is
functionally correct, but it can cause long delays before return control
back to the switch. We now keep track of whether there were any changes
and then only force a reconfigure once per management run.
When cfg_lock() has to block for some time to obtain the configuration file
lock, it logs the amount of time that it waited. However, it did not
refresh the current time before it began waiting, so the time that it
logged could be off by a significant amount, which make interpreting the
log file more challenging than it should have been.
This change should mainly affect log output. It should have little or no
effect on Open vSwitch operation because the factor by which the timeouts
were off is an order of magnitude smaller than the actual timeouts that we
pass into the function.
This is related to bug #1426, but it is not a fix for this bug, which will
be committed separately.
When the vSwitch xsconsole plugin is installed, it doesn't need execute
permissions. This commit changes the permissions from 755 to 644 to
match the other plugins.
When a slave cannot connect to the master, the vSwitch xsconsole plugin
complained with some Python style errors on the main display. This
commit cleans up that behavior.
Bug #1341
When a VIF is deleted, the "vif" script modifies "/etc/ovs-vswitchd.conf".
After changes are made to the config file, ovs-vswitchd should be told
to reload it, but this wasn't happening. Now it does.
Thanks to Natasha for catching this.
On the citrix branch we changed the license to Apache 2.0. Merging the
citrix branch into master hence updated the license of all the files that
existed in the citrix branch. However one file was added in master that
wasn't in citrix, so this commit updates the license on that new file.
The SHA-1 library that we used until now was taken from RFC 3174. That
library has no clearly free license statement, only a license on the text
of the RFC. This commit replaces this library with a modified version of
the code from the Apache Portable Runtime library from apr.apache.org,
which is licensed under the Apache 2.0 license, the same as the rest of
Open vSwitch.
veth_mod.ko is built only for Linux 2.6.18 (since later versions already
have it). Our XenServer build doesn't use it at all, so don't package it.
(This is in response to a build failure against a XenServer 5.7.0
prerelease, which uses a 2.6.27 kernel and thus for which veth_mod.ko is
not built.)
The 'packet' argument to process_flow() is allowed to be null, but some of
the code was assuming that it was always non-null, which caused a segfault
while revalidating ARP flows.
Bug #1394.
The TCP and SSL vconn implementations had a lot of common code to make
and accept TCP connections, which this commit factors out into common
functions in socket-util.c.
Also adds the ability to bind ptcp and pssl vconns to a particular IP
address instead of the wildcard address.
Older versions of Open vSwitch implemented OpenFlow in the kernel over
a Netlink channel, and this code was here to work around some issues with
that, but now it is unnecessary since the OpenFlow kernel implementation is
gone.
vconn_connect() is defined to return 0 on success or a positive errno
value on failure, but it was possible to get a negative value (EOF). This
commit changes this to ECONNRESET to match caller expectations.
This turned out to be less trouble than I expected.
This builds successfully against 2.6.18 through 2.6.28. Justin has lightly
tested it on a 2.6.27 kernel provided by Citrix.
The latest XenServer release is based on 2.6.27. The datapath code
defined "skb_checksum_setup", since it wasn't exported in their 2.6.18
kernels. This change causes it only to be built if the kernel is
version 2.6.18.
The "dump-vif-details" script adds the network UUID to the
ovs-vswitchd.conf file. Unfortunately, it wrote the key as
"network-uuid", but the code that retrieves it for the management
protocol checked our "net-uuid". The script now uses the key
"net-uuid".
Thanks to Natasha for catching the problem.
The controller discovery code has always had the capability to whitelist
only certain types of controller locations. Until now, we have only taken
advantage of this when SSL is enabled (so that all OpenFlow connections are
authenticated with SSL if SSL is configured).
However, it occurs to me that making the section of connections entirely
unrestricted is too permissive. An attacker could make the vswitch connect
to an arbitrary Unix domain socket, for example. I don't have a
description of how this is an exploitable security vulnerability, but it
seems entirely too lax.
So: this commit changes the default to allowing only TCP connections to
controller in the non-SSL case.
vNetManager needs to know the xapi UUIDs for the networks that correspond
to OpenFlow connections. For some time now we have passed these to it
over the management connection using bridge.<bridgename>.xs-network-uuids
configuration keys, but only now did we notice that this didn't get set for
internal networks.
The reason that it didn't get set is that interface-reconfigure is the
script that sets up these configuration keys, but interface-reconfigure
is never called for internal networks. Instead, xapi creates them itself
using directly calls to bridge ioctls. So no amount of tweaks to
interface-reconfigure will help.
This commit fixes the problem by modifying the vif script instead. This
works acceptably only because xapi is lazy about creating bridges for
internal networks: it creates them only just before it is about to add the
first vif to them. Thus, by setting up the configuration key in the vif
script, it gets added just after the bridge itself is created. There is
a race, of course, meaning that there may be a delay between the initial
OpenFlow connection and the time when the configuration key is set up,
but vNetManager can tolerate that.
OpenFlow uses a 16-bit field to describe the message length, which
limits messages to a maximum 65535 bytes. Some of the messages passed
by the management protocol may be larger than this, so a general
Extended Data message has been added to management protocol. It
encapsulates a single giant OpenFlow-like message, and breaks it into
however many vaild smaller ones are required.
When a resource update message is generated by vSwitch, it creates a
couple of svec objects that need to be explicitly destroyed. This
wasn't happening, so memory would leak after each resource update. This
commit properly destroys them after use.
As long as bonding has been implemented, the vswitch has refused to learn
from multicast packets that arrive on a bond slave if it has already
learned any other port for that source MAC, because it is likely that we
sent the packet out ourselves and are only now receiving a copy of it on
our active slave.
This is entirely correct, but it does not go far enough. In fact, the
bridge needs to entirely drop such packets. Otherwise, a host whose MAC
is assigned to a slave other than the active slave will receive a second
copy of multicast packets that it sends out the bond, and other ports
will receive two copies of every multicast packet sent by such a host.
This commit implements this new policy, which simplifies the code at the
same time.
Bug #1387.
The glibc 2.7 headers contain a bug that causes strtok_r() to segfault
in some circumstances. Until now, we have been working around this
problem at each invocation, but this depends on the programmer to remember
to do so each time.
This commit instead adds a shim that adds a work-around to the string.h
header itself, so that it is much more difficult to miss the workaround.
The man page for ovs-vswitchd.conf explains how ingress policing works.
However, what "ingress" means is a bit confusing depending on the
perspective. For vSwitch, it's from the switch's perspective. This
means on a PIF, it's the rate traffic comes into the box. On a VIF,
it's the rate traffic can be *transmitted* from a VM. This commit
clarifies the man page a bit.
Thanks to Johan for pointing out the problem.
The controller needs to know various things about virtual interfaces as
they move about the network. This commit sends the VIF, virtual
machine, and network UUIDs associated with the VIF, as well as its MAC
address over the management channel.
Feature #1324
An improper string comparison operator was used to check whether
FORCE_COREFILES was enabled. Further, the check to enable core files
was only down when vswitch was started, and not when restarted.
Thanks to Ben for help debugging the issue.