Until now, when dp_output_control() queued a GSO packet to userspace, it
would first compute the checksum for the whole GSO packet, then break the
packet into segments. However this had two drawbacks:
1. The checksum had to be recomputed for each segment, wasting time.
2. Linux 2.6.22 and later would emit a warning in skb_gso_segment()
because the checksum was precomputed.
This commit changes dp_output_control() to instead break the packet into
segments, then compute the checksum across each of the segments
individually. This fixes both drawbacks.
This commit has seen light testing on Xen's 2.6.27. It has been build
tested on a few different kernel versions.
Our test automation needs to be able to validate that a VLAN bridge and
for this I needed two new operations in ovs-vsctl:
* The ability to query the VLAN tag for a bridge.
* The ability to query the 'parent' of a bridge. The parent is the
non-VLAN/untagged bridge with the same physical devices and
could be a bond.
So given xenbr0 (containing eth0) + xapi2 (VLAN 42 on eth0) and xapi1
(containing bond0 == eth2+eth3) + xapi3 (VLAN 23 on the bonded
interface):
[root@warlock ~]# ovs-vsctl br-to-vlan xapi2
42
[root@warlock ~]# ovs-vsctl br-to-vlan xapi3
23
[root@warlock ~]# ovs-vsctl br-to-parent xapi2
xenbr0
[root@warlock ~]# ovs-vsctl br-to-parent xapi3
xapi1
In the /proc compatibility layer, the bond member was reported as up
immediately after link recovery, regardless of the updelay. I believe
the compatibility code was correct if the check had been done with carrier,
but since 'iface->enabled' already does that calculation, we can use it
directly.
Additinally, when a bond slave was enabled or disabled, the bond
compatibility code was not being told to update its state. This commit
makes that call.
NIC-39
Given a possible 1,024 ports on a bridge the previous limit of 2,048
entries seems low.
If we want to increase this further we should introduce dynamic allocation
of table entries to avoid wasting memory in the common case.
CC: Keith Amidon <keith@nicira.com>
The original xen-bugtool did not collect any OVS logs. Now that more
logging is moving from /var/log/messages to ovs-vswitchd's and
ovs-brcompatd's private log files, we should include them in the
information collected for bug reports.
It seems really strange that this one slipped through. Perhaps this
means that we have never tested with any action other than OFPAT_OUTPUT
(which has value 0 and thus is not affected by byte-swapping).
/usr is the standard location for installation, so use that instead of our
nonstandard location under /root.
This migrates everything except the kernel modules to /usr. The kernel
modules will be migrated in an upcoming commit.
One possibly surprising change is that the manpages listed in the %files
section of vswitch-xen.spec not only moved but added .gz extensions. This
seems to be because RPM automatically compresses manpages, but only if they
are installed in a standard system location.
The RPM install was generating a database cache in Python pickle format in
/etc/ovs-vswitchd.conf, but interface-reconfigure was looking for it in
XML format in /var/lib/openvswitch/dbcache. This fixes the problem, by
adding an init-dbcache command to interface-reconfigure and then using that
at RPM install time.
This moves the database cache creation from %pre to %post. This is
necessary so that interface-reconfigure is available from the install
script.
Set the default log level for file logging to INFO for ovs-vswitchd and
ovs-brcompatd. This is done so that coverage messages are kept in the
log file, since we no longer log them through syslog due its synchronous
writing on Xen hosts. The issue is described in detail in commit 6bc995e.
Fix test for whether file logging should be enabled for ovs-brcompatd.
Reported by Ian Campbell.
By default, many OVS processes keep track of their time through a poll
loop. If it takes an unusually long time (measured as some distance
from the mean), the processes will log stats it has been keeping about
coverage. It was doing this at level WARN.
On Xen systems, syslog messages written at level INFO and higher are
written to /var/log/messages synchronously. This would mean that there
would be dire messages that it took a few dozen milliseconds to go
through the loop, meanwhile, it would take up to 6(!) seconds writing
those. Meanwhile, the process would do no other processing, which could
be quite serious in the case of a process such as ovs-vswitchd.
This problem was somewhat masked because the time used by this logging
was not used in the calculations for determining how long it was taking
to get through the loop.
This commit lowers the default log level for those coverage messages to
INFO. On Xen systems, it raises the default level at which messages are
written to syslog to WARN.
Diagnosed and fixed with the help of Ian Campbell.
Our hope is that this will resolve many of the issues we have seen
where temporary delays in the forwarding of packets have caused issues
of various types.
This is a crossport from master of commit 44cb492 by Ian Campbell.
This reverts commit 641a0a4ed0.
This is a crossport from master of commit b2cdfea by Ian Campbell, with
the following commit message:
Do not renice the netback thread.
We should increase the vswitchd daemon's priority instead.
Our hope is that this will resolve many of the issues we have seen
where temporary delays in the forwarding of packets have caused issues
of various types.
The RPM spec file was trying to remove Open vSwitch log files as part of
the RPM uninstall process. It wasn't succeeding, however, since the glob
pattern was wrong.
Instead of fixing the glob pattern, just stop trying to remove the log
files. Log files are, after all, important for trying to debug problems,
and if we delete them at uninstall time then that makes life harder.
The vswitchd.conf manpage said that fail-open is disabled by default. This
is wrong: it is enabled by default. This commit fixes the documentation.
CC: Sujatha Sumanth <ssumanth@nicira.com>
Before transimitting a packet, the datapath checks that the packet
length is not greater than the MTU. It determines the length based on
the 'protocol' field in the skb. If 'protocol' is ETH_P_8021Q, it reduces
the packet length as stored in the 'len' field by four bytes, which
is the size of a VLAN tag header. Unfortunately, packets that arrived
from userspace were not having the 'protocol' field set, which would
cause MTU-sized packets to be dropped. This commit sets the 'protocol'
field appropriately.
Thanks to Ben Pfaff for the help diagnosing this issue.
NIC-17 and NIC-26
Until now, the vswitch RPM has installed /etc/sysconfig/vswitch.example
and made the system administrator copy it to /etc/sysconfig/vswitch if he
desires. This is slightly inconvenient, since it is slightly easier for
the admin if he can just edit /etc/sysconfig/vswitch directly. This commit
changes to the latter behavior.
Bug #1810.
/etc/ovs-vswitchd.conf should always be there. Nevertheless, it is not
nice to entirely break vswitch if it is accidentally deleted. This commit
makes /etc/init.d/vswitch create an empty configuration file if it is
missing.
Bug #1821.
In-band control needs to know the IP and port of the controller, so that
it can set up the correct flows to talk to that controller. Until now,
the rconn code has only made this available when a connection was actually
in progress. This means that, say, ARP packets will not be allowed through
when the rconn backs off. The same is true of packets sent by switches
that access the controller through this one.
This commit makes the rconn cache the remote IP and port and local IP
across connection attempts, improving the situation. In particular, it
reduces the overall amount of time that it takes to connect in my own
simple test case from over 10 seconds to about 2 seconds.
TCP vconns were reporting indeterminate remote IP and remote port, which
prevented in-band control from working for TCP vconns.
The code that this fixes is implemented differently on the citrix branch
and thus the bug was not present there.
Previously, in-band control was L2-based. This worked well when the
controller was on the same network segment as the switch. However, many
configurations are not set up this way. These changes allow a switch and
controller to be on different subnets.
This set of changes also fixes some problems related to passing DHCP
traffic as described in Bug #1618.
A full description of the reasoning and supported configurations of
in-band will be forthcoming.
The ability to match the IP addresses in ARP packets allows for fine-grained
control of ARP processing. Some forthcoming changes to allow in-band
control to operate over L3 requires this support if we don't want to
allow overly broad rules regarding ARPs to always be white-listed.
Unfortunately, OpenFlow does not support this sort of processing yet, so
we must treat OpenFlow ARP rules as having wildcarded those L3 fields.