2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 13:27:59 +00:00

19894 Commits

Author SHA1 Message Date
Ben Pfaff
b1bdd85cab xenserver: Create vswitchd configuration file if it does not exist.
/etc/ovs-vswitchd.conf should always be there.  Nevertheless, it is not
nice to entirely break vswitch if it is accidentally deleted.  This commit
makes /etc/init.d/vswitch create an empty configuration file if it is
missing.

Bug #1821.
2009-09-03 13:00:11 -07:00
Ben Pfaff
fdc8be4e6d vswitchd: Explain why mirroring to a VLAN can cause network problems.
Bug #1963.
2009-09-03 12:58:40 -07:00
Ben Pfaff
875673c139 xenserver: Document all the /etc/sysconfig/vswitch settings.
Bug #1853.
2009-09-03 11:53:31 -07:00
Ben Pfaff
19d1ab55ae rconn: Speed up in-band control connections, by caching the remote address.
In-band control needs to know the IP and port of the controller, so that
it can set up the correct flows to talk to that controller.  Until now,
the rconn code has only made this available when a connection was actually
in progress.  This means that, say, ARP packets will not be allowed through
when the rconn backs off.  The same is true of packets sent by switches
that access the controller through this one.

This commit makes the rconn cache the remote IP and port and local IP
across connection attempts, improving the situation.  In particular, it
reduces the overall amount of time that it takes to connect in my own
simple test case from over 10 seconds to about 2 seconds.
2009-09-02 12:52:50 -07:00
Ben Pfaff
c3a519b5c9 vconn-tcp: Report correct remote IP and remote port.
TCP vconns were reporting indeterminate remote IP and remote port, which
prevented in-band control from working for TCP vconns.

The code that this fixes is implemented differently on the citrix branch
and thus the bug was not present there.
2009-09-02 11:16:40 -07:00
Ben Pfaff
c0a5fd2aed ofproto: Fix bad merge in previous commit.
This was fixed in my working tree before I pushed it, but I forgot to
commit it.  Oops.
2009-09-02 11:11:38 -07:00
Ben Pfaff
f1acd62b54 Merge citrix branch into master. 2009-09-02 10:14:53 -07:00
Justin Pettit
0ad9b73291 in-band: Implement L3-based in-band control
Previously, in-band control was L2-based.  This worked well when the
controller was on the same network segment as the switch.  However, many
configurations are not set up this way.  These changes allow a switch and
controller to be on different subnets.

This set of changes also fixes some problems related to passing DHCP
traffic as described in Bug #1618.

A full description of the reasoning and supported configurations of
in-band will be forthcoming.
2009-09-01 14:48:34 -07:00
Justin Pettit
bd193a0aba dpif: Add dpif_port_get_name call
Add ability to lookup a device name by its dpif port number.
2009-09-01 14:48:34 -07:00
Justin Pettit
13063c3b38 netdev: Add netdev_get_next_hop call
Add ability to determine the next hop IP address and device used to
reach a given host.
2009-09-01 14:48:34 -07:00
Justin Pettit
a26ef51703 Add ability for the datapath to match IP address in ARPs
The ability to match the IP addresses in ARP packets allows for fine-grained
control of ARP processing.  Some forthcoming changes to allow in-band
control to operate over L3 requires this support if we don't want to
allow overly broad rules regarding ARPs to always be white-listed.
Unfortunately, OpenFlow does not support this sort of processing yet, so
we must treat OpenFlow ARP rules as having wildcarded those L3 fields.
2009-09-01 14:48:34 -07:00
Justin Pettit
f10725fea5 Return netmask along with IP address when querying through netdev
The call netdev_get_in4() now allows the caller to also retrieve the
associated netmask.
2009-09-01 14:48:34 -07:00
Justin Pettit
a5f37a2deb secchan: Tighten in-band traffic always allowed into switch
In-band control sets up a bunch of invisible flows that allow the switch
and controller to communicate over OpenFlow.  The rules may have been a
bit too permissive, since it allowed any traffic to reach the
connection's interface.  This set of changes tries to tighten that to
only OpenFlow traffic and ARPs.
2009-09-01 14:48:34 -07:00
Justin Pettit
4701460af0 in-band: Fix status checks that could prevent in-band updates
The method the status callback was using to retrieve the local and
remote MAC addresses pushed back the refresh timer.  If this were done
frequently, it could prevent in-band control from updating its rules.
2009-09-01 14:48:34 -07:00
Justin Pettit
52ae00b3d3 ofproto: Cleanup bridge/dump-flows output
Add separator that was missing from the output of the "bridge/dump-flows"
command from ovs-appctl.
2009-09-01 14:48:33 -07:00
Justin Pettit
7f329488c5 netdev: Fix reversed arguments in netdev_recv warning. 2009-09-01 14:48:33 -07:00
Ben Pfaff
6fa58f7a15 datapath: Use hash table more tolerant of collisions for flow table.
The hash table used until now in the kernel datapath for storing the flow
table provides only two slots that a given flow can occupy.  If both of
those slots are already full, for a given flow, then that flow cannot be
added at all and its packets must be handled entirely in userspace, taking
a performance hit.  The code does attempt to compensate for this by making
the flow table rather large: 8 slots per flow actually in the flow table.
In practice, this is usually good enough, but some of the tests that we
have run show bad enough performance degradation or even timeouts of
various kinds that we want to implement something better.

This commit replaces the existing hash table by one with a completely
different design in which buckets are flexibly sized and can accept any
number of collisions.  By use of suitable levels of indirection, this
design is both simple and RCU-compatible.  I did consider other schemes,
but none of the ones that I came up with shared both of those two
properties.

This commit also adds kerneldoc comments for all of the flow table
non-static functions and data structures.

This has been lightly tested for correctness.  It has not been tested for
performance.

Bug #1656.  Bug #1851.
2009-09-01 10:36:42 -07:00
Ben Pfaff
2c7807ac4f datapath: Remove WARN_ON_ONCE(1) now that this code has been exercised.
The code on one side of this #if fork was difficult to test until Xen
upgraded to a new enough kernel that it would exercise it.  Later Xen
kernels are now available and this code path has been tested, at least to
some extent, so remove the warning.

Thanks to Ian Campbell <Ian.Campbell@citrix.com> for pointing out the
warning.
2009-09-01 10:12:12 -07:00
Ben Pfaff
3c4fae5f45 corekeeper: Always include PID in core dump names.
Some distributions automatically set /proc/sys/kernel/core_uses_pid to 1
and others leave it at its default setting of 0.  That means that, with the
core_pattern that corekeeper was setting, on the former distributions the
PID would be included in core names and on the latter the PID would be
omitted.  For consistency, this commit forces the PID to be in the core
file name in either case (note that putting %p in core_pattern causes
the core_uses_pid setting to be disregarded).

CC: Martin Casado <casado@nicira.com>
2009-08-31 09:11:32 -07:00
Ben Pfaff
b6f67919d4 secchan: Avoid sending NetFlow packets for empty flows.
There is no value in sending out NetFlow messages when the byte counter
(hence, packet counter) is 0.  This does not often happen, but it can in
corner cases where a flow gets installed but never sees any traffic before
it is uninstalled.

CC: Peter Balland <peter@nicira.com>
2009-08-28 14:59:42 -07:00
Ben Pfaff
e0c27cffbc vswitchd: Mirror nothing, not everything, if mirror ports don't exist.
If all of the ports specified as mirror selection criteria actually do not
exist, then until now the bridge would mirror all incoming packets (on
specified VLAN(s), if any).  This matches the behavior that occurs if no
mirror selection ports were specified at all, and so it makes a certain
amount of logical sense.

But it is far more likely that the user simply misspelled a port name, or
specified the name of a port that does not always exist.  In fact we have
seen this behavior in practice when the controller has not caught up to
the switch's current configuration.  So this commit changes the bridge to
instead disable a mirror if ports are specified and none of those ports
exist.

Bug #1904.
2009-08-26 14:03:39 -07:00
Ben Pfaff
274de4d20f vswitchd: Avoid output port explosion with mirrors that output to VLANs.
compose_dsts() was updating the VLAN of packets sent to VLAN mirrors
before it changed the VLAN value, but of course it's the final VLAN value
that actually matters.

Thanks to Reid for his good work tracking this one down.

Bug #1898.
2009-08-26 14:03:39 -07:00
Ben Pfaff
0babc06fb3 vswitchd: Fix bug in Ethernet address selection for bridge.
This bug was introduced in the merge from the citrix branch in commit
8fef8c71 "Merge citrix into master."

Thanks to Reid for characterizing the problem.

Bug #1907.
2009-08-26 12:51:39 -07:00
Justin Pettit
d540d9cbb1 tests: Cleanup getsockname argument warning
The second argument was being passed in as a sockaddr_in, when it should
be a sockaddr.  This commit cleans up the warning by casting it.
2009-08-25 16:37:37 -07:00
Justin Pettit
bc5ef83d2f tests: Cleanup islower() warning.
NetBSD's gcc complains if islower()'s argument is an unadorned char.  This
provides an appropriate cast.
2009-08-25 16:37:28 -07:00
Justin Pettit
b9b0ce6111 Cleanup incorrect unitialized variable warnings.
The NetBSD compiler warns that these variables may be used unitialized.
They are not, but this commit gets rid of the warnings.
2009-08-25 16:26:36 -07:00
Justin Pettit
d87961bf84 tests: Rename NTOHL/NTOHS macros
NetBSD defines NTOHL and NTOHS macros that are used differently than how
they are defined in the test-classifier.c.  This commit renames the local
definition so there's no conflict.
2009-08-25 16:22:44 -07:00
Justin Pettit
5b4994cd75 mgmt: Cleanup handling of extended messages
OpenFlow has a maximum messages size of 65536 bytes, but management
messages can be greater than that.  The management protocol's Extended
Data message is used to get around that limitation.  This commit cleans
up some problems with our implementation and adds some additional
sanity-checking to received messages.

Related to vNetManager Bug #1843.
2009-08-25 15:47:02 -07:00
Justin Pettit
3c71830aef dpif: Address portability issues in dpif-netdev
There were a number of Linux assumptions in dpif-netdev that were not
necessary.  This commit cleans those up to aid portability.
2009-08-25 14:12:01 -07:00
Justin Pettit
be2c418b73 Cleanup isdigit() warnings.
NetBSD's gcc complains if isdigit()'s argument is an unadorned char.  This
provides an appropriate cast.
2009-08-25 14:11:44 -07:00
Justin Pettit
00908dc27a Merge commit 'origin/citrix' 2009-08-25 13:23:11 -07:00
Justin Pettit
4f0b85d66b datapath: Return EFBIG instead of EXFULL when no room in flow table
The EXFULL errno is only defined in Linux.  While this datapath is
Linux-specific, the userspace that interacts with it is not.
2009-08-25 13:17:26 -07:00
Ben Pfaff
f91f081139 netflow: Remove stray debug printf(). 2009-08-21 13:46:47 -07:00
Ben Pfaff
eeb569bf8c xenserver: Compute correct physical PIFs for VLANs on bonds.
Otherwise the bond device is considered the physical PIF of a VLAN-on-bond
PIF, and various bad stuff happens.
2009-08-20 15:39:01 -07:00
Ben Pfaff
1d87357a13 Merge citrix into master. 2009-08-19 16:08:18 -07:00
Ben Pfaff
641a0a4ed0 xenserver: Renice netback process to priority 0 by default.
Under heavy VM network load, we have observed that ovs-vswitchd can be
starved for CPU time, which prevents flows from being set up.  This can
in turn cause connections to XAPI in Dom0 to time out (among other issues).

It is probably not necessary to renice netback all the way to priority 0
as done in this commit.  That is simply the value that we have tested.  QA
has not reported any ill side-effects of this choice of value (yet).  One
reasonable alternative, should any problems be noticed, would be to leave
netback at its default -5 priority and simply boost ovs-vswitchd's priority
to say -6 or -7.

Bug #1656.
2009-08-19 15:59:18 -07:00
Ben Pfaff
612f6d49c5 xenserver: Use = instead of == as operator for "test" in shell scripts.
The "test" program uses =, not ==, as the test for equality.  Fortunately
most implementations are tolerant but it's better to follow the spec.
2009-08-19 15:10:39 -07:00
Ben Pfaff
8521345b51 xenserver: Fix "brctl show" compatibility by introducing "brctl" wrapper.
Bug NIC-19, which reported that "brctl show" did not format its output in
the way expected by Citrix QA scripts, was believed fixed by commit
35c979bff4 "vswitchd: Support creating fake bond device interfaces."
Unfortunately, this commit was not tested on a XenServer before it was
committed.  Due to differences in the actual test environment and the
XenServer environment, which have different versions of the bridge-utils
package that contains brctl, that commit did not fix the problem observed
by Citrix QA.  In particular, the XenServer brctl uses sysfs to obtain
the information displayed by "brctl show", but the previous commit only
fixed up the information output by the bridge ioctls.

The natural way to fix this problem would be to fix up the sysfs support
as well.  I started out along that path, but became bogged down in all
the details of the kernel sysfs.

This commit takes an alternate approach, by introducing a wrapper around
the system brctl binary that implements "brctl show" itself and delegates
all other functionality to the original binary (in a different location).
This will not fix tools that do not call into brctl, but to the best of
my knowledge there are no such tools used in the Citrix QA process.

Thanks to Justin and Reid for much feedback.

Bug NIC-19.
2009-08-19 15:10:39 -07:00
Ben Pfaff
9d04e270a8 xenserver: Completely ignore datapath devices for renaming purposes.
Commit 2bb451b69 "xenserver: Rename network devices to match MAC addresses
of physical PIFs" started renaming network devices so that they match
the MAC address that we expect them to have.  This worked OK at the time.

Commit 35c979bff "vswitchd: Support creating fake bond device interfaces"
later started creating fake bond devices to make the Citrix QA scripts
happier.

Unfortunately these commits interact badly: the bond devices created by
the latter commit are sometimes chosen as the physical devices to be
renamed over the physical PIF device names.  This is because we do allow
datapath internal ports to be chosen as "physical devices" as a last
resort.  This commit reverses this decision, eliminating that possibility.
This probably won't become a problem unless somehow we encounter a physical
Ethernet card driver that lacks a queue, but that is unlikely since the
performance would be awful.
2009-08-19 13:44:05 -07:00
Ben Pfaff
b78a7336a3 datapath: Additional fixes for datapath device renaming.
Commit c874dc6d6b "secchan: Fix behavior when a network device is renamed."
fixed a crash in the datapath when network devices within a datapath were
renamed.  However, this missed the case where the device that was renamed
was a datapath's internal port: these devices have their br_port members
set to NULL, so we have to determine that they belong to a datapath another
way.  This commit does so.

This commit also changes the initialization order in dp_dev_create().
Otherwise, dp_device_event() will dereference null when it is called via
register_netdevice(), because the newly created device is a datapath device
but its members are not yet initialized.
2009-08-19 13:44:05 -07:00
Ben Pfaff
8fef8c7121 Merge citrix into master.
This was a somewhat difficult merge since there was a fair amount of
superficially divergent development on the two branches, especially in the
datapath.

This has been build-tested against XenServer 5.5.0 and XenServer 5.7.0
build 15122.  It has been booted and connected to XenCenter on 5.5.0.

The merge revealed a couple of outstanding bugs, which will be fixed on
citrix and then merged back into master.
2009-08-19 13:03:46 -07:00
Ian Campbell
6dd3fad481 xenserver: Store XAPI dbcache as XML in interface-reconfigure.
This allows the Citrix host installer to also write the dbcache on upgrade
which enables the management interface to come up on a slave after upgrade.

CP-1148.
2009-08-18 16:09:32 -07:00
Ian Campbell
171ed16859 xenserver: Whitelist specific XAPI fields to pickle in interface-reconfigure.
Only add certain fields to the database cache of database
objects. This constrains the cases we need to deal with when
pickling/unpickling.

CP-1148.
2009-08-18 16:09:32 -07:00
Ian Campbell
c798b21c6a xenserver: Only consider the host we are running on in interface-reconfigure.
Drop records for PIFs,bonds,VLANs etc for other hosts at the point at
which we fetch the records from xapi rather than filtering everytime
we iterate through the lists.

CP-1148.
2009-08-18 16:09:32 -07:00
Ian Campbell
d8ba4acf44 xenserver: Factor out XAPI interactions in interface-reconfigure.
Currently interface-reconfigure stores a copy of the XAPI database using
python's pickling functionality. Since the XenServer host installer also needs
to write this file (so it is present after slave upgrade) we would like to
switch to something more explicitly under our control.

Begin by factoring out XAPI interactions.
2009-08-18 16:09:32 -07:00
Ben Pfaff
081f138172 switch UI: Only build ovs-switchui if PCRE 7.2 or later is available.
The PCRE_INFO_OKPARTIAL feature used by ovs-switchui was only introduced
in PCRE 7.2, so we need to check for that version or later, instead of
just for PCRE.

Thanks to Ian Campbell <Ian.Campbell@citrix.com> for reporting the problem.
2009-08-18 16:09:32 -07:00
Ian Campbell
b2f460c72d datapath: Only call skb_checksum_setup on 2.6.18 && Xen.
For newer kernels the checksum setup is done at the point the skb is
received in netback or netfront so there is no more need to sprinkle
skb_checksum_setup calls throughout the kernel.
2009-08-18 16:09:32 -07:00
Ben Pfaff
af616f686b ovs-brcompatd: Don't include the local port in BRCTL_GET_PORT_LIST output.
The BRCTL_GET_PORT_LIST ioctl is not supposed to include the bridge port
itself in the list of ports, but ovs-brcompatd was doing that.
2009-08-18 12:36:47 -07:00
Ben Pfaff
694f2679ce ovs-brcompatd: Fix memory leak. 2009-08-18 12:36:47 -07:00
Ben Pfaff
2595cb8cea ovs-brcompatd: Fix use of uninitialized svec. 2009-08-18 12:36:47 -07:00