2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 21:07:47 +00:00

663 Commits

Author SHA1 Message Date
Ben Pfaff
efdd908807 Simplify shash_find() followed by shash_add() into shash_add_once().
This is just a cleanup.
2010-06-30 16:48:55 -07:00
Ben Pfaff
fa05809b72 vswitch: Implement unixctl command to reconnect OpenFlow connections.
This feature may be useful for debugging.

Feature #2222.
2010-06-24 12:49:17 -07:00
Ben Pfaff
018f1525ed bridge: Implement basic periodic update of interface statistics. 2010-06-23 12:43:03 -07:00
Ben Pfaff
1e0b752d3d bridge: Make configuration database records valid all the time.
Before, it was possible for records in the configuration database to
disappear, so all of the ovsrec pointers inside bridge structures had
comments cautioning against their use except during reconfiguration.  But
now that the bridge has direct control over when ovsdb_idl_run() is called,
it can ensure that bridge_reconfigure() is always called immediately
whenever the IDL data structures change.  That means that we can use the
ovsrec configuration at any time after the reconfiguration process
initializes them, not just during reconfiguration.
2010-06-23 12:43:03 -07:00
Ben Pfaff
c5187f17b6 ovs-vswitchd: Allow bridge code to manage the database connection itself.
Until now, the ovs-vswitchd main loop has managed the connection to the
database.  This worked adequately until now, but upcoming patches will tie
the bridge code more tightly to the database, which means that the bridge
needs more control over interaction with the database connection and thus
that it is better for the bridge to handle that connection itself.  This
commit makes the latter change, moving the database interaction from the
ovs-vswitchd main loop into bridge.c.
2010-06-23 12:43:03 -07:00
Ben Pfaff
81857c785e bridge: Drop unused enum definition. 2010-06-23 12:43:02 -07:00
Ben Pfaff
cc020c766e bridge: Remove unused functions. 2010-06-23 12:43:02 -07:00
Ben Pfaff
506051fcb5 Use shash_destroy_free_data() to simplify a few scattered pieces of code. 2010-06-23 12:43:02 -07:00
Ben Pfaff
c1c9c9c4b6 Implement QoS framework.
ovs-vswitchd doesn't declare its QoS capabilities in the database yet,
so the controller has to know what they are.  We can add that later.

The linux-htb QoS class has been tested to the extent that I can see that
it sets up the queues I expect when I run "tc qdisc show" and "tc class
show".  I haven't tested that the effects on flows are what we expect them
to be.  I am sure that there will be problems in that area that we will
have to fix.
2010-06-17 15:04:12 -07:00
Ben Pfaff
fc09119a4a vswitch: Datapath IDs are now 16 hex digits.
OpenFlow 1.0 datapath IDs are 64 bits long, so the "datapath_id" column
should have 16 hex digits.  The documentation had this right, but the
code didn't implement it correctly.

Reported-by: Arthur van Kleef <arthur.vankleef@os3.nl>
2010-06-17 10:26:13 -07:00
Ben Pfaff
5a24315000 bridge: Remove commented-out code for custom management and snoop listeners.
Before the transition from configuration file to OVSDB, it was possible to
override the defaults for OpenFlow management listeners and for OpenFlow
controller connection snooping.  The former can now effectively be done by
configuring a controller through the database.  Overriding the latter is
not very useful (no one has complained that it cannot be done any longer).
So this commit deletes the commented-out code.
2010-06-15 10:45:13 -07:00
Ben Pfaff
1fb7fcafa6 bridge: Remove commented-out code to set OpenFlow description strings.
This code has been commented out since the end of January, so it cannot
be very important.
2010-06-15 10:45:13 -07:00
Ben Pfaff
4f484bb224 bridge: Remove unused and write-only members of 'struct bridge'. 2010-06-15 10:45:13 -07:00
Jesse Gross
f4b6076aca netdev-vport: Use vport set_stats instead of internal dev.
In certain cases we require the ability to provide stats that are
added to the values collected by the kernel (currently only used
by bond fake devices).  Internal devices previously implemented
this directly but now that their stats are now handled by the vport
layer the functionality has been moved there.  This removes the
userspace code to set the stats and replaces it with a mechanism
to access the equivalent functionality in the vport layer.
2010-06-10 14:30:51 -07:00
Jesse Gross
7febb9100b bridge: Filter some gratuitous ARPs on bond slaves.
Normally we filter out packets received on a bond if we have
learned the source MAC as belonging to another port to avoid packets
sent on one slave and reflected back on another.  The exception to
this is gratuitous ARPs because they indicate that the host
has moved to another port.  However, this can result in an additional
problem on the switch that the host moved to if the gratuitous ARP is
reflected back on a bond slave.  In this case, we incorrectly relearn
the slave as the source of the MAC address.  To solve this, we lock the
learning entry for 5 seconds after receiving a gratuitous ARP against
further updates caused by gratuitous ARPs on bond slaves.

Bug #2516

Reported-by: Ian Campbell <ian.campbell@citrix.com>
2010-06-03 19:46:44 -07:00
Jesse Gross
1e82e503c5 netdev: Remove may_create/may_open flags.
The most recent revision of the netdev library added may_create
and may_open flags to explicitly state the intent of the caller as
to whether the device should already be in use.  This was simply
a sanity check for users of the netdev library and the configuration.
At this point the netdev library and its users are well behaved and
should no longer need to be checked.  Additional checks have also
been added for incorrect configuration that mean the netdev library
is no longer the primary line of defense.

These flags themselves create problems because it is not always
easy for a library to know what the state of devices should be.
This is particularly a problem for ovs-openflowd, which expects
ports to be added by ovs-dpctl.  Fixing this either requires that
the checks are so permissive to be useless or ugly hacks to get
around them.  Since they are no longer needed, just remove the
checks.

This commit restores the previous behavior of ovs-openflowd to
not require that ports be specified on the command line or
cleaned up after use.

Bug #2652

CC: Natasha Gude <natasha@nicira.com>
CC: Jean Tourrilhes <jt@hpl.hp.com>
CC: 蒲彦 <yan.p.bjtu@gmail.com>
2010-06-01 17:27:45 -07:00
Ben Pfaff
5d0ae1387c vswitchd: Treat gratuitous ARP requests like gratuitous ARP replies.
vswitchd has long used a gratuitous ARP reply as an indication that a VM
has migrated, because traditional xen.org Linux DomUs send such packets out
when they complete migration.  Relatively recently, however, we realized
that upstream Linux does not do this.  Ian Campbell tracked this down to
two separate issues:

        1. A bug prevented gratuitous ARPs from being sent.

        2. When this was fixed, the gratuitous ARPs that were sent were
           requests, not replies, although kernel documentation sent that
           replies were to be sent.

Ian submitted patches to fix both bugs.  #1 is in process of revision for
acceptance.  #2 was rejected: according to Dave Miller, the documentation
is wrong, not the implementation, because ARP replies would unnecessarily
fill up the ARP tables of devices on the network.

OVS has not until now treated gratuitous ARP requests specially, only
replies.  Now that Linux will be using ARP requests to indicate migration,
OVS should also treat them as such.!  This commit does so.

See http://marc.info/?l=linux-netdev&m=127367215620212&w=2 for Ian's
original patch and http://marc.info/?l=linux-netdev&m=127468303701361&w=2
for Dave Miller's response.

CC: Ian Campbell <Ian.Campbell@citrix.com>
NIC-74.
2010-05-27 10:07:06 -07:00
Ben Pfaff
9d82ec478d Always #include <sys/socket.h> before <net/if.h>.
FreeBSD 8.0's <net/if.h> requires <sys/socket.h> to be included first,
even though I don't see any such requirement in POSIX.
2010-05-26 15:27:01 -07:00
Ben Pfaff
7cf8b2660f poll-loop: New function poll_timer_wait_until().
Many of poll_timer_wait()'s callers actually want to wait until a specific
time, so it's convenient for them to offer a function that does this.
2010-05-26 11:46:59 -07:00
Jesse Gross
2457b24fc5 bridge: Add iface to hash table before calling iface_is_internal().
When creating an interface we need to check whether it is internal.
However, the function iface_is_internal() does a lookup on the
interface name but we haven't added it to the hash table yet.  This
adds the interface to the table early on in iface_create.

NIC-78
2010-05-10 13:38:21 -07:00
Ben Pfaff
d25b341aa8 bridge: Fix double-free bug in port_reconfigure().
Reported-by: Peter Balland <peter@nicira.com>
Bug #2794
2010-05-10 10:55:29 -07:00
Ben Pfaff
3e9c481c70 bridge: Optimize trunk port common case.
Profiling with qprof showed that bitmap_set_multiple() and bitmap_equal()
were eating up quite a bit of CPU time during bridge reconfiguration (up
to about 10% of total runtime).  This is completely avoidable in the common
case where a port trunks all VLANs, where we don't really need a bitmap at
all.  This commit implements that optimization.
2010-05-05 14:00:50 -07:00
Ben Pfaff
836fad5e1a bridge: Optimize port_lookup() using a hash.
Before this commit and the preceding one, with 1000 interfaces strcmp()
took 36% and port_lookup() took 8% of total runtime when reconfiguring
bridges.  With these two commits the percentage is reduced to 3% and 0%,
respectively.
2010-05-05 14:00:49 -07:00
Ben Pfaff
4a1ee6ae82 bridge: Optimize iface_lookup() and port_lookup_iface() with a hash.
Before this commit and the following one, with 1000 interfaces strcmp()
took 36% and port_lookup() took 8% of total runtime when reconfiguring
bridges.  With these two commits the percentage is reduced to 3% and 0%,
respectively.
2010-05-05 14:00:47 -07:00
Ben Pfaff
7b99db051b bridge: Fix double-free in sFlow configuration. 2010-05-05 10:50:38 -07:00
Jesse Gross
ceb4559f66 bridge: Immediately drop interfaces that can't be opened.
Previously we would keep interfaces around that couldn't be opened
because they might be internal interfaces that are created later.
However, this leads to a race condition if the interface appears
after we try to create it and fails since some operations may
succeed.  Instead, give up on the interface immediately if it can't
be opened and isn't internal (which we control and so won't have
this issue).

Bug #2737
2010-04-29 15:23:10 -07:00
Ben Pfaff
11bbff1b79 vswitchd: Rename bridge_reconfigure_controller().
Suggested-by: Justin Pettit <jpettit@nicira.com>
2010-04-26 14:25:27 -07:00
Ben Pfaff
cd11000ba2 vswitchd: Enable in-band control to managers.
ovsdb-server must be able to connect to the OVSDB managers over in-band
control (because the manager may be what configures the OpenFlow
controllers).  This commit enables that.
2010-04-26 11:02:12 -07:00
Ben Pfaff
76ce943239 Add support for multiple OpenFlow controllers on a single bridge.
With this commit, Open vSwitch permits a bridge to have any number of
OpenFlow controllers.  When multiple controllers are configured, Open
vSwitch connects to all of them simultaneously.  Details of configuration
are in the vswitch schema documentation.

OpenFlow 1.0 does not specify how multiple controllers coordinate in
interacting with a single switch, so more than one controller should be
specified only if the controllers are themselves designed to coordinate
with each other.

An upcoming commit will provide a simple means for coordination between
multiple controllers.

Feature #2495.
2010-04-20 11:01:44 -07:00
Ben Pfaff
79c9f2ee78 ofproto: Bundle all controller-related settings into a struct.
Many ofproto settings are controller-related.  Upcoming commits will add
to ofproto the ability to support multiple controllers, so it is important
to be able to refer to controller settings as a group.  Hence, this commit
bundles them into a new "struct ofproto_controller".
2010-04-20 11:01:44 -07:00
Ben Pfaff
8722022c0c Update fake bond devices' statistics with the sum of bond slaves' stats.
Needed by XAPI to accurately report bond statistics.

Ugh.

Bug NIC-63.
2010-04-19 11:12:27 -07:00
Jesse Gross
659586efcf tunneling: Add support for tunnel ID.
Add a tun_id field which contains the ID of the encapsulating tunnel
on which a packet was received (0 if not received on a tunnel).  Also
add an action which allows the tunnel ID to be set for outgoing
packets.  At this point there aren't any tunnel implementations so
these fields don't have any effect.

The matching is exposed to OpenFlow by overloading the high 32 bits
of the cookie as the tunnel ID.  ovs-ofctl is capable of turning
on this special behavior using a new "tun-cookie" command but this
command is intentially undocumented to avoid it being used without
a full understanding of the consequences.
2010-04-19 09:11:51 -04:00
Ben Pfaff
db0e2ad101 bridge: Don't learn from inadmissible flows when revising learning table.
Various kinds of flows are inadmissible and must be dropped.  Most notably,
OVS drops packets received on a bond whose destinations are ones that OVS
has already learned on a different port.  As the comment says:

         /* Drop all packets for which we have learned a different input
          * port, because we probably sent the packet on one slave and got
          * it back on the other.  Broadcast ARP replies are an exception
          * to this rule: the host has moved to another switch. */

As an important side effect of dropping these packets, OVS does not use
them for MAC learning when it sets up the corresponding flows.

However, OVS also periodically scans the datapath flow table and uses
information about flow activity to update its learning tables.  (Otherwise,
learning table entries could expire because no new flows were being set up,
even though active flows existed.)  This process, implemented in
bridge_account_flow_ofhook_cb(), did not check for admissibility, so
packets received on a bond could be used for learning even though another
port had already been learned.

This commit fixes the problem by making bridge_account_flow_ofhook_cb()
check for admissibility.

QA notes: Reproducing this problem requires some care and some luck.  One
way is to have two VMs with network interfaces on a single bonded network.
Both bonded interfaces must be up (otherwise packets sent out on one slave
will never be received on the other).  The problem will also not occur if
the physical switch that the bond slaves are plugged into has learned the
MAC address of the VMs involved (because the physical switch will then,
again, drop the packets without sending them back in on the other slave).
Finally, there needs to be some luck in timing and perhaps with the OVS
internal hash function also.

(One way to reproduce it reliably is to plug a pair of Ethernet ports into
each other with a cable, without an intermediate switch, and then use that
pair of ports as a bond.  Then every packet sent out on one will
immediately be received on the other, triggering the problem fairly often.
If this doesn't work at first, try changing the Ethernet address used on
one side or the other.)

To verify that the problem being observed is the one fixed by this commit,
turn on bridge debugging with "ovs-appctl vlog/set bridge:file" and look
for "bridge xapi2: learned that 00:01:02:03:04:06 is on port bond0 in VLAN
0" where 00:01:02:03:04:06 is a VM's Ethernet address and bond0 is the
name of the bond in the ovs-vswitchd log file.

Testing: I ran the "loopback bond" test above with and without this commit,
twice each in case I was just lucky.

CC: Henrik Amren <henrik@nicira.com>

Bug #2366.
Bug NIC-64.
Bug NIC-69.
2010-04-16 09:53:09 -07:00
Ben Pfaff
14a34d00ad bridge: Factor admissibility check out of process_flow().
The next commit will need to make the same tests as the first part of
process_flow(), so this commit breaks that out into a new function
is_admissible().

Should have no externally visible effect.
2010-04-16 09:53:09 -07:00
Ben Pfaff
1609aa0310 bridge: Minor cleanup in process_flow().
Should have no externally visible effect.
2010-04-16 09:53:08 -07:00
Ben Pfaff
f925807dc8 bridge: Don't log warnings when revalidating.
The rest of the warnings along this path follow this rule, but this one
didn't.  Make it consistent.
2010-04-16 09:53:08 -07:00
Justin Pettit
6501fda909 vswitch: Mark bridge_update_desc argument as unused
The implementation of bridge_update_desc() is empty, which causes a
compiler warning for the argument.  Mark the argument unused until we
get a chance to fix the function's implementation.
2010-04-08 17:53:58 -07:00
Justin Pettit
7128cd6b04 vswitchd: Fix small memory leak in bridge_init 2010-04-06 13:46:28 -07:00
Ben Pfaff
c8143c8879 vswitchd: Make the bond rebalancing interval user-configurable.
This may make some bond debugging problems easier.  It also seems
reasonable to expose this parameter to the user.

Related to bug #2366.
2010-04-05 16:41:50 -07:00
Ben Pfaff
415f6c0b1c stream-ssl: Make no-op reconfiguration cheap.
Until now, the stream_ssl functions for configuring private keys,
certificates, and CA certificates have always called into OpenSSL to read
a file.  This commit instead makes them do that only if the file name
changed (or it has been 60 seconds since we last tried, in case someone
installed the file behind our backs).

This allows us to factor some code out of vswitchd.  In an upcoming commit
we will want to do essentially the same thing from ovsdb-server, so this
avoid code redundancy.
2010-03-19 15:18:37 -07:00
Ben Pfaff
939ff2674c vswitch: Use weak references in Mirror table.
A port mirror seems sufficiently disconnected from the ports that it
mirrors that it seems counterproductive to forbid removing a port if
it is mirrored.  This commit therefore changes the references from
Mirror to Port from strong references to weak references, so that
removing a port automatically removes references to it from the Mirror
table.

Since this could cause the port and VLAN selection for the Mirror to
become empty, which would make the mirror select all packets, at the same
time this commit adds a new column "select_all" to Mirror, to explicitly
allow selecting all packets.
2010-03-17 14:24:56 -07:00
Jesse Gross
b9172b0078 bridge: Remove leftover from config file.
The 'pfx' variable is no longer used now that the config file is
gone and its only purpose in life is to be freed so get rid of it.
2010-03-15 15:43:27 -04:00
Justin Pettit
d4c2000bb2 bridge: Map an "internal" config device type to a "system" netdev type
ovs-vswitchd has a concept of an "internal" port, which is created
on-demand.  The netdev library doesn't understand an "internal" device
type, so we map it to a "system" one.

Bug #2431
2010-02-21 00:41:19 -08:00
Justin Pettit
23ff2821fd ovs-vswitchd: Remove inline OpenFlow descriptions
Replace inline OpenFlow descriptions with #define.  Also, start work to
support setting them dynamically.

(This was originally working with the config file version of vswitchd,
but needs to be updated to work with the config db.)
2010-02-20 02:22:29 -08:00
Ben Pfaff
c69ee87c10 Merge "master" into "next".
The main change here is the need to update all of the uses of UNUSED in
the next branch to OVS_UNUSED as it is now spelled on "master".
2010-02-11 11:11:23 -08:00
Ben Pfaff
67a4917b07 Rename UNUSED macro to OVS_UNUSED to avoid naming conflict.
Requested by Jean Tourrilhes <jt@hpl.hp.com>.
2010-02-11 10:59:47 -08:00
Ben Pfaff
5fbb4e187d vswitchd: Drop assignment whose value is never used in port_reconfigure().
Seems cleaner this way.

Found by Clang (http://clang-analyzer.llvm.org/).
2010-02-11 10:35:28 -08:00
Ben Pfaff
f446d59b36 vswitch: Fix uninitialized variable.
The 'ip' variable in this inner "if" statement shadows a variable with
the same name in the enclosing block.  The variable in the inner block
is never initialized.

Found by Clang (http://clang-analyzer.llvm.org).
2010-02-11 10:33:11 -08:00
Ben Pfaff
0b523ea7a8 vswitchd: Add missing * to use of sizeof.
Found using coccinelle and coccicheck (http://coccinelle.lip6.fr/).
2010-02-09 10:20:13 -08:00
Jesse Gross
70150daf2f vswitch: Consistently set Nicira OUI.
In places where a random Ethernet address needs to be generated we
are inconsistent about setting an OUI.  This sets an OUI everywhere
to allow the source of packets to be easily identified.
2010-02-08 15:01:20 -05:00