2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 21:07:47 +00:00

47 Commits

Author SHA1 Message Date
Ben Pfaff
9fe3b9a2ee datapath: Drop datapath index and port number from Ethtool output.
I introduced this a long time ago as an efficient way for userspace to find
out whether and where an internal device was attached, but I've always
considered it an ugly kluge.  Now that ODP_VPORT_QUERY can fetch a vport's
info regardless of datapath, it is no longer necessary.  This commit
stops using Ethtool for this purpose and drops the feature.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
51d4d59822 datapath: Make it possible to query vports by name regardless of datapath.
Until now it has only been possible to query a vport if you know what
datapath it is on.  This doesn't really make sense, so this commit removes
that restriction.  It is a little bigger than one might naturally expect
because locking changes are required.

This also allows us to get rid of the ETHTOOL_GDRVINFO kluge that has
bothered me for a long time.  The next commit does that.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
b0ec0f279e datapath: Change listing ports to use an iterator concept.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to add new
features to the kernel vport layer without changing userspace software.  In
turn, that means that the odp_port structure must become variable-length.
This does not, however, fit in well with the ODP_PORT_LIST ioctl in its
current form, because that would require userspace to know how much space
to allocate for each port in advance, or to allocate as much space as
could possibly be needed.  Neither choice is very attractive.

This commit prepares for a different solution, by replacing ODP_PORT_LIST
by a new ioctl ODP_VPORT_DUMP that retrieves information about a single
vport from the datapath on each call.  It is much cleaner to allocate the
maximum amount of space for a single vport than to do so for possibly a
large number of vports.

It would be faster to retrieve a number of vports in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.

The Netlink version won't need to take the starting port number from
userspace, since Netlink sockets can keep track of that state as part
of their "dump" feature.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
856081f683 datapath: Report kernel's flow key when passing packets up to userspace.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.

This commit takes one step in that direction by making the kernel report
its idea of the flow that a packet belongs to whenever it passes a packet
up to userspace.  This means that userspace can intelligently figure out
what to do:

   - If userspace's notion of the flow for the packet matches the kernel's,
     then nothing special is necessary.

   - If the kernel has a more specific notion for the flow than userspace,
     for example if the kernel decoded IPv6 headers but userspace stopped
     at the Ethernet type (because it does not understand IPv6), then again
     nothing special is necessary: userspace can still set up the flow in
     the usual way.

   - If userspace has a more specific notion for the flow than the kernel,
     for example if userspace decoded an IPv6 header but the kernel
     stopped at the Ethernet type, then userspace can forward the packet
     manually, without setting up a flow in the kernel.  (This case is
     bad from a performance point of view, but at least it is correct.)

This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, although userspace does now
have enough information to do that intelligently.  This will have to wait
for later commits.

This commit is bigger than it would otherwise be because it is rolled
together with changing "struct odp_msg" to a sequence of Netlink
attributes.  The alternative, to do each of those changes in a separate
patch, seemed like overkill because it meant that either we would have to
introduce and then kill off Netlink attributes for in_port and tun_id, if
Netlink conversion went first, or shove yet another variable-length header
into the stuff already after odp_msg, if adding the flow key to odp_msg
went first.

This commit will slow down performance of checksumming packets sent up to
userspace.  I'm not entirely pleased with how I did it.  I considered a
couple of alternatives, but none of them seemed that much better.
Suggestions welcome.  Not changing anything wasn't an option,
unfortunately.  At any rate some slowdown will become unavoidable when OVS
actually starts using Netlink instead of just Netlink framing.

(Actually, I thought of one option where we could avoid that: make
userspace do the checksum instead, by passing csum_start and csum_offset as
part of what goes to userspace.  But that's not perfect either.)

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
704a1e09e9 datapath: Change listing flows to use an iterator concept.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length.  This does
not, however, fit in well with the ODP_FLOW_LIST ioctl in its current form,
because that would require userspace to know how much space to allocate
for each flow's key in advance, or to allocate as much space as could
possibly be needed.  Neither choice is very attractive.

This commit prepares for a different solution, by replacing ODP_FLOW_LIST
by a new ioctl ODP_FLOW_DUMP that retrieves a single flow from the datapath
on each call.  It is much cleaner to allocate the maximum amount of space
for a single flow key than to do so for possibly a very large number of
flow keys.

As a side effect, this patch also fixes a race condition that sometimes
made "ovs-dpctl dump-flows" print an error: previously, flows were listed
and then their actions were retrieved, which left a window in which
ovs-vswitchd could delete the flow.  Now dumping a flow and its actions is
a single step, closing that window.

Dumping all of the flows in a datapath is no longer an atomic step, so now
it is possible to miss some flows or see a single flow twice during
iteration, if the flow table is modified by another process.  It doesn't
look like this should be a problem for ovs-vswitchd.

It would be faster to retrieve a number of flows in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:35 -08:00
Ethan Jackson
21d6e22eee rtnetlink: Remove LINK specific messages from rtnetlink
Abstracted rtnetlink so that it may be used for messages other than
RTM LINK messages.  Created a new rtnetlink-link module which
specifically deals with these kinds of messages and follows the old
rtnetlink API.
2011-01-04 10:34:55 -08:00
Justin Pettit
e16a28b585 vswitch: Use "ipsec_gre" vport instead of "gre" with "other_config"
Previously, a GRE-over-IPsec tunnel was created as an interface with a
"type" of "gre" and the "other_config" column with "ipsec_cert" or
"ipsec_psk" set.  This could lead to a potential security problem if a user
intended to create a GRE-over-IPsec tunnel, but misconfigured the
"ipsec_*" config and created an unencrypted GRE tunnel.

This commit defines an "ipsec_gre" tunnel type, which should prevent
users from inadvertently establishing insecure tunnels.
2010-12-28 14:30:36 -08:00
Jesse Gross
cf22f8cba3 vswitchd: Consistently use size_t for action lengths.
Currently the type of the datapath action length is mixture of
size_t and unsigned int.  However, size_t is really defined as an
unsigned long, which causes the build to fail on 64-bit platforms.
This consistently uses size_t.
2010-12-13 11:07:15 -08:00
Ben Pfaff
cdee00fd63 datapath: Replace "struct odp_action" by Netlink attributes.
In the medium term, we plan to migrate the datapath to use Netlink as its
communication channel.  In the short term, we need to be able to have
actions with 64-bit arguments but "struct odp_action" only has room for
48 bits.  So this patch shifts to variable-length arguments using Netlink
attributes, which starts in on the Netlink transition and makes 64-bit
arguments possible at the same time.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-10 11:13:32 -08:00
Ben Pfaff
c3827f619a datapath: Make adding and attaching a vport a single step.
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath.  Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath.  Similarly, a vport
could be detached and deleted separately.

After some study, I think I understand why this distinction exists.  It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them.  However, changing it to create them before it tries to
open them is not difficult, so this commit does this.

The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-03 14:41:38 -08:00
Ben Pfaff
4a38774146 dpif: Make dpif_class 'open' function take class instead of type name.
This makes it easier for dpif_provider implementations to share code but
distinguish the class actually in use, because comparing a pointer is
easier than comparing a string.
2010-11-18 10:08:05 -08:00
Ben Pfaff
d98e600755 vlog: Make client supply semicolon for VLOG_DEFINE_THIS_MODULE.
It's kind of odd for VLOG_DEFINE_THIS_MODULE to supply its own semicolon,
so this commit switches to the more common form.
2010-10-29 09:48:47 -07:00
Ben Pfaff
f1588b1fa1 datapath: Remove implementation of port groups.
The "port group" concept seems like a good one, but it has not been
used very much in userspace so far, so before we commit ourselves to
a frozen API that we must maintain forever, remove it.  We can always
add it back in later as a new kind of vport.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2010-10-11 12:40:11 -07:00
Ben Pfaff
4f2226487d shash: New function shash_steal(). 2010-09-23 11:45:34 -07:00
Ben Pfaff
5136364f41 vlog: Add VLOG_WARN_ONCE() and similar macros. 2010-09-23 11:45:34 -07:00
Ben Pfaff
68efcbec41 ofpbuf: Add ofpbuf_new_with_headroom(), ofpbuf_clone_with_headroom().
These new functions simplify an increasingly common usage pattern.

Suggested-by: Jesse Gross <jesse@nicira.com>
2010-09-01 12:55:50 -07:00
Ben Pfaff
5136ce492c vlog: Introduce VLOG_DEFINE_THIS_MODULE for declaring vlog module in use.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined.  A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
2010-07-21 15:47:09 -07:00
Ben Pfaff
17ee3c1ffd netdev-linux: Avoid minor number 0 in traffic control.
Linux traffic control handles with minor number 0 refer to qdiscs, not
to classes.  This commit deals with this by using a conversion function:
OpenFlow queue 0 maps to minor 1, queue 1 to minor 2, and so on.
2010-07-20 11:26:58 -07:00
Ben Pfaff
5b3941ee17 dpif-linux: Translate queues to priorities correctly.
The TC_H_MAKE macro does not shift the major number into position.
2010-07-20 11:26:58 -07:00
Ben Pfaff
aae51f5335 dpif: Abstract translation from OpenFlow queue ID into ODP priority value.
When the QoS code was integrated, I didn't yet know how to abstract the
translation from a queue ID in an OpenFlow OFPAT_ENQUEUE action into a
priority value for an ODP ODPAT_SET_PRIORITY action.  This commit is a
first attempt that works OK for Linux, so far.  It's possible that in fact
this translation needs the 'netdev' as an argument too, but it's not needed
yet.
2010-07-20 11:23:21 -07:00
Ben Pfaff
b90fa799b9 datapath: Make datapath-protocol.h portable to non-Linux systems.
datapath-protocol.h is not a very clean interface.  I originally intended
it to be solely a Linux-kernel specific interface.  Over time it became
a general-purpose interface to dpifs.  This is not a good situation,
because clearly the header is still Linux-specific.

In the long run, the correct solution is to separate the generic and
Linux-specific bits.  This is not that patch.  Instead, this patch modifies
datapath-protocol.h enough that it can be used on non-Linux hosts.  In
particular I tested that it works OK with FreeBSD 8.0.
2010-05-26 15:32:34 -07:00
Justin Pettit
10dcf8deec dpif: Include stat.h header 2010-05-20 13:27:55 -07:00
Ben Pfaff
54825e09b3 dpif-linux: Use hash instead of sorted array.
With 1000 network devices being added or removed, sorting the array was a
profiling hot spot.  Using a hash makes it drop off the profile.
2010-05-05 14:00:50 -07:00
Ben Pfaff
4325359529 ofproto: Avoid buffer copy in OFPT_PACKET_IN path.
When a dpif passes an odp_msg down to ofproto, and ofproto transforms it
into an ofp_packet_in to send to the controller, until now this always
involved a full copy of the packet inside ofproto.  This commit eliminates
this copy by ensuring that there is always enough headroom in the ofpbuf
that holds the odp_msg to replace it by an ofp_packet_in in-place.

From Jean Tourrilhes <jt@hpl.hp.com>, with some revisions.
2010-04-27 09:40:46 -07:00
Jesse Gross
3abc4a1a6c dpif-linux: Clean up vports that are no longer in config.
If the config changes while ovs-vswitchd is not running it is possible
that there could be some vports which are no longer needed but won't
be destroyed when closed because they aren't open.  This deletes
unneeded vports at the same time that we clean up unneeded datapaths.
2010-04-19 09:11:57 -04:00
Jesse Gross
f2459fe7d9 datapath: Add generic virtual port layer.
Currently the datapath directly accesses devices through their
Linux functions.  Obviously this doesn't work for virtual devices
that are not backed by an actual Linux device.  This creates a
new virtual port layer which handles all interaction with devices.

The existing support for Linux devices was then implemented on top
of this layer as two device types.  It splits out and renames dp_dev
to internal_dev.  There were several places where datapath devices
had to handled in a special manner and this cleans that up by putting
all the special casing in a single location.
2010-04-19 09:11:57 -04:00
Tetsuo NAKAGAWA
ed30fb10e1 dpif-linux: Fix file descriptor leak.
get_major() opens /proc/devices to get the openvswitch major number
but never closes the FD.
2010-03-25 10:54:52 -04:00
Ben Pfaff
c69ee87c10 Merge "master" into "next".
The main change here is the need to update all of the uses of UNUSED in
the next branch to OVS_UNUSED as it is now spelled on "master".
2010-02-11 11:11:23 -08:00
Ben Pfaff
67a4917b07 Rename UNUSED macro to OVS_UNUSED to avoid naming conflict.
Requested by Jean Tourrilhes <jt@hpl.hp.com>.
2010-02-11 10:59:47 -08:00
Jesse Gross
7dab847a19 Fix some regressions from the merge from master. 2010-02-08 13:31:33 -05:00
Justin Pettit
a4af00400a Merge branch 'master' into next
Conflicts:
	COPYING
	datapath/datapath.h
	lib/automake.mk
	lib/dpif-provider.h
	lib/dpif.c
	lib/hmap.h
	lib/netdev-provider.h
	lib/netdev.c
	lib/stream-ssl.h
	ofproto/executer.c
	ofproto/ofproto.c
	ofproto/ofproto.h
	tests/automake.mk
	utilities/ovs-ofctl.c
	utilities/ovs-vsctl.in
	vswitchd/ovs-vswitchd.conf.5.in
	xenserver/etc_init.d_vswitch
	xenserver/etc_xensource_scripts_vif
	xenserver/opt_xensource_libexec_interface-reconfigure
2010-02-05 17:14:55 -08:00
Jesse Gross
999401aa9c dpif: Allow providers to be managed at runtime.
The list of datapath providers was previously staticly defined at
compile time.  This allows new providers to be added and removed
at runtime.
2010-02-01 12:00:53 -05:00
Jesse Gross
1a6f1e2a6d dpif: Update dpif interface to match netdev.
This brings over some features that were added to the netdev interface,
most notably the separation between the name and the type.  In addition
to being cleaner, this also avoids problems where it is expected that
the local port has the same name as the datapath.
2010-01-27 20:03:38 -05:00
Ben Pfaff
8334b477a4 dpif-linux: Always set *fnp in make_openvswitch_device().
Some versions of GCC warn about this.  Always initializing it seems like
the right thing to do, since we "almost always" initialized it before.

Reported-by: Neil McKee <neil.mckee@inmon.com>
2010-01-19 10:10:52 -08:00
Ben Pfaff
72b0630028 Initial implementation of sFlow.
Tested very slightly with "ping" and "sflowtool -t | tcpdump -r -".
2010-01-04 13:08:37 -08:00
Jesse Gross
d65349ea28 Merge citrix branch into master. 2009-11-10 15:12:01 -08:00
Justin Pettit
f17d7bd838 dpif-linux: Clarify bad device warning message
The message warning that the device number is wrong for the Open vSwitch
devices could have been clearer.

Thanks to Ben Pfaff for the suggested wording.
2009-10-02 16:59:28 -07:00
Justin Pettit
57aaff8a99 dpif-linux: Fail earlier if OVS kernel module isn't loaded
When the kernel module isn't loaded, the bridge tries to open all the
possible minor devices, regardless.  This change first checks that there
is a major device number for Open vSwitch and only then tries to open the
minor devices.

This change also removes the assumption that there's a default Open vSwitch
major device number, since the kernel module always attempts to get a
dynamic one.  Maybe one day we'll have one...

Bug #1179
2009-10-02 16:07:32 -07:00
Justin Pettit
be2c418b73 Cleanup isdigit() warnings.
NetBSD's gcc complains if isdigit()'s argument is an unadorned char.  This
provides an appropriate cast.
2009-08-25 14:11:44 -07:00
Ben Pfaff
559843ed53 rtnetlink: Move into separate source and header file.
Now that rtnetlink isn't named similarly to netdev_linux, it might as well
have its own source and header files to avoid confusing everyone.
2009-07-30 16:07:15 -07:00
Ben Pfaff
46097491e4 netdev-linux: Rename "linux_netdev_*" to "rtnetlink_*".
It was getting to be too confusing to have both netdev_linux_* functions
and linux_netdev_* functions.  Rename the latter to make the distinction
more obvious.  "rtnetlink" seems to be a fairly good name because that's
what the kernel calls it, so the name will be familiar at least to people
who know about rtnetlink.
2009-07-30 16:07:14 -07:00
Ben Pfaff
8b61709d5e netdev: Implement an abstract interface to network devices.
This new abstraction layer allows multiple implementations of network
devices in a single running process.  This will be useful, for example, to
support network devices that are simulated entirely in the running process
or that communicate with other processes over Unix domain sockets, etc.

The reimplemented tap device support in this commit has not been tested.
2009-07-30 16:07:14 -07:00
Ben Pfaff
d3d22744a7 vswitch: Avoid knowledge of details specific to Linux datapaths.
At startup, the vswitch needs to delete datapaths that are not configured
by the administrator.  Until now this was done by knowing the possible
names of Linux datapaths.  This commit cleans up by allowing each
datapath class to enumerate its existing datapaths and their names.
2009-07-06 11:06:36 -07:00
Ben Pfaff
a165b67e53 dpif-linux: Don't allow arbitrary internal ports to identify a datapath.
The userspace tools were allowing the name of any internal port to be used
to identify a datapath.  This, however, makes it hard to enumerate all the
names by which a datapath can be known, and it was never documented or
intentional behavior, so this commit disables it.
2009-07-06 11:02:57 -07:00
Ben Pfaff
e9e28be359 Introduce general-purpose ways to wait for dpif and netdev changes.
The dpif and netdev code has had various ways to check for changes to
dpifs and netdevs over the course of Open vSwitch development.  All of
these have been thus far fairly specific to the Linux implementation.  This
commit is the start of a more general API for watching for such changes.
The dpif-related parts seem fairly mature and so they are documented,
the netdev parts will probably need to change somewhat and so they are
not documented yet.
2009-07-06 09:07:24 -07:00
Ben Pfaff
5792c5c64a dpif: Add new functions dp_run() and dp_wait().
The upcoming netdev-based dpif needs a hook where it can process packets
and throw them against the flow table, and this provides a suitable place.
2009-07-06 09:07:24 -07:00
Ben Pfaff
96fba48f52 dpif: Make dpifs abstract, to allow multiple datapath implementations.
This commit initially introduces only a single datapath implementation,
which is the same as the original one, but it paves the way for
additional implementations, such as the upcoming userspace datapath.
2009-07-06 09:07:24 -07:00