2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 05:18:13 +00:00

84 Commits

Author SHA1 Message Date
Ben Pfaff
feebdea2e5 dpif: Eliminate "struct odp_flow" from client-visible interface.
Following this commit, "struct odp_flow" and related data structures are
only used in Linux-specific parts of OVS userspace code.  This allows the
actual Linux datapath interface to evolve more freely.

Reviewed by Justin Pettit.
2011-01-27 21:08:38 -08:00
Ben Pfaff
bc4a05c639 datapath: Change ODP_FLOW_GET to retrieve only a single flow at a time.
This brings the code closer to what the Netlink interface will need to
implement.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:38 -08:00
Ben Pfaff
996c1b3d7a datapath: Drop port information from odp_stats.
As with n_flows, n_ports was used regularly by userspace to determine how
much memory to allocate when listing ports, but it is no longer needed for
that.  max_ports, on the other hand, is necessary but it is also a fixed
value for the kernel datapath right now and if we expand it we can also
come up with a way to report the expanded value.

The remaining members of odp_stats are actually real statistics that I
intend to keep.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:38 -08:00
Ben Pfaff
1ba530f4b2 datapath: Drop queue information from odp_stats.
This queue information will be available through the kernel socket layer
once we move over to Netlink socket as transports, so we might as well get
rid of the redundancy.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:38 -08:00
Ben Pfaff
4c738a8da5 dpif: Eliminate "struct odp_port" from client-visible interface.
Following this commit, "struct odp_port" is only used in Linux-specific
parts of OVS userspace code.  This allows the actual Linux datapath
interface to evolve more freely.

Reviewed by Justin Pettit.
2011-01-27 21:08:37 -08:00
Ben Pfaff
b0ec0f279e datapath: Change listing ports to use an iterator concept.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to add new
features to the kernel vport layer without changing userspace software.  In
turn, that means that the odp_port structure must become variable-length.
This does not, however, fit in well with the ODP_PORT_LIST ioctl in its
current form, because that would require userspace to know how much space
to allocate for each port in advance, or to allocate as much space as
could possibly be needed.  Neither choice is very attractive.

This commit prepares for a different solution, by replacing ODP_PORT_LIST
by a new ioctl ODP_VPORT_DUMP that retrieves information about a single
vport from the datapath on each call.  It is much cleaner to allocate the
maximum amount of space for a single vport than to do so for possibly a
large number of vports.

It would be faster to retrieve a number of vports in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.

The Netlink version won't need to take the starting port number from
userspace, since Netlink sockets can keep track of that state as part
of their "dump" feature.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
856081f683 datapath: Report kernel's flow key when passing packets up to userspace.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.

This commit takes one step in that direction by making the kernel report
its idea of the flow that a packet belongs to whenever it passes a packet
up to userspace.  This means that userspace can intelligently figure out
what to do:

   - If userspace's notion of the flow for the packet matches the kernel's,
     then nothing special is necessary.

   - If the kernel has a more specific notion for the flow than userspace,
     for example if the kernel decoded IPv6 headers but userspace stopped
     at the Ethernet type (because it does not understand IPv6), then again
     nothing special is necessary: userspace can still set up the flow in
     the usual way.

   - If userspace has a more specific notion for the flow than the kernel,
     for example if userspace decoded an IPv6 header but the kernel
     stopped at the Ethernet type, then userspace can forward the packet
     manually, without setting up a flow in the kernel.  (This case is
     bad from a performance point of view, but at least it is correct.)

This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, although userspace does now
have enough information to do that intelligently.  This will have to wait
for later commits.

This commit is bigger than it would otherwise be because it is rolled
together with changing "struct odp_msg" to a sequence of Netlink
attributes.  The alternative, to do each of those changes in a separate
patch, seemed like overkill because it meant that either we would have to
introduce and then kill off Netlink attributes for in_port and tun_id, if
Netlink conversion went first, or shove yet another variable-length header
into the stuff already after odp_msg, if adding the flow key to odp_msg
went first.

This commit will slow down performance of checksumming packets sent up to
userspace.  I'm not entirely pleased with how I did it.  I considered a
couple of alternatives, but none of them seemed that much better.
Suggestions welcome.  Not changing anything wasn't an option,
unfortunately.  At any rate some slowdown will become unavoidable when OVS
actually starts using Netlink instead of just Netlink framing.

(Actually, I thought of one option where we could avoid that: make
userspace do the checksum instead, by passing csum_start and csum_offset as
part of what goes to userspace.  But that's not perfect either.)

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
704a1e09e9 datapath: Change listing flows to use an iterator concept.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length.  This does
not, however, fit in well with the ODP_FLOW_LIST ioctl in its current form,
because that would require userspace to know how much space to allocate
for each flow's key in advance, or to allocate as much space as could
possibly be needed.  Neither choice is very attractive.

This commit prepares for a different solution, by replacing ODP_FLOW_LIST
by a new ioctl ODP_FLOW_DUMP that retrieves a single flow from the datapath
on each call.  It is much cleaner to allocate the maximum amount of space
for a single flow key than to do so for possibly a very large number of
flow keys.

As a side effect, this patch also fixes a race condition that sometimes
made "ovs-dpctl dump-flows" print an error: previously, flows were listed
and then their actions were retrieved, which left a window in which
ovs-vswitchd could delete the flow.  Now dumping a flow and its actions is
a single step, closing that window.

Dumping all of the flows in a datapath is no longer an atomic step, so now
it is possible to miss some flows or see a single flow twice during
iteration, if the flow table is modified by another process.  It doesn't
look like this should be a problem for ovs-vswitchd.

It would be faster to retrieve a number of flows in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:35 -08:00
Ben Pfaff
b9298d3f82 Expand tunnel IDs from 32 to 64 bits.
We have a need to identify tunnels with keys longer than 32 bits.  This
commit adds basic datapath and OpenFlow support for such keys.  It doesn't
actually add any tunnel protocols that support 64-bit keys, so this is not
very useful yet.

The 'arg' member of struct odp_msg had to be expanded to 64-bits also,
because it sometimes contains a tunnel ID.  This member also contains the
argument passed to ODPAT_CONTROLLER, so I expanded that action's argument
to 64 bits also so that it can use the full width of the expanded 'arg'.
Userspace doesn't take advantage of the new space though (it was only
using 16 bits anyhow).

This commit has been tested only to the extent that it doesn't disrupt
basic Open vSwitch operation.  I have not tested it with tunnel traffic.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Feature #3976.
2010-12-10 11:14:13 -08:00
Ben Pfaff
cdee00fd63 datapath: Replace "struct odp_action" by Netlink attributes.
In the medium term, we plan to migrate the datapath to use Netlink as its
communication channel.  In the short term, we need to be able to have
actions with 64-bit arguments but "struct odp_action" only has room for
48 bits.  So this patch shifts to variable-length arguments using Netlink
attributes, which starts in on the Netlink transition and makes 64-bit
arguments possible at the same time.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-10 11:13:32 -08:00
Ben Pfaff
c3827f619a datapath: Make adding and attaching a vport a single step.
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath.  Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath.  Similarly, a vport
could be detached and deleted separately.

After some study, I think I understand why this distinction exists.  It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them.  However, changing it to create them before it tries to
open them is not difficult, so this commit does this.

The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-03 14:41:38 -08:00
Ben Pfaff
f1588b1fa1 datapath: Remove implementation of port groups.
The "port group" concept seems like a good one, but it has not been
used very much in userspace so far, so before we commit ourselves to
a frozen API that we must maintain forever, remove it.  We can always
add it back in later as a new kind of vport.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2010-10-11 12:40:11 -07:00
Ben Pfaff
9dbb9d5e94 ofproto: Avoid user->kernel->user round-trip for many controller actions.
When an OpenFlow flow says to send packets to the controller, until now
ofproto has executed that using dpif_execute(), which passes the packet up
to the kernel.  The kernel queues the packet into its "action" queue, and
then later ofproto pulls the packet back down from the kernel and sends it
to the controller.

However, this is unnecessary.  Open vSwitch can just recognize in advance
that it will get the packet back and handle it directly, skipping the round
trip.  This commit implements this optimization.

This generally affects only the first packet in a flow, since generally the
rest come directly down from the kernel.  It only optimizes the "easy" case
where the first action in a flow is to send the packet to the controller,
since this seems to be the common case in the flows that I'm looking at
now.
2010-08-27 10:10:23 -07:00
Ben Pfaff
aae51f5335 dpif: Abstract translation from OpenFlow queue ID into ODP priority value.
When the QoS code was integrated, I didn't yet know how to abstract the
translation from a queue ID in an OpenFlow OFPAT_ENQUEUE action into a
priority value for an ODP ODPAT_SET_PRIORITY action.  This commit is a
first attempt that works OK for Linux, so far.  It's possible that in fact
this translation needs the 'netdev' as an argument too, but it's not needed
yet.
2010-07-20 11:23:21 -07:00
Ben Pfaff
02dd3123a0 Merge "master" into "next". 2010-02-24 13:47:09 -08:00
Jesse Gross
03292c4654 Add extern "C" to more header files.
From partner.
2010-02-17 10:38:54 -05:00
Justin Pettit
a4af00400a Merge branch 'master' into next
Conflicts:
	COPYING
	datapath/datapath.h
	lib/automake.mk
	lib/dpif-provider.h
	lib/dpif.c
	lib/hmap.h
	lib/netdev-provider.h
	lib/netdev.c
	lib/stream-ssl.h
	ofproto/executer.c
	ofproto/ofproto.c
	ofproto/ofproto.h
	tests/automake.mk
	utilities/ovs-ofctl.c
	utilities/ovs-vsctl.in
	vswitchd/ovs-vswitchd.conf.5.in
	xenserver/etc_init.d_vswitch
	xenserver/etc_xensource_scripts_vif
	xenserver/opt_xensource_libexec_interface-reconfigure
2010-02-05 17:14:55 -08:00
Jesse Gross
999401aa9c dpif: Allow providers to be managed at runtime.
The list of datapath providers was previously staticly defined at
compile time.  This allows new providers to be added and removed
at runtime.
2010-02-01 12:00:53 -05:00
Jesse Gross
1a6f1e2a6d dpif: Update dpif interface to match netdev.
This brings over some features that were added to the netdev interface,
most notably the separation between the name and the type.  In addition
to being cleaner, this also avoids problems where it is expected that
the local port has the same name as the datapath.
2010-01-27 20:03:38 -05:00
Ben Pfaff
72b0630028 Initial implementation of sFlow.
Tested very slightly with "ping" and "sflowtool -t | tcpdump -r -".
2010-01-04 13:08:37 -08:00
Ben Pfaff
efacbce62f dpif: New function dpif_create_and_open().
This function combines what dpif_create() and dpif_open() do.  It allows
us to factor a tiny amount of code out of the vswitch, but more importantly
this function is also useful in the following commit.
2009-11-23 15:58:48 -08:00
Ben Pfaff
d3d22744a7 vswitch: Avoid knowledge of details specific to Linux datapaths.
At startup, the vswitch needs to delete datapaths that are not configured
by the administrator.  Until now this was done by knowing the possible
names of Linux datapaths.  This commit cleans up by allowing each
datapath class to enumerate its existing datapaths and their names.
2009-07-06 11:06:36 -07:00
Ben Pfaff
e9e28be359 Introduce general-purpose ways to wait for dpif and netdev changes.
The dpif and netdev code has had various ways to check for changes to
dpifs and netdevs over the course of Open vSwitch development.  All of
these have been thus far fairly specific to the Linux implementation.  This
commit is the start of a more general API for watching for such changes.
The dpif-related parts seem fairly mature and so they are documented,
the netdev parts will probably need to change somewhat and so they are
not documented yet.
2009-07-06 09:07:24 -07:00
Ben Pfaff
5792c5c64a dpif: Add new functions dp_run() and dp_wait().
The upcoming netdev-based dpif needs a hook where it can process packets
and throw them against the flow table, and this provides a suitable place.
2009-07-06 09:07:24 -07:00
Ben Pfaff
96fba48f52 dpif: Make dpifs abstract, to allow multiple datapath implementations.
This commit initially introduces only a single datapath implementation,
which is the same as the original one, but it paves the way for
additional implementations, such as the upcoming userspace datapath.
2009-07-06 09:07:24 -07:00
Ben Pfaff
9ee3ae3e0d datapath: Make the datapath responsible for choosing port numbers.
Soon we will allow for multiple datapath implementations.  By allowing
the datapath to choose the port numbers, we possibly simplify some datapath
implementations, and the datapath's clients don't have to guess (or to
check) what port numbers are free, so this seems like a better way to go.
2009-07-06 09:07:24 -07:00
Ben Pfaff
8f24562aed dpif: Rename odp_msg related functions for more consistency.
This seems like a more consistent naming scheme, since all of these
functions are related but none of them were named similarly or grouped
together.
2009-07-06 09:07:24 -07:00
Ben Pfaff
bb8a9a2b0e dpif: Change dpif_port_group_get() semantics.
This function is easier for callers to use if they do not have to guess
how many ports are in the group.  Since it's not performance critical at
all, introduce these easier semantics.
2009-07-06 09:07:24 -07:00
Ben Pfaff
c228a3649a dpif: Hide the contents of struct dpif.
This helps prepare for multiple dpif implementations, and ensures that
code outside dpif.c does not depend on its internals.
2009-07-06 09:07:23 -07:00
Ben Pfaff
b29ba12809 dpif: Replace dpif_id() by dpif_name().
dpif_id() is often used in error messages, e.g. "dp%u: screwed up".  But
soon we will be generalizing the concept of a datapath, so it is better
to have a function that returns a full name, e.g. "%s: screwed up".
Accordingly, this commit replaces dpif_id() by a new function dpif_name()
that does so.
2009-07-06 09:07:23 -07:00
Ben Pfaff
53a4218dca dpif: New function dpif_get_netflow_ids().
The 'minor' member of struct dpif is used for two different purposes:
for printing in log messages and for encapsulating in NetFlow messages.
The needs in each case are different, so we should break up these uses.
This commit does half of that, by introducing a new function to retrieve
NetFlow ids and using it where appropriate.
2009-07-06 09:07:23 -07:00
Ben Pfaff
335562c0b9 dpif: Rename dpif_get_name() to dpif_port_get_name(), update interface.
With multiple kinds of datapaths, code should not just use
"dp%u" along with dpif_minor() to print a datapath name, because not all
datapaths can sensibly be named that way.  We want to use a function
with a name like dpif_get_name() to retrieve a datapath name for printing
to the user, in which case the existing dpif_get_name() function would be
confusing.  So rename the existing one to something more explicit.
2009-07-06 09:07:23 -07:00
Ben Pfaff
a14bc59fb8 Update primary code license to Apache 2.0. 2009-06-15 15:11:30 -07:00
Ben Pfaff
064af42167 Import from old repository commit 61ef2b42a9c4ba8e1600f15bb0236765edc2ad45. 2009-07-08 13:19:16 -07:00