2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 13:27:59 +00:00

247 Commits

Author SHA1 Message Date
Raju Subramanian
e0edde6fee Global replace of Nicira Networks.
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.

Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-02 17:08:02 -07:00
Ben Pfaff
112bc5f46f ofproto-dpif: Make it easier to credit statistics for resubmits.
Until now, crediting statistics to OpenFlow rules due to "resubmit" actions
has required setting up a "resubmit hook" with a callback function and
auxiliary data.  This commit makes it easier to do, by adding a member to
struct action_xlate_ctx that specifies statistics to credit to each
resubmitted rule.

This commit includes one small behavioral change as an optimization.
Previously, rule_execute() translated the rule twice: once to get the ODP
actions, then a second time after executing the ODP actions to credit
statistics to the rules.  After this commit, rule_execute() translates the
rule only once, crediting statistics as a side effect.  The difference only
becomes visible when executing the actions fails: previously the statistics
would not be incremented, after this commit they will be.  It is very
unusual for executing actions to fail (generally this indicates a bug) so
I'm not concerned about it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-04-18 20:37:57 -07:00
Ben Pfaff
90a7c55e56 dpif: Make caller of dpif_recv() provide buffer space.
This improves performance under heavy flow setup loads.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-04-18 20:28:51 -07:00
Ben Pfaff
7393104d1a dpif: Include TCP flags in "ovs-dpctl dump-flows" output.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-04-18 20:28:40 -07:00
Ben Pfaff
b99d3ceeed ofproto-dpif: Batch flow uninstallations due to expiration.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-04-18 20:28:12 -07:00
Ben Pfaff
8017806940 ofproto-dpif: Keep subfacet "used" times more up-to-date.
handle_flow_miss() didn't update subfacet "used" times for packets
processed by userspace.  This commit fixes the problem.

Found by inspection.  I didn't verify the bug in testing.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-04-18 20:28:09 -07:00
Ben Pfaff
12113c394a packets: New function packet_get_tcp_flags(), factored out of dpif.
This will acquire a new user in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-02-15 10:18:10 -08:00
Ben Pfaff
a39edbd4a4 Add a few 'const's.
These are useful hints, in these cases, that the caller retains ownership
of the passed-in packets.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-02-01 10:16:28 -08:00
Ben Pfaff
f23d28456f dpif: Log each operation in dpif_operate().
Without logging of operation groups, it becomes more difficult to debug
problems related to flow setups, since those go through operation groups.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-01-16 13:37:27 -08:00
Ben Pfaff
89625d1efb dpif: Change provider interface to consistently use operation structs.
Until now, a "flow put" has represented its parameters in two different
ways, depending on whether it was coming from dpif_flow_put() or from
dpif_operate(), and similarly for an "execute" operation.  This commit
adopts the operation struct consistently within the dpif provider
interface, which seems cleaner.

This commit also factors out logging for flow puts and executes, which
is useful in the following commit.

This doesn't change the dpif client interface, since the two forms are
more convenient for clients than always filling out an operation struct.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-01-16 13:37:27 -08:00
Ben Pfaff
c2b565b54e dpif: Factor 'type' and 'error' out of individual dpif_op members.
I'd like to change ->dpif_flow_put() and ->dpif_execute() in the dpif
provider to take the structures of the same names as parameters, instead of
passing them discrete parameters, because this seems like a more sensible
way to do things internally than to have two different ways to pass the
parameters.  It might even simplify code slightly.  But ->flow_put() and
->execute() wouldn't want the 'type' (because it's implied by the function
being called) or 'error' (because it would be the same as the return
value).  Although of course they could just ignore those members, it seems
slightly cleaner to omit them entirely, as this change allows.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-01-16 13:35:21 -08:00
Ben Pfaff
a12b3eadc6 dpif: Simplify the "listen mask" concept.
At one point in the past, there were three separate queues between the
kernel module and OVS userspace, each of which corresponded to a Netlink
socket (or, before that, to a character device).  It made sense to allow
each of these to be enabled or disabled separately, hence the "listen mask"
concept in the dpif layer.

These days, the concept is much less clear-cut.  Queuing is no longer on
the basis of different classes of packets but instead striped across a
collection of sockets based on input port.  It doesn't really make sense
to enable receiving packets on the basis of the kind of packet anymore.
Accordingly, this commit simplifies the "listen_mask" to just a bool that
either enables or disables receiving packets.

It could be useful to enable or disable receiving packets on a per-vport
basis, but the rest of the code isn't ready to make use of that so this
commit doesn't generalize this much.

Based on this discussion on ovs-dev:
http://openvswitch.org/pipermail/dev/2011-October/012044.html

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-01-12 17:09:22 -08:00
Ben Pfaff
90bf1e0732 Better abstract OpenFlow error codes.
This commit switches from using the actual protocol values of error codes
internally in Open vSwitch, to using abstract values that are translated to
and from protocol values at message parsing and serialization time.  I
believe that this makes the code easier to read and to write.

This is also one step along the way toward OpenFlow 1.1 support because
OpenFlow 1.1 renumbered a bunch of error codes.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-01-12 15:54:25 -08:00
Ethan Jackson
c499c75db6 ofp-print: Remove vestigial 'total_len' argument.
ofp_print_packet() and ofp_packet_to_string() don't use the
'total_len' argument which they require callers to supply.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-01-10 14:29:17 -08:00
Ben Pfaff
5fcc0d004a ofproto: Add "fast path".
The key to getting good performance on the netperf CRR test seems to be to
handle the first packet of each new flow as quickly as possible.  Until
now, we've only had one opportunity to do that on each trip through the
main poll loop.  One way to improve would be to make that poll loop
circulate more quickly.  My experiments show, however, that even just
commenting out the slower parts of the poll loop yield minimal improvement.

This commit takes another approach.  Instead of making the poll loop
overall faster, it invokes the performance-critical parts of it more than
once during each poll loop.

My measurements show that this commit improves netperf CRR performance by
24% versus the previous commit, for an overall improvement of 87% versus
the baseline just before the commit that removed the poll_fd_woke().  With
this commit, ovs-benchmark performance has also improved by 13% overall
since that baseline.
2011-11-28 10:35:15 -08:00
Ethan Jackson
579a77e024 tests: Allow unit tests to run as root.
The unit tests did not allow users to run them as root because
ovs-vswitchd would destroy all of the existing 'system' datapaths.
This patch prevents ovs-vswitchd from registering 'system'
datapaths when running unit tests preventing the issue.
2011-11-18 13:48:57 -08:00
Pravin B Shelar
abff858b5a datapath: Convert kernel priority actions into match/set.
Following patch adds skb-priority to flow key. So userspace will know
what was priority when packet arrived and we can remove the pop/reset
priority action. It's no longer necessary to have a special action for
pop that is based on the kernel remembering original skb->priority.
Userspace can just emit a set priority action with the original value.

Since the priority field is a match field with just a normal set action,
we can convert it into the new model for actions that are based on
matches.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>

Bug #7715
2011-11-01 10:13:16 -07:00
Ben Pfaff
7257b535ab Implement new fragment handling policy.
Until now, OVS has handled IP fragments more awkwardly than necessary.  It
has not been possible to match on L4 headers, even in fragments with offset
0 where they are actually present.  This means that there was no way to
implement ACLs that treat, say, different TCP ports differently, on
fragmented traffic; instead, all decisions for fragment forwarding had to
be made on the basis of L2 and L3 headers alone.

This commit improves the situation significantly.  It is still not possible
to match on L4 headers in fragments with nonzero offset, because that
information is simply not present in such fragments, but this commit adds
the ability to match on L4 headers for fragments with zero offset.  This
means that it becomes possible to implement ACLs that drop such "first
fragments" on the basis of L4 headers.  In practice, that effectively
blocks even fragmented traffic on an L4 basis, because the receiving IP
stack cannot reassemble a full packet when the first fragment is missing.

This commit works by adding a new "fragment type" to the kernel flow match
and making it available through OpenFlow as a new NXM field named
NXM_NX_IP_FRAG.  Because OpenFlow 1.0 explicitly says that the L4 fields
are always 0 for IP fragments, it adds a new OpenFlow fragment handling
mode that fills in the L4 fields for "first fragments".  It also enhances
ovs-ofctl to allow users to configure this new fragment handling mode and
to parse the new field.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Bug #7557.
2011-10-21 15:07:36 -07:00
Ben Pfaff
6bc6002490 dpif: New function dpif_operate() and dpif-linux implementation.
This will be used in an upcoming commit.
2011-10-14 14:08:44 -07:00
Ben Pfaff
98403001ec datapath: Move Netlink PID for userspace actions from flows to actions.
Commit b063d9f06 "datapath: Use unicast Netlink sockets for upcalls" that
switched from multicast to unicast Netlink for sending upcalls added a
Netlink PID to each kernel flow, used by OVS_ACTION_ATTR_USERSPACE actions
within the flow as target.

This commit drops this per-flow PID in favor of a per-action PID, because
that is more flexible.  It does not yet make use of this additional
flexibility, so behavior should not change.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7559.
2011-10-12 16:27:00 -07:00
Ben Pfaff
a8d9304d12 dpif: Avoid use of "struct ovs_dp_stats" in platform-independent modules.
Over time we wish to reduce the number of datapath-protocol.h definitions
used directly outside of Linux-specific code.  This commit removes use of
"struct ovs_dp_stats" from platform-independent code.

Bug #7559.
2011-10-05 11:18:13 -07:00
Ben Pfaff
572b70687b flow: Move flow_extract_stats() to dpif.c, as dpif_flow_stats_extract().
The "flow" module is concerned only with OpenFlow flows these days.  It
shouldn't have anything to do with ODP or dpifs.  However, it included
dpif.h just to implement flow_extract_stats().  This function is a better
fit for dpif.c, so this commit moves it there and removes the dpif.h
#include from flow.h and flow.c

This commit also removes a few more dpif.h #includes that weren't needed.
2011-09-30 08:58:10 -07:00
Pravin Shelar
6ff686f2bc sFlow: Genericize/simplify kernel sFlow implementation
Following patch adds sampling action which takes probability and set
of actions as arguments. When probability is hit, actions are executed for
given packet.
USERSPACE action's userdata (u64) is used to store struct
user_action_cookie as cookie. CONTROLLER action is fixed accordingly.

Now we can remove sFlow code from kernel and implement sFlow generically
as SAMPLE action. sFlow is defined as SAMPLE Action with probability (sFlow
sampling rate) and USERSPACE action as argument. USERSPACE action's data
is used as cookie. sFlow uses this cookie to store output-port, number of
output ports and vlan-id. sample-pool is calculated by using vport
stats.

Signed-off-by: Pravin Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2011-09-28 10:43:07 -07:00
Pravin Shelar
f613a0d72c datapath: Always use generic stats for devices (vports)
Currently ovs is using device stats for Linux devices and count them
itself in other situations. This leads to overlap with hardware stats,
inconsistencies, etc. It's much better to just always count the packets
flowing through the switch and let userspace do any merging that it wants.

Following patch removes vport->get_stats() interface. vport-stat is changed
to use new `struct ovs_vport_stat` rather than rtnl_link_stats64.
Definitions of rtnl_link_stats64 is removed from OVS.  dipf_port->stat is also
removed as aggregate stats are only available at netdev layer.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-15 19:36:17 -07:00
Justin Pettit
df2c07f433 datapath: Use "OVS_*" as opposed to "ODP_*" for user<->kernel interactions.
The prefix "ODP_*" is not overly descriptive in the context of the
larger Linux tree.  This commit changes the prefix to "OVS_*" for the
userpace to kernel interactions.  The userspace libraries still use
"ODP_" in many of their interfaces since it is more descriptive in the
OVS oeuvre.

Feature #6904

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-08-19 22:48:23 -07:00
Ben Pfaff
01545c1a3c dpif: Improve logging of upcalls.
The kernel now provides the entire flow key for a packet sent up to
userspace, but dpif_recv() would only log the in_port.  This change makes
userspace log the entire flow key.

This would have made a bug that I recently looked at a bit easier to
investigate.
2011-06-09 09:21:24 -07:00
Ben Pfaff
80e5eed9c2 datapath: Get packet metadata from userspace in odp_packet_cmd_execute().
Until now, the tun_id and in_port have been lost when a packet is sent from
the kernel to userspace and then back to the kernel.  I didn't think that
this was a problem, but recent behavior made me look closer and see that
it makes a difference if sFlow is turned on or if an
ODP_ATTR_ACTION_CONTROLLER action is present.  We could possibly kluge
around those, but for future-proofing it seems better to pass the packet
metadata from userspace to the kernel.  That is what this commit does.

This commit introduces a user-kernel protocol break.  We could avoid that,
if it is desirable, by making ODP_PACKET_ATTR_KEY optional for
ODP_PACKET_CMD_EXECUTE commands.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-06-01 13:39:51 -07:00
Ben Pfaff
0079481775 Merge 'master' into 'next'. 2011-05-12 12:05:42 -07:00
Ben Pfaff
54ed8a5d74 dpif: Make dp_parse_name() normalize its returned type.
This means that callers don't have to be concerned with a NULL return value
or unnormalized type.
2011-05-11 12:26:07 -07:00
Ben Pfaff
640e1b2077 dpif: Improve abstraction by making 'run' and 'wait' functions per-dpif.
Until now, the dp_run() and dp_wait() functions had to be called at the top
level of the program because they applied to every open dpif.  By replacing
them by functions that take a specific dpif as an argument, we can call
them only from ofproto, which is currently the correct layer to deal with
dpifs.
2011-05-11 12:26:07 -07:00
Ben Pfaff
d647f0a744 dpif: Better log unusual errors in dpif_port_query_by_name().
Logging these unusual errors at a low level means that we can remove a
bit of higher-level code from ofproto.

The ofproto change also changes behavior for these error cases, from doing
nothing to removing the port, but I think that's OK.  I've never noticed
this log message.
2011-05-04 10:27:21 -07:00
Ben Pfaff
3a225db707 dpif: New function dpif_normalize_type().
This allows dpif types to be compared.
2011-05-04 10:20:06 -07:00
Ben Pfaff
032aa6a354 ovs-dpctl: Add -s option to print packet and byte counters. 2011-05-02 09:33:12 -07:00
Ben Pfaff
d0c23a1a57 dpif: Use sset instead of svec in dpif interface. 2011-03-31 16:42:01 -07:00
Ben Pfaff
7aec165dbc datapath: s/ODPAT_/ODP_ACTION_ATTR_/ to fit new naming scheme.
Jesse suggested this naming scheme, so I'm adjusting existing names to
fit it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-28 15:34:40 -08:00
Ben Pfaff
3d8c95357f dpif: Remove dpif_get_all_names().
None of the remaining dpif implementations have more than one name per
dpif, so there's no need for this function anymore.

Suggested-by: Jesse Gross <jesse@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-28 15:34:40 -08:00
Ben Pfaff
82272eded1 Eliminate ODPL_* from userspace-facing interface.
Reviewed by Justin Pettit.
2011-01-27 21:08:41 -08:00
Ben Pfaff
693c4a0112 datapath: Eliminate 'flags' member from odp_flow.
Nothing was productively using the 'flags' member of odp_flow, so this
commit removes it.

ODPFF_ZERO_TCP_FLAGS isn't used at all (as of the previous commit).

ODPFF_EOF has been replaced by a special case of the 'key_len' member.
This will go away, too, once AF_NETLINK starts being used.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:39 -08:00
Ben Pfaff
ba25b8f41f dpif: Eliminate ODPPF_* constants from client-visible interface.
Following this commit, the ODPPF_* constants are only used in
Linux-specific parts of OVS userspace code.  This allows the actual Linux
datapath interface to evolve more freely.

Reviewed by Justin Pettit.
2011-01-27 21:08:39 -08:00
Ben Pfaff
c97fb13280 dpif: Eliminate "struct odp_flow_stats" from client-visible interface.
Following this commit, "struct odp_flow_stats" is only used in
Linux-specific parts of OVS userspace code.  This allows the actual Linux
datapath interface to evolve more freely.

Reviewed by Justin Pettit.
2011-01-27 21:08:38 -08:00
Ben Pfaff
feebdea2e5 dpif: Eliminate "struct odp_flow" from client-visible interface.
Following this commit, "struct odp_flow" and related data structures are
only used in Linux-specific parts of OVS userspace code.  This allows the
actual Linux datapath interface to evolve more freely.

Reviewed by Justin Pettit.
2011-01-27 21:08:38 -08:00
Ben Pfaff
bc4a05c639 datapath: Change ODP_FLOW_GET to retrieve only a single flow at a time.
This brings the code closer to what the Netlink interface will need to
implement.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:38 -08:00
Ben Pfaff
996c1b3d7a datapath: Drop port information from odp_stats.
As with n_flows, n_ports was used regularly by userspace to determine how
much memory to allocate when listing ports, but it is no longer needed for
that.  max_ports, on the other hand, is necessary but it is also a fixed
value for the kernel datapath right now and if we expand it we can also
come up with a way to report the expanded value.

The remaining members of odp_stats are actually real statistics that I
intend to keep.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:38 -08:00
Ben Pfaff
1ba530f4b2 datapath: Drop queue information from odp_stats.
This queue information will be available through the kernel socket layer
once we move over to Netlink socket as transports, so we might as well get
rid of the redundancy.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:38 -08:00
Ben Pfaff
4c738a8da5 dpif: Eliminate "struct odp_port" from client-visible interface.
Following this commit, "struct odp_port" is only used in Linux-specific
parts of OVS userspace code.  This allows the actual Linux datapath
interface to evolve more freely.

Reviewed by Justin Pettit.
2011-01-27 21:08:37 -08:00
Ben Pfaff
b0ec0f279e datapath: Change listing ports to use an iterator concept.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to add new
features to the kernel vport layer without changing userspace software.  In
turn, that means that the odp_port structure must become variable-length.
This does not, however, fit in well with the ODP_PORT_LIST ioctl in its
current form, because that would require userspace to know how much space
to allocate for each port in advance, or to allocate as much space as
could possibly be needed.  Neither choice is very attractive.

This commit prepares for a different solution, by replacing ODP_PORT_LIST
by a new ioctl ODP_VPORT_DUMP that retrieves information about a single
vport from the datapath on each call.  It is much cleaner to allocate the
maximum amount of space for a single vport than to do so for possibly a
large number of vports.

It would be faster to retrieve a number of vports in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.

The Netlink version won't need to take the starting port number from
userspace, since Netlink sockets can keep track of that state as part
of their "dump" feature.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
856081f683 datapath: Report kernel's flow key when passing packets up to userspace.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.

This commit takes one step in that direction by making the kernel report
its idea of the flow that a packet belongs to whenever it passes a packet
up to userspace.  This means that userspace can intelligently figure out
what to do:

   - If userspace's notion of the flow for the packet matches the kernel's,
     then nothing special is necessary.

   - If the kernel has a more specific notion for the flow than userspace,
     for example if the kernel decoded IPv6 headers but userspace stopped
     at the Ethernet type (because it does not understand IPv6), then again
     nothing special is necessary: userspace can still set up the flow in
     the usual way.

   - If userspace has a more specific notion for the flow than the kernel,
     for example if userspace decoded an IPv6 header but the kernel
     stopped at the Ethernet type, then userspace can forward the packet
     manually, without setting up a flow in the kernel.  (This case is
     bad from a performance point of view, but at least it is correct.)

This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, although userspace does now
have enough information to do that intelligently.  This will have to wait
for later commits.

This commit is bigger than it would otherwise be because it is rolled
together with changing "struct odp_msg" to a sequence of Netlink
attributes.  The alternative, to do each of those changes in a separate
patch, seemed like overkill because it meant that either we would have to
introduce and then kill off Netlink attributes for in_port and tun_id, if
Netlink conversion went first, or shove yet another variable-length header
into the stuff already after odp_msg, if adding the flow key to odp_msg
went first.

This commit will slow down performance of checksumming packets sent up to
userspace.  I'm not entirely pleased with how I did it.  I considered a
couple of alternatives, but none of them seemed that much better.
Suggestions welcome.  Not changing anything wasn't an option,
unfortunately.  At any rate some slowdown will become unavoidable when OVS
actually starts using Netlink instead of just Netlink framing.

(Actually, I thought of one option where we could avoid that: make
userspace do the checksum instead, by passing csum_start and csum_offset as
part of what goes to userspace.  But that's not perfect either.)

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
36956a7d33 datapath: Convert odp_flow_key to use Netlink attributes instead.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length.  This
commit makes that change using Netlink attribute sequences.

This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, because userspace doesn't yet
have enough information to do that intelligently.  Upcoming commits will
fix that.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:35 -08:00
Ben Pfaff
704a1e09e9 datapath: Change listing flows to use an iterator concept.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length.  This does
not, however, fit in well with the ODP_FLOW_LIST ioctl in its current form,
because that would require userspace to know how much space to allocate
for each flow's key in advance, or to allocate as much space as could
possibly be needed.  Neither choice is very attractive.

This commit prepares for a different solution, by replacing ODP_FLOW_LIST
by a new ioctl ODP_FLOW_DUMP that retrieves a single flow from the datapath
on each call.  It is much cleaner to allocate the maximum amount of space
for a single flow key than to do so for possibly a very large number of
flow keys.

As a side effect, this patch also fixes a race condition that sometimes
made "ovs-dpctl dump-flows" print an error: previously, flows were listed
and then their actions were retrieved, which left a window in which
ovs-vswitchd could delete the flow.  Now dumping a flow and its actions is
a single step, closing that window.

Dumping all of the flows in a datapath is no longer an atomic step, so now
it is possible to miss some flows or see a single flow twice during
iteration, if the flow table is modified by another process.  It doesn't
look like this should be a problem for ovs-vswitchd.

It would be faster to retrieve a number of flows in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:35 -08:00
Jesse Gross
cf22f8cba3 vswitchd: Consistently use size_t for action lengths.
Currently the type of the datapath action length is mixture of
size_t and unsigned int.  However, size_t is really defined as an
unsigned long, which causes the build to fail on 64-bit platforms.
This consistently uses size_t.
2010-12-13 11:07:15 -08:00