2
0
mirror of https://github.com/openvswitch/ovs synced 2025-10-29 15:28:56 +00:00
Commit Graph

92 Commits

Author SHA1 Message Date
Alex Wang
1954e6bbcb dpif: Change dpif API to allow multiple handler threads read upcall.
This commit changes the API in 'dpif-provider.h' to allow multiple
handler threads call dpif_recv() simultaneously.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-03-20 10:27:10 -07:00
Joe Stringer
bdeadfdd95 dpif: New function flow_dump_next_may_destroy_keys().
This new function allows callers to determine whether previously
returned keys will be modified or reallocated on the next call to
dpif_flow_dump_next(). This will be used in a future commit to allow
batched flow deletion by revalidator threads.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-02-27 14:39:21 -08:00
Joe Stringer
938eaa50a6 dpif: Don't synchronize flow_dump_next() status.
Recent changes to the flow_dump_next() interface have made it the
responsibility of the dpif implementation to track error status over a
flow dump operation.

This patch removes status tracking from 'struct dpif_flow_dump', allowing
multiple threads to call dpif_flow_dump_next() and track their status
independently. Even if one thread finishes processing flows for a given
iterator and state, it will not prevent other callers from processing
the remaining flows in their buffers.

After this patch, the error code that dpif_flow_dump_next() returns is
only significant for the current state and buffer. As before, the status
of the entire flow dump operation can be obtained by calling
dpif_flow_dump_done().

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-02-27 14:38:58 -08:00
Joe Stringer
d2ad7ef178 dpif: Make dpif_flow_dump_next() thread-safe.
This patch makes it the caller's responsibility to initialize a
per-thread 'state' object and pass it down to the dpif_flow_dump_next()
implementation. The implementation can expect to be called from multiple
threads with the same 'iter' and different 'state' objects.

When flow_dump_next() returns non-zero, the implementation must ensure
that subsequent calls with the same arguments also return non-zero.
Subsequent calls with the same 'iter' and different 'state' may return
zero, but should make progress towards returning non-zero.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-02-27 14:30:25 -08:00
Joe Stringer
e723fd32d5 dpif: Separate local and shared flow dump state.
This patch separates the structures for thread-local flow dump state
("state") from the shared flow dump state ("iter") in dpif-linux and
dpif-netdev. Future patches will make use of this to allow multiple
threads to dump flows from the same flow dump operation.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-02-27 14:27:32 -08:00
Jarno Rajahalme
758c456df5 dpif: Use explicit packet metadata.
This helps reduce confusion about when a flow is a flow and when it is
just metadata.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-12-30 16:52:43 -08:00
Francesco Fusco
1ce3fa06fc dpif-linux: fix the size of n_masks
The command ovs-dpctl can wrongly output the masks even if the
datapath does not implement mega flows. In this case the output
will be similar to the following:

system@ovs-system:
	lookups: hit:14 missed:41 lost:0
	flows: 0
	masks: hit:18446744073709551615 total:4294967295
		hit/pkt:335395346794719104.00
	port 0: ovs-system (internal)
	port 1: gre_system (gre: df_default=false, ttl=0)
	port 2: ots-br0 (internal)
	port 3: int0 (internal)
	port 4: vnet0
	port 5: vnet1

The problem depends on the fact that n_masks stats is stored as a
uint32 in the struct ovs_dp_megaflow_stats and as a uint64 in the
struct dpif_dp_stats. UINT32_MAX instead of UINT64_MAX should be
used to detect if the datapath supports megaflows or not.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-12-17 13:20:28 -08:00
Jarno Rajahalme
da546e0764 dpif: Allow execute to modify the packet.
Allowing the packet to be modified by execution allows less data
copying for userspace action execution.  Some users of the
dpif_execute already expect that the packet may be modified.  This
patch makes this behavior uniform and makes the userspace datapath and
the execution helpers modify the packet as it is being executed.
Userspace action now steals the packet if given permission, as the
packet is normally not needed after it.  The only exception is the
sample action, and this is accounted for my keeping track of any
actions that could be following the userspace action.

The packet in dpif_upcall is changed from a pointer to a struct,
allowing the packet to be honest about it's headroom.  After this
change the packet can safely be pushed on over the precarious 4 byte
limit earlier allowed by the netlink data preceding the packet.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-12-16 08:14:52 -08:00
Ben Pfaff
ee75c5460e dpif: Document datapath masking.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-11-12 17:11:17 -08:00
Gurucharan Shetty
9f5bbb005c ofproto-dpif: Disassociate datapath max_ports with openflow port numbers.
With single datapath, multiple userspace bridges share the same datapath.
As such it does not look beneficial that we decide a valid open flow port
number based on the number of ports in the datapath specially now that
we have the ofport_request column in OVSDB.

This commit does not remove ofproto_init_max_ports() interface as defined
in ofproto-provider.h as there may be other implementations that still use it.
But ofproto-dpif should not need it.

Bug #20163.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-11-06 07:25:06 -08:00
Jarno Rajahalme
a66733a8bc Widen TCP flags handling.
Widen TCP flags handling from 7 bits (uint8_t) to 12 bits (uint16_t).
The kernel interface remains at 8 bits, which makes no functional
difference now, as none of the higher bits is currently of interest
to the userspace.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2013-10-29 09:40:19 -07:00
Andy Zhou
847108dc29 dpif-linux: collect and display mega flow mask stats
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-10-22 10:23:22 -07:00
Ben Pfaff
7fd9102566 dpif: Support working around actions that a datapath does not support.
Until now, OVS has expected that the datapath supports all the actions
required by any flow to be installed.  There are at least two reasons why
a datapath might not support a given action:

    - The datapath version is older than the userspace version, and the
      action was introduced after the version of the datapath in use.

    - The action is not considered important enough to implement as part of
      an ABI that must be maintained forever.

This commit adds infrastructure to handle these cases.  It doesn't actually
add any uses; that will come in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-10-09 17:28:05 -07:00
Alex Wang
1dd16b9a27 dpif: Change get_max_ports() to return uint32_t.
The declaration of 'get_max_ports()' to return odp_port_t adds
unwanted complexity to coding.  This commit changes it back to
return uint32_t type.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-23 11:33:17 -07:00
Ben Pfaff
5703b15fd6 dpif: Make dpifs thread-safe, and document it.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-07-25 10:31:42 -07:00
Alex Wang
4e022ec09e Create specific types for ofp and odp port
Until now, datapath ports and openflow ports were both represented by
unsigned integers of various sizes. With implicit conversions, etc., it is
easy to mix them up and use one where the other is expected.  This commit
creates two typedefs, ofp_port_t and odp_port_t.  Both of these two types
are marked by "__attribute__((bitwise))" so that sparse can be used to
detect any misuse.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-20 10:42:37 -07:00
Andy Zhou
e6cc0babc2 ovs-dpctl: Add mega flow support
Added support to allow mega flow specified and displayed. ovs-dpctl tool
is mainly used as debugging tool.

This patch also implements the low level user space routines to send
and receive mega flow netlink messages. Those netlink suppor
routines are required for forthcoming user space mega flow patches.

Added a unit test to test parsing and display of mega flows.

Ethan contributed the ovs-dpctl mega flow output function.

Co-authored-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-20 10:33:51 -07:00
Ben Pfaff
e995e3df57 Allow OVS_USERSPACE_ATTR_USERDATA to be variable length.
Until now, the optional OVS_USERSPACE_ATTR_USERDATA attribute had to be
exactly 64 bits long, if it was present.  However, 64 bits is not enough
space to associate as much information with a flow as would be convenient
for some userspace features now under development.  This commit generalizes
the attribute, allowing it to be any length.

This generalization is backward-compatible: if userspace only uses 64-bit
attributes, then it will not see any change in behavior.

CC: Romain Lenglet <rlenglet@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2013-02-15 16:48:32 -08:00
Ethan Jackson
c060c4cf83 netdev-vport: Build on all platforms.
This patch removes the final bit of linux specific code which
prevents building netdev-vport everywhere.  With this, other
platforms automatically get access to patch ports, and (if their
datapath supports it), flow based tunneling.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-28 19:09:58 -08:00
Ben Pfaff
ffcb9f6ebc dpif: Document.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-01-09 14:39:53 -08:00
Justin Pettit
0aeaabc8db Add functions to determine how port should be opened based on type.
Depending on the port and type of datapath, a port may need to be opened
as a different type of device than it's configured.  For example, an
"internal" port on a "dummy" datapath should opened as a "dummy" port.
This commit adds the ability for a dpif to provide this information to a
caller.  It will be used in a future commit.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
2012-11-16 12:35:55 -08:00
Justin Pettit
c7a262159e dpif: Add function to get the dpif type.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
2012-11-01 22:54:27 -07:00
Justin Pettit
4afba28d55 dpif: Add new dpif_port_exists() function.
Provide the ability to determine whether a port exists in a datapath
without having to deal with a "dpif_port" structure as with
dpif_port_query_by_name().  A future patch will use this function.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
2012-11-01 22:54:27 -07:00
Justin Pettit
9b56fe137d Always treat datapath ports as 32 bits.
Most of the code referred to datapath ports as 32-bit values, but a few
places still used 16-bit references.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
2012-11-01 22:54:27 -07:00
Justin Pettit
3b68500bde dpif: Fix minor typo in comment.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
2012-11-01 22:54:27 -07:00
Ben Pfaff
a7752d4ab7 dpif: Add 'used' argument to dpif_flow_stats_extract().
The following commit will need to use a value other than a literal
time_msec() in one case.  This commit is just preparation.

Factoring the time_msec() call out of the loop in
handle_flow_miss_without_facet() is a really minor optimization.  It isn't
the main point here.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-08-21 14:13:15 -07:00
Raju Subramanian
e0edde6fee Global replace of Nicira Networks.
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.

Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-02 17:08:02 -07:00
Ben Pfaff
90a7c55e56 dpif: Make caller of dpif_recv() provide buffer space.
This improves performance under heavy flow setup loads.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-04-18 20:28:51 -07:00
Ben Pfaff
b99d3ceeed ofproto-dpif: Batch flow uninstallations due to expiration.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-04-18 20:28:12 -07:00
Ben Pfaff
a39edbd4a4 Add a few 'const's.
These are useful hints, in these cases, that the caller retains ownership
of the passed-in packets.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-02-01 10:16:28 -08:00
Ben Pfaff
c2b565b54e dpif: Factor 'type' and 'error' out of individual dpif_op members.
I'd like to change ->dpif_flow_put() and ->dpif_execute() in the dpif
provider to take the structures of the same names as parameters, instead of
passing them discrete parameters, because this seems like a more sensible
way to do things internally than to have two different ways to pass the
parameters.  It might even simplify code slightly.  But ->flow_put() and
->execute() wouldn't want the 'type' (because it's implied by the function
being called) or 'error' (because it would be the same as the return
value).  Although of course they could just ignore those members, it seems
slightly cleaner to omit them entirely, as this change allows.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-01-16 13:35:21 -08:00
Ben Pfaff
a12b3eadc6 dpif: Simplify the "listen mask" concept.
At one point in the past, there were three separate queues between the
kernel module and OVS userspace, each of which corresponded to a Netlink
socket (or, before that, to a character device).  It made sense to allow
each of these to be enabled or disabled separately, hence the "listen mask"
concept in the dpif layer.

These days, the concept is much less clear-cut.  Queuing is no longer on
the basis of different classes of packets but instead striped across a
collection of sockets based on input port.  It doesn't really make sense
to enable receiving packets on the basis of the kind of packet anymore.
Accordingly, this commit simplifies the "listen_mask" to just a bool that
either enables or disables receiving packets.

It could be useful to enable or disable receiving packets on a per-vport
basis, but the rest of the code isn't ready to make use of that so this
commit doesn't generalize this much.

Based on this discussion on ovs-dev:
http://openvswitch.org/pipermail/dev/2011-October/012044.html

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-01-12 17:09:22 -08:00
Ethan Jackson
579a77e024 tests: Allow unit tests to run as root.
The unit tests did not allow users to run them as root because
ovs-vswitchd would destroy all of the existing 'system' datapaths.
This patch prevents ovs-vswitchd from registering 'system'
datapaths when running unit tests preventing the issue.
2011-11-18 13:48:57 -08:00
Ben Pfaff
7257b535ab Implement new fragment handling policy.
Until now, OVS has handled IP fragments more awkwardly than necessary.  It
has not been possible to match on L4 headers, even in fragments with offset
0 where they are actually present.  This means that there was no way to
implement ACLs that treat, say, different TCP ports differently, on
fragmented traffic; instead, all decisions for fragment forwarding had to
be made on the basis of L2 and L3 headers alone.

This commit improves the situation significantly.  It is still not possible
to match on L4 headers in fragments with nonzero offset, because that
information is simply not present in such fragments, but this commit adds
the ability to match on L4 headers for fragments with zero offset.  This
means that it becomes possible to implement ACLs that drop such "first
fragments" on the basis of L4 headers.  In practice, that effectively
blocks even fragmented traffic on an L4 basis, because the receiving IP
stack cannot reassemble a full packet when the first fragment is missing.

This commit works by adding a new "fragment type" to the kernel flow match
and making it available through OpenFlow as a new NXM field named
NXM_NX_IP_FRAG.  Because OpenFlow 1.0 explicitly says that the L4 fields
are always 0 for IP fragments, it adds a new OpenFlow fragment handling
mode that fills in the L4 fields for "first fragments".  It also enhances
ovs-ofctl to allow users to configure this new fragment handling mode and
to parse the new field.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Bug #7557.
2011-10-21 15:07:36 -07:00
Ben Pfaff
6bc6002490 dpif: New function dpif_operate() and dpif-linux implementation.
This will be used in an upcoming commit.
2011-10-14 14:08:44 -07:00
Ben Pfaff
077257b83c datapath-protocol: Rename to <linux/openvswitch.h>.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7559.
2011-10-12 16:27:09 -07:00
Ben Pfaff
98403001ec datapath: Move Netlink PID for userspace actions from flows to actions.
Commit b063d9f06 "datapath: Use unicast Netlink sockets for upcalls" that
switched from multicast to unicast Netlink for sending upcalls added a
Netlink PID to each kernel flow, used by OVS_ACTION_ATTR_USERSPACE actions
within the flow as target.

This commit drops this per-flow PID in favor of a per-action PID, because
that is more flexible.  It does not yet make use of this additional
flexibility, so behavior should not change.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7559.
2011-10-12 16:27:00 -07:00
Ben Pfaff
a8d9304d12 dpif: Avoid use of "struct ovs_dp_stats" in platform-independent modules.
Over time we wish to reduce the number of datapath-protocol.h definitions
used directly outside of Linux-specific code.  This commit removes use of
"struct ovs_dp_stats" from platform-independent code.

Bug #7559.
2011-10-05 11:18:13 -07:00
Ben Pfaff
572b70687b flow: Move flow_extract_stats() to dpif.c, as dpif_flow_stats_extract().
The "flow" module is concerned only with OpenFlow flows these days.  It
shouldn't have anything to do with ODP or dpifs.  However, it included
dpif.h just to implement flow_extract_stats().  This function is a better
fit for dpif.c, so this commit moves it there and removes the dpif.h
#include from flow.h and flow.c

This commit also removes a few more dpif.h #includes that weren't needed.
2011-09-30 08:58:10 -07:00
Pravin Shelar
6ff686f2bc sFlow: Genericize/simplify kernel sFlow implementation
Following patch adds sampling action which takes probability and set
of actions as arguments. When probability is hit, actions are executed for
given packet.
USERSPACE action's userdata (u64) is used to store struct
user_action_cookie as cookie. CONTROLLER action is fixed accordingly.

Now we can remove sFlow code from kernel and implement sFlow generically
as SAMPLE action. sFlow is defined as SAMPLE Action with probability (sFlow
sampling rate) and USERSPACE action as argument. USERSPACE action's data
is used as cookie. sFlow uses this cookie to store output-port, number of
output ports and vlan-id. sample-pool is calculated by using vport
stats.

Signed-off-by: Pravin Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2011-09-28 10:43:07 -07:00
Pravin Shelar
f613a0d72c datapath: Always use generic stats for devices (vports)
Currently ovs is using device stats for Linux devices and count them
itself in other situations. This leads to overlap with hardware stats,
inconsistencies, etc. It's much better to just always count the packets
flowing through the switch and let userspace do any merging that it wants.

Following patch removes vport->get_stats() interface. vport-stat is changed
to use new `struct ovs_vport_stat` rather than rtnl_link_stats64.
Definitions of rtnl_link_stats64 is removed from OVS.  dipf_port->stat is also
removed as aggregate stats are only available at netdev layer.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-15 19:36:17 -07:00
Justin Pettit
df2c07f433 datapath: Use "OVS_*" as opposed to "ODP_*" for user<->kernel interactions.
The prefix "ODP_*" is not overly descriptive in the context of the
larger Linux tree.  This commit changes the prefix to "OVS_*" for the
userpace to kernel interactions.  The userspace libraries still use
"ODP_" in many of their interfaces since it is more descriptive in the
OVS oeuvre.

Feature #6904

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-08-19 22:48:23 -07:00
pravin shelar
b85d8d61a6 Datapath action should not refer to controller
ODP_ACTION_ATTR_CONTROLLER in the kernel actually sends packets to
userspace, not the controller. To make it generic rename this action
to ODP_ACTION_ATTR_USERSPACE.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2011-07-28 09:05:25 -07:00
Ben Pfaff
01545c1a3c dpif: Improve logging of upcalls.
The kernel now provides the entire flow key for a packet sent up to
userspace, but dpif_recv() would only log the in_port.  This change makes
userspace log the entire flow key.

This would have made a bug that I recently looked at a bit easier to
investigate.
2011-06-09 09:21:24 -07:00
Ben Pfaff
80e5eed9c2 datapath: Get packet metadata from userspace in odp_packet_cmd_execute().
Until now, the tun_id and in_port have been lost when a packet is sent from
the kernel to userspace and then back to the kernel.  I didn't think that
this was a problem, but recent behavior made me look closer and see that
it makes a difference if sFlow is turned on or if an
ODP_ATTR_ACTION_CONTROLLER action is present.  We could possibly kluge
around those, but for future-proofing it seems better to pass the packet
metadata from userspace to the kernel.  That is what this commit does.

This commit introduces a user-kernel protocol break.  We could avoid that,
if it is desirable, by making ODP_PACKET_ATTR_KEY optional for
ODP_PACKET_CMD_EXECUTE commands.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-06-01 13:39:51 -07:00
Ben Pfaff
0079481775 Merge 'master' into 'next'. 2011-05-12 12:05:42 -07:00
Ben Pfaff
f79e673f3d ofproto: Complete abstraction by adding enumeration and deletion functions.
This eliminates the final reference from bridge.c directly into the dpif
layer, which will make it easier to change the implementation of ofproto
to support other lower layers.
2011-05-11 12:26:08 -07:00
Ben Pfaff
640e1b2077 dpif: Improve abstraction by making 'run' and 'wait' functions per-dpif.
Until now, the dp_run() and dp_wait() functions had to be called at the top
level of the program because they applied to every open dpif.  By replacing
them by functions that take a specific dpif as an argument, we can call
them only from ofproto, which is currently the correct layer to deal with
dpifs.
2011-05-11 12:26:07 -07:00
Ben Pfaff
3a225db707 dpif: New function dpif_normalize_type().
This allows dpif types to be compared.
2011-05-04 10:20:06 -07:00
Ben Pfaff
032aa6a354 ovs-dpctl: Add -s option to print packet and byte counters. 2011-05-02 09:33:12 -07:00