This patch makes it the caller's responsibility to initialize a
per-thread 'state' object and pass it down to the dpif_flow_dump_next()
implementation. The implementation can expect to be called from multiple
threads with the same 'iter' and different 'state' objects.
When flow_dump_next() returns non-zero, the implementation must ensure
that subsequent calls with the same arguments also return non-zero.
Subsequent calls with the same 'iter' and different 'state' may return
zero, but should make progress towards returning non-zero.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This patch separates the structures for thread-local flow dump state
("state") from the shared flow dump state ("iter") in dpif-linux and
dpif-netdev. Future patches will make use of this to allow multiple
threads to dump flows from the same flow dump operation.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This helps reduce confusion about when a flow is a flow and when it is
just metadata.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Use one callback instead of many, helps in adding new functionality
later on.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Commit da546e0 (dpif: Allow execute to modify the packet.) uninitializes
the "dpif_upcall.packet" of "struct upcall" when dpif_recv() returns error.
The packet ofpbuf is likely uninitialized in this case, hence calling
ofpbuf_uninit() on it will likely cause a SEGFAULT.
This commit fixes this bug by only uninitializing packet's ofpbuf on
successfully received upcalls.
A note warning about this is added on the comment of dpif_recv() in
dpif.c and dpif-provider.h.
Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This allows other libraries to use util.h that has already
defined NOT_REACHED.
Signed-off-by: Harold Lim <haroldl@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Allowing the packet to be modified by execution allows less data
copying for userspace action execution. Some users of the
dpif_execute already expect that the packet may be modified. This
patch makes this behavior uniform and makes the userspace datapath and
the execution helpers modify the packet as it is being executed.
Userspace action now steals the packet if given permission, as the
packet is normally not needed after it. The only exception is the
sample action, and this is accounted for my keeping track of any
actions that could be following the userspace action.
The packet in dpif_upcall is changed from a pointer to a struct,
allowing the packet to be honest about it's headroom. After this
change the packet can safely be pushed on over the precarious 4 byte
limit earlier allowed by the netlink data preceding the packet.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
With single datapath, multiple userspace bridges share the same datapath.
As such it does not look beneficial that we decide a valid open flow port
number based on the number of ports in the datapath specially now that
we have the ofport_request column in OVSDB.
This commit does not remove ofproto_init_max_ports() interface as defined
in ofproto-provider.h as there may be other implementations that still use it.
But ofproto-dpif should not need it.
Bug #20163.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
If the datapath actions exceed the maximum size of a Netlink attribute
(about 64 kB), then previously we would assert-fail (before commit
542024c4c3 "ofproto-dpif-xlate: Suppress oversize datapath actions.")
or just drop all of them (after that commit). This commit makes OVS cope
by slow-pathing the flow and executing all of its actions in userspace.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, OVS has expected that the datapath supports all the actions
required by any flow to be installed. There are at least two reasons why
a datapath might not support a given action:
- The datapath version is older than the userspace version, and the
action was introduced after the version of the datapath in use.
- The action is not considered important enough to implement as part of
an ABI that must be maintained forever.
This commit adds infrastructure to handle these cases. It doesn't actually
add any uses; that will come in an upcoming commit.
Signed-off-by: Ben Pfaff <blp@nicira.com>
With this commit, whenever the verbosity is enabled with '-m'
option, the ovs-dpctl dump-flows command will display the flows with
in_port field showing the name instead of a port number.
Conversely, one can also use a name in the in_port field with del-flow,
add-flow and mod-flow commands of ovs-dpctl. One should also be able
to use the port name when supplying the datapath flow as an input
to ofproto/trace command.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The declaration of 'get_max_ports()' to return odp_port_t adds
unwanted complexity to coding. This commit changes it back to
return uint32_t type.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
When verbose mode tuned on, all dp flow fields described by the netlink
attributes are displayed, including fully wildcarded attributes.
Otherwise, the fully wildcarded attributes are omitted for brevity.
Added -m option to "ovs-dpctl dump-flows" to enable verbose mode. It is
off by default.
Signed-off-by: Andy Zhou <azhou@nicira.com>
[blp@nicira.com added documentation]
Signed-off-by: Ben Pfaff <blp@nicira.com>
This commit adds annotations for thread safety check. And the
check can be conducted by using -Wthread-safety flag in clang.
Co-authored-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
When debugging the system, it's useful to not just see the key but
also the mask.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
If a flow cannot be installed in the datapath, we should notice
this and not treat it as installed. This becomes an issue with
megaflows, since a batch of unique flows may come in that generate
a single new datapath megaflow that covers them. Since userspace
doesn't know whether the datapath supports megaflows, each unique
flow will get a separate flow entry (which overlap when masks are
applied) and all except the first will get rejected by a megaflow-
supporting datapath as duplicates.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Until now, datapath ports and openflow ports were both represented by
unsigned integers of various sizes. With implicit conversions, etc., it is
easy to mix them up and use one where the other is expected. This commit
creates two typedefs, ofp_port_t and odp_port_t. Both of these two types
are marked by "__attribute__((bitwise))" so that sparse can be used to
detect any misuse.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Added support to allow mega flow specified and displayed. ovs-dpctl tool
is mainly used as debugging tool.
This patch also implements the low level user space routines to send
and receive mega flow netlink messages. Those netlink suppor
routines are required for forthcoming user space mega flow patches.
Added a unit test to test parsing and display of mega flows.
Ethan contributed the ovs-dpctl mega flow output function.
Co-authored-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
The caller wants to know whether 'devname' is attached to 'dpif', and
ENOENT is a legitimate response to that not being the case.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Depending on the port and type of datapath, a port may need to be opened
as a different type of device than it's configured. For example, an
"internal" port on a "dummy" datapath should opened as a "dummy" port.
This commit adds the ability for a dpif to provide this information to a
caller. It will be used in a future commit.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Provide the ability to determine whether a port exists in a datapath
without having to deal with a "dpif_port" structure as with
dpif_port_query_by_name(). A future patch will use this function.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Most of the code referred to datapath ports as 32-bit values, but a few
places still used 16-bit references.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
The ESX userspace looks quite a bit like linux, but has some key
differences which need to be specially handled in the build. To
distinguish between ESX and systems which use the linux datapath
module, this patch adds two new macros "ESX" and "LINUX_DATAPATH".
It uses these macros to disable building code on ESX which only
applies to a true Linux environment. In addition, it adds a new
route-table-stub implementation which is required for the build to
complete successfully on ESX.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Soon the kernel will begin supplying the information about the outer
IP header for tunneled packets and userspace will need to be able to
track it as part of the flow. For the time being this is only used
internally by OVS and not exposed outwards to OpenFlow. As a result,
this threads the information throughout userspace but simply stores
the existing tun_id in it.
Signed-off-by: Jesse Gross <jesse@nicira.com>
The following commit will need to use a value other than a literal
time_msec() in one case. This commit is just preparation.
Factoring the time_msec() call out of the loop in
handle_flow_miss_without_facet() is a really minor optimization. It isn't
the main point here.
Signed-off-by: Ben Pfaff <blp@nicira.com>
The datapath allows requesting a specific port number for a port, but
the dpif interface didn't expose it. This commit adds that support.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Until now, packets for these special protocols have been mixed with general
traffic in the kernel-to-userspace queues. This means that a big-enough
storm of new flows in these queues can cause packets for these special
protocols to be dropped at this interface, fooling userspace into believing
that, say, no CFM packets have been received even though they are arriving
at the expected rate.
This commit moves special protocols to a dedicated kernel-to-userspace
queue to avoid the problem.
Bug #7550.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, crediting statistics to OpenFlow rules due to "resubmit" actions
has required setting up a "resubmit hook" with a callback function and
auxiliary data. This commit makes it easier to do, by adding a member to
struct action_xlate_ctx that specifies statistics to credit to each
resubmitted rule.
This commit includes one small behavioral change as an optimization.
Previously, rule_execute() translated the rule twice: once to get the ODP
actions, then a second time after executing the ODP actions to credit
statistics to the rules. After this commit, rule_execute() translates the
rule only once, crediting statistics as a side effect. The difference only
becomes visible when executing the actions fails: previously the statistics
would not be incremented, after this commit they will be. It is very
unusual for executing actions to fail (generally this indicates a bug) so
I'm not concerned about it.
Signed-off-by: Ben Pfaff <blp@nicira.com>
handle_flow_miss() didn't update subfacet "used" times for packets
processed by userspace. This commit fixes the problem.
Found by inspection. I didn't verify the bug in testing.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Without logging of operation groups, it becomes more difficult to debug
problems related to flow setups, since those go through operation groups.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, a "flow put" has represented its parameters in two different
ways, depending on whether it was coming from dpif_flow_put() or from
dpif_operate(), and similarly for an "execute" operation. This commit
adopts the operation struct consistently within the dpif provider
interface, which seems cleaner.
This commit also factors out logging for flow puts and executes, which
is useful in the following commit.
This doesn't change the dpif client interface, since the two forms are
more convenient for clients than always filling out an operation struct.
Signed-off-by: Ben Pfaff <blp@nicira.com>
I'd like to change ->dpif_flow_put() and ->dpif_execute() in the dpif
provider to take the structures of the same names as parameters, instead of
passing them discrete parameters, because this seems like a more sensible
way to do things internally than to have two different ways to pass the
parameters. It might even simplify code slightly. But ->flow_put() and
->execute() wouldn't want the 'type' (because it's implied by the function
being called) or 'error' (because it would be the same as the return
value). Although of course they could just ignore those members, it seems
slightly cleaner to omit them entirely, as this change allows.
Signed-off-by: Ben Pfaff <blp@nicira.com>
At one point in the past, there were three separate queues between the
kernel module and OVS userspace, each of which corresponded to a Netlink
socket (or, before that, to a character device). It made sense to allow
each of these to be enabled or disabled separately, hence the "listen mask"
concept in the dpif layer.
These days, the concept is much less clear-cut. Queuing is no longer on
the basis of different classes of packets but instead striped across a
collection of sockets based on input port. It doesn't really make sense
to enable receiving packets on the basis of the kind of packet anymore.
Accordingly, this commit simplifies the "listen_mask" to just a bool that
either enables or disables receiving packets.
It could be useful to enable or disable receiving packets on a per-vport
basis, but the rest of the code isn't ready to make use of that so this
commit doesn't generalize this much.
Based on this discussion on ovs-dev:
http://openvswitch.org/pipermail/dev/2011-October/012044.html
Signed-off-by: Ben Pfaff <blp@nicira.com>
This commit switches from using the actual protocol values of error codes
internally in Open vSwitch, to using abstract values that are translated to
and from protocol values at message parsing and serialization time. I
believe that this makes the code easier to read and to write.
This is also one step along the way toward OpenFlow 1.1 support because
OpenFlow 1.1 renumbered a bunch of error codes.
Signed-off-by: Ben Pfaff <blp@nicira.com>
ofp_print_packet() and ofp_packet_to_string() don't use the
'total_len' argument which they require callers to supply.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
The key to getting good performance on the netperf CRR test seems to be to
handle the first packet of each new flow as quickly as possible. Until
now, we've only had one opportunity to do that on each trip through the
main poll loop. One way to improve would be to make that poll loop
circulate more quickly. My experiments show, however, that even just
commenting out the slower parts of the poll loop yield minimal improvement.
This commit takes another approach. Instead of making the poll loop
overall faster, it invokes the performance-critical parts of it more than
once during each poll loop.
My measurements show that this commit improves netperf CRR performance by
24% versus the previous commit, for an overall improvement of 87% versus
the baseline just before the commit that removed the poll_fd_woke(). With
this commit, ovs-benchmark performance has also improved by 13% overall
since that baseline.
The unit tests did not allow users to run them as root because
ovs-vswitchd would destroy all of the existing 'system' datapaths.
This patch prevents ovs-vswitchd from registering 'system'
datapaths when running unit tests preventing the issue.