This commit also adds several #include directives in source files in
order to make the 'ofp-util.h' move possible
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Most vlog calls are for the log module owned by the translation unit being
compiled, but this module was referenced indirectly through a pointer
variable. That seems silly, so this commit changes the code so that the
local vlog module is referred to directly, as &this_module.
We could get rid of the global variables for vlog modules entirely, but
I like getting linker errors when there's a duplicate module name.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
Currently, all of the PMD netdevs can only have the same number of
rx queues, which is specified in other_config:n-dpdk-rxqs.
Fix that by introducing of new option for PMD interfaces: 'n_rxq', which
specifies the maximum number of rx queues to be created for this
interface.
Example:
ovs-vsctl set Interface dpdk0 options:n_rxq=8
Old 'other_config:n-dpdk-rxqs' deleted.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Note that because there's been no prerequisite on the outer protocol,
we cannot add it now. Instead, treat the ipv4 and ipv6 dst fields in the way
that either both are null, or at most one of them is non-null.
[cascardo: abstract testing either dst with flow_tnl_dst_is_set]
cascardo: using IPv4-mapped address is an exercise for the future, since this
would require special handling of MFF_TUN_SRC and MFF_TUN_DST and OpenFlow
messages.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Co-authored-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Since we don't distinguish between IPv4 and IPv6 lookups, consolidate ARP
and ND cache into neighbor cache. Other references to ARP related to the
ARP cache but that are not really about ARP have been renamed as well.
tnl_arp_lookup is kept for lookups using IPv4 instead of IPv4-mapped
addresses, but that is going to be removed in a later patch.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
User space now may receive re-assembled IP fragments. The user space
netlink handler can now accept packets with the new OVS_PACKET_ATTR_MRU
attribute. This allows the kernel to assemble fragmented packets for the
duration of OpenFlow processing, then re-fragment at output time. Most
notably this occurs for packets that are sent through the connection
tracker.
Note that the MRU attribute is not exported at the OpenFlow layer. As
such, if packets are reassembled by conntrack and subsequently sent to
the controller, then OVS has no way to re-serialize the packets to their
original size.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This patch adds a new action and fields to OVS that allow connection
tracking to be performed. This support works in conjunction with the
Linux kernel support merged into the Linux-4.3 development cycle.
Packets have two possible states with respect to connection tracking:
Untracked packets have not previously passed through the connection
tracker, while tracked packets have previously been through the
connection tracker. For OpenFlow pipeline processing, untracked packets
can become tracked, and they will remain tracked until the end of the
pipeline. Tracked packets cannot become untracked.
Connections can be unknown, uncommitted, or committed. Packets which are
untracked have unknown connection state. To know the connection state,
the packet must become tracked. Uncommitted connections have no
connection state stored about them, so it is only possible for the
connection tracker to identify whether they are a new connection or
whether they are invalid. Committed connections have connection state
stored beyond the lifetime of the packet, which allows later packets in
the same connection to be identified as part of the same established
connection, or related to an existing connection - for instance ICMP
error responses.
The new 'ct' action transitions the packet from "untracked" to
"tracked" by sending this flow through the connection tracker.
The following parameters are supported initally:
- "commit": When commit is executed, the connection moves from
uncommitted state to committed state. This signals that information
about the connection should be stored beyond the lifetime of the
packet within the pipeline. This allows future packets in the same
connection to be recognized as part of the same "established" (est)
connection, as well as identifying packets in the reply (rpl)
direction, or packets related to an existing connection (rel).
- "zone=[u16|NXM]": Perform connection tracking in the zone specified.
Each zone is an independent connection tracking context. When the
"commit" parameter is used, the connection will only be committed in
the specified zone, and not in other zones. This is 0 by default.
- "table=NUMBER": Fork pipeline processing in two. The original instance
of the packet will continue processing the current actions list as an
untracked packet. An additional instance of the packet will be sent to
the connection tracker, which will be re-injected into the OpenFlow
pipeline to resume processing in the specified table, with the
ct_state and other ct match fields set. If the table is not specified,
then the packet is submitted to the connection tracker, but the
pipeline does not fork and the ct match fields are not populated. It
is strongly recommended to specify a table later than the current
table to prevent loops.
When the "table" option is used, the packet that continues processing in
the specified table will have the ct_state populated. The ct_state may
have any of the following flags set:
- Tracked (trk): Connection tracking has occurred.
- Reply (rpl): The flow is in the reply direction.
- Invalid (inv): The connection tracker couldn't identify the connection.
- New (new): This is the beginning of a new connection.
- Established (est): This is part of an already existing connection.
- Related (rel): This connection is related to an existing connection.
For more information, consult the ovs-ofctl(8) man pages.
Below is a simple example flow table to allow outbound TCP traffic from
port 1 and drop traffic from port 2 that was not initiated by port 1:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,ct_state=-trk,action=ct(commit,zone=9),2
table=0,in_port=2,tcp,ct_state=-trk,action=ct(zone=9,table=1)
table=1,in_port=2,ct_state=+trk+est,tcp,action=1
table=1,in_port=2,ct_state=+trk+new,tcp,action=drop
Based on original design by Justin Pettit, contributions from Thomas
Graf and Daniele Di Proietto.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
When dpdk configuration changes, all pmd threads are recreated
and rx queues of each port are reloaded. After this process,
rx queue could be mapped to a different pmd thread other than
the one before reconfiguration. However, this is totally
transparent to ofproto layer modules. So, if the ofproto-dpif-upcall
module still holds ukeys generated before pmd thread recreation,
this old ukey will collide with the ukey for the new upcalls
from same traffic flow, causing flow installation failure.
To fix the bug, this commit adds a new call-back function
in dpif layer for notifying upper layer the purging of datapath
(e.g. pmd thread deletion in dpif-netdev). So, the
ofproto-dpif-upcall module can react properly with deleting
the ukeys and with collecting flows' last stats.
Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Alex Wang <ee07b291@gmail.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
When --enable-dummy=system or --enable-dummy=override is in use, dpifs
other than "dummy" are actually dummy dpifs, so use a more reliable test.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
This function doesn't need to be exported in the public OVS headers, and
it had an inconsistent name compared to uuid_equals(). Rename and move.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This patch sacrifices a few bits of hash quality from the 128-bit unique
flow identifiers to make the UFIDs RFC4122-conformant as per the version 4
(random) UUID spec. Given that the 128-bit space is already quite large,
this should not affect the spread of UFIDs in any meaningful way for
hashing.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
DPDK lcore_id is unsigned. We need to support big values like
LCORE_ID_ANY (=UINT32_MAX). Therefore I am changing the type everywhere
in OVS.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
This init function is called when the dpif class is registered. It will
be used by following commits
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
ofpbuf was complicated due to its wide usage across all
layers of OVS, Now we have introduced independent dp_packet
which can be used for datapath packet, we can simplify ofpbuf.
Following patch removes DPDK mbuf and access API of ofpbuf
members.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Currently dp-packet make use of ofpbuf for managing packet
buffers. That complicates ofpbuf, by making dp-packet
independent of ofpbuf both libraries can be optimized for
their own use case.
This avoids mapping operation between ofpbuf and dp_packet
in datapath upcalls.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
dp_packet is short and better name for datapath packet
structure.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
This commit changes the per dpif-netdev datapath flow-table/
classifier to per pmd-thread. As direct benefit, datapath
and flow statistics no longer need to be protected by mutex
or be declared as per-thread variable, since they are only
written by the owning pmd thread.
As side effects, the flow-dump output of userspace datapath
can contain overlapping flows. To reduce confusion, the dump
from different pmd thread will be separated by a title line.
In addition, the flow operations via 'ovs-appctl dpctl/*'
are modified so that if the given flow in_port corresponds
to a dpdk interface, the operation will be conducted to all
pmd threads recv from that interface (expect for flow-get
which will always be applied to non-pmd threads).
Signed-off-by: Alex Wang <alexw@nicira.com>
Tested-by: Mark D. Gray <mark.d.gray@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Previously, the dpif layer was responsible for determining datapath
support for UFIDs, which resulted in all ovs-dpctl utilities
inserting/deleting flows from the datapath each time they are run.
Shift this responsibility up to the dpif_backer.
There are two users of this functionality: Revalidators check for UFID
support to request a terser dump using UFIDs, and dpif-netlink uses this
to request flow_del operations to only return the UFID/stats. The latter
case was previously hidden from revalidators, but this change makes them
aware of it, and reuses the same "udpif->enable_ufid" flag for reducing
overhead of both flow dump and flow delete.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
If ovs-vswitchd is killed at the right time, then a probe flow may be
left in the datapath. This commit adds the DPIF_FP_MODIFY flag to
feature probes so that re-creating the same probe flow will not cause an
error.
These flow_put flags were previously changed in commit a7d1bbdcfe49e8
("ofproto-dpif: Use DPIF_FP_CREATE but not DPIF_FP_MODIFY."), despite
the commit message addressing a different case.
Suggested-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Various functions in ofproto-dpif and dpif-netlink detect support for
features in very similar ways. Refactor their common code to a single
function.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The UFID parameter to dpif_flow_get() is optional, but the current
implementation dereferences it to initialize part of the output flow.
This field is filled in by the dpif implementation, so don't initialize
it here.
This does not fix any existing bug because every caller currently passes
in a UFID. The next patch will introduce the first call to
dpif_flow_get() that doesn't provide a UFID, which would break without
this change.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
One of the limiting factors on the number of flows that can be supported
in the datapath is the overhead of assembling flow dump messages in the
datapath. This patch modifies the dpif to allow revalidators to skip
dumping the key, mask and actions from the datapath, by making use of
the unique flow identifiers introduced in earlier patches.
For each flow dump, the dpif user specifies whether to skip these
attributes, allowing the common case to only dump a pair of 128-bit ID
and flow stats. With datapath support, this increases the number of
flows that a revalidator can handle per second by 50% or more. Support
in dpif-netdev and dpif-netlink is added in this patch; kernel support
is left for future patches.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This patch modifies the dpif interface to allow flows to be manipulated
using a 128-bit identifier. This allows revalidator threads to perform
datapath operations faster, as they do not need to serialise the entire
flow key for operations like flow_get and flow_delete. In conjunction
with a future patch to simplify the dump interface, this provides a
significant performance benefit for revalidation.
When handlers assemble flow_put operations, they specify a unique
identifier (UFID) for each flow as it is passed down to the datapath to
be stored with the flow. The UFID is currently provided to handlers
by the dpif during upcall processing.
When revalidators assemble flow_get or flow_del operations, they may
specify the UFID for the flow along with the key. The dpif will decide
whether to send only the UFID to the datapath, or both the UFID and flow
key. The former is preferred for newer datapaths that support UFID,
while the latter is used for backwards compatibility.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
All but one of the callers of dp_enumerate_types() hands it an sset that is
known to be empty. The one exception is the implementation of the
ofproto-provider 'enumerate_types' function in ofproto-dpif. That function
isn't supposed to clear the existing contents of the sset passed in.
Therefore, this commit changes dp_enumerate_types() to not clear the sset
before it adds new members.
The ofproto-provider comment wasn't clear about the desired behavior so
this commit also clarifies that.
Reported-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Since dpif registering for routing table at initialization
there is no need to unregister it. Following patch removes
support for turning routing table notifications on and off.
Due to this change OVS always listens for these
notifications.
Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
This patch shifts the responsibility for determining the hash for a flow
from the revalidation logic down to the dpif layer. This assists in
handling backward-compatibility for revalidation with the upcoming
unique flow identifier "UFID" patches.
A 128-bit UFID was selected to minimize the likelihood of hash conflicts.
Handler threads will not install a flow that has an identical UFID as
another flow, to prevent misattribution of stats and to ensure that the
correct flow key cache is used for revalidation.
For datapaths that do not support UFID, which is currently all
datapaths, the dpif will generate the UFID and pass it up during upcall
and flow_dump. This is generated based on the datapath flow key.
Later patches will add support for datapaths to store and interpret this
UFID, in which case the dpif has a responsibility to pass it through
transparently.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
OVS router depends on tnl_conf_seq and all tunnel related
components should be initialized before registering dpif
implementations.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.
Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
OVS userspace are backward compatible with older Linux kernel modules.
However, not having the most up-to-date datapath kernel modules can
some times lead to user confusion. Storing the datapath version in
OVSDB allows management software to check and optionally provide
notifications to users.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
When flow_get fails (in this case flow does not exist) simply log
the key part of the get and erase the rest of the flow because it
is invalid.
Verified the fix by doing ovs-ofctl del-flows when traffic is running.
2014-10-18T20:12:13.785Z|00011|dpif(revalidator20)|WARN|system@ovs-system: failed to flow_get (No such file or directory) dp_hash(0),recirc_id(0),skb_priority(0),in_port(2),skb_mark(0),eth(src=00:13:72:0b:52:fa,dst=00:14:72:0b:52:fa),eth_type(0x0800),ipv4(src=10.0.0.164,dst=11.0.0.164,proto=6,tos=0,ttl=4,frag=no),tcp(src=1651,dst=6095),tcp_flags(ack), packets:0, bytes:0, used:never
Signed-off-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Today dpif-netdev has single metadat for given batch, since one
batch belongs to one port, but soon packets fro single tunnel ports
can belong to different ports, so we need to have per packet metadata.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Use the new OVS_FLOW_ATTR_PROBE flag when probing for datapath feature
support. Suppress also dpif error logging when requested, as probe
failures are already logged at ofproto-dpif.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This commits adds the multithreading functionality to OVS dpdk
module. Users are able to create multiple pmd threads and set
their cpu affinity via specifying the cpu mask string similar
to the EAL '-c COREMASK' option.
Also, the number of rx queues for each dpdk interface is made
configurable to help distribution of rx packets among multiple
pmd threads.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
The patch contains the necessary modifications to compile and also to run
under MSVC.
Added the files to the build system and also changed dpif_linux to be under
a more generic name dpif_windows.
Added a TODO under the windows part in case we want to implement another
counterpart for epoll functions.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Add a new action type OVS_ACTION_ATTR_SET_MASKED, and support for
parsing, printing, and committing them.
Masked set actions add a mask, immediately following the netlink
attribute data, within the netlink attribute itself. Thus the key
attribute size for a masked set action is exactly double of the
non-masked set action.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This patch avoids the relatively inefficient miss handling processes
dictated by the dpif process, by calling into ofproto-dpif directly
through a callback.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This cleans up the dpif interface to make it more consistent with the
other dpif operations, and allows flows to be fetched in batches.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The dpif provider 'operate' call duplicates all of the features available
from the 'flow_put', 'flow_del', and 'execute' calls, yielding redundant
code in providers that support both mechanisms. This change drops the
latter calls in favor of making every dpif provider support 'operate'.
The result is code that is overall less duplicative.
It might make sense to do the same with flow_get but so far 'operate'
doesn't support flow_get.
Signed-off-by: Ben Pfaff <blp@nicira.com>
This commit introduces multiple appctl commands (dpctl/*)
They are needed to interact with userspace datapaths (dpif-netdev), because the
ovs-dpctl command runs in a separate process and cannot see the userspace
datapaths inside vswitchd.
This change moves most of the code of utilities/ovs-dpctl.c in lib/dpctl.c.
Both the ovs-dpctl command and the ovs-appctl dpctl/* commands make calls to
lib/dpctl.c functions, to interact with datapaths.
The code from utilities/ovs-dpctl.c has been moved to lib/dpctl.c and has been
changed for different reasons:
- An exit() call in the old code made perfectly sense. Now (since the code
can be run inside vswitchd) it would terminate the daemon. Same reasoning
can be applied to ovs_fatal_*() calls.
- The lib/dpctl.c code _should_ not leak memory.
- All the print* have been replaced with a function pointer provided by the
caller, since this code can be run in the ovs-dpctl process (in which
case we need to print to stdout) or in response to a unixctl request (and
in this case we need to send everything through a socket, using JSON
encapsulation).
The syntax is
ovs-appctl dpctl/(COMMAND) [OPTIONS] [PARAMETERS]
while the ovs-dpctl syntax (which _should_ remain the same after this change)
is
ovs-dpctl [OPTIONS] (COMMAND) [PARAMETERS]
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
[blp@nicira.com made stylistic and documentation changes]
Signed-off-by: Ben Pfaff <blp@nicira.com>
Typically, kernel datapath threads send upcalls to userspace where
handler threads process the upcalls. For TAP and DPDK devices, the
datapath threads operate in userspace, so there is no need for
separate handler threads.
This patch allows userspace datapath threads to directly call the
ofproto upcall functions, eliminating the need for handler threads
for datapaths of type 'netdev'.
Signed-off-by: Ryan Wilson <wryan@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
The userspace datapath returns RCU-protected actions from flow_get() and
flow_dump_next(). This doesn't cause any trouble for current users of
these functions, but it imposes additional constraints on their use.
This patch makes the dpif documentation more explicit about how the
results of these functions can be used.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Change the interface to allow implementations to pass back a buffer, and
allow callers to specify which of actions, mask, and stats they wish to
receive. This will be used in the next commit.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This should make it easier to identify problems that result from bugs in
the execution helper.
Suggested-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>