This patch adds set-field operations for nd_target, nd_sll, and nd_tll
fields, with and without masks, using Nicira extensions and OpenFlow 1.2
protocol.
Signed-off-by: Randall A Sharo <randall.sharo at navy.mil>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This commit fixes an include dependency for header ip6.h, on
FreeBSD. Without this commit, the gmake of ovs master on
FreeBSD will result in the following error.
/usr/include/netinet/ip6.h:82: error: field 'ip6_src' has incomplete type
/usr/include/netinet/ip6.h:83: error: field 'ip6_dst' has incomplete type
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.
Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Today dpif-netdev has single metadat for given batch, since one
batch belongs to one port, but soon packets fro single tunnel ports
can belong to different ports, so we need to have per packet metadata.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Firstly, with this change, the 'more_actions' parameter is removed and
is integrated into 'steal'. Then, every function that receives a batch
of packets with 'steal' set to true is responsible for freeing the
packets. Finally, odp_execute_actions() and odp_execute_actions__()
can be be merged.
This also fixes a memory leak in odp_execute_sample(), when the
subactions are not executed
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
If odp_execute_actions() has been called with 'steal' set to true and
OVS_ACTION_ATTR_RECIRC as last action, it should allow dp_execute_cb()
to steal the packet.
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
When building with DPDK support, 'struct dpif_packet' won't have 'dp_hash'
member. dpif_packet_set_dp_hash() and dpif_packet_get_dp_hash() should be used.
Furthermore, the masked set action shouldn't read 'md->dp_hash' (which is
shared in a batch), but should use dpif_packet_get_dp_hash() to get each packet
private hash.
This commit fixes the build with DPDK.
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Add a new action type OVS_ACTION_ATTR_SET_MASKED, and support for
parsing, printing, and committing them.
Masked set actions add a mask, immediately following the netlink
attribute data, within the netlink attribute itself. Thus the key
attribute size for a masked set action is exactly double of the
non-masked set action.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
These function are used to stored the packet hash. 'netdev-dpdk'
automatically set this value to the RSS hash returned by the
NIC. Other 'netdev's set it to 0 (which is an invalid hash
value), so that callers can compute the hash on their own.
If DPDK support is enabled, struct dpif_packet's member
'dp_hash' is removed and 'pkt.hash.rss' from DPDK mbuf is used
This commit also configure DPDK devices to compute RSS hash
for UDP and IPv6 packets
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Until now, the OVS source tree has had a whole maze of header files that
make "#include <linux/openvswitch.h>" work OK regardless of platform, but
this confuses everyone new to the tree, at first glance, and is difficult
to understand at second glance too.
This commit renames include/linux/openvswitch.h to
datapath/linux/compat/include/linux/openvswitch.h without other change,
then modifies the userspace build to generate a header that makes sense in
portable Open vSwitch userspace from that header.
It then removes all the remaining include/linux/* files since they are now
unused.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
dpif-packet contains ofpbuf which points to packet data. Here buf
is better name rather than ofp.
Following patch renames all remaining instances of ofp variable.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
This change in dpif-netdev allows faster packet processing for devices which
implement batching (netdev-dpdk currently).
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
This commit introduces a new data structure used for receiving packets from
netdevs and passing them to dpifs.
The purpose of this change is to allow storing some private data for each
packet. The subsequent commits make use of it.
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
In case DP_HASH and RECIRC actions need to be executed in slow path,
current implementation simply don't handle them -- vswitchd simply
crashes. This patch fixes them by supply an implementation for them.
RECIRC will be handled by the datapath, same as the output action.
DP_HASH, on the other hand, is handled in the user space. Although the
resulting hash values may not match those computed by the datapath, it
is less expensive; current use case (bonding) does not require a strict
match to work properly.
Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Currently recirculation action can optionally compute hash. This patch
adds a hash action that is independent of the recirc action, which
no longer computes hash. For megaflow bond with recirc, the output
to a bond port action will look like:
hash(hash_l4(0)), recirc(<recirc_id>)
Obviously, when a recirculation application that does not depend on
hash value can just use the recirc action alone.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Reviewed-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Pravin B Shelar <pshelar@nicira.com
Infrastructure to enable megaflow support for bond ports using
recirculation. This patch adds the following features:
* Generate RECIRC action when bond can benefit from recirculation.
* Populate post recirculation rules in a hidden table. Currently table 254.
* Uses post recirculation rules for bond rebalancing
* A recirculation implementation in dpif-netdev.
The goal of this patch is to be able to megaflow bond outputs and
thus greatly improve performance. However, this patch does not
actually improve the megaflow generation. It is left for a later commit.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Rename 'l2' to 'frame' and add new ofpbuf_set_frame() and ofpbuf_l2().
ofpbuf_set_frame() alse resets all the layer offsets. ofpbuf_l2()
returns NULL if the packet has no Ethernet header, as indicated either
by unset l3 offset or NULL frame pointer. Callers of ofpbuf_l2() are
supposed to check the return value, unless they can otherwise be sure
that the packet has a valid Ethernet header.
The recent commit 437d0d22 made some assumptions that were not valid
regarding the use of the 'l2' pointer in rconn module and by
compose_rarp(). This is now fixed as follows: rconn now relies on the
fact that once OpenFlow messages are given to rconn for transport, the
frame pointer is no longer needed to refer to the OpenFlow header; and
compose_rarp() now sets the frame pointer and offsets as expected.
In addition to storing network frames, ofpbufs are also used for
handling OpenFlow messages and action lists. lib/ofpbuf.h now has a
comment documenting the current usage conventions and invariants.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Code reads better without the "get", for example "ofpbuf_l3()"
v.s. "ofpbuf_get_l3()". L4 payoad access functions still use the
"get" (e.g., "ofpbuf_get_tcp_payload()").
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
This patch shrinks the struct ofpbuf from 104 to 48 bytes on 64-bit
systems, or from 52 to 36 bytes on 32-bit systems (counting in the
'l7' removal from an earlier patch). This may help contribute to
cache efficiency, and will speed up initializing, copying and
manipulating ofpbufs. This is potentially important for the DPDK
datapath, but the rest of the code base may also see a little benefit.
Changes are:
- Remove 'l7' pointer (previous patch).
- Use offsets instead of layer pointers for l2_5, l3, and l4 using
'l2' as basis. Usually 'data' is the same as 'l2', but this is not
always the case (e.g., when parsing or constructing a packet), so it
can not be easily used as the offset basis. Also, packet parsing is
faster if we do not need to maintain the offsets each time we pull
data from the ofpbuf.
- Use uint32_t for 'allocated' and 'size', as 2^32 is enough even for
largest possible messages/packets.
- Use packed enum for 'source'.
- Rearrange to avoid unnecessary padding.
- Remove 'private_p', which was used only in two cases, both of which
had the invariant ('l2' == 'data'), so we can temporarily use 'l2'
as a private pointer.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Add basic recirculation infrastructure and user space
data path support for it. The following bond mega flow patch will
make use of this infrastructure.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
DPDK can receive multiple packets but current netdev API does
not allow that. Following patch allows dpif-netdev receive batch
of packet in a rx_recv() call for any netdev port. This will be
used by dpdk-netdev.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
This is in preparation for pushing vlan tags
using the TPID provided by the kernel via auxdata.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This helps reduce confusion about when a flow is a flow and when it is
just metadata.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Use one callback instead of many, helps in adding new functionality
later on.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This allows other libraries to use util.h that has already
defined NOT_REACHED.
Signed-off-by: Harold Lim <haroldl@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Allowing the packet to be modified by execution allows less data
copying for userspace action execution. Some users of the
dpif_execute already expect that the packet may be modified. This
patch makes this behavior uniform and makes the userspace datapath and
the execution helpers modify the packet as it is being executed.
Userspace action now steals the packet if given permission, as the
packet is normally not needed after it. The only exception is the
sample action, and this is accounted for my keeping track of any
actions that could be following the userspace action.
The packet in dpif_upcall is changed from a pointer to a struct,
allowing the packet to be honest about it's headroom. After this
change the packet can safely be pushed on over the precarious 4 byte
limit earlier allowed by the netlink data preceding the packet.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
tcp_flags=flags/mask
Bitwise match on TCP flags. The flags and mask are 16-bit num‐
bers written in decimal or in hexadecimal prefixed by 0x. Each
1-bit in mask requires that the corresponding bit in port must
match. Each 0-bit in mask causes the corresponding bit to be
ignored.
TCP protocol currently defines 9 flag bits, and additional 3
bits are reserved (must be transmitted as zero), see RFCs 793,
3168, and 3540. The flag bits are, numbering from the least
significant bit:
0: FIN No more data from sender.
1: SYN Synchronize sequence numbers.
2: RST Reset the connection.
3: PSH Push function.
4: ACK Acknowledgement field significant.
5: URG Urgent pointer field significant.
6: ECE ECN Echo.
7: CWR Congestion Windows Reduced.
8: NS Nonce Sum.
9-11: Reserved.
12-15: Not matchable, must be zero.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
In current code, the odp_execute_actions() function does not check
the value of "userspace" function pointer before invoking it. This
patch adds a check for it.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This support is added through the userspace slow path, because we don't
judge that this is important enough to require permanent support in the
Linux kernel ABI.
Bug #19259.
CC: Teemu Koponen <koponen@nicira.com>
CC: Pankaj Thakkar <thakkar@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The skb_mark field is currently only available with the Linux datapath
and is only used internally. However, it is desirable to expose this
through OpenFlow and when it is exposed ideally it would not be system-
specific. In preparation for this, skb_mark is rename to pkt_mark in
internal data structures for consistency.
This does not rename the Linux interfaces because doing so would break
the API. It would not necessarily be desirable to do anyways since in
Linux-specific code it is clearer to use the actual name rather than a
generic one. This can lead to confusion in some places, however, because
we do not always strictly separate generic and platform dependent code
(one example is actions). This seems inevitable though at this point if
the lower and upper layers have different names (as they must given the
above requirements).
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Now that execute_actions() is available it can be used as a generic
replacement for special-case action execution in
execute_controller_action().
As suggested by Jesse Gross.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The motivation for this is to allow such actions to be honoured
if they are encountered; by the user-space datapath before recirculation;
or by internal processing of actions by ovs-vswitchd before recirculation.
Recirculation will be added by a subsequent patch.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This moves generic action execution code out of lib/dpif-netedev.c
and into a new file, lib/odp-execute.c.
This is in preparation for using odp_execute_actions()
in lib/odp-util.c to handle recirculation/
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>