This patch adds support for rewriting SCTP src,dst ports similar to the
functionality already available for TCP/UDP.
Rewriting SCTP ports is expensive due to double-recalculation of the
SCTP checksums; this is performed to ensure that packets traversing OVS
with invalid checksums will continue to the destination with any
checksum corruption intact.
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
OVS_ACTION_ATTR_USERSPACE action was sending the key from the matching
flow. This works for exact match flows because flow keys are the
same as packet keys. However, it does not work with wildcarded flows as
the packet keys may be different than the flow keys. This patch uses
the packet keys carried in OVS_CB(skb) when calling output_userspace().
Bug #18163
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
As suggested by Jesse in the comment for patch "gre: Restructure
tunneling", following patch keeps skb->csum correct across ovs.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
We currently allow five trips through the kernel datapath
before dropping the packet to protect the stack. However, there
have been a few reports recently involving tunneling that this is
still too much. Although it's not a complete solution, this reduces
the limit by one to balance safety in common situations with
flexibility.
Bug #15477
Reported-by: Paul Ingram <paul@nicira.com>
Reported-by: 謝秉融 <faithfulman@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Since userspace flow based tunneling code is checked in, the kernel
port based tunneling code can be removed.
Patch removes following components:
- tunnel ports hash table and moved tunnel ports list to individual
vports.
- Cleaned per tnl-port config.
- OVS_KEY_ATTR_TUN_ID action is removed.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #15078
In vlan_insert_tag(), we insert a 4-byte VLAN header _after_
mac header:
memmove(skb->data, skb->data + VLAN_HLEN, 2 * ETH_ALEN);
...
veth->h_vlan_proto = htons(ETH_P_8021Q);
...
veth->h_vlan_TCI = htons(vlan_tci);
so after it, we should recompute the checksum to include these 4 bytes.
skb->data still points to the mac header, therefore VLAN header is at
(2 * ETH_ALEN = 12) bytes after it, not (ETH_HLEN = 14) bytes.
This can also be observed via tcpdump:
0x0000: ffff ffff ffff 5254 005d 6f6e 8100 000a
0x0010: 0806 0001 0800 0604 0001 5254 005d 6f6e
0x0020: c0a8 026e 0000 0000 0000 c0a8 0282
Similar for __pop_vlan_tci(), the vlan header we remove is the one
overwritten in:
memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);
Therefore the VLAN_HLEN = 4 bytes after 2 * ETH_ALEN is the part
we want to sub from checksum.
Cc: David S. Miller <davem@davemloft.net>
Cc: Jesse Gross <jesse@nicira.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
This patch adds support for skb mark matching and set action.
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Ansis Atteka <aatteka@nicira.com>
When the IPV4_TUNNEL action is executed, a pointer in the skb is
directly assigned the address of the action, which is protected by
RCU. If a TUN_ID action is later executed it will write into the
action, which is not allowed by RCU. This avoids the problem by
making a copy of the data and writing into the copy.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
Datapath tunneling check for flag OVS_FLOW_TNL_F_KEY is failing,
causing it to drop packet. This only happens on tunnels with
zero key as vswitchd does not generate set-tunnel action. Set
tunnel action sets this flags for given action. To fix this issue
the check is dropped.
Bug #13666
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This is a first pass at providing a tun_key which can be
used as the basis for flow-based tunnelling. The
tun_key includes and replaces the tun_id in both struct
ovs_skb_cb and struct sw_tun_key.
This patch allows all existing tun_id behaviour to still work. Existing
users of tun_id are redirected to tun_key->tun_id to retain compatibility.
However, when the userspace code is updated to make use of the new
tun_key, the old behaviour will be deprecated and removed.
NOTE: With these changes, the tunneling code no longer assumes input and
output keys are symmetric. If they are not, PMTUD needs to be disabled
for tunneling to work.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reviewed-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
When modifying IP addresses or ports on a UDP packet we don't
correctly follow the rules for unchecksummed packets. This meant
that packets without a checksum can be given a incorrect new checksum
and packets with a checksum can become marked as being unchecksummed.
This fixes it to handle those requirements.
Bug #8937
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Use hash table to store ports of datapath. Allow 64K ports per switch.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #2462
OVS has quite a few global symbols that should be scoped with a
prefix to prevent collisions with other modules in the kernel.
Suggested-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Many of our kernel copyright messages make reference to code being
copied from the Linux kernel, which is a bit odd for code in the
kernel. This changes them to use the standard GNU GPL boilerplate
instead. It does not change the actual license, which continues to
be GPLv2.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
In the future it is likely that our vlan support will expand to
include multiply tagged packets. When this happens, we would
ideally like for it to be consistent with our current tagging.
Currently, if we receive a packet with a partial VLAN tag we will
automatically drop it in the kernel, which is unique among the
protocols we support. The only other reason to drop a packet is
a memory allocation error. For a doubly tagged packet, we will
parse the first tag and indicate that another tag was present but
do not drop if the second tag is incorrect as we do not parse it.
This changes the behavior of the vlan parser to match other protocols
and also deeper tags by indicating the presence of a broken tag with
the 802.1Q EtherType but no vlan information. This shifts the policy
decision to userspace on whether to drop broken tags and allows us to
uniformly add new levels of tag parsing.
Although additional levels of control are provided to userspace, this
maintains the current behavior of dropping packets with a broken
tag when using the NORMAL action because that is the correct behavior
for an 802.1Q-aware switch. The userspace flow parser actually
already had the new behavior so this corrects an inconsistency.
Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
When the datapath was converted to use Netlink attributes for describing
flow keys, I had a vague idea of how it could be smoothly extensible, but
I didn't actually implement extensibility or carefully think it through.
This commit adds a document that describes how flow keys can be extended
in a compatible fashion and adapts the existing interface to match what
it says.
This commit doesn't actually implement extensibility. I already have a
separate patch series out for that. This patch series borrows from that
one heavily, but the extensibility series will need to be reworked
somewhat once this one is in.
This commit is only lightly tested because I don't have a good test setup
for VLANs.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
There are only two symbols in actions.h. Compatibility function
is moved to compat.h and execute_actions() declaration is moved
to datapath.h
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Add support matching the IPv4 TTL and IPv6 hop limit fields. This
commit also adds support for modifying the IPv4 TTL. Modifying the IPv6
hop limit isn't currently supported, since we don't support modifying
IPv6 headers.
We will likely want to change the user-space interface, since basic
matching and setting the TTL are not generally useful. We will probably
want the ability to match on extraordinary events (such as TTL of 0 or 1)
and a decrement action.
Feature #8024
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
When updating the IP TOS field, the checksum was not properly calculated
on little endian systems. This commit fixes the issue.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Following patch adds skb-priority to flow key. So userspace will know
what was priority when packet arrived and we can remove the pop/reset
priority action. It's no longer necessary to have a special action for
pop that is based on the kernel remembering original skb->priority.
Userspace can just emit a set priority action with the original value.
Since the priority field is a match field with just a normal set action,
we can convert it into the new model for actions that are based on
matches.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7715
Commit 4edb9ae90e "datapath: Refactor
actions in terms of match fields." introduced a spurious warning
because the compiler thinks a value might not have been assigned to
'err'. In practice this can't happen because we've already validated
the actions.
CC: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Almost all current actions can be expressed in the form of
push/pop/set <field>, where field is one of the match fields. We can
create three base actions and take a field. This has both a nice
symmetry and avoids inconsistencies where we can match on the vlan
TPID but not set it.
Following patch converts all actions to this new format.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7115
Commit b063d9f06 "datapath: Use unicast Netlink sockets for upcalls" that
switched from multicast to unicast Netlink for sending upcalls added a
Netlink PID to each kernel flow, used by OVS_ACTION_ATTR_USERSPACE actions
within the flow as target.
This commit drops this per-flow PID in favor of a per-action PID, because
that is more flexible. It does not yet make use of this additional
flexibility, so behavior should not change.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7559.
Following patch removes RT kernel support. This allows us to cleanup
the loop detection.
Along with this BH is now disabled while running execute_actions()
for packet from user-space.
As a result we can simplify the stats code as entire send and receive
path runs in BH context on all supported platforms.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7621
There is not need to clone skb while sending packet to user-space.
Since data is only read from packet skb.
Signed-off-by: Pravin Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Following patch adds sampling action which takes probability and set
of actions as arguments. When probability is hit, actions are executed for
given packet.
USERSPACE action's userdata (u64) is used to store struct
user_action_cookie as cookie. CONTROLLER action is fixed accordingly.
Now we can remove sFlow code from kernel and implement sFlow generically
as SAMPLE action. sFlow is defined as SAMPLE Action with probability (sFlow
sampling rate) and USERSPACE action as argument. USERSPACE action's data
is used as cookie. sFlow uses this cookie to store output-port, number of
output ports and vlan-id. sample-pool is calculated by using vport
stats.
Signed-off-by: Pravin Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The code for outputting a packet can be simplified a little and
also modernized. There is no functional change.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
If we encounter an error when sending a packet to userspace due to
an explicit action we stop processing further actions. This makes
sense for things like push vlan, where to continue means outputting
an incorrect packet. However, sending to userspace is more akin
to outputting to a port, which does not halt further processing.
For consistency, ignore errors in this case as well.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Currently the kernel vlan actions mirror those used by OpenFlow 1.0.
i.e. MODIFY and STRIP. More flexible approach is to have an action to
push a tag and pop a tag off, so that it can handle multiple levels of vlan
tags. Plus it aligns with newer version of OpenFlow.
As this patch replaces MODIFY with PUSH semantic, action
mapping done in userpace is fixed accordingly.
GSO handling for multiple levels of vlan tags is also added as
Jesse suggested before.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
The prefix "ODP_*" is not overly descriptive in the context of the
larger Linux tree. This commit changes the prefix to "OVS_*" for the
userpace to kernel interactions. The userspace libraries still use
"ODP_" in many of their interfaces since it is more descriptive in the
OVS oeuvre.
Feature #6904
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
ODP_ACTION_ATTR_CONTROLLER in the kernel actually sends packets to
userspace, not the controller. To make it generic rename this action
to ODP_ACTION_ATTR_USERSPACE.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
It's possible to trace kfree_skb() call sites to find out where
packets are getting dropped. Situations where kfree_skb() does
not actually indicate an error adds additional noise, so use
consume_skb() instead to avoid tracing non-errors.
Suggested-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The current implementation of make_writable() is both overly complex
and unnecessarily aggressive about copying data. We can improve
performance by only making a copy of the data if someone else holds
a reference to the portion of the data that we want to modify. This
means that if a clone is held by the TCP stack for retransmission then
we do not need to make a copy if we are changing the IP header because
it will get regenerated on retransmit anyways. Even when it is necessary
to copy we avoid duplicating struct sk_buff.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The sematics for setting a vlan tag are to modify the existing tag
if one exists. This can be expressed as removing the existing tag
first and then adding a new one. This simplifies the code by not
requiring two copies of the logic that manipulates non-accelerated
vlans and should not make a performance difference because the vlan
tag is contained in a single cache line.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The NXAST_DROP_SPOOFED_ARP action has been deprecated in favor of
defining flows using the NXM_NX_ARP_SHA flow match for a while. This
commit removes it.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
The fields of the kernel flow key are now grouped by protocol rather
than using generic names. The containing structures describe the
category, so it is no longer necessary to use prefixes. Most of
these prefixes have been removed but nw_proto and nw_tos have
retained them. This renames the fields for consistency and brevity.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Currently the whole flow key struct is hashed on every packet received from the
network or userspace. The whole struct is also compared byte-for-byte when
doing flow table lookups. This consumes a fair percentage of CPU time, and most
of the time part of the structure is unused (e.g. the IPv6 fields when handling
IPv4 traffic; the IPv4 fields when handling Ethernet frames).
This commit reorders the fields in the flow key struct to put the least
commonly used elements at the end and changes the hash and comparison functions
to look only at the part that contains data.
Signed-off-by: Andrew Evans <aevans@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
It's (almost) always easier to understand a function with fewer parameters,
so this removes the now-redundant sw_flow_key and actions parameters from
execute_actions(), since they can be found through OVS_CB(skb)->flow now.
This also necessarily moves loop detection into execute_actions().
Otherwise, the flow's actions could have changed between the time that the
loop was detected and the time that it was suppressed, which would mean
that the wrong (version of the) flow would get suppressed.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>