In the future it is likely that our vlan support will expand to
include multiply tagged packets. When this happens, we would
ideally like for it to be consistent with our current tagging.
Currently, if we receive a packet with a partial VLAN tag we will
automatically drop it in the kernel, which is unique among the
protocols we support. The only other reason to drop a packet is
a memory allocation error. For a doubly tagged packet, we will
parse the first tag and indicate that another tag was present but
do not drop if the second tag is incorrect as we do not parse it.
This changes the behavior of the vlan parser to match other protocols
and also deeper tags by indicating the presence of a broken tag with
the 802.1Q EtherType but no vlan information. This shifts the policy
decision to userspace on whether to drop broken tags and allows us to
uniformly add new levels of tag parsing.
Although additional levels of control are provided to userspace, this
maintains the current behavior of dropping packets with a broken
tag when using the NORMAL action because that is the correct behavior
for an 802.1Q-aware switch. The userspace flow parser actually
already had the new behavior so this corrects an inconsistency.
Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
When the datapath was converted to use Netlink attributes for describing
flow keys, I had a vague idea of how it could be smoothly extensible, but
I didn't actually implement extensibility or carefully think it through.
This commit adds a document that describes how flow keys can be extended
in a compatible fashion and adapts the existing interface to match what
it says.
This commit doesn't actually implement extensibility. I already have a
separate patch series out for that. This patch series borrows from that
one heavily, but the extensibility series will need to be reworked
somewhat once this one is in.
This commit is only lightly tested because I don't have a good test setup
for VLANs.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This is more conventional use of Netlink.
For upstreaming, 'u64 attrs' can be changed to u32 and the uses of 1ULL
can be changed to 1.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Most of the members in structures referring to network elements indicate
the layer (e.g., "tl_", "nw_", "tp_"). The "frag" and "tos" members
didn't, so this commit add them.
IPv6 uses the term "traffic class" for what IPv4 calls
"type-of-service". This commit renames the the "ipv6_tos" field to
"ipv6_tclass" in the "ovs-key_ipv6" struct to be more consistent with
the IPv6 terminology.
Suggested-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Add support matching the IPv4 TTL and IPv6 hop limit fields. This
commit also adds support for modifying the IPv4 TTL. Modifying the IPv6
hop limit isn't currently supported, since we don't support modifying
IPv6 headers.
We will likely want to change the user-space interface, since basic
matching and setting the TTL are not generally useful. We will probably
want the ability to match on extraordinary events (such as TTL of 0 or 1)
and a decrement action.
Feature #8024
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This will be useful later when we add support for matching the ECN bits
within the TOS field.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Following patch adds skb-priority to flow key. So userspace will know
what was priority when packet arrived and we can remove the pop/reset
priority action. It's no longer necessary to have a special action for
pop that is based on the kernel remembering original skb->priority.
Userspace can just emit a set priority action with the original value.
Since the priority field is a match field with just a normal set action,
we can convert it into the new model for actions that are based on
matches.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7715
Patch below fixes build on FreeBSD; tested on 10.0-CURRENT.
Signed-off-by: Edward Tomasz Napierala <trasz@FreeBSD.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, OVS has handled IP fragments more awkwardly than necessary. It
has not been possible to match on L4 headers, even in fragments with offset
0 where they are actually present. This means that there was no way to
implement ACLs that treat, say, different TCP ports differently, on
fragmented traffic; instead, all decisions for fragment forwarding had to
be made on the basis of L2 and L3 headers alone.
This commit improves the situation significantly. It is still not possible
to match on L4 headers in fragments with nonzero offset, because that
information is simply not present in such fragments, but this commit adds
the ability to match on L4 headers for fragments with zero offset. This
means that it becomes possible to implement ACLs that drop such "first
fragments" on the basis of L4 headers. In practice, that effectively
blocks even fragmented traffic on an L4 basis, because the receiving IP
stack cannot reassemble a full packet when the first fragment is missing.
This commit works by adding a new "fragment type" to the kernel flow match
and making it available through OpenFlow as a new NXM field named
NXM_NX_IP_FRAG. Because OpenFlow 1.0 explicitly says that the L4 fields
are always 0 for IP fragments, it adds a new OpenFlow fragment handling
mode that fills in the L4 fields for "first fragments". It also enhances
ovs-ofctl to allow users to configure this new fragment handling mode and
to parse the new field.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Bug #7557.
Almost all current actions can be expressed in the form of
push/pop/set <field>, where field is one of the match fields. We can
create three base actions and take a field. This has both a nice
symmetry and avoids inconsistencies where we can match on the vlan
TPID but not set it.
Following patch converts all actions to this new format.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7115
Most of the enum tags in this file are lowercased versions of the uppercase
enum prefixes (or slightly less abbreviated versions, e.g. "dp" becomes
"datapath"). This commit fixes up the others for consistency.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Commit b063d9f06 "datapath: Use unicast Netlink sockets for upcalls" that
switched from multicast to unicast Netlink for sending upcalls added a
Netlink PID to each kernel flow, used by OVS_ACTION_ATTR_USERSPACE actions
within the flow as target.
This commit drops this per-flow PID in favor of a per-action PID, because
that is more flexible. It does not yet make use of this additional
flexibility, so behavior should not change.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7559.
These macros caused trouble if datapath-protocol.h was included before
openflow.h. Later references to the icmp_type and icmp_code members of
struct ovs_key_icmp caused compiler errors, because the macros caused them
to try to refer to nonexistent tp_src and tp_dst members in those
structures.
Following patch adds sampling action which takes probability and set
of actions as arguments. When probability is hit, actions are executed for
given packet.
USERSPACE action's userdata (u64) is used to store struct
user_action_cookie as cookie. CONTROLLER action is fixed accordingly.
Now we can remove sFlow code from kernel and implement sFlow generically
as SAMPLE action. sFlow is defined as SAMPLE Action with probability (sFlow
sampling rate) and USERSPACE action as argument. USERSPACE action's data
is used as cookie. sFlow uses this cookie to store output-port, number of
output ports and vlan-id. sample-pool is calculated by using vport
stats.
Signed-off-by: Pravin Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Currently the kernel vlan actions mirror those used by OpenFlow 1.0.
i.e. MODIFY and STRIP. More flexible approach is to have an action to
push a tag and pop a tag off, so that it can handle multiple levels of vlan
tags. Plus it aligns with newer version of OpenFlow.
As this patch replaces MODIFY with PUSH semantic, action
mapping done in userpace is fixed accordingly.
GSO handling for multiple levels of vlan tags is also added as
Jesse suggested before.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
When ovs-vswitchd executes actions on a synthesized packet, that is, on a
packet that is not being forwarded from any particular port but is being
generated by ovs-vswitchd itself or by an OpenFlow controller (using a
OFPT_PACKET_OUT message with an in_port of OFPP_NONE), there is no good
choice for the in_port to pass to the kernel in the flow in the
OVS_PACKET_CMD_EXECUTE message. This commit allows ovs-vswitchd to omit
the in_port entirely in this case.
This fixes a bug in OFPT_PACKET_OUT: using an in_port of OFPP_NONE would
cause the packet to be dropped by the kernel, since that's an invalid
input port.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Reported-by: Aaron Rosen <arosen@clemson.edu>
The prefix "ODP_*" is not overly descriptive in the context of the
larger Linux tree. This commit changes the prefix to "OVS_*" for the
userpace to kernel interactions. The userspace libraries still use
"ODP_" in many of their interfaces since it is more descriptive in the
OVS oeuvre.
Feature #6904
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
An existing comment in the function being updated explains the problem:
* Many of the sscanf calls in this function use oversized destination
* fields because some sscanf() implementations truncate the range of %i
* directives, so that e.g. "%"SCNi16 interprets input of "0xfedc" as a
* value of 0x7fff. The other alternatives are to allow only a single
* radix (e.g. decimal or hexadecimal) or to write more sophisticated
* parsers.
The rest of the headers all follow the form "header(value)" or
"header(key1=value1,key2=value2,...)" but VLAN headers left out the "="
characters. This adds them in for consistency.
ODP_ACTION_ATTR_CONTROLLER in the kernel actually sends packets to
userspace, not the controller. To make it generic rename this action
to ODP_ACTION_ATTR_USERSPACE.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
The NXAST_DROP_SPOOFED_ARP action has been deprecated in favor of
defining flows using the NXM_NX_ARP_SHA flow match for a while. This
commit removes it.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
In addition to the changes to ofproto, this commit changes all of the
instances of "struct flow" in the tree so that the "in_port" member is an
OpenFlow port number. Previously, this member was an OpenFlow port number
in some cases and an ODP port number in other cases.
This is a potential security issue for the kernel. In userspace it just
provokes false-positive valgrind warnings (which is how I found it).
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
IPv6 uses Neighbor Discovery messages in a similar manner to how IPv4
uses ARP. This commit adds support for matching deeper into the
payloads of Neighbor Solicitation (NS) and Neighbor Advertisement (NA)
messages. Currently, the matching fields include:
- NS and NA Target (nd_target)
- NS Source Link Layer Address (nd_sll)
- NA Target Link Layer Address (nd_tll)
When defining IPv6 Neighbor Discovery rules, the Nicira Extensible Match
(NXM) extension to OVS must be used.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Provides ability to match over IPv6 traffic in the same manner as IPv4.
Currently, the matching fields include:
- IPv6 source and destination addresses (ipv6_src and ipv6_dst)
- Traffic Class (nw_tos)
- Next Header (nw_proto)
- ICMPv6 Type and Code (icmp_type and icmp_code)
- TCP and UDP Ports over IPv6 (tp_src and tp_dst)
When defining IPv6 rules, the Nicira Extensible Match (NXM) extension to
OVS must be used.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
OpenFlow 1.0 doesn't allow matching on the ARP source and target
hardware address. This has caused us to introduce hacks such as the
Drop Spoofed ARP action. Now that we have extensible match, we can
match on more fields within ARP:
- Source Hardware Address (arp_sha)
- Target Hardware Address (arp_tha)
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
A few common IP protocol types were defined in "lib/packets.h". However,
we already assume the existence of <netinet/in.h> which contains a more
exhaustive list and should be available on POSIX systems.
The '#' format specifier doesn't respect the field width modifier,
so EtherTypes are printed with variable length. Zero is not a valid
EtherType so there isn't a need for the logic to dynamically insert
the 0x prefix (if the EtherType isn't specified it won't be printed
at all). This fixes the EtherType to have the intended format and
also changes the vlan TPID to match.
Jesse suggested this naming scheme, so I'm adjusting existing names to
fit it.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Following this commit, "struct odp_flow_stats" is only used in
Linux-specific parts of OVS userspace code. This allows the actual Linux
datapath interface to evolve more freely.
Reviewed by Justin Pettit.
Following this commit, "struct odp_flow" and related data structures are
only used in Linux-specific parts of OVS userspace code. This allows the
actual Linux datapath interface to evolve more freely.
Reviewed by Justin Pettit.
This is cleaner than parsing "odp_port"s directly. It takes one step
toward eliminating use of odp_port from any userspace code outside of
lib/netdev-vport.c and lib/dpif-linux.c.
Reviewed by Justin Pettit.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length. This
commit makes that change using Netlink attribute sequences.
This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, because userspace doesn't yet
have enough information to do that intelligently. Upcoming commits will
fix that.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Before this change, these were formatted as:
,***%u leftover bytes***
After this change, they are formatted as:
<empty>,***%u leftover bytes***
Reviewed by Ethan Jackson <ethan@nicira.com>.
Previously, a GRE-over-IPsec tunnel was created as an interface with a
"type" of "gre" and the "other_config" column with "ipsec_cert" or
"ipsec_psk" set. This could lead to a potential security problem if a user
intended to create a GRE-over-IPsec tunnel, but misconfigured the
"ipsec_*" config and created an unencrypted GRE tunnel.
This commit defines an "ipsec_gre" tunnel type, which should prevent
users from inadvertently establishing insecure tunnels.
When "ovs-dpctl show" is run, return additional information about the
port. For example, tunnel ports will print the remote_ip, local_ip, and
in_key when defined.