Before commit e995e3df57ea (Allow OVS_USERSPACE_ATTR_USERDATA to be
variable length.) userdata attributes in userspace actions were expected
to be exactly 64 bits long. The kernel only actually enforced that they
were at least 64 bits long (the previously referenced commit's log message
contains misinformation on this account).
Initially this was no problem, because all of the userdata that userspace
actually used was exactly 8 bytes long. Commit 29089a540c (Implement IPFIX
export), however, exposed a problem by reducing the length of userdata for
IPFIX support to just 4 bytes. This meant that IPFIX no longer worked on
older datapaths, because the userdata was no longer at least 8 bytes long.
This commit fixes the problem by padding out userdata attributes less than
8 bytes long to 8 bytes.
CC: Romain Lenglet <rlenglet@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Romain Lenglet <rlenglet at vmware.com>
This function already had a few potential users, which this commit
converts. An upcoming commit adds more users.
Signed-off-by: Ben Pfaff <blp@nicira.com>
We should be looking at 'src_flow' instead of 'flow'. Otherwise,
parsing SCTP through odp_flow_key_to_mask will fail.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
tcp_flags=flags/mask
Bitwise match on TCP flags. The flags and mask are 16-bit num‐
bers written in decimal or in hexadecimal prefixed by 0x. Each
1-bit in mask requires that the corresponding bit in port must
match. Each 0-bit in mask causes the corresponding bit to be
ignored.
TCP protocol currently defines 9 flag bits, and additional 3
bits are reserved (must be transmitted as zero), see RFCs 793,
3168, and 3540. The flag bits are, numbering from the least
significant bit:
0: FIN No more data from sender.
1: SYN Synchronize sequence numbers.
2: RST Reset the connection.
3: PSH Push function.
4: ACK Acknowledgement field significant.
5: URG Urgent pointer field significant.
6: ECE ECN Echo.
7: CWR Congestion Windows Reduced.
8: NS Nonce Sum.
9-11: Reserved.
12-15: Not matchable, must be zero.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Tabs and spaces got mixed up, making the code harder to read.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This support is added through the userspace slow path, because we don't
judge that this is important enough to require permanent support in the
Linux kernel ABI.
Bug #19259.
CC: Teemu Koponen <koponen@nicira.com>
CC: Pankaj Thakkar <thakkar@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, OVS has expected that the datapath supports all the actions
required by any flow to be installed. There are at least two reasons why
a datapath might not support a given action:
- The datapath version is older than the userspace version, and the
action was introduced after the version of the datapath in use.
- The action is not considered important enough to implement as part of
an ABI that must be maintained forever.
This commit adds infrastructure to handle these cases. It doesn't actually
add any uses; that will come in an upcoming commit.
Signed-off-by: Ben Pfaff <blp@nicira.com>
It will soon be possible for a single flow to be slow pathed for multiple
reasons. This commit makes it possible to indicate more than one reason
to slow path a flow.
This commit is logically a revert of commit 98f0520fb2 (odp-util: Make
slow_path_reasons mutually exclusive.) but details have changed.
Signed-off-by: Ben Pfaff <blp@nicira.com>
With this commit, whenever the verbosity is enabled with '-m'
option, the ovs-dpctl dump-flows command will display the flows with
in_port field showing the name instead of a port number.
Conversely, one can also use a name in the in_port field with del-flow,
add-flow and mod-flow commands of ovs-dpctl. One should also be able
to use the port name when supplying the datapath flow as an input
to ofproto/trace command.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This allows for future patches to pass different tci values to
commit_vlan_action() without passing an entire flow structure.
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Rather than tracking the MPLS depth as a field in the
flow, which is an entirely poor place for it, just track
the delta to the MPLS depth during translation.
This logic was developed while implementing recirculation
and intended to be used to detect when recirculation should
occur. This variant of the patch uses the logic to determine
if processing of actions should stop due to an MPLS
action which cannot be translated (without recirculation).
A side-effect of this patch is that it resolves a bug
whereby ovs-vswitchd will abort due to to an assertion
on eth_type_mpls(ctx->xin->flow.dl_type) in compose_mpls_pop_action(()
if the actions of a flow include pop_mpls twice without
a push_mpls in between.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
In odp_flow_key_from_flow__(), when converting ICMPv6 flow/mask
to flow/mask key, we should always use flow to check for whether
ND informaition is present or not. In mask case, both type and code
field should be masked, otherwise ND fields can be masked.
Similarly in reverse conversion (parse_l2_5_onward()), we should
have same check.
Signed-off-by: Guolin Yang <gyang@nicira.com>
[blp@nicira.com changed && to || in parse_l2_5_onward()
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
With megaflow support, there is API to convert mask to nlattr key based
format. This change introduces API to do the reverse conversion. We
leverage the existing odp_flow_key_to_flow() API to reuse the code.
Signed-off-by: Guolin Yang <gyang@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The skb_mark field is currently only available with the Linux datapath
and is only used internally. However, it is desirable to expose this
through OpenFlow and when it is exposed ideally it would not be system-
specific. In preparation for this, skb_mark is rename to pkt_mark in
internal data structures for consistency.
This does not rename the Linux interfaces because doing so would break
the API. It would not necessarily be desirable to do anyways since in
Linux-specific code it is clearer to use the actual name rather than a
generic one. This can lead to confusion in some places, however, because
we do not always strictly separate generic and platform dependent code
(one example is actions). This seems inevitable though at this point if
the lower and upper layers have different names (as they must given the
above requirements).
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
The current Netlink protocol allows a default value of zero if either mark
or priority is not specified (this is part of the ABI). Until now, when
userspace serializes either the value or mask, it looked at the value and
omitted the netlink attribute if it is zero. This is a bug because an
exact match on zero turns into a wildcard of the field.
These two fields (plus input port and EtherType) are special because they
can be omitted whereas most other values are required to be fully
specified. These protocol variations tend to cause bugs (as above) when we
evolve the protocol because an exception that makes sense in one context
might not be logical in another. Since the default value for mark and
priority are merely shorthands, we can push the protocol in a more
consistent direction by ignoring the shortcut and always serializing the
values. This is what this commits does.
Signed-off-by: Andy Zhou <azhou@nicira.com>
[blp@nicira.com added Jesse's text to the commit message]
Signed-off-by: Ben Pfaff <blp@nicira.com>
When verbose mode tuned on, all dp flow fields described by the netlink
attributes are displayed, including fully wildcarded attributes.
Otherwise, the fully wildcarded attributes are omitted for brevity.
Added -m option to "ovs-dpctl dump-flows" to enable verbose mode. It is
off by default.
Signed-off-by: Andy Zhou <azhou@nicira.com>
[blp@nicira.com added documentation]
Signed-off-by: Ben Pfaff <blp@nicira.com>
A tunnel value attribute is not allowed to have an empty IP destination
address but this is legal for masks. This drops both the checks for
serializing masks and also the sanity checks on them.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
This bug causes the flag mask to always mask only 1 bit, not the 2 bits
possible. While at it, make the top 6 bits exact match.
Bug #18834.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
This commit fixes the warning issued by 'clang' when pointer is casted
to one with greater alignment.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
For non-Ethernet II packets, we don't set an EtherType netlink attribute
and set the Ethertype mask attribute to 0xffff. The code was encoding
whatever mask was passed in, which could lead to bugs if the caller
didn't know the userspace-kernel interface.
Found by inspection.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
ovs-dpctl sometimes displays wildcarded fields as exact match. This
patch fixes those cases.
This patch implements the following logic. When OVS_FLOW_ATTR_MASK is
missing, the entire key attributes will be displayed as exact match fields.
When OVS_FLOW_ATTR_MASK is present, but some individual key attributes do
not have matching attributes in the mask, those key attributes will be
displayed as wildcarded fields.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
When converting the VLAN from a flow to an ODP key, the processing logic
would always store the VLAN ethertype. However, when handling a mask,
it should be a mask, not an ethertype. And since we don't support
bit-wise masking of the ethertype, just make it an exact-match mask.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Add a new function for converting a mask into a set of
OVS_KEY_ATTR* attributes. This will be useful in a future commit.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Until now, datapath ports and openflow ports were both represented by
unsigned integers of various sizes. With implicit conversions, etc., it is
easy to mix them up and use one where the other is expected. This commit
creates two typedefs, ofp_port_t and odp_port_t. Both of these two types
are marked by "__attribute__((bitwise))" so that sparse can be used to
detect any misuse.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Added support to allow mega flow specified and displayed. ovs-dpctl tool
is mainly used as debugging tool.
This patch also implements the low level user space routines to send
and receive mega flow netlink messages. Those netlink suppor
routines are required for forthcoming user space mega flow patches.
Added a unit test to test parsing and display of mega flows.
Ethan contributed the ovs-dpctl mega flow output function.
Co-authored-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
A number of use-cases weren't handled properly when determining what can
be wildcarded for megaflows. This commit both catches additional fields
that cannot be wildcarded and loosens a few other cases.
Bug #17979
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Returning a static data buffer makes code more brittle and definitely
not thread-safe, so this commit switches to using a caller-provided
buffer instead.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ansis Atteka <aatteka@nicira.com>
Rename tun_key_from_attr() as odp_tun_key_from_attr() and export it.
This is in preparation for calling this function outside of odp-util.c.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Flagged with: https://github.com/lyda/misspell-check
Run with: git ls-files | misspellings -f -
Signed-off-by: Andy Hill <hillad@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
It's no longer possible for a single datapath flow to be slow
pathed for two different reasons. This patch updates the code to
reflect this fact (marginally simplifying it).
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Before this patch, when in band control was enabled, every DHCP
packet had to be sent to userspace to calculate it's actions.
Those DHCP packets intended for the local port would have a special
action added to ensure they actually make it there. This
unnecessarily complicates the code, so this patch takes a slightly
different approach. When in-band is enabled, *all* DHCP packets
must be sent to the local port. This guarantees that
xlate_actions() returns the same result every time for a given
flow.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Before this patch, datapath keys with ODP_FIT_TO_LITTLE, would be
assigned subfacets and installed in the kernel with a SLOW_MATCH
slow path reason. This is problematic, because these flow keys
can't be reliable converted into a 'struct flow' thus breaking a
fundamental assumption of ofproto-dpif. This patch circumvents the
issue by skipping facet creation for these flows altogether. This
approach has the added benefit of simplifying the code for future
patches.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Traditionally, Open vSwitch has used a variant of 802.1ag "CFM" for
interface liveness detection. This has served us well until now,
but has several serious drawbacks which have steadily become more
inconvenient. First, the 802.1ag standard does not implement
several useful features forcing us to (optionally) break
compatibility. Second, 802.1.ag is not particularly popular
outside of carrier grade networking equipment. Third, 802.1ag is
simply quite awkward.
In an effort to solve the aforementioned problems, this patch
implements BFD which is ubiquitous, well designed, straight
forward, and implements required features in a standard way. The
initial cut of the protocol focuses on getting the basics of the
specification correct, leaving performance optimizations, and
advanced features as future work. The protocol should be
considered experimental pending future testing.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Note that OVS_KEY_ATTR_MPLS may be an array of ovs_key_mpls
and that the acceptable length may be restricted by the implementation.
Currently the user-space datapath and proposed kernel datapath
implementation restrict the length to a single element.
Also update the mpls_top_lse name of the element of struct ovs_key_mpls,
as it is an array of LSEs and thus not necessarily just the top LSE.
As requested by Jesse Gross
Cc: Jesse Gross <jesse@nicira.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Define a new NXAST_SAMPLE OpenFlow vendor action and the corresponding
OFPACT_SAMPLE OVS action, to do per-flow packet sampling, translated
into a new SAMPLE "flow_sample" dp action.
Make the userspace action's userdata size vary depending on the union
member used. Add a new "flow_sample" upcall to do per-flow packet
sampling. Add a new "ipfix" upcall to do per-bridge packet sampling
to IPFIX collectors.
Extend the OVSDB schema to support configuring IPFIX collector sets.
Add support for configuring multiple IPFIX collectors for per-flow
packet sampling. Add support for configuring per-bridge IPFIX
sampling.
Automatically generate standard IPFIX entity definitions from the IANA
specs. Send one IPFIX data record message for every packet sampled by
an OpenFlow sample action or received by a bridge configured with
IPFIX sampling, and periodically send IPFIX template set messages.
Signed-off-by: Romain Lenglet <rlenglet@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Change the base flow only if a corresponding kernel action is generated
in commit_odp_tunnel_action().
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Currently, when flow attribute type is greater than OVS_KEY_ATTR_MAX,
we can write into a random memory address causing corruption. Fix it.
Bug #15702.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
* Update the order in which actions are committed and thus
appear in the datapath such that MPLS actions appear after
l3 and l4 (nw and port) actions.
In the case where an mpls_push action is present it should ensure
that l3 and l4 actions occur first, which seems logical as
once a mpls_push has occur the frame will be MPLS rather
than IPv4 or IPv6.
In the case where there is an mpls_pop action is present this should
not make any difference as the frame will have been MPLS to start with
and thus not satisfy the pre-requisites for l3 or l4 actions.
* Update commit_set_nw_action() to use the base ethertype when considering
eligibility to commit l3 (nw) actions. This allows l3 actions to be
applied so long as the frame was originally IPv4 or IPv6, even if
an mpls_push action will be applied and thus flow indicates the
frame will be MPLS.
* Make actions that may modify port or network information conditional on
the flow's ethernet type being an IP ethernet type. This is to ensure
that actions that modify network and port information do not occur
on non IP packets, for example if an mpls_push action has changed a
packet from IP to MPLS.
Note that modification of network data is already prevented by
virtue of commit_set_nw_action() only having cases for when the
ethernet type of the flow is IPV4 or IPV6. The conditionality
of network actions on the ethernet type has been added to
do_xlate_actions() to make it explicit.
* Add a check to commit_set_port_action() to ensure that the base
flow is IP. This protects against the case where move_reg is used
to change transport ports after an MPLS header is pushed.
Signed-off-by: Simon Horman <horms@verge.net.au>
[jesse: Add check for an IP protocol when committing L4 actions.]
Signed-off-by: Jesse Gross <jesse@nicira.com>
There were plans to use this in conjunction with inner/outer flows,
however that plan has been changed in favour of using recirculation.
This leaves us with the current usage.
encal_dl_type is currently only used to allow decoding of packets used in
the test suite. However, this is a bit of a fudge and the packets may be
provided as hexadecimal instead.
Also remove comments from parse_l2_5_onward() relating to MPLS which are
not in keeping with the commenting throughout the rest of the function.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Since userspace flow based tunneling code is checked in, the kernel
port based tunneling code can be removed.
Patch removes following components:
- tunnel ports hash table and moved tunnel ports list to individual
vports.
- Cleaned per tnl-port config.
- OVS_KEY_ATTR_TUN_ID action is removed.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #15078