When we format most netlink attributes we do so from the netlink
itself, iterating through each one and printing the contents out.
However, for tunnels we don't do this - we first convert to the
OVS userspace representation and then format that. While convienient,
this isn't really ideal as the primary use of printing netlink
attributes is debugging and this conversion is lossy, particularly
when the attributes aren't as expected. The result is that unexpected
keys are silently ignored and the level of detail on errors is
minimal.
This situation becomes worse when we introduce support for Geneve.
The conversion to userspace format requires additional information
which we might not have (ovs-dpctl) and is more complicated than
other attributes so it is likely to be confusing in the event of a
bug. The information from the kernel is self-describing so it's
much more reliable to display it directly from the netlink.
This converts tunnel attribute formatting to be more similar to
other types of attributes. As a nice bonus the output becomes
more compact because it doesn't print zeroed out attributes in
cases where they aren't relevant and therefore not present.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
When formatting netlink attributes if no mask is present a wildcarded
attribute is synthesized for the purposes of later processing. In
the case of nested attributes this must be done recursively, filling
in the correct attributes at each level rather than just generating
a set of zeros of the correct size. This is done already but it
always uses the attribute type for the top level keys - this corresponds
to nested ENCAP attributes. However, we have several levels of potentially
nested attributes for tunnels that each have their own types.
This uses an approach similar to the kernel where we have sets of
tables for the type of each attribute linked together by pointers.
This allows the mask generation function to automatically traverse
the nested attributes and always get the right types.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
This commit fixes a bug in the parse_flag() function which causes
failure of parsing tunnel flags like:
tunnel(tun_id=0x0,src=1.2.3.4,dst=1.2.3.5,tos=0,ttl=64,flags(-df+csum+key))
Reported-by: Jacob Cherkas <jcherkas@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The flags field is 16 bits so use network byte order in the
test case and use the proper conversion methods when parsing
and dumping.
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
When we parse the text representation of the Geneve action the
header is not fully initialized. Besides the obvious potential
to generate an action that the user did not actually specify, this
also causes intermittent unit test failures when an action is
read in and printed out and the result is different.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Kernel based OVS recently added the ability to support checksums
for UDP based tunnels (Geneve and VXLAN). This adds similar support
for the userspace datapath to bring feature parity.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
This adds basic userspace dataplane support for the Geneve
tunneling protocol. The rest of userspace only has the ability
to handle Geneve without options and this follows that pattern
for the time being. However, when the rest of userspace is updated
it should be easy to extend the dataplane as well.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Currently, the userspace VXLAN implementation contains the code
for generating and parsing both the UDP and VXLAN headers. This
pulls out the UDP portion for better layering and to make it
easier to support additional UDP based tunnels and features.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Currently when printing a userspace tunnel action for VXLAN, the
VNI is treated as a 32 bit field rather than 24 bit. Even if this
is the representation that we use internally, we should still show
the right VNI to avoid confusing people.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
The GRE checksum is a 16 bit field stored in a 32 bit option (the
rest is reserved). The current code treats the checksum as a 32-bit
field and places it in the right place for little endian systems but
not big endian. This fixes the problem by storing the 16 bit field
directly.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Recirculation id was scanned without a mask, which led to it being
ignored.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
ofpbuf was complicated due to its wide usage across all
layers of OVS, Now we have introduced independent dp_packet
which can be used for datapath packet, we can simplify ofpbuf.
Following patch removes DPDK mbuf and access API of ofpbuf
members.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Introduces two new NXMs to represent VXLAN-GBP [0] fields.
actions=load:0x10->NXM_NX_TUN_GBP_ID[],NORMAL
tun_gbp_id=0x10,actions=drop
This enables existing VXLAN tunnels to carry security label
information such as a SELinux context to other network peers.
The values are carried to/from the datapath using the attribute
OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS.
[0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy-00
Signed-off-by: Madhu Challa <challa@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
This patch adds set-field operations for nd_target, nd_sll, and nd_tll
fields, with and without masks, using Nicira extensions and OpenFlow 1.2
protocol.
Signed-off-by: Randall A Sharo <randall.sharo at navy.mil>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Commit 534a19b (dpctl: Add support for using UFID to add/del flows.)
introduced string parsing functions for UFIDs, but provided a broken
implementation where the upper 64 bits would be ignored, then the lower
64 bits would be read into both the lower and upper UFID positions. Fix
the implementation to read the upper bits properly.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Parse "ufid:<foo>" at the beginning of a flow specification and use it
for flow manipulation if present.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This patch modifies the dpif interface to allow flows to be manipulated
using a 128-bit identifier. This allows revalidator threads to perform
datapath operations faster, as they do not need to serialise the entire
flow key for operations like flow_get and flow_delete. In conjunction
with a future patch to simplify the dump interface, this provides a
significant performance benefit for revalidation.
When handlers assemble flow_put operations, they specify a unique
identifier (UFID) for each flow as it is passed down to the datapath to
be stored with the flow. The UFID is currently provided to handlers
by the dpif during upcall processing.
When revalidators assemble flow_get or flow_del operations, they may
specify the UFID for the flow along with the key. The dpif will decide
whether to send only the UFID to the datapath, or both the UFID and flow
key. The former is preferred for newer datapaths that support UFID,
while the latter is used for backwards compatibility.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.
Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Just because the ethertype is MPLS, this doesn't mean that the datapath
understands and provides OVS_KEY_ATTR_MPLS attributes for the flow.
Previously we would check the size of the OVS_KEY_ATTR_MPLS attribute
before checking whether the attribute is present. This would cause a
segfault in nl_attr_get_size(), usually triggered from a handler thread.
This patch brings the MPLS parsing code more in line with the rest of
the parse_l2_5_onward() function, by only processing MPLS if the
attribute is present.
Reported-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Previously we masked labels not present in the incoming packet.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This may be useful for debugging (with dpctl)
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
The frag member in the Netlink interface is an uint8_t enumeration
type, not a bitfield, so it should always be either fully masked or
not masked at all.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
When a whole field of a key value is ignored, skip it when formatting
the key, and allow it to be left out when parsing the key from a
string. However, when the 'verbose' formatting is requested those are
still formatted, as it may help in debugging.
Now the named key fields can also be given in arbitrary order.
Duplicate field values are not checked for, so the last one will
remain in effect.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
tp_src and tp_dst fields were recently added to struct flow_tnl, but
parsing and printing was missing.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Use the "+-" syntax more uniformly when printing masked flags, and use
the syntax of delimited 1-flags also for formatting fully masked TCP
flags.
The "+-" syntax only deals with masked flags, but if there are many of
those, the printout becomes long and confusing. Typically there are
many flags only when flags are fully masked, but even then most of
them are zeros, so it makes sense to print the flags that are set
(ones) and omit the zero flags.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Some attributes are exact matches even when all bits are not ones.
Make odp_mask_attr_is_exact() to return true if the mask is set for
all the bits we actually care about.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
is_all_zeros() and is_all_ones() operate on bytes, but just like with
memset, it is easier to use if the first argument is a void *.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Add a new action type OVS_ACTION_ATTR_SET_MASKED, and support for
parsing, printing, and committing them.
Masked set actions add a mask, immediately following the netlink
attribute data, within the netlink attribute itself. Thus the key
attribute size for a masked set action is exactly double of the
non-masked set action.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Current unconditional call may result in NULL being passed to
nl_msg_put_u32().
Cc: Andy Zhou <azhou@nicira.com>
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
tun_key_to_attr() accesses tp_src and tp_dst which are currently
uinitialized.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Extend IPFIX exporter to export tunnel headers when both input and output
of the port.
Add three other_config options in IPFIX table: enable-input-sampling,
enable-output-sampling and enable-tunnel-sampling, to control whether
sampling tunnel info, on which direction (input or output).
Insert sampling action before output action and the output tunnel port
is sent to datapath in the sampling action.
Make datapath collect output tunnel info and send it back to userpace
in upcall message with a new additional optional attribute.
Add a tunnel ports map to make the tunnel port lookup faster in sampling
upcalls in IPFIX exporter. Make the IPFIX exporter generate IPFIX template
sets with enterprise elements for the tunnel info, save the tunnel info
in IPFIX cache entries, and send IPFIX DATA with tunnel info.
Add flowDirection element in IPFIX templates.
Signed-off-by: Wenyu Zhang <wenyuz@vmware.com>
Acked-by: Romain Lenglet <rlenglet@vmware.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
This adds support for Geneve - Generic Network Virtualization
Encapsulation. The protocol is documented at
http://tools.ietf.org/html/draft-gross-geneve-00
The kernel implementation is completely agnostic to the options
that are in use and can handle newly defined options without
further work. It does this by simply matching on a byte array
of options and allowing userspace to setup flows on this array.
Userspace currently implements only support for basic version of
Geneve. It can work with the base header (including the VNI) and
is capable of parsing options but does not currently support any
particular option definitions. Over time, the intention is to
allow options to be matched through OpenFlow without requiring
explicit support in OVS userspace.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Some tunnel formats have mechanisms for indicating that packets are
OAM frames that should be handled specially (either as high priority or
not forwarded beyond an endpoint). This provides support for allowing
those types of packets to be matched.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
In the case that an flow for an IP packet has an mpls_push action applied
the L3 and L4 portions of the flow will be cleared in flow_push_mpls().
Without this change commit_set_port_action() will set the tp_src and tp_dst
mask for the flow to all-ones because the base and flow port values no
longer match. Even though there will be no corresponding set action for the
ports; because the flow is no longer IP.
In this case where nw_proto is not part of the match this manifests
in a problem because the kernel datapath rejects flows whose masks
have non-zero values for tp_src or dp_dst if the nw_proto mask is
not all-ones.
This patch resolves this problem by having commit_set_port_action() return
without doing anything if flow->nw_proto is zero. The same logic is present
in commit_set_nw_action().
Also enhance one of the MPLS tests to exercise this logic. The enhanced
tests inputs a UDP packet with non-zero ports rather than an IP packet with
zeroed ports: zeroed ports cause commit_set_port_action() always return
without doing anything..
Commit 691d39b ("upcall: Remove redundant xlate_actions_for_side_effects().")
causes xlate_in_init() to be called for every packet that has an upcall.
This has the effect of indirectly calling commit_set_port_action() when
translating a controller action which may not have previously been the case
depending on the flow.
The result is that the behaviour described in the changelog above can be
exercised via a minor enhancement to one of the existing MPLS tests. This
illustrates that the problem exists for the user-space datapath whereas I
had previously incorrectly assumed it only manifested when using the kernel
datapath because I had only observed it there.
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Jarno Rajahalme <jrajahame@nicira.com>
The userspace and kernel datapaths previously differed on their
treatment of the recirc_id and dp_hash fields when sending upcalls.
While the kernel datapath would always serialise these fields, the
userspace would not. When using the userspace datapath, this would cause
a mismatch between the odp flow key in an upcall compared to the one
that is serialised upon flow_dump.
This patch brings the userspace datapath behaviour back in line with the
kernel datapath by always serialising recirc_id and dp_hash to odp.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
The comment was very specific to one user of the function, and had a
typo. This change reflects the wider effect of the case.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Rename hash_bias to hash_basis to make it consistent with similar
usages.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Currently netlink flow (and mask) recirc_id attribute is only
serialized when the recirc_id value is non-zero. For this logic
to work correctly, the interpretation of the missing recirc_id
depends on whether the datapath supports recirculation.
This patch remove the ambiguity of the meaning of missing recirc_id
attribute in netlink message. When recirc_id is non-zero, or when
it is not a wildcard match, both key and mask attributes are
serialized. On the other hand, when recirc_id is zero, and being
wildcarded, they are not serialized. A missing recirc_id key and
mask attribute thus should always be interpreted as wildcard,
same as other flow fields.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Currently recirculation action can optionally compute hash. This patch
adds a hash action that is independent of the recirc action, which
no longer computes hash. For megaflow bond with recirc, the output
to a bond port action will look like:
hash(hash_l4(0)), recirc(<recirc_id>)
Obviously, when a recirculation application that does not depend on
hash value can just use the recirc action alone.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Reviewed-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Pravin B Shelar <pshelar@nicira.com