Cache the 'conn' context and use it when it is valid. The cached 'conn'
context will get reset if it is not expected to be valid; the cost to do
this is negligible. Besides being most optimal, this also handles corner
cases, such as decapsulation leading to the same tuple, as in tunnel VPN
cases. A negative test is added to check the resetting of the cached
'conn'.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When a packet is received, the RSS hash is calculated if it is not
already available. The Exact Match Cache (EMC) entry is then looked up
using this RSS hash.
When a MPLS encapsulated packet is received, the MPLS header is popped
and the packet is recirculated. Since the RSS hash has not been
invalidated here, the EMC lookup for all decapsulated packets will hit
the same entry even though these packets will have different tuple
values. This degrades performance severely as different inner packets
from the same MPLS tunnel would hit the same EMC entry.
This patch invalidates RSS hash (by resetting offload flags) after MPLS
header is popped.
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Currently OVS supports all ARP protocol fields as OXM match fields to
implement the relevant ARP procedures for IPv4. This includes support
for matching copying and setting ARP fields. In IPv6 ARP has been
replaced by ICMPv6 neighbor discovery (ND) procedures, neighbor
advertisement and neighbor solicitation.
The support for ICMPv6 fields in OVS is not complete for the use cases
equivalent to ARP in IPv4. OVS lacks support for matching, copying and
setting the “ND option type” and “ND reserved” fields. Without these user
cannot implement all ICMPv6 ND procedures for IPv6 support.
This commit adds additional OXM fields to OVS for ICMPv6 “ND option type“
and ICMPv6 “ND reserved” using the OXM extension mechanism. This allows
support for parsing these fields from an ICMPv6 packet header and extending
the OpenFlow protocol with specifications for these new OXM fields for
matching, copying and setting.
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Co-authored-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com>
Signed-off-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
dp_packet_put_uninit() can reallocate the data buffer, so find the L4
header pointer afterward instead of before.
Found by Address Sanitizer.
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
I didn't realize until now that the tree had two different ways of parsing
strings in the form <host>:<port> and <port>:<host>. There are the
long-standing inet_parse_active() and inet_parse_passive() functions, and
more recently the ipv46_parse() function. This commit eliminates the
latter and changes the code to use the former.
The two implementations interpreted some input differently. In particular,
the older functions required IPv6 addresses to be [bracketed], but the
newer ones do not. For compatibility this patch changes the merged code to
use the more liberal interpretation.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
This patch changes OVS_KEY_ATTR_NSH
to nested attribute and adds three new NSH sub attribute keys:
OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header
OVS_NSH_KEY_ATTR_MD1: for length-fixed MD type 1 context
OVS_NSH_KEY_ATTR_MD2: for length-variable MD type 2 metadata
Its intention is to align to NSH kernel implementation.
NSH match fields, set and PUSH_NSH action all use the below
nested attribute format:
OVS_KEY_ATTR_NSH begin
OVS_NSH_KEY_ATTR_BASE
OVS_NSH_KEY_ATTR_MD1
OVS_KEY_ATTR_NSH end
or
OVS_KEY_ATTR_NSH begin
OVS_NSH_KEY_ATTR_BASE
OVS_NSH_KEY_ATTR_MD2
OVS_KEY_ATTR_NSH end
In addition, NSH encap and decap actions are renamed as push_nsh
and pop_nsh to meet action naming convention.
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This change adds three new options to the Northbound
Logical_Router_Port's ipv6_ra_configs option:
* send_periodic: If set to "true", then OVN will send periodic router
advertisements out of this router port.
* max_interval: The maximum amount of time to wait between sending
periodic router advertisements.
* min_interval: The minimum amount of time to wait between sending
periodic router advertisements.
When send_periodic is true, then IPv6 RA configs, as well as some layer
2 and layer 3 information about the router port, are copied to the
southbound database. From there, ovn-controller can use this information
to know when to send periodic RAs and what to send in them.
Because periodic RAs originate from each ovn-controller, the new
keep-local flag is set on the packet so that ports don't receive an
overabundance of RAs.
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
FreeBSD insists that <sys/types.h> be included before <netinet/in.h> and
that <netinet/in.h> be included before <arpa/inet.h>. This adds guards to
the "sparse" headers to yield a warning if this order is violated. This
commit also adjusts the order of many #includes to suit this requirement.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Until now, ovs-vswitchd has insisted that other-config:datapath-id be
exactly 16 hex digits. This commit allows shorter forms prefixed by 0x.
This was prompted by Faucet controller examples such as this one:
https://github.com/faucetsdn/faucet/blob/master/docs/README_config.rst
which tend to suggest datapath IDs like 0x1.
CC: Josh Bailey <joshb@google.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This commit adjusts the NSH user space implementation in OVS to
the latest wire format defined in draft-ietf-sfc-nsh-28 (November 3
2017). The NSH_MDTYPE field was reduced from 8 to 4 bits. The FLAGS
field is reduced from 8 to 2 bits. A new 6 bit TTL header field is
added. The TTL field is set to 63 at encap(nsh).
Match and set_field support for the newly introduced TTL header field
and a corresponding dec_nsh_ttl action is not yet included and will be
implemented in a future patch.
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
- Fix 2 incorrect length checks
- Remove unnecessary limit of MD length to 16 bytes
- Remove incorrect comments stating MD2 was not supported
- Pad metadata in encap_nsh with zeroes if not multiple of 4 bytes
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
OVS has functions for parsing IPv4 addresses, parsing IPv4 addresses
with a port, and parsing IPv6 addresses. What is lacking though is a
function that can take an IPv4 or IPv6 address, with or without a port.
This commit adds ipv46_parse(), which breaks the given input string into
its component parts and stores them in a sockaddr_storage structure. The
function accepts flags that determine how it should behave if a port is
present in the input string.
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This commit adds translation and netdev datapath support for generic
encap and decap actions for the NSH MD1 header. The generic encap and
decap actions are mapped to specific encap_nsh and decap_nsh actions
in the datapath.
The translation follows that general scheme that decap() of an NSH
packet triggers recirculation after decapsulation, while encap(nsh)
just modifies struct flow and sets the ctx->pending_encap flag to
generate the encap_nsh action at the next commit to be able to include
subsequent set_field actions for NSH headers.
Support for the flexible MD2 format using TLV properties is foreseen
in encap(nsh), but not yet fully implemented.
The CLI syntax for encap of NSH is
encap(nsh(md_type=1))
encap(nsh(md_type=2[,tlv(<tlv_class>,<tlv_type>,<hex_string>),...]))
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Since ovs_nd_mtu_opt and ovs_nd_prefix_opt is introducted, rename
ovs_nd_opt to ovs_nd_lla_opt to specify it's Source/Target Link-layer
Address Option.
Signed-off-by: Zongkai LI <zealokii@gmail.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch introduces methods to compose a Router Advertisement (RA) packet,
introduces flags for RA. RA packet composed structures against specification
in RFC4861.
Caller can use compse_nd_ra_with_sll_mtu_opts to compose a RA packet with
Source Link-layer Address Option and MTU Option.
Caller can use packet_put_ra_prefix_opt to append a Prefix Information Option
to a RA packet.
Signed-off-by: Zongkai LI <zealokii@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This commit adds a packet_type attribute to the structs dp_packet and flow
to explicitly carry the type of the packet as prepration for the
introduction of the so-called packet type-aware pipeline (PTAP) in OVS.
The packet_type is a big-endian 32 bit integer with the encoding as
specified in OpenFlow verion 1.5.
The upper 16 bits contain the packet type name space. Pre-defined values
are defined in openflow-common.h:
enum ofp_header_type_namespaces {
OFPHTN_ONF = 0, /* ONF namespace. */
OFPHTN_ETHERTYPE = 1, /* ns_type is an Ethertype. */
OFPHTN_IP_PROTO = 2, /* ns_type is a IP protocol number. */
OFPHTN_UDP_TCP_PORT = 3, /* ns_type is a TCP or UDP port. */
OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */
};
The lower 16 bits specify the actual type in the context of the name space.
Only name spaces 0 and 1 will be supported for now.
For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet).
This is the default packet_type in OVS and the only one supported so far.
Packets of type (OFPHTN_ONF, 0) are called Ethernet packets.
In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet.
A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet
whith the Ethernet header (and any VLAN tags) removed to expose the L3
(or L2.5) payload of the packet. These will simply be called L3 packets.
The Ethernet address fields dl_src and dl_dst in struct flow are not
applicable for an L3 packet and must be zero. However, to maintain
compatibility with the large code base, we have chosen to copy the
Ethertype of an L3 packet into the the dl_type field of struct flow.
This does not mean that it will be possible to match on dl_type for L3
packets with PTAP later on. Matching must be done on packet_type instead.
New dp_packets are initialized with packet_type Ethernet. Ports that
receive L3 packets will have to explicitly adjust the packet_type.
Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The NAT changes in this series need both packet_set_ipv4_addr()
and packet_set_ipv6_addr() exporting, however, the ipv4 api was
exported with an unrelated patch.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Code is simplified when the ODP keys use the same type as the struct
flow for the IPv6 addresses. As the change is facilitated by
extract-odp-netlink-h, this change only affects the userspace. We
already do the same for the ethernet addresses.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
I measured the packet processing cost of OVS DPDK datapath for different
OpenFlow actions. I configured OVS to use a single pmd thread and
measured the packet throughput in a phy-to-phy setup. I used 10G
interfaces bounded to DPDK driver and overloaded the vSwitch with 64
byte packets through one of the 10G interfaces.
The processing cost of the dec_ttl action seemed to be gratuitously high
compared with other actions.
I looked into the code and saw that dec_ttl is encoded as a masked
nested attribute in OVS_ACTION_ATTR_SET_MASKED(OVS_KEY_ATTR_IPV4).
That way, OVS datapath can modify several IP header fields (TTL, TOS,
source and destination IP addresses) by a single invocation of
packet_set_ipv4() in the odp_set_ipv4() function in the
lib/odp-execute.c file. The packet_set_ipv4() function takes the new
TOS, TTL and IP addresses as arguments, compares them with the actual
ones and updates the fields if needed. This means, that even if only TTL
needs to be updated, each of the four IP header fields is passed to the
callee and is compared to the actual field for each packet.
The odp_set_ipv4() caller function possesses information about the
fields that need to be updated in the 'mask' structure. The idea is to
spare invocation of the packet_set_ipv4() function but use its code
parts directly. So the 'mask' can be used to decide which IP header
fields need to be updated. In addition, a faster packet processing can
be achieved if the values of local variables are
calculated right before their usage.
| T | T | I | I |
| T | O | P | P | Vanilla OVS || + new patch
| L | S | s | d | (nsec/packet) || (nsec/packet)
-------+---+---+---+---+---------------++---------------
output | | | | | 67.19 || 67.19
| X | | | | 74.48 || 68.78
| | X | | | 74.42 || 70.07
| | | X | | 84.62 || 78.03
| | | | X | 84.25 || 77.94
| | | X | X | 97.46 || 91.86
| X | | X | X | 100.42 || 96.00
| X | X | X | X | 102.80 || 100.73
The table shows the average processing cost of packets in nanoseconds
for the following actions:
output; output + dec_ttl; output + mod_nw_tos; output + mod_nw_src;
output + mod_nw_dst and some of their combinations.
I ran each test five times. The values are the mean of the readings
obtained.
I added OVS_LIKELY to the 'if' condition for the TTL field, since as far
as I know, this field will typically be decremented when any field of
the IP header is modified.
Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
When IGMP or MLD packets arrive their content is used without the checksum
being verified. With this change the checksum is verified, and the packet
is not used for multicast snooping on failure.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The command "ovn-nbctl lrp-add" should not set the MAC address
which length is invalid to logical router port. This patch
updates the eth_addr_from_string() to check trailing characters.
We should use the ovs_scan() to check the "addresses" owned by
the logical port, instead of eth_addr_from_string(). This patch
also updates the ovn-nbctl tests.
Signed-off-by: nickcooper-zhangtonghao <nickcooper-zhangtonghao@opencloud.tech>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch provides the command line to create a load balancer.
You can create a load balancer independently and add it to multiple
switches or routers. A single load balancer can have multiple vips.
Add a name column for the load balancer. With --add-duplicate,
the command really creates a new load balancer with a duplicate name.
This name has no special meaning or purpose other than to provide
convenience for human interaction with the ovn-nb database.
This patch also provides the unit tests and the documentation.
Signed-off-by: nickcooper-zhangtonghao <nickcooper-zhangtonghao@opencloud.tech>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Rename "compose_nd" and "compose_na" to "compose_nd_ns" and
"compose_nd_na", respecively, to be clearer about their functionality.
This will also make it more consistent when we add Neighbor Discover
Router Solicitation/Advertisement compose functions.
Also change the source and destination IPv6 addresses to take
"struct in6_addr" arguments, which are more common in the code base.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
To easily allow both in- and out-of-tree building of the Python
wrapper for the OVS JSON parser (e.g. w/ pip), move json.h to
include/openvswitch. This also requires moving lib/{hmap,shash}.h.
Both hmap.h and shash.h were #include-ing "util.h" even though the
headers themselves did not use anything from there, but rather from
include/openvswitch/util.h. Fixing that required including util.h
in several C files mostly due to OVS_NOT_REACHED and things like
xmalloc.
Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
I presume the flags are supposed to map to neighbor discovery
advertisement "Router", "Solicited", and "Override" flags, which would
be "rso" instead of "rco".
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
These will have callers later.
This also rewrites ipv6_addr_bitand() to use newly defined macros.
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
This patch tries to support ND versus ARP for OVN.
It adds a new OVN action 'na' in ovn-controller side, and modify lflows
for 'na' action and relevant packets in ovn-northd.
First, for ovn-northd, it will generate lflows per each lport with its
IPv6 addresses and mac addresss, with 'na' action, such as:
match=(icmp6 && icmp6.type == 135 &&
(nd.target == fd81:ce49:a948:0:f816:3eff:fe46:8a42 ||
nd.target == fd81:ce49:b123:0:f816:3eff:fe46:8a42)),
action=(na { eth.src = fa:16:3e:46:8a:42; nd.tll = fa:16:3e:46:8a:42;
outport = inport;
inport = ""; /* Allow sending out inport. */ output; };)
and new lflows will be set in tabel ls_in_arp_nd_rsp, which is renamed
from previous ls_in_arp_rsp.
Later, for ovn-controller, when it received a ND packet, it frames a
template NA packet for reply. The NA packet will be initialized based on
ND packet, such as NA packet will use:
- ND packet eth.src as eth.dst,
- ND packet eth.dst as eth.src,
- ND packet ip6.src as ip6.dst,
- ND packet nd.target as ip6.src,
- ND packet eth.dst as nd.tll.
Finally, nested actions in 'na' action will update necessary fileds
for NA packet, such as:
- eth.src, nd.tll
- inport, outport
Since patch port for IPv6 router interface is not ready yet, this
patch will only try to deal with ND from VM. This patch will set
RSO flags to 011 for NA packets.
This patch also modified current ACL lflows for ND, not to do conntrack
on ND and NA packets in following tables:
- S_SWITCH_IN_PRE_ACL
- S_SWITCH_OUT_PRE_ACL
- S_SWITCH_IN_ACL
- S_SWITCH_OUT_ACL
Signed-off-by: Zong Kai LI <zealokii@gmail.com>
[blp@ovn.org made several minor simplifications and improvements]
Signed-off-by: Ben Pfaff <blp@ovn.org>
A zero prefix length is used to match any IP address, which is useful
for defining default routes.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Flavio Fernandes <flavio@flaviof.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Set and get functions for IP explicit congestion notification flag.
These function would be used by STT reassembly code.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
When using masked actions the ipv6_proto field of an action
to set IPv6 fields may be zero rather than the prevailing protocol
which will result in skipping checksum recalculation.
This patch resolves the problem by relying on the protocol
in the packet rather than that in the set field action.
A similar fix for the kernel datapath has been accepted into David Miller's
'net' tree as b4f70527f052 ("openvswitch: use flow protocol when
recalculating ipv6 checksums").
Cc: Jarno Rajahalme <jrajahalme@nicira.com>
Fixes: 6d670e7f0d ("lib/odp: Masked set action execution and printing.")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Following patch fixes number of issues with compose nd, like
setting ip packet header, set ICMP opt-len, checksum.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
An upcoming commit will use this as a building block in adding ARP support
to the OVN L3 logical router implementation.
Signed-off-by: Ben Pfaff <blp@ovn.org>
If a logical port has two ipv4 addresses and one ipv6 address
it will be stored as ["MAC IPv41 IPv42 IPv61"] instead of
["MAC IPv41", "MAC IPv42", "MAC IPv61"].
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
[blp@ovn.org made changes to comments and ovn.at]
Signed-off-by: Ben Pfaff <blp@ovn.org>
This saves some code and improves clarity, in my opinion.
Some of these changes just change an inet_pton() call into a similar
ip_parse() or ipv6_parse() call. In those cases the benefit is better
type safety, since inet_pton()'s output parameter is type "void *".
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
When doing push/pop and building tunnel header, do IPv6 route lookups and send
Neighbor Solicitations if needed.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Cc: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This allows code to be written more naturally in some cases.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Note that because there's been no prerequisite on the outer protocol,
we cannot add it now. Instead, treat the ipv4 and ipv6 dst fields in the way
that either both are null, or at most one of them is non-null.
[cascardo: abstract testing either dst with flow_tnl_dst_is_set]
cascardo: using IPv4-mapped address is an exercise for the future, since this
would require special handling of MFF_TUN_SRC and MFF_TUN_DST and OpenFlow
messages.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Co-authored-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
ipv6_string_mapped stores an IPv6 or IPv4 representation of an IPv6 address
into a string. If the address is IPv4-mapped, it's represented in IPv4
dotted-decimal format.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Extend OVS conntrack interface to cover NAT. New nested NAT action
may be included with a CT action. A bare NAT action only mangles
existing connections. If a NAT action with src or dst range attribute
is included, new (non-committed) connections are mangled according to
the NAT attributes.
This work extends on a branch by Thomas Graf at
https://github.com/tgraf/ovs/tree/nat.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>