2013-05-29 15:06:38 +09:00
|
|
|
/*
|
2014-12-23 23:42:05 +00:00
|
|
|
* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc.
|
2013-05-29 15:06:38 +09:00
|
|
|
* Copyright (c) 2013 Simon Horman
|
|
|
|
*
|
|
|
|
* Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
* you may not use this file except in compliance with the License.
|
|
|
|
* You may obtain a copy of the License at:
|
|
|
|
*
|
|
|
|
* http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
*
|
|
|
|
* Unless required by applicable law or agreed to in writing, software
|
|
|
|
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
* See the License for the specific language governing permissions and
|
|
|
|
* limitations under the License.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <config.h>
|
|
|
|
#include "odp-execute.h"
|
2017-11-06 14:42:32 -08:00
|
|
|
#include <sys/types.h>
|
2015-01-03 16:09:07 -08:00
|
|
|
#include <netinet/in.h>
|
2017-11-06 14:42:32 -08:00
|
|
|
#include <arpa/inet.h>
|
2014-12-23 23:42:05 +00:00
|
|
|
#include <netinet/icmp6.h>
|
2014-09-05 15:44:19 -07:00
|
|
|
#include <netinet/ip6.h>
|
2013-05-29 15:06:38 +09:00
|
|
|
#include <stdlib.h>
|
|
|
|
#include <string.h>
|
|
|
|
|
2015-02-25 12:01:53 -08:00
|
|
|
#include "dp-packet.h"
|
2013-12-30 15:58:58 -08:00
|
|
|
#include "dpif.h"
|
2013-05-29 15:06:38 +09:00
|
|
|
#include "netlink.h"
|
2014-08-04 11:11:40 -07:00
|
|
|
#include "odp-netlink.h"
|
2013-06-05 14:28:49 +09:00
|
|
|
#include "odp-util.h"
|
2013-05-29 15:06:38 +09:00
|
|
|
#include "packets.h"
|
2014-05-28 17:00:48 -07:00
|
|
|
#include "flow.h"
|
2013-10-09 17:37:30 -07:00
|
|
|
#include "unaligned.h"
|
2013-05-29 15:06:38 +09:00
|
|
|
#include "util.h"
|
odp-execute: Optimize IP header modification in OVS datapath
I measured the packet processing cost of OVS DPDK datapath for different
OpenFlow actions. I configured OVS to use a single pmd thread and
measured the packet throughput in a phy-to-phy setup. I used 10G
interfaces bounded to DPDK driver and overloaded the vSwitch with 64
byte packets through one of the 10G interfaces.
The processing cost of the dec_ttl action seemed to be gratuitously high
compared with other actions.
I looked into the code and saw that dec_ttl is encoded as a masked
nested attribute in OVS_ACTION_ATTR_SET_MASKED(OVS_KEY_ATTR_IPV4).
That way, OVS datapath can modify several IP header fields (TTL, TOS,
source and destination IP addresses) by a single invocation of
packet_set_ipv4() in the odp_set_ipv4() function in the
lib/odp-execute.c file. The packet_set_ipv4() function takes the new
TOS, TTL and IP addresses as arguments, compares them with the actual
ones and updates the fields if needed. This means, that even if only TTL
needs to be updated, each of the four IP header fields is passed to the
callee and is compared to the actual field for each packet.
The odp_set_ipv4() caller function possesses information about the
fields that need to be updated in the 'mask' structure. The idea is to
spare invocation of the packet_set_ipv4() function but use its code
parts directly. So the 'mask' can be used to decide which IP header
fields need to be updated. In addition, a faster packet processing can
be achieved if the values of local variables are
calculated right before their usage.
| T | T | I | I |
| T | O | P | P | Vanilla OVS || + new patch
| L | S | s | d | (nsec/packet) || (nsec/packet)
-------+---+---+---+---+---------------++---------------
output | | | | | 67.19 || 67.19
| X | | | | 74.48 || 68.78
| | X | | | 74.42 || 70.07
| | | X | | 84.62 || 78.03
| | | | X | 84.25 || 77.94
| | | X | X | 97.46 || 91.86
| X | | X | X | 100.42 || 96.00
| X | X | X | X | 102.80 || 100.73
The table shows the average processing cost of packets in nanoseconds
for the following actions:
output; output + dec_ttl; output + mod_nw_tos; output + mod_nw_src;
output + mod_nw_dst and some of their combinations.
I ran each test five times. The values are the mean of the readings
obtained.
I added OVS_LIKELY to the 'if' condition for the TTL field, since as far
as I know, this field will typically be decremented when any field of
the IP header is modified.
Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-12-13 17:27:37 +00:00
|
|
|
#include "csum.h"
|
2018-01-19 14:21:51 -05:00
|
|
|
#include "conntrack.h"
|
2013-05-29 15:06:38 +09:00
|
|
|
|
2014-09-05 15:44:19 -07:00
|
|
|
/* Masked copy of an ethernet address. 'src' is already properly masked. */
|
2013-05-29 15:06:38 +09:00
|
|
|
static void
|
2015-08-28 14:55:11 -07:00
|
|
|
ether_addr_copy_masked(struct eth_addr *dst, const struct eth_addr src,
|
|
|
|
const struct eth_addr mask)
|
2014-09-05 15:44:19 -07:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
2015-08-28 14:55:11 -07:00
|
|
|
for (i = 0; i < ARRAY_SIZE(dst->be16); i++) {
|
|
|
|
dst->be16[i] = src.be16[i] | (dst->be16[i] & ~mask.be16[i]);
|
2014-09-05 15:44:19 -07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_eth_set_addrs(struct dp_packet *packet, const struct ovs_key_ethernet *key,
|
2014-09-05 15:44:19 -07:00
|
|
|
const struct ovs_key_ethernet *mask)
|
2013-05-29 15:06:38 +09:00
|
|
|
{
|
2017-04-25 16:29:59 +00:00
|
|
|
struct eth_header *eh = dp_packet_eth(packet);
|
2013-05-29 15:06:38 +09:00
|
|
|
|
2014-04-02 15:44:21 -07:00
|
|
|
if (eh) {
|
2014-09-05 15:44:19 -07:00
|
|
|
if (!mask) {
|
2015-08-28 14:55:11 -07:00
|
|
|
eh->eth_src = key->eth_src;
|
|
|
|
eh->eth_dst = key->eth_dst;
|
2014-09-05 15:44:19 -07:00
|
|
|
} else {
|
2015-08-28 14:55:11 -07:00
|
|
|
ether_addr_copy_masked(&eh->eth_src, key->eth_src, mask->eth_src);
|
|
|
|
ether_addr_copy_masked(&eh->eth_dst, key->eth_dst, mask->eth_dst);
|
2014-09-05 15:44:19 -07:00
|
|
|
}
|
2014-04-02 15:44:21 -07:00
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
}
|
|
|
|
|
2014-09-05 15:44:19 -07:00
|
|
|
static void
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_ipv4(struct dp_packet *packet, const struct ovs_key_ipv4 *key,
|
2014-09-05 15:44:19 -07:00
|
|
|
const struct ovs_key_ipv4 *mask)
|
|
|
|
{
|
2015-02-22 03:21:09 -08:00
|
|
|
struct ip_header *nh = dp_packet_l3(packet);
|
odp-execute: Optimize IP header modification in OVS datapath
I measured the packet processing cost of OVS DPDK datapath for different
OpenFlow actions. I configured OVS to use a single pmd thread and
measured the packet throughput in a phy-to-phy setup. I used 10G
interfaces bounded to DPDK driver and overloaded the vSwitch with 64
byte packets through one of the 10G interfaces.
The processing cost of the dec_ttl action seemed to be gratuitously high
compared with other actions.
I looked into the code and saw that dec_ttl is encoded as a masked
nested attribute in OVS_ACTION_ATTR_SET_MASKED(OVS_KEY_ATTR_IPV4).
That way, OVS datapath can modify several IP header fields (TTL, TOS,
source and destination IP addresses) by a single invocation of
packet_set_ipv4() in the odp_set_ipv4() function in the
lib/odp-execute.c file. The packet_set_ipv4() function takes the new
TOS, TTL and IP addresses as arguments, compares them with the actual
ones and updates the fields if needed. This means, that even if only TTL
needs to be updated, each of the four IP header fields is passed to the
callee and is compared to the actual field for each packet.
The odp_set_ipv4() caller function possesses information about the
fields that need to be updated in the 'mask' structure. The idea is to
spare invocation of the packet_set_ipv4() function but use its code
parts directly. So the 'mask' can be used to decide which IP header
fields need to be updated. In addition, a faster packet processing can
be achieved if the values of local variables are
calculated right before their usage.
| T | T | I | I |
| T | O | P | P | Vanilla OVS || + new patch
| L | S | s | d | (nsec/packet) || (nsec/packet)
-------+---+---+---+---+---------------++---------------
output | | | | | 67.19 || 67.19
| X | | | | 74.48 || 68.78
| | X | | | 74.42 || 70.07
| | | X | | 84.62 || 78.03
| | | | X | 84.25 || 77.94
| | | X | X | 97.46 || 91.86
| X | | X | X | 100.42 || 96.00
| X | X | X | X | 102.80 || 100.73
The table shows the average processing cost of packets in nanoseconds
for the following actions:
output; output + dec_ttl; output + mod_nw_tos; output + mod_nw_src;
output + mod_nw_dst and some of their combinations.
I ran each test five times. The values are the mean of the readings
obtained.
I added OVS_LIKELY to the 'if' condition for the TTL field, since as far
as I know, this field will typically be decremented when any field of
the IP header is modified.
Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-12-13 17:27:37 +00:00
|
|
|
ovs_be32 ip_src_nh;
|
|
|
|
ovs_be32 ip_dst_nh;
|
|
|
|
ovs_be32 new_ip_src;
|
|
|
|
ovs_be32 new_ip_dst;
|
|
|
|
uint8_t new_tos;
|
|
|
|
uint8_t new_ttl;
|
2014-09-05 15:44:19 -07:00
|
|
|
|
odp-execute: Optimize IP header modification in OVS datapath
I measured the packet processing cost of OVS DPDK datapath for different
OpenFlow actions. I configured OVS to use a single pmd thread and
measured the packet throughput in a phy-to-phy setup. I used 10G
interfaces bounded to DPDK driver and overloaded the vSwitch with 64
byte packets through one of the 10G interfaces.
The processing cost of the dec_ttl action seemed to be gratuitously high
compared with other actions.
I looked into the code and saw that dec_ttl is encoded as a masked
nested attribute in OVS_ACTION_ATTR_SET_MASKED(OVS_KEY_ATTR_IPV4).
That way, OVS datapath can modify several IP header fields (TTL, TOS,
source and destination IP addresses) by a single invocation of
packet_set_ipv4() in the odp_set_ipv4() function in the
lib/odp-execute.c file. The packet_set_ipv4() function takes the new
TOS, TTL and IP addresses as arguments, compares them with the actual
ones and updates the fields if needed. This means, that even if only TTL
needs to be updated, each of the four IP header fields is passed to the
callee and is compared to the actual field for each packet.
The odp_set_ipv4() caller function possesses information about the
fields that need to be updated in the 'mask' structure. The idea is to
spare invocation of the packet_set_ipv4() function but use its code
parts directly. So the 'mask' can be used to decide which IP header
fields need to be updated. In addition, a faster packet processing can
be achieved if the values of local variables are
calculated right before their usage.
| T | T | I | I |
| T | O | P | P | Vanilla OVS || + new patch
| L | S | s | d | (nsec/packet) || (nsec/packet)
-------+---+---+---+---+---------------++---------------
output | | | | | 67.19 || 67.19
| X | | | | 74.48 || 68.78
| | X | | | 74.42 || 70.07
| | | X | | 84.62 || 78.03
| | | | X | 84.25 || 77.94
| | | X | X | 97.46 || 91.86
| X | | X | X | 100.42 || 96.00
| X | X | X | X | 102.80 || 100.73
The table shows the average processing cost of packets in nanoseconds
for the following actions:
output; output + dec_ttl; output + mod_nw_tos; output + mod_nw_src;
output + mod_nw_dst and some of their combinations.
I ran each test five times. The values are the mean of the readings
obtained.
I added OVS_LIKELY to the 'if' condition for the TTL field, since as far
as I know, this field will typically be decremented when any field of
the IP header is modified.
Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-12-13 17:27:37 +00:00
|
|
|
if (mask->ipv4_src) {
|
|
|
|
ip_src_nh = get_16aligned_be32(&nh->ip_src);
|
|
|
|
new_ip_src = key->ipv4_src | (ip_src_nh & ~mask->ipv4_src);
|
|
|
|
|
|
|
|
if (ip_src_nh != new_ip_src) {
|
|
|
|
packet_set_ipv4_addr(packet, &nh->ip_src, new_ip_src);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (mask->ipv4_dst) {
|
|
|
|
ip_dst_nh = get_16aligned_be32(&nh->ip_dst);
|
|
|
|
new_ip_dst = key->ipv4_dst | (ip_dst_nh & ~mask->ipv4_dst);
|
|
|
|
|
|
|
|
if (ip_dst_nh != new_ip_dst) {
|
|
|
|
packet_set_ipv4_addr(packet, &nh->ip_dst, new_ip_dst);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (mask->ipv4_tos) {
|
|
|
|
new_tos = key->ipv4_tos | (nh->ip_tos & ~mask->ipv4_tos);
|
|
|
|
|
|
|
|
if (nh->ip_tos != new_tos) {
|
|
|
|
nh->ip_csum = recalc_csum16(nh->ip_csum,
|
|
|
|
htons((uint16_t) nh->ip_tos),
|
|
|
|
htons((uint16_t) new_tos));
|
|
|
|
nh->ip_tos = new_tos;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (OVS_LIKELY(mask->ipv4_ttl)) {
|
|
|
|
new_ttl = key->ipv4_ttl | (nh->ip_ttl & ~mask->ipv4_ttl);
|
|
|
|
|
|
|
|
if (OVS_LIKELY(nh->ip_ttl != new_ttl)) {
|
|
|
|
nh->ip_csum = recalc_csum16(nh->ip_csum, htons(nh->ip_ttl << 8),
|
|
|
|
htons(new_ttl << 8));
|
|
|
|
nh->ip_ttl = new_ttl;
|
|
|
|
}
|
|
|
|
}
|
2014-09-05 15:44:19 -07:00
|
|
|
}
|
|
|
|
|
2017-01-04 16:10:56 -08:00
|
|
|
static struct in6_addr *
|
|
|
|
mask_ipv6_addr(const ovs_16aligned_be32 *old, const struct in6_addr *addr,
|
|
|
|
const struct in6_addr *mask, struct in6_addr *masked)
|
2014-09-05 15:44:19 -07:00
|
|
|
{
|
2017-01-04 16:10:56 -08:00
|
|
|
#ifdef s6_addr32
|
2014-09-05 15:44:19 -07:00
|
|
|
for (int i = 0; i < 4; i++) {
|
2017-01-04 16:10:56 -08:00
|
|
|
masked->s6_addr32[i] = addr->s6_addr32[i]
|
|
|
|
| (get_16aligned_be32(&old[i]) & ~mask->s6_addr32[i]);
|
2014-09-05 15:44:19 -07:00
|
|
|
}
|
2017-01-04 16:10:56 -08:00
|
|
|
#else
|
|
|
|
const uint8_t *old8 = (const uint8_t *)old;
|
|
|
|
for (int i = 0; i < 16; i++) {
|
|
|
|
masked->s6_addr[i] = addr->s6_addr[i] | (old8[i] & ~mask->s6_addr[i]);
|
|
|
|
}
|
|
|
|
#endif
|
2014-09-05 15:44:19 -07:00
|
|
|
return masked;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_ipv6(struct dp_packet *packet, const struct ovs_key_ipv6 *key,
|
2014-09-05 15:44:19 -07:00
|
|
|
const struct ovs_key_ipv6 *mask)
|
|
|
|
{
|
2015-02-22 03:21:09 -08:00
|
|
|
struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(packet);
|
2017-01-04 16:10:56 -08:00
|
|
|
struct in6_addr sbuf, dbuf;
|
2014-09-05 15:44:19 -07:00
|
|
|
uint8_t old_tc = ntohl(get_16aligned_be32(&nh->ip6_flow)) >> 20;
|
|
|
|
ovs_be32 old_fl = get_16aligned_be32(&nh->ip6_flow) & htonl(0xfffff);
|
|
|
|
|
|
|
|
packet_set_ipv6(
|
|
|
|
packet,
|
2017-01-04 16:10:56 -08:00
|
|
|
mask_ipv6_addr(nh->ip6_src.be32, &key->ipv6_src, &mask->ipv6_src,
|
|
|
|
&sbuf),
|
|
|
|
mask_ipv6_addr(nh->ip6_dst.be32, &key->ipv6_dst, &mask->ipv6_dst,
|
|
|
|
&dbuf),
|
2014-09-05 15:44:19 -07:00
|
|
|
key->ipv6_tclass | (old_tc & ~mask->ipv6_tclass),
|
|
|
|
key->ipv6_label | (old_fl & ~mask->ipv6_label),
|
|
|
|
key->ipv6_hlimit | (nh->ip6_hlim & ~mask->ipv6_hlimit));
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_tcp(struct dp_packet *packet, const struct ovs_key_tcp *key,
|
2014-09-05 15:44:19 -07:00
|
|
|
const struct ovs_key_tcp *mask)
|
|
|
|
{
|
2015-02-22 03:21:09 -08:00
|
|
|
struct tcp_header *th = dp_packet_l4(packet);
|
2014-09-05 15:44:19 -07:00
|
|
|
|
2015-02-22 03:21:09 -08:00
|
|
|
if (OVS_LIKELY(th && dp_packet_get_tcp_payload(packet))) {
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
packet_set_tcp_port(packet,
|
|
|
|
key->tcp_src | (th->tcp_src & ~mask->tcp_src),
|
|
|
|
key->tcp_dst | (th->tcp_dst & ~mask->tcp_dst));
|
|
|
|
}
|
2014-09-05 15:44:19 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_udp(struct dp_packet *packet, const struct ovs_key_udp *key,
|
2014-09-05 15:44:19 -07:00
|
|
|
const struct ovs_key_udp *mask)
|
|
|
|
{
|
2015-02-22 03:21:09 -08:00
|
|
|
struct udp_header *uh = dp_packet_l4(packet);
|
2014-09-05 15:44:19 -07:00
|
|
|
|
2015-02-22 03:21:09 -08:00
|
|
|
if (OVS_LIKELY(uh && dp_packet_get_udp_payload(packet))) {
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
packet_set_udp_port(packet,
|
|
|
|
key->udp_src | (uh->udp_src & ~mask->udp_src),
|
|
|
|
key->udp_dst | (uh->udp_dst & ~mask->udp_dst));
|
|
|
|
}
|
2014-09-05 15:44:19 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_sctp(struct dp_packet *packet, const struct ovs_key_sctp *key,
|
2014-09-05 15:44:19 -07:00
|
|
|
const struct ovs_key_sctp *mask)
|
|
|
|
{
|
2015-02-22 03:21:09 -08:00
|
|
|
struct sctp_header *sh = dp_packet_l4(packet);
|
2014-09-05 15:44:19 -07:00
|
|
|
|
2015-02-22 03:21:09 -08:00
|
|
|
if (OVS_LIKELY(sh && dp_packet_get_sctp_payload(packet))) {
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
packet_set_sctp_port(packet,
|
|
|
|
key->sctp_src | (sh->sctp_src & ~mask->sctp_src),
|
|
|
|
key->sctp_dst | (sh->sctp_dst & ~mask->sctp_dst));
|
|
|
|
}
|
2014-09-05 15:44:19 -07:00
|
|
|
}
|
|
|
|
|
2013-05-29 15:06:38 +09:00
|
|
|
static void
|
2013-06-05 14:28:49 +09:00
|
|
|
odp_set_tunnel_action(const struct nlattr *a, struct flow_tnl *tun_key)
|
|
|
|
{
|
2018-01-31 11:23:24 -08:00
|
|
|
ovs_assert(odp_tun_key_from_attr(a, tun_key) != ODP_FIT_ERROR);
|
2013-06-05 14:28:49 +09:00
|
|
|
}
|
|
|
|
|
2013-10-09 17:37:30 -07:00
|
|
|
static void
|
2015-02-22 03:21:09 -08:00
|
|
|
set_arp(struct dp_packet *packet, const struct ovs_key_arp *key,
|
2014-09-05 15:44:19 -07:00
|
|
|
const struct ovs_key_arp *mask)
|
2013-10-09 17:37:30 -07:00
|
|
|
{
|
2015-02-22 03:21:09 -08:00
|
|
|
struct arp_eth_header *arp = dp_packet_l3(packet);
|
2013-10-09 17:37:30 -07:00
|
|
|
|
2014-09-05 15:44:19 -07:00
|
|
|
if (!mask) {
|
|
|
|
arp->ar_op = key->arp_op;
|
2015-08-28 14:55:11 -07:00
|
|
|
arp->ar_sha = key->arp_sha;
|
2014-09-05 15:44:19 -07:00
|
|
|
put_16aligned_be32(&arp->ar_spa, key->arp_sip);
|
2015-08-28 14:55:11 -07:00
|
|
|
arp->ar_tha = key->arp_tha;
|
2014-09-05 15:44:19 -07:00
|
|
|
put_16aligned_be32(&arp->ar_tpa, key->arp_tip);
|
|
|
|
} else {
|
|
|
|
ovs_be32 ar_spa = get_16aligned_be32(&arp->ar_spa);
|
|
|
|
ovs_be32 ar_tpa = get_16aligned_be32(&arp->ar_tpa);
|
|
|
|
|
|
|
|
arp->ar_op = key->arp_op | (arp->ar_op & ~mask->arp_op);
|
2015-08-28 14:55:11 -07:00
|
|
|
ether_addr_copy_masked(&arp->ar_sha, key->arp_sha, mask->arp_sha);
|
2014-09-05 15:44:19 -07:00
|
|
|
put_16aligned_be32(&arp->ar_spa,
|
|
|
|
key->arp_sip | (ar_spa & ~mask->arp_sip));
|
2015-08-28 14:55:11 -07:00
|
|
|
ether_addr_copy_masked(&arp->ar_tha, key->arp_tha, mask->arp_tha);
|
2014-09-05 15:44:19 -07:00
|
|
|
put_16aligned_be32(&arp->ar_tpa,
|
|
|
|
key->arp_tip | (ar_tpa & ~mask->arp_tip));
|
|
|
|
}
|
2013-10-09 17:37:30 -07:00
|
|
|
}
|
|
|
|
|
2014-12-23 23:42:05 +00:00
|
|
|
static void
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_nd(struct dp_packet *packet, const struct ovs_key_nd *key,
|
2014-12-23 23:42:05 +00:00
|
|
|
const struct ovs_key_nd *mask)
|
|
|
|
{
|
2015-02-22 03:21:09 -08:00
|
|
|
const struct ovs_nd_msg *ns = dp_packet_l4(packet);
|
2017-05-04 20:42:54 +05:30
|
|
|
const struct ovs_nd_lla_opt *lla_opt = dp_packet_get_nd_payload(packet);
|
2014-12-23 23:42:05 +00:00
|
|
|
|
2017-05-04 20:42:54 +05:30
|
|
|
if (OVS_LIKELY(ns && lla_opt)) {
|
2015-02-22 03:21:09 -08:00
|
|
|
int bytes_remain = dp_packet_l4_size(packet) - sizeof(*ns);
|
2017-01-04 16:10:56 -08:00
|
|
|
struct in6_addr tgt_buf;
|
2015-08-28 14:55:11 -07:00
|
|
|
struct eth_addr sll_buf = eth_addr_zero;
|
|
|
|
struct eth_addr tll_buf = eth_addr_zero;
|
2014-12-23 23:42:05 +00:00
|
|
|
|
2017-05-04 20:42:54 +05:30
|
|
|
while (bytes_remain >= ND_LLA_OPT_LEN && lla_opt->len != 0) {
|
|
|
|
if (lla_opt->type == ND_OPT_SOURCE_LINKADDR
|
|
|
|
&& lla_opt->len == 1) {
|
|
|
|
sll_buf = lla_opt->mac;
|
2015-08-28 14:55:11 -07:00
|
|
|
ether_addr_copy_masked(&sll_buf, key->nd_sll, mask->nd_sll);
|
2014-12-23 23:42:05 +00:00
|
|
|
|
|
|
|
/* A packet can only contain one SLL or TLL option */
|
|
|
|
break;
|
2017-05-04 20:42:54 +05:30
|
|
|
} else if (lla_opt->type == ND_OPT_TARGET_LINKADDR
|
|
|
|
&& lla_opt->len == 1) {
|
|
|
|
tll_buf = lla_opt->mac;
|
2015-08-28 14:55:11 -07:00
|
|
|
ether_addr_copy_masked(&tll_buf, key->nd_tll, mask->nd_tll);
|
2014-12-23 23:42:05 +00:00
|
|
|
|
|
|
|
/* A packet can only contain one SLL or TLL option */
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2017-05-04 20:42:54 +05:30
|
|
|
lla_opt += lla_opt->len;
|
|
|
|
bytes_remain -= lla_opt->len * ND_LLA_OPT_LEN;
|
2014-12-23 23:42:05 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
packet_set_nd(packet,
|
2017-01-04 16:10:56 -08:00
|
|
|
mask_ipv6_addr(ns->target.be32, &key->nd_target,
|
|
|
|
&mask->nd_target, &tgt_buf),
|
2014-12-23 23:42:05 +00:00
|
|
|
sll_buf,
|
|
|
|
tll_buf);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
/* Set the NSH header. Assumes the NSH header is present and matches the
|
|
|
|
* MD format of the key. The slow path must take case of that. */
|
|
|
|
static void
|
2018-01-11 13:24:02 +08:00
|
|
|
odp_set_nsh(struct dp_packet *packet, const struct nlattr *a, bool has_mask)
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
{
|
2018-01-11 13:24:02 +08:00
|
|
|
struct ovs_key_nsh key, mask;
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
struct nsh_hdr *nsh = dp_packet_l3(packet);
|
2017-11-07 00:40:21 +01:00
|
|
|
uint8_t mdtype = nsh_md_type(nsh);
|
2018-01-06 13:47:51 +08:00
|
|
|
ovs_be32 path_hdr;
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
|
2018-01-11 13:24:02 +08:00
|
|
|
if (has_mask) {
|
|
|
|
odp_nsh_key_from_attr(a, &key, &mask);
|
|
|
|
} else {
|
|
|
|
odp_nsh_key_from_attr(a, &key, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!has_mask) {
|
|
|
|
nsh_set_flags_and_ttl(nsh, key.flags, key.ttl);
|
|
|
|
put_16aligned_be32(&nsh->path_hdr, key.path_hdr);
|
2017-11-07 00:40:21 +01:00
|
|
|
switch (mdtype) {
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
case NSH_M_TYPE1:
|
|
|
|
for (int i = 0; i < 4; i++) {
|
2018-01-11 13:24:02 +08:00
|
|
|
put_16aligned_be32(&nsh->md1.context[i], key.context[i]);
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
case NSH_M_TYPE2:
|
|
|
|
default:
|
2017-08-05 13:41:11 +08:00
|
|
|
/* No support for setting any other metadata format yet. */
|
|
|
|
break;
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
}
|
|
|
|
} else {
|
2018-01-11 13:24:01 +08:00
|
|
|
uint8_t flags = nsh_get_flags(nsh);
|
|
|
|
uint8_t ttl = nsh_get_ttl(nsh);
|
|
|
|
|
2018-01-11 13:24:02 +08:00
|
|
|
flags = key.flags | (flags & ~mask.flags);
|
|
|
|
ttl = key.ttl | (ttl & ~mask.ttl);
|
2018-01-11 13:24:01 +08:00
|
|
|
nsh_set_flags_and_ttl(nsh, flags, ttl);
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
|
2018-01-11 13:24:01 +08:00
|
|
|
uint32_t spi = ntohl(nsh_get_spi(nsh));
|
|
|
|
uint8_t si = nsh_get_si(nsh);
|
2018-01-11 13:24:02 +08:00
|
|
|
uint32_t spi_mask = nsh_path_hdr_to_spi_uint32(mask.path_hdr);
|
|
|
|
uint8_t si_mask = nsh_path_hdr_to_si(mask.path_hdr);
|
2018-01-06 13:47:51 +08:00
|
|
|
if (spi_mask == 0x00ffffff) {
|
|
|
|
spi_mask = UINT32_MAX;
|
|
|
|
}
|
2018-01-11 13:24:02 +08:00
|
|
|
spi = nsh_path_hdr_to_spi_uint32(key.path_hdr) | (spi & ~spi_mask);
|
|
|
|
si = nsh_path_hdr_to_si(key.path_hdr) | (si & ~si_mask);
|
2018-01-11 13:24:01 +08:00
|
|
|
path_hdr = nsh_get_path_hdr(nsh);
|
|
|
|
nsh_path_hdr_set_spi(&path_hdr, htonl(spi));
|
|
|
|
nsh_path_hdr_set_si(&path_hdr, si);
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
put_16aligned_be32(&nsh->path_hdr, path_hdr);
|
2017-11-07 00:40:21 +01:00
|
|
|
switch (mdtype) {
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
case NSH_M_TYPE1:
|
|
|
|
for (int i = 0; i < 4; i++) {
|
2018-01-06 13:47:51 +08:00
|
|
|
ovs_be32 p = get_16aligned_be32(&nsh->md1.context[i]);
|
2018-01-11 13:24:02 +08:00
|
|
|
ovs_be32 k = key.context[i];
|
|
|
|
ovs_be32 m = mask.context[i];
|
2018-01-06 13:47:51 +08:00
|
|
|
put_16aligned_be32(&nsh->md1.context[i], k | (p & ~m));
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
case NSH_M_TYPE2:
|
|
|
|
default:
|
2017-08-05 13:41:11 +08:00
|
|
|
/* No support for setting any other metadata format yet. */
|
|
|
|
break;
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-06-05 14:28:49 +09:00
|
|
|
static void
|
2015-02-25 12:01:53 -08:00
|
|
|
odp_execute_set_action(struct dp_packet *packet, const struct nlattr *a)
|
2013-05-29 15:06:38 +09:00
|
|
|
{
|
|
|
|
enum ovs_key_attr type = nl_attr_type(a);
|
|
|
|
const struct ovs_key_ipv4 *ipv4_key;
|
|
|
|
const struct ovs_key_ipv6 *ipv6_key;
|
2014-10-03 20:23:58 -07:00
|
|
|
struct pkt_metadata *md = &packet->md;
|
2013-05-29 15:06:38 +09:00
|
|
|
|
|
|
|
switch (type) {
|
|
|
|
case OVS_KEY_ATTR_PRIORITY:
|
2013-12-30 15:58:58 -08:00
|
|
|
md->skb_priority = nl_attr_get_u32(a);
|
2013-06-05 14:28:49 +09:00
|
|
|
break;
|
|
|
|
|
2013-05-29 15:06:38 +09:00
|
|
|
case OVS_KEY_ATTR_TUNNEL:
|
2013-12-30 15:58:58 -08:00
|
|
|
odp_set_tunnel_action(a, &md->tunnel);
|
2013-06-05 14:28:49 +09:00
|
|
|
break;
|
|
|
|
|
2013-05-29 15:06:38 +09:00
|
|
|
case OVS_KEY_ATTR_SKB_MARK:
|
2013-12-30 15:58:58 -08:00
|
|
|
md->pkt_mark = nl_attr_get_u32(a);
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_ETHERNET:
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_eth_set_addrs(packet, nl_attr_get(a), NULL);
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
2018-01-06 13:47:51 +08:00
|
|
|
case OVS_KEY_ATTR_NSH: {
|
2018-01-11 13:24:02 +08:00
|
|
|
odp_set_nsh(packet, a, false);
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
break;
|
2018-01-06 13:47:51 +08:00
|
|
|
}
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
|
2013-05-29 15:06:38 +09:00
|
|
|
case OVS_KEY_ATTR_IPV4:
|
|
|
|
ipv4_key = nl_attr_get_unspec(a, sizeof(struct ovs_key_ipv4));
|
2015-02-22 03:21:09 -08:00
|
|
|
packet_set_ipv4(packet, ipv4_key->ipv4_src,
|
2014-06-23 11:43:57 -07:00
|
|
|
ipv4_key->ipv4_dst, ipv4_key->ipv4_tos,
|
|
|
|
ipv4_key->ipv4_ttl);
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_IPV6:
|
|
|
|
ipv6_key = nl_attr_get_unspec(a, sizeof(struct ovs_key_ipv6));
|
2017-01-04 16:10:56 -08:00
|
|
|
packet_set_ipv6(packet, &ipv6_key->ipv6_src, &ipv6_key->ipv6_dst,
|
2014-06-23 11:43:57 -07:00
|
|
|
ipv6_key->ipv6_tclass, ipv6_key->ipv6_label,
|
|
|
|
ipv6_key->ipv6_hlimit);
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_TCP:
|
2015-02-22 03:21:09 -08:00
|
|
|
if (OVS_LIKELY(dp_packet_get_tcp_payload(packet))) {
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
const struct ovs_key_tcp *tcp_key
|
|
|
|
= nl_attr_get_unspec(a, sizeof(struct ovs_key_tcp));
|
|
|
|
|
2015-02-22 03:21:09 -08:00
|
|
|
packet_set_tcp_port(packet, tcp_key->tcp_src,
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
tcp_key->tcp_dst);
|
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
2013-06-11 11:05:55 +09:00
|
|
|
case OVS_KEY_ATTR_UDP:
|
2015-02-22 03:21:09 -08:00
|
|
|
if (OVS_LIKELY(dp_packet_get_udp_payload(packet))) {
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
const struct ovs_key_udp *udp_key
|
|
|
|
= nl_attr_get_unspec(a, sizeof(struct ovs_key_udp));
|
|
|
|
|
2015-02-22 03:21:09 -08:00
|
|
|
packet_set_udp_port(packet, udp_key->udp_src,
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
udp_key->udp_dst);
|
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
2013-08-22 20:24:44 +12:00
|
|
|
case OVS_KEY_ATTR_SCTP:
|
2015-02-22 03:21:09 -08:00
|
|
|
if (OVS_LIKELY(dp_packet_get_sctp_payload(packet))) {
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
const struct ovs_key_sctp *sctp_key
|
|
|
|
= nl_attr_get_unspec(a, sizeof(struct ovs_key_sctp));
|
|
|
|
|
2015-02-22 03:21:09 -08:00
|
|
|
packet_set_sctp_port(packet, sctp_key->sctp_src,
|
Fix setting transport ports with frags.
Packets with 'LATER' fragment do not have a transport header, so it is
not possible to either match on or set transport ports on such
packets. Matching is prevented by augmenting mf_are_prereqs_ok() with
a nw_frag 'LATER' bit check. Setting the transport headers on such
packets is prevented in three ways:
1. Flows with an explicit match on nw_frag, where the LATER bit is 1:
existing calls to the modified mf_are_prereqs_ok() prohibit using
transport header fields (port numbers) in OXM/NXM actions
(set_field, move). SET_TP_* actions need a new check on the LATER
bit.
2. Flows that wildcard the nw_frag LATER bit: At flow translation
time, add calls to mf_are_prereqs_ok() to make sure that we do not
use transport ports in flows that do not have them.
3. At action execution time, do not set transport ports, if the packet
does not have a full transport header. This ensures that we never
call the packet_set functions, that require a valid transport
header, with packets that do not have them. For example, if the
flow was created with a IPv6 first fragment that had the full TCP
header, but the next packet's first fragment is missing them.
3 alone would suffice for correct behavior, but 1 and 2 seem like a
right thing to do, anyway.
Currently, if we are setting port numbers, we will also match them,
due to us tracking the set fields with the same flow_wildcards as the
matched fields. Hence, if the incoming port number was not zero, the
flow would not match any packets with missing or truncated transport
headers. However, relying on no packets having zero port numbers
would not be very robust. Also, we may separate the tracking of set
and matched fields in the future, which would allow some flows that
blindly set port numbers to not match on them at all.
For TCP in case 3 we use ofpbuf_get_tcp_payload() that requires the
whole (potentially variable size) TCP header to be present. However,
when parsing a flow, we only require the fixed size portion of the TCP
header to be present, which would be enough to set the port numbers
and fix the TCP checksum.
Finally, we add tests testing the new behavior.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-05 10:10:13 -08:00
|
|
|
sctp_key->sctp_dst);
|
|
|
|
}
|
2013-08-22 20:24:44 +12:00
|
|
|
break;
|
|
|
|
|
2013-06-11 11:05:55 +09:00
|
|
|
case OVS_KEY_ATTR_MPLS:
|
2015-02-22 03:21:09 -08:00
|
|
|
set_mpls_lse(packet, nl_attr_get_be32(a));
|
2014-09-05 15:44:19 -07:00
|
|
|
break;
|
2013-05-29 15:06:38 +09:00
|
|
|
|
2013-10-09 17:37:30 -07:00
|
|
|
case OVS_KEY_ATTR_ARP:
|
2015-02-22 03:21:09 -08:00
|
|
|
set_arp(packet, nl_attr_get(a), NULL);
|
2013-10-09 17:37:30 -07:00
|
|
|
break;
|
|
|
|
|
2015-10-20 22:03:02 -07:00
|
|
|
case OVS_KEY_ATTR_ICMP:
|
|
|
|
case OVS_KEY_ATTR_ICMPV6:
|
|
|
|
if (OVS_LIKELY(dp_packet_get_icmp_payload(packet))) {
|
|
|
|
const struct ovs_key_icmp *icmp_key
|
|
|
|
= nl_attr_get_unspec(a, sizeof(struct ovs_key_icmp));
|
|
|
|
|
|
|
|
packet_set_icmp(packet, icmp_key->icmp_type, icmp_key->icmp_code);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
2014-12-23 23:42:05 +00:00
|
|
|
case OVS_KEY_ATTR_ND:
|
2015-02-22 03:21:09 -08:00
|
|
|
if (OVS_LIKELY(dp_packet_get_nd_payload(packet))) {
|
2014-12-23 23:42:05 +00:00
|
|
|
const struct ovs_key_nd *nd_key
|
|
|
|
= nl_attr_get_unspec(a, sizeof(struct ovs_key_nd));
|
2017-01-04 16:10:56 -08:00
|
|
|
packet_set_nd(packet, &nd_key->nd_target, nd_key->nd_sll,
|
2015-08-28 14:55:11 -07:00
|
|
|
nd_key->nd_tll);
|
2014-12-23 23:42:05 +00:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
2014-03-04 15:36:03 -08:00
|
|
|
case OVS_KEY_ATTR_DP_HASH:
|
2014-08-29 16:06:42 -07:00
|
|
|
md->dp_hash = nl_attr_get_u32(a);
|
2014-03-04 15:36:03 -08:00
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_RECIRC_ID:
|
|
|
|
md->recirc_id = nl_attr_get_u32(a);
|
|
|
|
break;
|
|
|
|
|
2013-06-11 11:05:55 +09:00
|
|
|
case OVS_KEY_ATTR_UNSPEC:
|
2017-06-02 16:16:17 +00:00
|
|
|
case OVS_KEY_ATTR_PACKET_TYPE:
|
2013-06-11 11:05:55 +09:00
|
|
|
case OVS_KEY_ATTR_ENCAP:
|
|
|
|
case OVS_KEY_ATTR_ETHERTYPE:
|
|
|
|
case OVS_KEY_ATTR_IN_PORT:
|
|
|
|
case OVS_KEY_ATTR_VLAN:
|
2013-10-28 13:54:40 -07:00
|
|
|
case OVS_KEY_ATTR_TCP_FLAGS:
|
Add support for connection tracking.
This patch adds a new action and fields to OVS that allow connection
tracking to be performed. This support works in conjunction with the
Linux kernel support merged into the Linux-4.3 development cycle.
Packets have two possible states with respect to connection tracking:
Untracked packets have not previously passed through the connection
tracker, while tracked packets have previously been through the
connection tracker. For OpenFlow pipeline processing, untracked packets
can become tracked, and they will remain tracked until the end of the
pipeline. Tracked packets cannot become untracked.
Connections can be unknown, uncommitted, or committed. Packets which are
untracked have unknown connection state. To know the connection state,
the packet must become tracked. Uncommitted connections have no
connection state stored about them, so it is only possible for the
connection tracker to identify whether they are a new connection or
whether they are invalid. Committed connections have connection state
stored beyond the lifetime of the packet, which allows later packets in
the same connection to be identified as part of the same established
connection, or related to an existing connection - for instance ICMP
error responses.
The new 'ct' action transitions the packet from "untracked" to
"tracked" by sending this flow through the connection tracker.
The following parameters are supported initally:
- "commit": When commit is executed, the connection moves from
uncommitted state to committed state. This signals that information
about the connection should be stored beyond the lifetime of the
packet within the pipeline. This allows future packets in the same
connection to be recognized as part of the same "established" (est)
connection, as well as identifying packets in the reply (rpl)
direction, or packets related to an existing connection (rel).
- "zone=[u16|NXM]": Perform connection tracking in the zone specified.
Each zone is an independent connection tracking context. When the
"commit" parameter is used, the connection will only be committed in
the specified zone, and not in other zones. This is 0 by default.
- "table=NUMBER": Fork pipeline processing in two. The original instance
of the packet will continue processing the current actions list as an
untracked packet. An additional instance of the packet will be sent to
the connection tracker, which will be re-injected into the OpenFlow
pipeline to resume processing in the specified table, with the
ct_state and other ct match fields set. If the table is not specified,
then the packet is submitted to the connection tracker, but the
pipeline does not fork and the ct match fields are not populated. It
is strongly recommended to specify a table later than the current
table to prevent loops.
When the "table" option is used, the packet that continues processing in
the specified table will have the ct_state populated. The ct_state may
have any of the following flags set:
- Tracked (trk): Connection tracking has occurred.
- Reply (rpl): The flow is in the reply direction.
- Invalid (inv): The connection tracker couldn't identify the connection.
- New (new): This is the beginning of a new connection.
- Established (est): This is part of an already existing connection.
- Related (rel): This connection is related to an existing connection.
For more information, consult the ovs-ofctl(8) man pages.
Below is a simple example flow table to allow outbound TCP traffic from
port 1 and drop traffic from port 2 that was not initiated by port 1:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,ct_state=-trk,action=ct(commit,zone=9),2
table=0,in_port=2,tcp,ct_state=-trk,action=ct(zone=9,table=1)
table=1,in_port=2,ct_state=+trk+est,tcp,action=1
table=1,in_port=2,ct_state=+trk+new,tcp,action=drop
Based on original design by Justin Pettit, contributions from Thomas
Graf and Daniele Di Proietto.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-08-11 10:56:09 -07:00
|
|
|
case OVS_KEY_ATTR_CT_STATE:
|
datapath: Add original direction conntrack tuple to sw_flow_key.
Upstream commit:
commit 9dd7f8907c3705dc7a7a375d1c6e30b06e6daffc
Author: Jarno Rajahalme <jarno@ovn.org>
Date: Thu Feb 9 11:21:59 2017 -0800
openvswitch: Add original direction conntrack tuple to sw_flow_key.
Add the fields of the conntrack original direction 5-tuple to struct
sw_flow_key. The new fields are initially marked as non-existent, and
are populated whenever a conntrack action is executed and either finds
or generates a conntrack entry. This means that these fields exist
for all packets that were not rejected by conntrack as untrackable.
The original tuple fields in the sw_flow_key are filled from the
original direction tuple of the conntrack entry relating to the
current packet, or from the original direction tuple of the master
conntrack entry, if the current conntrack entry has a master.
Generally, expected connections of connections having an assigned
helper (e.g., FTP), have a master conntrack entry.
The main purpose of the new conntrack original tuple fields is to
allow matching on them for policy decision purposes, with the premise
that the admissibility of tracked connections reply packets (as well
as original direction packets), and both direction packets of any
related connections may be based on ACL rules applying to the master
connection's original direction 5-tuple. This also makes it easier to
make policy decisions when the actual packet headers might have been
transformed by NAT, as the original direction 5-tuple represents the
packet headers before any such transformation.
When using the original direction 5-tuple the admissibility of return
and/or related packets need not be based on the mere existence of a
conntrack entry, allowing separation of admission policy from the
established conntrack state. While existence of a conntrack entry is
required for admission of the return or related packets, policy
changes can render connections that were initially admitted to be
rejected or dropped afterwards. If the admission of the return and
related packets was based on mere conntrack state (e.g., connection
being in an established state), a policy change that would make the
connection rejected or dropped would need to find and delete all
conntrack entries affected by such a change. When using the original
direction 5-tuple matching the affected conntrack entries can be
allowed to time out instead, as the established state of the
connection would not need to be the basis for packet admission any
more.
It should be noted that the directionality of related connections may
be the same or different than that of the master connection, and
neither the original direction 5-tuple nor the conntrack state bits
carry this information. If needed, the directionality of the master
connection can be stored in master's conntrack mark or labels, which
are automatically inherited by the expected related connections.
The fact that neither ARP nor ND packets are trackable by conntrack
allows mutual exclusion between ARP/ND and the new conntrack original
tuple fields. Hence, the IP addresses are overlaid in union with ARP
and ND fields. This allows the sw_flow_key to not grow much due to
this patch, but it also means that we must be careful to never use the
new key fields with ARP or ND packets. ARP is easy to distinguish and
keep mutually exclusive based on the ethernet type, but ND being an
ICMPv6 protocol requires a bit more attention.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch squashes in minimal amount of OVS userspace code to not
break the build. Later patches contain the full userspace support.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
2017-03-08 17:18:22 -08:00
|
|
|
case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4:
|
|
|
|
case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6:
|
Add support for connection tracking.
This patch adds a new action and fields to OVS that allow connection
tracking to be performed. This support works in conjunction with the
Linux kernel support merged into the Linux-4.3 development cycle.
Packets have two possible states with respect to connection tracking:
Untracked packets have not previously passed through the connection
tracker, while tracked packets have previously been through the
connection tracker. For OpenFlow pipeline processing, untracked packets
can become tracked, and they will remain tracked until the end of the
pipeline. Tracked packets cannot become untracked.
Connections can be unknown, uncommitted, or committed. Packets which are
untracked have unknown connection state. To know the connection state,
the packet must become tracked. Uncommitted connections have no
connection state stored about them, so it is only possible for the
connection tracker to identify whether they are a new connection or
whether they are invalid. Committed connections have connection state
stored beyond the lifetime of the packet, which allows later packets in
the same connection to be identified as part of the same established
connection, or related to an existing connection - for instance ICMP
error responses.
The new 'ct' action transitions the packet from "untracked" to
"tracked" by sending this flow through the connection tracker.
The following parameters are supported initally:
- "commit": When commit is executed, the connection moves from
uncommitted state to committed state. This signals that information
about the connection should be stored beyond the lifetime of the
packet within the pipeline. This allows future packets in the same
connection to be recognized as part of the same "established" (est)
connection, as well as identifying packets in the reply (rpl)
direction, or packets related to an existing connection (rel).
- "zone=[u16|NXM]": Perform connection tracking in the zone specified.
Each zone is an independent connection tracking context. When the
"commit" parameter is used, the connection will only be committed in
the specified zone, and not in other zones. This is 0 by default.
- "table=NUMBER": Fork pipeline processing in two. The original instance
of the packet will continue processing the current actions list as an
untracked packet. An additional instance of the packet will be sent to
the connection tracker, which will be re-injected into the OpenFlow
pipeline to resume processing in the specified table, with the
ct_state and other ct match fields set. If the table is not specified,
then the packet is submitted to the connection tracker, but the
pipeline does not fork and the ct match fields are not populated. It
is strongly recommended to specify a table later than the current
table to prevent loops.
When the "table" option is used, the packet that continues processing in
the specified table will have the ct_state populated. The ct_state may
have any of the following flags set:
- Tracked (trk): Connection tracking has occurred.
- Reply (rpl): The flow is in the reply direction.
- Invalid (inv): The connection tracker couldn't identify the connection.
- New (new): This is the beginning of a new connection.
- Established (est): This is part of an already existing connection.
- Related (rel): This connection is related to an existing connection.
For more information, consult the ovs-ofctl(8) man pages.
Below is a simple example flow table to allow outbound TCP traffic from
port 1 and drop traffic from port 2 that was not initiated by port 1:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,ct_state=-trk,action=ct(commit,zone=9),2
table=0,in_port=2,tcp,ct_state=-trk,action=ct(zone=9,table=1)
table=1,in_port=2,ct_state=+trk+est,tcp,action=1
table=1,in_port=2,ct_state=+trk+new,tcp,action=drop
Based on original design by Justin Pettit, contributions from Thomas
Graf and Daniele Di Proietto.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-08-11 10:56:09 -07:00
|
|
|
case OVS_KEY_ATTR_CT_ZONE:
|
Add connection tracking mark support.
This patch adds a new 32-bit metadata field to the connection tracking
interface. When a mark is specified as part of the ct action and the
connection is committed, the value is saved with the current connection.
Subsequent ct lookups with the table specified will expose this metadata
as the "ct_mark" field in the flow.
For example, to allow new TCP connections from port 1->2 and only allow
established connections from port 2->1, and to associate a mark with those
connections:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,action=ct(commit,exec(set_field:1->ct_mark)),2
table=0,in_port=2,ct_state=-trk,tcp,action=ct(table=1)
table=1,in_port=2,ct_state=+trk,ct_mark=1,tcp,action=1
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-09-18 13:58:00 -07:00
|
|
|
case OVS_KEY_ATTR_CT_MARK:
|
Add connection tracking label support.
This patch adds a new 128-bit metadata field to the connection tracking
interface. When a label is specified as part of the ct action and the
connection is committed, the value is saved with the current connection.
Subsequent ct lookups with the table specified will expose this metadata
as the "ct_label" field in the flow.
For example, to allow new TCP connections from port 1->2 and only allow
established connections from port 2->1, and to associate a label with
those connections:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,action=ct(commit,exec(set_field:1->ct_label)),2
table=0,in_port=2,ct_state=-trk,tcp,action=ct(table=1)
table=1,in_port=2,ct_state=+trk,ct_label=1,tcp,action=1
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-10-13 11:13:10 -07:00
|
|
|
case OVS_KEY_ATTR_CT_LABELS:
|
2013-06-11 11:05:55 +09:00
|
|
|
case __OVS_KEY_ATTR_MAX:
|
|
|
|
default:
|
2013-12-17 10:32:12 -08:00
|
|
|
OVS_NOT_REACHED();
|
2013-05-29 15:06:38 +09:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-09-05 15:44:19 -07:00
|
|
|
#define get_mask(a, type) ((const type *)(const void *)(a + 1) + 1)
|
|
|
|
|
|
|
|
static void
|
2015-02-25 12:01:53 -08:00
|
|
|
odp_execute_masked_set_action(struct dp_packet *packet,
|
2014-10-03 20:23:58 -07:00
|
|
|
const struct nlattr *a)
|
2014-09-05 15:44:19 -07:00
|
|
|
{
|
2014-10-03 20:23:58 -07:00
|
|
|
struct pkt_metadata *md = &packet->md;
|
2014-09-05 15:44:19 -07:00
|
|
|
enum ovs_key_attr type = nl_attr_type(a);
|
|
|
|
struct mpls_hdr *mh;
|
|
|
|
|
|
|
|
switch (type) {
|
|
|
|
case OVS_KEY_ATTR_PRIORITY:
|
|
|
|
md->skb_priority = nl_attr_get_u32(a)
|
|
|
|
| (md->skb_priority & ~*get_mask(a, uint32_t));
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_SKB_MARK:
|
|
|
|
md->pkt_mark = nl_attr_get_u32(a)
|
|
|
|
| (md->pkt_mark & ~*get_mask(a, uint32_t));
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_ETHERNET:
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_eth_set_addrs(packet, nl_attr_get(a),
|
2014-09-05 15:44:19 -07:00
|
|
|
get_mask(a, struct ovs_key_ethernet));
|
|
|
|
break;
|
|
|
|
|
2018-01-06 13:47:51 +08:00
|
|
|
case OVS_KEY_ATTR_NSH: {
|
2018-01-11 13:24:02 +08:00
|
|
|
odp_set_nsh(packet, a, true);
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
break;
|
2018-01-06 13:47:51 +08:00
|
|
|
}
|
userspace: Add support for NSH MD1 match fields
This patch adds support for NSH packet header fields to the OVS
control plane and the userspace datapath. Initially we support the
fields of the NSH base header as defined in
https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt
and the fixed context headers specified for metadata format MD1.
The variable length MD2 format is parsed but the TLV context headers
are not yet available for matching.
The NSH fields are modelled as experimenter fields with the dedicated
experimenter class 0x005ad650 proposed for NSH in ONF. The following
fields are defined:
NXOXM code ofctl name Size Comment
=====================================================================
NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word
(0x005ad650,1)
NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23
(0x005ad650,2)
NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31
(0x005ad650,3)
NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word
(0x005ad650,4)
NXOXM_NSH_SI nsh_si 8 Bits 24-31
(0x005ad650,5)
NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1
(0x005ad650,6)
NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1
(0x005ad650,7)
NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1
(0x005ad650,8)
NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1
(0x005ad650,9)
Co-authored-by: Johnson Li <johnson.li@intel.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-05 13:41:08 +08:00
|
|
|
|
2014-09-05 15:44:19 -07:00
|
|
|
case OVS_KEY_ATTR_IPV4:
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_ipv4(packet, nl_attr_get(a),
|
2014-09-05 15:44:19 -07:00
|
|
|
get_mask(a, struct ovs_key_ipv4));
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_IPV6:
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_ipv6(packet, nl_attr_get(a),
|
2014-09-05 15:44:19 -07:00
|
|
|
get_mask(a, struct ovs_key_ipv6));
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_TCP:
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_tcp(packet, nl_attr_get(a),
|
2014-09-05 15:44:19 -07:00
|
|
|
get_mask(a, struct ovs_key_tcp));
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_UDP:
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_udp(packet, nl_attr_get(a),
|
2014-09-05 15:44:19 -07:00
|
|
|
get_mask(a, struct ovs_key_udp));
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_SCTP:
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_sctp(packet, nl_attr_get(a),
|
2014-09-05 15:44:19 -07:00
|
|
|
get_mask(a, struct ovs_key_sctp));
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_MPLS:
|
2015-02-22 03:21:09 -08:00
|
|
|
mh = dp_packet_l2_5(packet);
|
2014-09-05 15:44:19 -07:00
|
|
|
if (mh) {
|
|
|
|
put_16aligned_be32(&mh->mpls_lse, nl_attr_get_be32(a)
|
|
|
|
| (get_16aligned_be32(&mh->mpls_lse)
|
|
|
|
& ~*get_mask(a, ovs_be32)));
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_ARP:
|
2015-02-22 03:21:09 -08:00
|
|
|
set_arp(packet, nl_attr_get(a),
|
2014-09-05 15:44:19 -07:00
|
|
|
get_mask(a, struct ovs_key_arp));
|
|
|
|
break;
|
|
|
|
|
2014-12-23 23:42:05 +00:00
|
|
|
case OVS_KEY_ATTR_ND:
|
2015-02-22 03:21:09 -08:00
|
|
|
odp_set_nd(packet, nl_attr_get(a),
|
2014-12-23 23:42:05 +00:00
|
|
|
get_mask(a, struct ovs_key_nd));
|
|
|
|
break;
|
|
|
|
|
2014-09-05 15:44:19 -07:00
|
|
|
case OVS_KEY_ATTR_DP_HASH:
|
2014-09-09 14:21:41 -07:00
|
|
|
md->dp_hash = nl_attr_get_u32(a)
|
2015-04-15 19:11:48 +01:00
|
|
|
| (md->dp_hash & ~*get_mask(a, uint32_t));
|
2014-09-05 15:44:19 -07:00
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_RECIRC_ID:
|
|
|
|
md->recirc_id = nl_attr_get_u32(a)
|
|
|
|
| (md->recirc_id & ~*get_mask(a, uint32_t));
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_KEY_ATTR_TUNNEL: /* Masked data not supported for tunnel. */
|
2017-06-02 16:16:17 +00:00
|
|
|
case OVS_KEY_ATTR_PACKET_TYPE:
|
2014-09-05 15:44:19 -07:00
|
|
|
case OVS_KEY_ATTR_UNSPEC:
|
Add support for connection tracking.
This patch adds a new action and fields to OVS that allow connection
tracking to be performed. This support works in conjunction with the
Linux kernel support merged into the Linux-4.3 development cycle.
Packets have two possible states with respect to connection tracking:
Untracked packets have not previously passed through the connection
tracker, while tracked packets have previously been through the
connection tracker. For OpenFlow pipeline processing, untracked packets
can become tracked, and they will remain tracked until the end of the
pipeline. Tracked packets cannot become untracked.
Connections can be unknown, uncommitted, or committed. Packets which are
untracked have unknown connection state. To know the connection state,
the packet must become tracked. Uncommitted connections have no
connection state stored about them, so it is only possible for the
connection tracker to identify whether they are a new connection or
whether they are invalid. Committed connections have connection state
stored beyond the lifetime of the packet, which allows later packets in
the same connection to be identified as part of the same established
connection, or related to an existing connection - for instance ICMP
error responses.
The new 'ct' action transitions the packet from "untracked" to
"tracked" by sending this flow through the connection tracker.
The following parameters are supported initally:
- "commit": When commit is executed, the connection moves from
uncommitted state to committed state. This signals that information
about the connection should be stored beyond the lifetime of the
packet within the pipeline. This allows future packets in the same
connection to be recognized as part of the same "established" (est)
connection, as well as identifying packets in the reply (rpl)
direction, or packets related to an existing connection (rel).
- "zone=[u16|NXM]": Perform connection tracking in the zone specified.
Each zone is an independent connection tracking context. When the
"commit" parameter is used, the connection will only be committed in
the specified zone, and not in other zones. This is 0 by default.
- "table=NUMBER": Fork pipeline processing in two. The original instance
of the packet will continue processing the current actions list as an
untracked packet. An additional instance of the packet will be sent to
the connection tracker, which will be re-injected into the OpenFlow
pipeline to resume processing in the specified table, with the
ct_state and other ct match fields set. If the table is not specified,
then the packet is submitted to the connection tracker, but the
pipeline does not fork and the ct match fields are not populated. It
is strongly recommended to specify a table later than the current
table to prevent loops.
When the "table" option is used, the packet that continues processing in
the specified table will have the ct_state populated. The ct_state may
have any of the following flags set:
- Tracked (trk): Connection tracking has occurred.
- Reply (rpl): The flow is in the reply direction.
- Invalid (inv): The connection tracker couldn't identify the connection.
- New (new): This is the beginning of a new connection.
- Established (est): This is part of an already existing connection.
- Related (rel): This connection is related to an existing connection.
For more information, consult the ovs-ofctl(8) man pages.
Below is a simple example flow table to allow outbound TCP traffic from
port 1 and drop traffic from port 2 that was not initiated by port 1:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,ct_state=-trk,action=ct(commit,zone=9),2
table=0,in_port=2,tcp,ct_state=-trk,action=ct(zone=9,table=1)
table=1,in_port=2,ct_state=+trk+est,tcp,action=1
table=1,in_port=2,ct_state=+trk+new,tcp,action=drop
Based on original design by Justin Pettit, contributions from Thomas
Graf and Daniele Di Proietto.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-08-11 10:56:09 -07:00
|
|
|
case OVS_KEY_ATTR_CT_STATE:
|
|
|
|
case OVS_KEY_ATTR_CT_ZONE:
|
Add connection tracking mark support.
This patch adds a new 32-bit metadata field to the connection tracking
interface. When a mark is specified as part of the ct action and the
connection is committed, the value is saved with the current connection.
Subsequent ct lookups with the table specified will expose this metadata
as the "ct_mark" field in the flow.
For example, to allow new TCP connections from port 1->2 and only allow
established connections from port 2->1, and to associate a mark with those
connections:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,action=ct(commit,exec(set_field:1->ct_mark)),2
table=0,in_port=2,ct_state=-trk,tcp,action=ct(table=1)
table=1,in_port=2,ct_state=+trk,ct_mark=1,tcp,action=1
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-09-18 13:58:00 -07:00
|
|
|
case OVS_KEY_ATTR_CT_MARK:
|
Add connection tracking label support.
This patch adds a new 128-bit metadata field to the connection tracking
interface. When a label is specified as part of the ct action and the
connection is committed, the value is saved with the current connection.
Subsequent ct lookups with the table specified will expose this metadata
as the "ct_label" field in the flow.
For example, to allow new TCP connections from port 1->2 and only allow
established connections from port 2->1, and to associate a label with
those connections:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,action=ct(commit,exec(set_field:1->ct_label)),2
table=0,in_port=2,ct_state=-trk,tcp,action=ct(table=1)
table=1,in_port=2,ct_state=+trk,ct_label=1,tcp,action=1
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-10-13 11:13:10 -07:00
|
|
|
case OVS_KEY_ATTR_CT_LABELS:
|
datapath: Add original direction conntrack tuple to sw_flow_key.
Upstream commit:
commit 9dd7f8907c3705dc7a7a375d1c6e30b06e6daffc
Author: Jarno Rajahalme <jarno@ovn.org>
Date: Thu Feb 9 11:21:59 2017 -0800
openvswitch: Add original direction conntrack tuple to sw_flow_key.
Add the fields of the conntrack original direction 5-tuple to struct
sw_flow_key. The new fields are initially marked as non-existent, and
are populated whenever a conntrack action is executed and either finds
or generates a conntrack entry. This means that these fields exist
for all packets that were not rejected by conntrack as untrackable.
The original tuple fields in the sw_flow_key are filled from the
original direction tuple of the conntrack entry relating to the
current packet, or from the original direction tuple of the master
conntrack entry, if the current conntrack entry has a master.
Generally, expected connections of connections having an assigned
helper (e.g., FTP), have a master conntrack entry.
The main purpose of the new conntrack original tuple fields is to
allow matching on them for policy decision purposes, with the premise
that the admissibility of tracked connections reply packets (as well
as original direction packets), and both direction packets of any
related connections may be based on ACL rules applying to the master
connection's original direction 5-tuple. This also makes it easier to
make policy decisions when the actual packet headers might have been
transformed by NAT, as the original direction 5-tuple represents the
packet headers before any such transformation.
When using the original direction 5-tuple the admissibility of return
and/or related packets need not be based on the mere existence of a
conntrack entry, allowing separation of admission policy from the
established conntrack state. While existence of a conntrack entry is
required for admission of the return or related packets, policy
changes can render connections that were initially admitted to be
rejected or dropped afterwards. If the admission of the return and
related packets was based on mere conntrack state (e.g., connection
being in an established state), a policy change that would make the
connection rejected or dropped would need to find and delete all
conntrack entries affected by such a change. When using the original
direction 5-tuple matching the affected conntrack entries can be
allowed to time out instead, as the established state of the
connection would not need to be the basis for packet admission any
more.
It should be noted that the directionality of related connections may
be the same or different than that of the master connection, and
neither the original direction 5-tuple nor the conntrack state bits
carry this information. If needed, the directionality of the master
connection can be stored in master's conntrack mark or labels, which
are automatically inherited by the expected related connections.
The fact that neither ARP nor ND packets are trackable by conntrack
allows mutual exclusion between ARP/ND and the new conntrack original
tuple fields. Hence, the IP addresses are overlaid in union with ARP
and ND fields. This allows the sw_flow_key to not grow much due to
this patch, but it also means that we must be careful to never use the
new key fields with ARP or ND packets. ARP is easy to distinguish and
keep mutually exclusive based on the ethernet type, but ND being an
ICMPv6 protocol requires a bit more attention.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch squashes in minimal amount of OVS userspace code to not
break the build. Later patches contain the full userspace support.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
2017-03-08 17:18:22 -08:00
|
|
|
case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4:
|
|
|
|
case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6:
|
2014-09-05 15:44:19 -07:00
|
|
|
case OVS_KEY_ATTR_ENCAP:
|
|
|
|
case OVS_KEY_ATTR_ETHERTYPE:
|
|
|
|
case OVS_KEY_ATTR_IN_PORT:
|
|
|
|
case OVS_KEY_ATTR_VLAN:
|
|
|
|
case OVS_KEY_ATTR_ICMP:
|
|
|
|
case OVS_KEY_ATTR_ICMPV6:
|
|
|
|
case OVS_KEY_ATTR_TCP_FLAGS:
|
|
|
|
case __OVS_KEY_ATTR_MAX:
|
|
|
|
default:
|
|
|
|
OVS_NOT_REACHED();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-05-29 15:06:38 +09:00
|
|
|
static void
|
2015-02-25 12:01:53 -08:00
|
|
|
odp_execute_sample(void *dp, struct dp_packet *packet, bool steal,
|
2014-10-03 20:23:58 -07:00
|
|
|
const struct nlattr *action,
|
2014-10-03 15:04:15 -07:00
|
|
|
odp_execute_cb dp_execute_action)
|
2013-05-29 15:06:38 +09:00
|
|
|
{
|
|
|
|
const struct nlattr *subactions = NULL;
|
|
|
|
const struct nlattr *a;
|
2016-05-17 17:32:33 -07:00
|
|
|
struct dp_packet_batch pb;
|
2013-05-29 15:06:38 +09:00
|
|
|
size_t left;
|
|
|
|
|
|
|
|
NL_NESTED_FOR_EACH_UNSAFE (a, left, action) {
|
|
|
|
int type = nl_attr_type(a);
|
|
|
|
|
|
|
|
switch ((enum ovs_sample_attr) type) {
|
|
|
|
case OVS_SAMPLE_ATTR_PROBABILITY:
|
|
|
|
if (random_uint32() >= nl_attr_get_u32(a)) {
|
2014-10-03 15:04:15 -07:00
|
|
|
if (steal) {
|
2015-02-25 12:01:53 -08:00
|
|
|
dp_packet_delete(packet);
|
2014-10-03 15:04:15 -07:00
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_SAMPLE_ATTR_ACTIONS:
|
|
|
|
subactions = a;
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_SAMPLE_ATTR_UNSPEC:
|
|
|
|
case __OVS_SAMPLE_ATTR_MAX:
|
|
|
|
default:
|
2013-12-17 10:32:12 -08:00
|
|
|
OVS_NOT_REACHED();
|
2013-05-29 15:06:38 +09:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-01-11 16:00:04 -08:00
|
|
|
if (!steal) {
|
|
|
|
/* The 'subactions' may modify the packet, but the modification
|
|
|
|
* should not propagate beyond this sample action. Make a copy
|
|
|
|
* the packet in case we don't own the packet, so that the
|
|
|
|
* 'subactions' are only applid to the clone. 'odp_execute_actions'
|
|
|
|
* will free the clone. */
|
|
|
|
packet = dp_packet_clone(packet);
|
|
|
|
}
|
2017-01-17 15:56:58 -08:00
|
|
|
dp_packet_batch_init_packet(&pb, packet);
|
2017-01-11 16:00:04 -08:00
|
|
|
odp_execute_actions(dp, &pb, true, nl_attr_get(subactions),
|
2014-10-03 15:04:15 -07:00
|
|
|
nl_attr_get_size(subactions), dp_execute_action);
|
2013-05-29 15:06:38 +09:00
|
|
|
}
|
|
|
|
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
static void
|
2017-02-09 15:41:53 +00:00
|
|
|
odp_execute_clone(void *dp, struct dp_packet_batch *batch, bool steal,
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
const struct nlattr *actions,
|
|
|
|
odp_execute_cb dp_execute_action)
|
|
|
|
{
|
|
|
|
if (!steal) {
|
|
|
|
/* The 'actions' may modify the packet, but the modification
|
|
|
|
* should not propagate beyond this clone action. Make a copy
|
|
|
|
* the packet in case we don't own the packet, so that the
|
|
|
|
* 'actions' are only applied to the clone. 'odp_execute_actions'
|
|
|
|
* will free the clone. */
|
2017-02-09 15:41:53 +00:00
|
|
|
struct dp_packet_batch clone_pkt_batch;
|
|
|
|
dp_packet_batch_clone(&clone_pkt_batch, batch);
|
|
|
|
dp_packet_batch_reset_cutlen(batch);
|
|
|
|
odp_execute_actions(dp, &clone_pkt_batch, true, nl_attr_get(actions),
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
nl_attr_get_size(actions), dp_execute_action);
|
2017-02-09 15:41:53 +00:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
odp_execute_actions(dp, batch, true, nl_attr_get(actions),
|
|
|
|
nl_attr_get_size(actions), dp_execute_action);
|
|
|
|
}
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
}
|
|
|
|
|
2015-03-06 10:09:10 -08:00
|
|
|
static bool
|
|
|
|
requires_datapath_assistance(const struct nlattr *a)
|
|
|
|
{
|
|
|
|
enum ovs_action_attr type = nl_attr_type(a);
|
|
|
|
|
|
|
|
switch (type) {
|
|
|
|
/* These only make sense in the context of a datapath. */
|
|
|
|
case OVS_ACTION_ATTR_OUTPUT:
|
|
|
|
case OVS_ACTION_ATTR_TUNNEL_PUSH:
|
|
|
|
case OVS_ACTION_ATTR_TUNNEL_POP:
|
|
|
|
case OVS_ACTION_ATTR_USERSPACE:
|
|
|
|
case OVS_ACTION_ATTR_RECIRC:
|
Add support for connection tracking.
This patch adds a new action and fields to OVS that allow connection
tracking to be performed. This support works in conjunction with the
Linux kernel support merged into the Linux-4.3 development cycle.
Packets have two possible states with respect to connection tracking:
Untracked packets have not previously passed through the connection
tracker, while tracked packets have previously been through the
connection tracker. For OpenFlow pipeline processing, untracked packets
can become tracked, and they will remain tracked until the end of the
pipeline. Tracked packets cannot become untracked.
Connections can be unknown, uncommitted, or committed. Packets which are
untracked have unknown connection state. To know the connection state,
the packet must become tracked. Uncommitted connections have no
connection state stored about them, so it is only possible for the
connection tracker to identify whether they are a new connection or
whether they are invalid. Committed connections have connection state
stored beyond the lifetime of the packet, which allows later packets in
the same connection to be identified as part of the same established
connection, or related to an existing connection - for instance ICMP
error responses.
The new 'ct' action transitions the packet from "untracked" to
"tracked" by sending this flow through the connection tracker.
The following parameters are supported initally:
- "commit": When commit is executed, the connection moves from
uncommitted state to committed state. This signals that information
about the connection should be stored beyond the lifetime of the
packet within the pipeline. This allows future packets in the same
connection to be recognized as part of the same "established" (est)
connection, as well as identifying packets in the reply (rpl)
direction, or packets related to an existing connection (rel).
- "zone=[u16|NXM]": Perform connection tracking in the zone specified.
Each zone is an independent connection tracking context. When the
"commit" parameter is used, the connection will only be committed in
the specified zone, and not in other zones. This is 0 by default.
- "table=NUMBER": Fork pipeline processing in two. The original instance
of the packet will continue processing the current actions list as an
untracked packet. An additional instance of the packet will be sent to
the connection tracker, which will be re-injected into the OpenFlow
pipeline to resume processing in the specified table, with the
ct_state and other ct match fields set. If the table is not specified,
then the packet is submitted to the connection tracker, but the
pipeline does not fork and the ct match fields are not populated. It
is strongly recommended to specify a table later than the current
table to prevent loops.
When the "table" option is used, the packet that continues processing in
the specified table will have the ct_state populated. The ct_state may
have any of the following flags set:
- Tracked (trk): Connection tracking has occurred.
- Reply (rpl): The flow is in the reply direction.
- Invalid (inv): The connection tracker couldn't identify the connection.
- New (new): This is the beginning of a new connection.
- Established (est): This is part of an already existing connection.
- Related (rel): This connection is related to an existing connection.
For more information, consult the ovs-ofctl(8) man pages.
Below is a simple example flow table to allow outbound TCP traffic from
port 1 and drop traffic from port 2 that was not initiated by port 1:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,ct_state=-trk,action=ct(commit,zone=9),2
table=0,in_port=2,tcp,ct_state=-trk,action=ct(zone=9,table=1)
table=1,in_port=2,ct_state=+trk+est,tcp,action=1
table=1,in_port=2,ct_state=+trk+new,tcp,action=drop
Based on original design by Justin Pettit, contributions from Thomas
Graf and Daniele Di Proietto.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-08-11 10:56:09 -07:00
|
|
|
case OVS_ACTION_ATTR_CT:
|
2017-02-23 11:27:54 -08:00
|
|
|
case OVS_ACTION_ATTR_METER:
|
2015-03-06 10:09:10 -08:00
|
|
|
return true;
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_SET:
|
|
|
|
case OVS_ACTION_ATTR_SET_MASKED:
|
|
|
|
case OVS_ACTION_ATTR_PUSH_VLAN:
|
|
|
|
case OVS_ACTION_ATTR_POP_VLAN:
|
|
|
|
case OVS_ACTION_ATTR_SAMPLE:
|
|
|
|
case OVS_ACTION_ATTR_HASH:
|
|
|
|
case OVS_ACTION_ATTR_PUSH_MPLS:
|
|
|
|
case OVS_ACTION_ATTR_POP_MPLS:
|
ofp-actions: Add truncate action.
The patch adds a new action to support packet truncation. The new action
is formatted as 'output(port=n,max_len=m)', as output to port n, with
packet size being MIN(original_size, m).
One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload. Example
use case is below as well as shown in the testcases:
- Output to port 1 with max_len 100 bytes.
- The output packet size on port 1 will be MIN(original_packet_size, 100).
# ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)'
- The scope of max_len is limited to output action itself. The following
packet size of output:1 and output:2 will be intact.
# ovs-ofctl add-flow br0 \
'actions=output(port=1,max_len=100),output:1,output:2'
- The Datapath actions shows:
# Datapath actions: trunc(100),1,1,2
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-06-24 07:42:30 -07:00
|
|
|
case OVS_ACTION_ATTR_TRUNC:
|
2017-02-06 21:04:41 +08:00
|
|
|
case OVS_ACTION_ATTR_PUSH_ETH:
|
|
|
|
case OVS_ACTION_ATTR_POP_ETH:
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
case OVS_ACTION_ATTR_CLONE:
|
2018-01-06 13:47:51 +08:00
|
|
|
case OVS_ACTION_ATTR_PUSH_NSH:
|
|
|
|
case OVS_ACTION_ATTR_POP_NSH:
|
2018-01-19 14:21:51 -05:00
|
|
|
case OVS_ACTION_ATTR_CT_CLEAR:
|
2015-03-06 10:09:10 -08:00
|
|
|
return false;
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_UNSPEC:
|
|
|
|
case __OVS_ACTION_ATTR_MAX:
|
|
|
|
OVS_NOT_REACHED();
|
|
|
|
}
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2017-11-29 14:14:50 -08:00
|
|
|
/* Executes all of the 'actions_len' bytes of datapath actions in 'actions' on
|
|
|
|
* the packets in 'batch'. If 'steal' is true, possibly modifies and
|
|
|
|
* definitely free the packets in 'batch', otherwise leaves 'batch' unchanged.
|
|
|
|
*
|
|
|
|
* Some actions (e.g. output actions) can only be executed by a datapath. This
|
|
|
|
* function implements those actions by passing the action and the packets to
|
|
|
|
* 'dp_execute_action' (along with 'dp'). If 'dp_execute_action' is passed a
|
2018-05-16 23:08:48 -07:00
|
|
|
* true 'steal' parameter then it must definitely free the packets passed into
|
|
|
|
* it. The packet can be modified whether 'steal' is false or true. If a
|
|
|
|
* packet is removed from the batch, then the fate of the packet is determined
|
|
|
|
* by the code that does this removal, irrespective of the value of 'steal'.
|
|
|
|
* Otherwise, if the packet is not removed from the batch and 'steal' is false
|
|
|
|
* then the packet could either be cloned or not. */
|
2014-10-03 15:04:15 -07:00
|
|
|
void
|
2016-05-17 17:32:33 -07:00
|
|
|
odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
|
2014-10-03 15:04:15 -07:00
|
|
|
const struct nlattr *actions, size_t actions_len,
|
|
|
|
odp_execute_cb dp_execute_action)
|
2013-05-29 15:06:38 +09:00
|
|
|
{
|
2017-01-17 15:56:58 -08:00
|
|
|
struct dp_packet *packet;
|
2013-05-29 15:06:38 +09:00
|
|
|
const struct nlattr *a;
|
|
|
|
unsigned int left;
|
2014-06-23 11:43:59 -07:00
|
|
|
|
2013-05-29 15:06:38 +09:00
|
|
|
NL_ATTR_FOR_EACH_UNSAFE (a, left, actions, actions_len) {
|
|
|
|
int type = nl_attr_type(a);
|
2014-10-03 15:04:15 -07:00
|
|
|
bool last_action = (left <= NLA_ALIGN(a->nla_len));
|
2013-05-29 15:06:38 +09:00
|
|
|
|
2015-03-06 10:09:10 -08:00
|
|
|
if (requires_datapath_assistance(a)) {
|
2013-12-30 15:58:58 -08:00
|
|
|
if (dp_execute_action) {
|
|
|
|
/* Allow 'dp_execute_action' to steal the packet data if we do
|
|
|
|
* not need it any more. */
|
2014-10-03 15:04:15 -07:00
|
|
|
bool may_steal = steal && last_action;
|
|
|
|
|
2016-05-17 17:32:33 -07:00
|
|
|
dp_execute_action(dp, batch, a, may_steal);
|
2014-10-03 15:04:15 -07:00
|
|
|
|
2017-11-30 05:37:57 +00:00
|
|
|
if (last_action || batch->count == 0) {
|
|
|
|
/* We do not need to free the packets.
|
|
|
|
* Either dp_execute_actions() has stolen them
|
|
|
|
* or the batch is freed due to errors. In either
|
|
|
|
* case we do not need to execute further actions.
|
|
|
|
*/
|
2014-10-03 15:04:15 -07:00
|
|
|
return;
|
|
|
|
}
|
2013-10-18 17:27:51 -07:00
|
|
|
}
|
2015-03-06 10:09:10 -08:00
|
|
|
continue;
|
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
|
2015-03-06 10:09:10 -08:00
|
|
|
switch ((enum ovs_action_attr) type) {
|
2014-05-28 17:00:48 -07:00
|
|
|
case OVS_ACTION_ATTR_HASH: {
|
|
|
|
const struct ovs_action_hash *hash_act = nl_attr_get(a);
|
|
|
|
|
|
|
|
/* Calculate a hash value directly. This might not match the
|
|
|
|
* value computed by the datapath, but it is much less expensive,
|
|
|
|
* and the current use case (bonding) does not require a strict
|
|
|
|
* match to work properly. */
|
|
|
|
if (hash_act->hash_alg == OVS_HASH_ALG_L4) {
|
|
|
|
struct flow flow;
|
|
|
|
uint32_t hash;
|
|
|
|
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-07-13 18:07:03 +03:00
|
|
|
/* RSS hash can be used here instead of 5tuple for
|
|
|
|
* performance reasons. */
|
|
|
|
if (dp_packet_rss_valid(packet)) {
|
|
|
|
hash = dp_packet_get_rss_hash(packet);
|
|
|
|
hash = hash_int(hash, hash_act->hash_basis);
|
|
|
|
} else {
|
|
|
|
flow_extract(packet, &flow);
|
|
|
|
hash = flow_hash_5tuple(&flow, hash_act->hash_basis);
|
|
|
|
}
|
2017-01-17 15:56:58 -08:00
|
|
|
packet->md.dp_hash = hash;
|
2014-06-23 11:43:59 -07:00
|
|
|
}
|
2014-05-28 17:00:48 -07:00
|
|
|
} else {
|
|
|
|
/* Assert on unknown hash algorithm. */
|
|
|
|
OVS_NOT_REACHED();
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2013-05-29 15:06:38 +09:00
|
|
|
case OVS_ACTION_ATTR_PUSH_VLAN: {
|
|
|
|
const struct ovs_action_push_vlan *vlan = nl_attr_get(a);
|
2014-06-23 11:43:59 -07:00
|
|
|
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-01-17 15:56:58 -08:00
|
|
|
eth_push_vlan(packet, vlan->vlan_tpid, vlan->vlan_tci);
|
2014-06-23 11:43:59 -07:00
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_POP_VLAN:
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-01-17 15:56:58 -08:00
|
|
|
eth_pop_vlan(packet);
|
2014-06-23 11:43:59 -07:00
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_PUSH_MPLS: {
|
|
|
|
const struct ovs_action_push_mpls *mpls = nl_attr_get(a);
|
2014-06-23 11:43:59 -07:00
|
|
|
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-01-17 15:56:58 -08:00
|
|
|
push_mpls(packet, mpls->mpls_ethertype, mpls->mpls_lse);
|
2014-06-23 11:43:59 -07:00
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_POP_MPLS:
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-01-17 15:56:58 -08:00
|
|
|
pop_mpls(packet, nl_attr_get_be16(a));
|
2014-06-23 11:43:59 -07:00
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_SET:
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-01-17 15:56:58 -08:00
|
|
|
odp_execute_set_action(packet, nl_attr_get(a));
|
2014-09-05 15:44:19 -07:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_SET_MASKED:
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH(i, packet, batch) {
|
2017-01-17 15:56:58 -08:00
|
|
|
odp_execute_masked_set_action(packet, nl_attr_get(a));
|
2014-06-23 11:43:59 -07:00
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_SAMPLE:
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-01-17 15:56:58 -08:00
|
|
|
odp_execute_sample(dp, packet, steal && last_action, a,
|
2014-10-03 15:04:15 -07:00
|
|
|
dp_execute_action);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (last_action) {
|
|
|
|
/* We do not need to free the packets. odp_execute_sample() has
|
|
|
|
* stolen them*/
|
|
|
|
return;
|
2014-06-23 11:43:59 -07:00
|
|
|
}
|
2013-05-29 15:06:38 +09:00
|
|
|
break;
|
|
|
|
|
ofp-actions: Add truncate action.
The patch adds a new action to support packet truncation. The new action
is formatted as 'output(port=n,max_len=m)', as output to port n, with
packet size being MIN(original_size, m).
One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload. Example
use case is below as well as shown in the testcases:
- Output to port 1 with max_len 100 bytes.
- The output packet size on port 1 will be MIN(original_packet_size, 100).
# ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)'
- The scope of max_len is limited to output action itself. The following
packet size of output:1 and output:2 will be intact.
# ovs-ofctl add-flow br0 \
'actions=output(port=1,max_len=100),output:1,output:2'
- The Datapath actions shows:
# Datapath actions: trunc(100),1,1,2
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-06-24 07:42:30 -07:00
|
|
|
case OVS_ACTION_ATTR_TRUNC: {
|
|
|
|
const struct ovs_action_trunc *trunc =
|
|
|
|
nl_attr_get_unspec(a, sizeof *trunc);
|
|
|
|
|
|
|
|
batch->trunc = true;
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-01-17 15:56:58 -08:00
|
|
|
dp_packet_set_cutlen(packet, trunc->max_len);
|
ofp-actions: Add truncate action.
The patch adds a new action to support packet truncation. The new action
is formatted as 'output(port=n,max_len=m)', as output to port n, with
packet size being MIN(original_size, m).
One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload. Example
use case is below as well as shown in the testcases:
- Output to port 1 with max_len 100 bytes.
- The output packet size on port 1 will be MIN(original_packet_size, 100).
# ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)'
- The scope of max_len is limited to output action itself. The following
packet size of output:1 and output:2 will be intact.
# ovs-ofctl add-flow br0 \
'actions=output(port=1,max_len=100),output:1,output:2'
- The Datapath actions shows:
# Datapath actions: trunc(100),1,1,2
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-06-24 07:42:30 -07:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
case OVS_ACTION_ATTR_CLONE:
|
2017-02-09 15:41:53 +00:00
|
|
|
odp_execute_clone(dp, batch, steal && last_action, a,
|
|
|
|
dp_execute_action);
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
if (last_action) {
|
|
|
|
/* We do not need to free the packets. odp_execute_clone() has
|
|
|
|
* stolen them. */
|
|
|
|
return;
|
|
|
|
}
|
2017-11-09 10:15:31 +00:00
|
|
|
break;
|
2017-02-23 11:27:54 -08:00
|
|
|
case OVS_ACTION_ATTR_METER:
|
|
|
|
/* Not implemented yet. */
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
break;
|
2017-05-06 15:49:43 +00:00
|
|
|
case OVS_ACTION_ATTR_PUSH_ETH: {
|
|
|
|
const struct ovs_action_push_eth *eth = nl_attr_get(a);
|
|
|
|
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-05-06 15:49:43 +00:00
|
|
|
push_eth(packet, ð->addresses.eth_dst,
|
|
|
|
ð->addresses.eth_src);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
case OVS_ACTION_ATTR_POP_ETH:
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2017-05-06 15:49:43 +00:00
|
|
|
pop_eth(packet);
|
|
|
|
}
|
|
|
|
break;
|
dpif-netdev: Add clone action
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-10 18:13:47 -08:00
|
|
|
|
2018-01-06 13:47:51 +08:00
|
|
|
case OVS_ACTION_ATTR_PUSH_NSH: {
|
|
|
|
uint32_t buffer[NSH_HDR_MAX_LEN / 4];
|
|
|
|
struct nsh_hdr *nsh_hdr = ALIGNED_CAST(struct nsh_hdr *, buffer);
|
|
|
|
nsh_reset_ver_flags_ttl_len(nsh_hdr);
|
|
|
|
odp_nsh_hdr_from_attr(nl_attr_get(a), nsh_hdr, NSH_HDR_MAX_LEN);
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2018-01-06 13:47:51 +08:00
|
|
|
push_nsh(packet, nsh_hdr);
|
2017-08-05 13:41:11 +08:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
2018-01-06 13:47:51 +08:00
|
|
|
case OVS_ACTION_ATTR_POP_NSH: {
|
2017-09-20 14:12:59 +01:00
|
|
|
size_t i;
|
|
|
|
const size_t num = dp_packet_batch_size(batch);
|
2017-08-05 13:41:11 +08:00
|
|
|
|
|
|
|
DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
|
2018-01-06 13:47:51 +08:00
|
|
|
if (pop_nsh(packet)) {
|
2017-08-05 13:41:11 +08:00
|
|
|
dp_packet_batch_refill(batch, packet, i);
|
|
|
|
} else {
|
|
|
|
dp_packet_delete(packet);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
2018-01-19 14:21:51 -05:00
|
|
|
case OVS_ACTION_ATTR_CT_CLEAR:
|
2018-02-27 10:41:30 -08:00
|
|
|
DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
|
2018-01-19 14:21:51 -05:00
|
|
|
conntrack_clear(packet);
|
|
|
|
}
|
|
|
|
break;
|
2017-08-05 13:41:11 +08:00
|
|
|
|
2015-03-06 10:09:10 -08:00
|
|
|
case OVS_ACTION_ATTR_OUTPUT:
|
|
|
|
case OVS_ACTION_ATTR_TUNNEL_PUSH:
|
|
|
|
case OVS_ACTION_ATTR_TUNNEL_POP:
|
|
|
|
case OVS_ACTION_ATTR_USERSPACE:
|
|
|
|
case OVS_ACTION_ATTR_RECIRC:
|
Add support for connection tracking.
This patch adds a new action and fields to OVS that allow connection
tracking to be performed. This support works in conjunction with the
Linux kernel support merged into the Linux-4.3 development cycle.
Packets have two possible states with respect to connection tracking:
Untracked packets have not previously passed through the connection
tracker, while tracked packets have previously been through the
connection tracker. For OpenFlow pipeline processing, untracked packets
can become tracked, and they will remain tracked until the end of the
pipeline. Tracked packets cannot become untracked.
Connections can be unknown, uncommitted, or committed. Packets which are
untracked have unknown connection state. To know the connection state,
the packet must become tracked. Uncommitted connections have no
connection state stored about them, so it is only possible for the
connection tracker to identify whether they are a new connection or
whether they are invalid. Committed connections have connection state
stored beyond the lifetime of the packet, which allows later packets in
the same connection to be identified as part of the same established
connection, or related to an existing connection - for instance ICMP
error responses.
The new 'ct' action transitions the packet from "untracked" to
"tracked" by sending this flow through the connection tracker.
The following parameters are supported initally:
- "commit": When commit is executed, the connection moves from
uncommitted state to committed state. This signals that information
about the connection should be stored beyond the lifetime of the
packet within the pipeline. This allows future packets in the same
connection to be recognized as part of the same "established" (est)
connection, as well as identifying packets in the reply (rpl)
direction, or packets related to an existing connection (rel).
- "zone=[u16|NXM]": Perform connection tracking in the zone specified.
Each zone is an independent connection tracking context. When the
"commit" parameter is used, the connection will only be committed in
the specified zone, and not in other zones. This is 0 by default.
- "table=NUMBER": Fork pipeline processing in two. The original instance
of the packet will continue processing the current actions list as an
untracked packet. An additional instance of the packet will be sent to
the connection tracker, which will be re-injected into the OpenFlow
pipeline to resume processing in the specified table, with the
ct_state and other ct match fields set. If the table is not specified,
then the packet is submitted to the connection tracker, but the
pipeline does not fork and the ct match fields are not populated. It
is strongly recommended to specify a table later than the current
table to prevent loops.
When the "table" option is used, the packet that continues processing in
the specified table will have the ct_state populated. The ct_state may
have any of the following flags set:
- Tracked (trk): Connection tracking has occurred.
- Reply (rpl): The flow is in the reply direction.
- Invalid (inv): The connection tracker couldn't identify the connection.
- New (new): This is the beginning of a new connection.
- Established (est): This is part of an already existing connection.
- Related (rel): This connection is related to an existing connection.
For more information, consult the ovs-ofctl(8) man pages.
Below is a simple example flow table to allow outbound TCP traffic from
port 1 and drop traffic from port 2 that was not initiated by port 1:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,ct_state=-trk,action=ct(commit,zone=9),2
table=0,in_port=2,tcp,ct_state=-trk,action=ct(zone=9,table=1)
table=1,in_port=2,ct_state=+trk+est,tcp,action=1
table=1,in_port=2,ct_state=+trk+new,tcp,action=drop
Based on original design by Justin Pettit, contributions from Thomas
Graf and Daniele Di Proietto.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-08-11 10:56:09 -07:00
|
|
|
case OVS_ACTION_ATTR_CT:
|
2013-05-29 15:06:38 +09:00
|
|
|
case OVS_ACTION_ATTR_UNSPEC:
|
|
|
|
case __OVS_ACTION_ATTR_MAX:
|
2013-12-17 10:32:12 -08:00
|
|
|
OVS_NOT_REACHED();
|
2013-05-29 15:06:38 +09:00
|
|
|
}
|
|
|
|
}
|
2014-06-23 11:43:59 -07:00
|
|
|
|
2017-01-17 15:56:58 -08:00
|
|
|
dp_packet_delete_batch(batch, steal);
|
2013-12-16 08:14:52 -08:00
|
|
|
}
|