Ports have a new layer3 attribute if they send/receive L3 packets.
The packet_type included in structs dp_packet and flow is considered in
ofproto-dpif. The classical L2 match fields (dl_src, dl_dst, dl_type, and
vlan_tci, vlan_vid, vlan_pcp) now have Ethernet as pre-requisite.
A dummy ethernet header is pushed to L3 packets received from L3 ports
before the the pipeline processing starts. The ethernet header is popped
before sending a packet to a L3 port.
For datapath ports that can receive L2 or L3 packets, the packet_type
becomes part of the flow key for datapath flows and is handled
appropriately in dpif-netdev.
In the 'else' branch in flow_put_on_pmd() function, the additional check
flow_equal(&match.flow, &netdev_flow->flow) was removed, as a) the dpcls
lookup is sufficient to uniquely identify a flow and b) it caused false
negatives because the flow in netdev->flow may not properly masked.
In dpif_netdev_flow_put() we now use the same method for constructing the
netdev_flow_key as the one used when adding the flow to the dplcs to make sure
these always match. The function netdev_flow_key_from_flow() used so far was
not only inefficient but sometimes caused mismatches and subsequent flow
update failures.
The kernel datapath does not support the packet_type match field.
Instead it encodes the packet type implictly by the presence or absence of
the Ethernet attribute in the flow key and mask.
This patch filters the PACKET_TYPE attribute out of netlink flow key and
mask to be sent to the kernel datapath.
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
I know of two reasons to mark a structure as "packed". The first is
because the structure must match some defined interface and therefore
compiler-inserted padding between or after members would cause its layout
to diverge from that interface. This is not a problem in a structure that
follows the general alignment rules that are seen in ABIs for all the
architectures that OVS cares about: basically, that a struct member needs
to be aligned on a boundary that is a multiple of the member's size.
The second reason is because instances of the struct tend to be at
misaligned addresses.
struct eth_header and struct vlan_eth_header are normally aligned on
16-bit boundaries (at least), and they contain only 16-bit members, so
there's no need to pack them. This commit removes the packed annotation.
This commit also removes the packed annotation from struct llc_header.
Since that struct only contains 8-bit members, I don't know of any benefit
to packing it, period.
This commit also removes a few more packed annotations that are much less
important.
When these packed annotations were removed, it caused a few warnings
related to casts from 'uint8_t *' to more strictly aligned pointer types,
related to struct ovs_action_push_tnl. That's because that struct had a
trailing member used to store packet headers, that was declared as
a uint8_t[]. Before, when this was cast to 'struct eth_header *', there
was no change in alignment since eth_header was packed; now that
eth_header is not packed, the compiler considers it suspicious. This
commit avoids that problem by changing the member from uint8_t[] to
uint32_t[], which assures the compiler that it is properly aligned.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
I was about to add another complete list of all the connection states but
this eliminates the need.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Miguel Angel Ajo <majopela@redhat.com>
Various printf() format specifiers in the tree had minor technical issues
which the Mac OS build reported, e.g. here:
https://s3.amazonaws.com/archive.travis-ci.org/jobs/208718342/log.txt
These tend to fall into two categories of harmless warnings:
1. Wrong width for types that are all promoted to 'int'. For example,
both uint8_t and uint16_t are both promoted to 'int' as part of a call
to printf(), but using PRIu8 for a uint16_t causes a warning.
2. Wrong format specifier for type promoted to 'int' due to arithmetic.
For example, if 'x' is a uint8_t, then x >> 1 has type 'int' due to
C's promotion rules, so the correct format specifier is %d and using
PRIu8 will cause a warning.
This commit fixes the warnings. I didn't see anything that rose to the
level of a bug.
These warnings only showed up on Mac OS X because of differences in the
format specifiers that Mac OS uses for PRI*.
Reported-by: Shu Shen <shu.shen@gmail.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Flow key handling changes:
- Add VLAN header array in struct flow, to record multiple 802.1q VLAN
headers.
- Add dpif multi-VLAN capability probing. If datapath supports
multi-VLAN, increase the maximum depth of nested OVS_KEY_ATTR_ENCAP.
Refactor VLAN handling in dpif-xlate:
- Introduce 'xvlan' to track VLAN stack during flow processing.
- Input and output VLAN translation according to the xbundle type.
Push VLAN action support:
- Allow ethertype 0x88a8 in VLAN headers and push_vlan action.
- Support push_vlan on dot1q packets.
Use other_config:vlan-limit in table Open_vSwitch to limit maximum VLANs
that can be matched. This allows us to preserve backwards compatibility.
Add test cases for VLAN depth limit, Multi-VLAN actions and QinQ VLAN
handling
Co-authored-by: Thomas F Herbert <thomasfherbert@gmail.com>
Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com>
Co-authored-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Eric Garver <e@erig.me>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Upstream commit:
commit 9dd7f8907c3705dc7a7a375d1c6e30b06e6daffc
Author: Jarno Rajahalme <jarno@ovn.org>
Date: Thu Feb 9 11:21:59 2017 -0800
openvswitch: Add original direction conntrack tuple to sw_flow_key.
Add the fields of the conntrack original direction 5-tuple to struct
sw_flow_key. The new fields are initially marked as non-existent, and
are populated whenever a conntrack action is executed and either finds
or generates a conntrack entry. This means that these fields exist
for all packets that were not rejected by conntrack as untrackable.
The original tuple fields in the sw_flow_key are filled from the
original direction tuple of the conntrack entry relating to the
current packet, or from the original direction tuple of the master
conntrack entry, if the current conntrack entry has a master.
Generally, expected connections of connections having an assigned
helper (e.g., FTP), have a master conntrack entry.
The main purpose of the new conntrack original tuple fields is to
allow matching on them for policy decision purposes, with the premise
that the admissibility of tracked connections reply packets (as well
as original direction packets), and both direction packets of any
related connections may be based on ACL rules applying to the master
connection's original direction 5-tuple. This also makes it easier to
make policy decisions when the actual packet headers might have been
transformed by NAT, as the original direction 5-tuple represents the
packet headers before any such transformation.
When using the original direction 5-tuple the admissibility of return
and/or related packets need not be based on the mere existence of a
conntrack entry, allowing separation of admission policy from the
established conntrack state. While existence of a conntrack entry is
required for admission of the return or related packets, policy
changes can render connections that were initially admitted to be
rejected or dropped afterwards. If the admission of the return and
related packets was based on mere conntrack state (e.g., connection
being in an established state), a policy change that would make the
connection rejected or dropped would need to find and delete all
conntrack entries affected by such a change. When using the original
direction 5-tuple matching the affected conntrack entries can be
allowed to time out instead, as the established state of the
connection would not need to be the basis for packet admission any
more.
It should be noted that the directionality of related connections may
be the same or different than that of the master connection, and
neither the original direction 5-tuple nor the conntrack state bits
carry this information. If needed, the directionality of the master
connection can be stored in master's conntrack mark or labels, which
are automatically inherited by the expected related connections.
The fact that neither ARP nor ND packets are trackable by conntrack
allows mutual exclusion between ARP/ND and the new conntrack original
tuple fields. Hence, the IP addresses are overlaid in union with ARP
and ND fields. This allows the sw_flow_key to not grow much due to
this patch, but it also means that we must be careful to never use the
new key fields with ARP or ND packets. ARP is easy to distinguish and
keep mutually exclusive based on the ethernet type, but ND being an
ICMPv6 protocol requires a bit more attention.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch squashes in minimal amount of OVS userspace code to not
break the build. Later patches contain the full userspace support.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Add DPIF-level infrastructure for meters. Allow meter_set to modify
the meter configuration (e.g. set the burst size if unspecified).
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Andy Zhou <azhou@ovn.org>
Upstream commit:
commit 91820da6ae85904d95ed53bf3a83f9ec44a6b80a
Author: Jiri Benc <jbenc@redhat.com>
Date: Thu Nov 10 16:28:23 2016 +0100
openvswitch: add Ethernet push and pop actions
It's not allowed to push Ethernet header in front of another Ethernet
header.
It's not allowed to pop Ethernet header if there's a vlan tag. This
preserves the invariant that L3 packet never has a vlan tag.
Based on previous versions by Lorand Jakab and Simon Horman.
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[Committer notes]
Fix build with the upstream commit by folding in the required switch
case enum handlers.
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
When adding userspace datapath clone action, the corresponding odp
actions parser and unit tests were missing. This patch adds them.
Reported-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
Code is simplified when the ODP keys use the same type as the struct
flow for the IPv6 addresses. As the change is facilitated by
extract-odp-netlink-h, this change only affects the userspace. We
already do the same for the ethernet addresses.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
This patch fixes problems with MPLS handling related to patch ports
and group buckets.
If a group bucket or a peer bridge across a patch port pushes MPLS
headers to a non-MPLS packet and outputs, the flow translation after
returning from the group bucket or patch port would undo the packet
transformations so that the processing could continue with the packet
as it was before entering the patch port. There were two problems
with this:
1. As part of the first MPLS push on a non-MPLS packet, the flow
translation would first clear the L3/4 headers of the 'flow' to mark
those fields invalid. Later, when committing 'flow' changes to
datapath actions before output, the necessary datapath MPLS actions
are created and the corresponding changes updated to the 'base flow'.
This was done using the same flow_push_mpls() function that clears
the L2/3 headers, so also the 'base flow' L2/3 headers were cleared.
Then, when translation returns from a patch port or group bucket, the
original 'flow' is restored, now showing no sign of the MPLS labels.
Since the 'base flow' now has the MPLS labels, following translations
know to issue MPLS POP actions before any output actions. However, as
part of checking for changes to IP headers we test that the IP
protocol type was not changed. But now the 'base flow's 'nw_proto'
field is zero and an assert fail crashes OVS.
This is solved by not clearing the L3/4 fields of the 'base
flow'. This allows the processing after the patch port to continue
with L3/4 fields as if no MPLS was done, after first issuing the
necessary MPLS POP actions.
2. IP header updates were done before the MPLS POP actions were
issued. This caused incorrect packet output after, e.g., group action
or patch port. For example, with actions:
group 1234: all bucket=push_mpls,output:LOCAL
ip actions=group:1234,dec_ttl,output:LOCAL,output:LOCAL
the dec_ttl would only be executed before the last output to LOCAL,
since at the time of committing IP changes after the group action the
packet was still an MPLS packet.
This is solved by checking the dl_type of both 'flow' and 'base flow'
and issuing MPLS actions if they can transform the packet from an MPLS
packet to a non-MPLS packet. For an IP packet the change in ttl can
then be correctly committed before the last two output actions.
Two test cases are added to prevent future regressions.
Reported-by: Thomas Morin <thomas.morin@orange.com>
Suggested-by: Takashi YAMAMOTO <yamamoto@ovn.org>
Fixes: 8bfd0fdac ("Enhance userspace support for MPLS, for up to 3 labels.")
Fixes: 1b035ef20 ("mpls: Allow l3 and l4 actions to prior to a push_mpls action")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Before Open vSwitch 2.5.90, IPFIX reports from Open vSwitch didn't include
whether the packet was ingressing or egressing the switch. Starting in
OVS 2.5.90, this information was available but only accurate if the action
included a port number that indicated a tunnel. Conflating these two does
not always make sense (not every packet involves a tunnel!), so this patch
makes it possible for the sample action to simply say whether it's for
ingress or egress.
This is difficult to test, since the "tests" directory of OVS does not have
a proper IPFIX listener. This passes those tests, plus a couple that just
verify that the actions are properly parsed and formatted. Benli did test
it end-to-end in a VMware use case.
Requested-by: Benli Ye <daniely@vmware.com>
Tested-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Simon Horman <simon.horman@netronome.com>
This helper is a little tidier than the alternative. Use it treewide.
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Simon Horman <simon.horman@netronome.com>
When using tunnel TLVs (at the moment, this means Geneve options), a
controller must first map the class and type onto an appropriate OXM
field so that it can be used in OVS flow operations. This table is
managed using OpenFlow extensions.
The original code that added support for TLVs made the mapping table
global as a simplification. However, this is not really logically
correct as the OpenFlow management commands are operating on a per-bridge
basis. This removes the original limitation to make the table per-bridge.
One nice result of this change is that it is generally clearer whether
the tunnel metadata is in datapath or OpenFlow format. Rather than
allowing ad-hoc format changes and trying to handle both formats in the
tunnel metadata functions, the format is more clearly separated by function.
Datapaths (both kernel and userspace) use datapath format and it is not
changed during the upcall process. At the beginning of action translation,
tunnel metadata is converted to OpenFlow format and flows and wildcards
are translated back at the end of the process.
As an additional benefit, this change improves performance in some flow
setup situations by keeping the tunnel metadata in the original packet
format in more cases. This helps when copies need to be made as the amount
of data touched is only what is present in the packet rather than the
maximum amount of metadata supported.
Co-authored-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Upstream commit:
commit b46f6ded906ef0be52a4881ba50a084aeca64d7e
Author: Nicolas Dichtel <nicolas.dichtel@6wind.com>
libnl: nla_put_be64(): align on a 64-bit area
nla_data() is now aligned on a 64-bit area.
A temporary version (nla_put_be64_32bit()) is added for nla_put_net64().
This function is removed in the next patch.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
The patch adds a new action to support packet truncation. The new action
is formatted as 'output(port=n,max_len=m)', as output to port n, with
packet size being MIN(original_size, m).
One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload. Example
use case is below as well as shown in the testcases:
- Output to port 1 with max_len 100 bytes.
- The output packet size on port 1 will be MIN(original_packet_size, 100).
# ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)'
- The scope of max_len is limited to output action itself. The following
packet size of output:1 and output:2 will be intact.
# ovs-ofctl add-flow br0 \
'actions=output(port=1,max_len=100),output:1,output:2'
- The Datapath actions shows:
# Datapath actions: trunc(100),1,1,2
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Add support to export tunnel information for flow-based IPFIX.
The original steps to configure flow level IPFIX:
1) Create a new record in Flow_Sample_Collector_Set table:
'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
2) Add IPFIX configuration which is referred by corresponding
row in Flow_Sample_Collector_Set table:
'ovs-vsctl -- set Flow_Sample_Collector_Set
"Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX
targets=\"IP:4739\" obs_domain_id=123 obs_point_id=456
cache_active_timeout=60 cache_max_flows=13'
3) Add sample action to the flows:
'ovs-ofctl add-flow mybridge in_port=1,
actions=sample'('probability=65535,collector_set_id=1,
obs_domain_id=123,obs_point_id=456')',output:3'
NXAST_SAMPLE action was used in step 3. In order to support exporting tunnel
information, the NXAST_SAMPLE2 action was added and with NXAST_SAMPLE2 action
in this patch, the step 3 should be configured like below:
'ovs-ofctl add-flow mybridge in_port=1,
actions=sample'('probability=65535,collector_set_id=1,obs_domain_id=123,
obs_point_id=456,sampling_port=3')',output:3'
'sampling_port' can be equal to ingress port or one of egress ports. If sampling
port is equal to output port and the output port is a tunnel port,
OVS_USERSPACE_ATTR_EGRESS_TUN_PORT will be set in the datapath flow sample action.
When flow sample action upcall happens, tunnel information will be retrieved from
the datapath and then IPFIX can export egress tunnel port information. If
samping_port=65535 (OFPP_NONE), flow-based IPFIX will keep the same behavior
as before.
This patch mainly do three tasks:
1) Add a new flow sample action NXAST_SAMPLE2 to support exporting
tunnel information. NXAST_SAMPLE2 action has a new added field
'sampling_port'.
2) Use 'other_configure: enable-tunnel-sampling' to enable or disable
exporting tunnel information.
3) If 'sampling_port' is equal to output port and output port is a tunnel
port, the translation of OpenFlow "sample" action should first emit
set(tunnel(...)), then the sample action itself. It makes sure the
egress tunnel information can be sampled.
4) Add a test of flow-based IPFIX for tunnel set.
How to test flow-based IPFIX:
1) Setup a test environment with two Linux host with Docker supported
2) Create a Docker container and a GRE tunnel port on each host
3) Use ovs-docker to add the container on the bridge
4) Listen on port 4739 on the collector machine and use wireshark to filter
'cflow' packets.
5) Configure flow-based IPFIX:
- 'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
- 'ovs-vsctl -- set Flow_Sample_Collector_Set
"Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX \
targets=\"IP:4739\" cache_active_timeout=60 cache_max_flows=13 \
other_config:enable-tunnel-sampling=true'
- 'ovs-ofctl add-flow mybridge in_port=1,
actions=sample'('probability=65535,collector_set_id=1,obs_domain_id=123,
obs_point_id=456,sampling_port=3')',output:3'
Note: The in-port is container port. The output port and sampling_port
are both open flow port and the output port is a GRE tunnel port.
6) Ping from the container whose host enabled flow-based IPFIX.
7) Get the IPFIX template pakcets and IPFIX information packets.
Signed-off-by: Benli Ye <daniely@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When calling odp_flow_key_from_flow (or _mask), the in_port included
as part of the flow is ignored and must be explicitly passed as a
separate parameter. This is because the assumption was that the flow's
version would often be in OFP format, rather than ODP.
However, at this point all flows that are ready for serialization in
netlink format already have their in_port properly set to ODP format.
As a result, every caller needs to explicitly initialize the extra
paramter to the value that is in the flow. This switches to just use
the value in the flow to simply things and avoid the possibility of
forgetting to initialize the extra parameter.
Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Commit f2d105b5 (ofproto-dpif-xlate: xlate ct_{mark, label} correctly.)
introduced the ovs_u128_and() function. It directly takes ovs_u128
values as arguments instead of pointers to them. As this is a bit more
direct way to deal with 128-bit values, modify the other utility
functions to do the same.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Fix build warning: 'flags_mask' may be used uninitialized.
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Even though the number of supported MPLS labels may vary between a
datapath and the OVS userspace, it is better to use the
FLOW_MAX_MPLS_LABELS than a hard-coded '3' as the maximum number of
labels to scan.
Requested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
So far we have been limited to including only one MPLS label in the
textual datapath flow format. Allow upto 3 labels to be included so
that testing with multiple labels becomes easier.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
All of the callers of hash_words() and hash_words64() actually find it
easier to pass in the number of bytes instead of the number of 32-bit
or 64-bit words. These new functions allow the callers to be a little
simpler.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
It is possible to pass some fields to the kernel with a zero mask, but
ovs-dpctl doesn't currently allow it. Change the code to allow it to
mimic what vswitchd is allowed to do.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
When converting between ODP attributes and struct flow_wildcards, we
check that all the prerequisites are exact matched on the mask.
For ND(ICMPv6) attributes, an exact match on tp_src and tp_dst
(which in this context are the icmp type and code) shold look like
htons(0xff), not htons(0xffff). Fix this in two places.
The consequences were that the ODP mask wouldn't include the ND
attributes and the flow would be deleted by the revalidation.
In the ODP context an empty mask netlink attribute usually means that
the flow should be an exact match.
odp_flow_key_to_mask{,_udpif}() instead return a struct flow_wildcards
with matches only on recirc_id and vlan_tci.
A more appropriate behavior is to handle a missing (zero length) netlink
mask specially (like we do in userspace and Linux datapath) and create
an exact match flow_wildcards from the original flow.
This fixes a bug in revalidate_ukey(): every flow created with
megaflows disabled would be revalidated away, because the mask would
seem too generic. (Another possible fix would be to handle the special
case of a missing mask in revalidate_ukey(), but this seems a more
generic solution).
commit_set_icmp_action() should do its job only if the packet is ICMP,
otherwise there will be two problems:
* A set ICMP action will be inserted in the ODP actions and the flow
will be slow pathed.
* The tp_src and tp_dst field will be unwildcarded.
Normal TCP or UDP packets won't be impacted, because
commit_set_icmp_action() is called after commit_set_port_action() and it
will see the fields as already committed (TCP/UCP transport ports and ICMP
code/type are stored in the same members in struct flow).
MPLS packets though will hit the bug, causing a nonsensical set action
(which will end up zeroing the transport source port) and an invalid
mask to be generated.
The commit also alters an MPLS testcase to trigger the bug.
Limit the scope of the local vlan variable in format_odp_action()
to where it is used. This is consistent with the treatment of mpls
in the same function.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Note that because there's been no prerequisite on the outer protocol,
we cannot add it now. Instead, treat the ipv4 and ipv6 dst fields in the way
that either both are null, or at most one of them is non-null.
[cascardo: abstract testing either dst with flow_tnl_dst_is_set]
cascardo: using IPv4-mapped address is an exercise for the future, since this
would require special handling of MFF_TUN_SRC and MFF_TUN_DST and OpenFlow
messages.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Co-authored-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Add in6_addr counterparts to the existing format and scan functions.
Otherwise we'd need to recast all the time.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Extend OVS conntrack interface to cover NAT. New nested NAT action
may be included with a CT action. A bare NAT action only mangles
existing connections. If a NAT action with src or dst range attribute
is included, new (non-committed) connections are mangled according to
the NAT attributes.
This work extends on a branch by Thomas Graf at
https://github.com/tgraf/ovs/tree/nat.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Instead of taking the source and destination as arguments, make these
functions act like their short and long counterparts.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
This patch adds support for specifying a "helper" or ALG to assist
connection tracking for protocols that consist of multiple streams.
Initially, only support for FTP is included.
Below is an example set of flows to allow FTP control connections from
port 1->2 to establish active data connections in the reverse direction:
table=0,priority=1,action=drop
table=0,arp,action=normal
table=0,in_port=1,tcp,action=ct(alg=ftp,commit),2
table=0,in_port=2,tcp,ct_state=-trk,action=ct(table=1)
table=1,in_port=2,tcp,ct_state=+trk+est,action=1
table=1,in_port=2,tcp,ct_state=+trk+rel,action=ct(commit),1
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>