As the police actions with indexes of 0x10000000-0x1fffffff are
reserved for meter offload, to provide a clean environment for OVS,
these reserved police actions should be deleted on startup. So dump
all the police actions, delete those actions with indexes in this
range.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Add helpers to add, delete and get stats of police action with
the specified index.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Add function to parse police action from netlink message.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Currently, tc merges all header rewrite actions into one tc pedit
action. So the header rewrite actions order is lost. Save each header
rewrite action into one tc pedit action to keep the order. And only
append one tc csum action to the last pedit action of a series.
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Fragmented packets with offset=0 are defragmented by tc act_ct, and
only when assembled pass to next action, in ovs offload case,
a goto action. Since stats are overwritten on each action dump,
only the stats for last action in the tc filter action priority
list is taken, the stats on the goto action, which count
only the assembled packets. See below for example.
Hardware updates just part of the actions (gact, ct, mirred) - those
that support stats_update() operation. Since datapath rules end
with either an output (mirred) or recirc/drop (both gact), tc rule
will at least have one action that supports it. For software packets,
the first action will have the max software packets count.
Tc dumps total packets (hw + sw) and hardware packets, then
software packets needs to be calculated from this (total - hw).
To fix the above, get hardware packets and calculate software packets
for each action, take the max of each set, then combine back
to get the total packets that went through software and hardware.
Example by running ping above MTU (ping <IP> -s 2000):
ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=first),
packets:14, bytes:19544,..., actions:ct(zone=1),recirc(0x1)
ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=later),
packets:14, bytes:28392,..., actions:ct(zone=1),recirc(0x1)
Second rule should have had bytes=14*<size of 'later' frag>, but instead
it's bytes=14*<size of assembled packets - size of 'first' + 'later'
frags>.
Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Previously, only chain 0 is deleted before attach the ingress block.
If there are rules on the chain other than 0, these rules are not flushed.
In this case, the recreation of qdisc also fails. To fix this issue, flush
rules from all chains.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
There is no need to pass tc rules to hw when just probing
for tc features. this will avoid redundant errors from hw drivers
that may happen.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Acked-By: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
This patch allows to install arp rules to tc dp.
In the future, arp will be offloaded to hardware to
be processed. So OvS enable this now.
$ ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(3),eth(),\
eth_type(0x0806),arp(op=2,tha=00:50:56:e1:4b:ab,tip=10.255.1.116)' 2
$ ovs-appctl dpctl/dump-flows
... arp(tip=10.255.1.116,op=2,tha=00:50:56:e1:4b:ab) ...
$ tc filter show dev <ethx> ingress
...
eth_type arp
arp_tip 10.255.1.116
arp_op reply
arp_tha 00:50:56:e1:4b:ab
not_in_hw
action order 1: mirred (Egress Redirect to device <ethy>) stolen
...
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
When dumping flows in terse mode set TCA_DUMP_FLAGS attribute to
TCA_DUMP_FLAGS_TERSE flag to prevent unnecessary copying of data between
kernel and user spaces. Only expect kernel to provide cookie, stats and
flags when dumping filters in terse mode.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Port range struct is currently union so the last min/max port assignment
wins, and kernel doesn't receive the range.
Change it to struct type.
Fixes: 2bf6ffb76ac6 ("netdev-offload-tc: Add conntrack nat support")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
tc_make_tcf_id() doesn't initialize the 'chain' field leaving it with a
random value from the stack. This makes request_from_tcf_id() create
request with random TCA_CHAIN included. These requests are obviously
doesn't work as needed leading to broken flow dump and various other
issues. Fix that by using designated initializer instead.
Fixes: acdd544c4c9a ("tc: Introduce tcf_id to specify a tc filter")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Zone and ct_state first.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Each recirculation id will create a tc chain, and we translate
the recirculation action to a tc goto chain action.
We check for kernel support for this by probing OvS Datapath for the
tc recirc id sharing feature. If supported, we can offload rules
that match on recirc_id, and recirculation action safely.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Move all that is needed to identify a tc filter to a
new structure, tcf_id. This removes a lot of duplication
in accessing/creating tc filters.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Currently, ovs supports to offload max TCA_ACT_MAX_PRIO(32) actions.
But net sched api has a limit of 4K message size which is not enough
for 32 actions when echo flag is set.
After a lot of testing, we find that 16 actions is a reasonable number.
So in this commit, we introduced a new define to limit the max actions.
Fixes: 0c70132cd288("tc: Make the actions order consistent")
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Recent modifications to TC allows the modifying of fields within the
outermost MPLS header of a packet. OvS datapath rules impliment an MPLS
set action by supplying a new MPLS header that should overwrite the
current one.
Convert the OvS datapath MPLS set action to a TC modify action and allow
such rules to be offloaded to a TC datapath.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
TC can now be used to push an MPLS header onto a packet. The MPLS label is
the only information that needs to be passed here with the rest reverting
to default values if none are supplied. OvS, however, gives the entire
MPLS header to be pushed along with the MPLS protocol to use. TC can
optionally accept these values so can be made replicate the OvS datapath
rule.
Convert OvS MPLS push datapath rules to TC format and offload to a TC
datapath.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
TC now supports an action to pop the outer MPLS header from a packet. The
next protocol after the header is required alongside this. Currently, OvS
datapath rules also supply this information.
Offload OvS MPLS pop actions to TC along with the next protocol.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Offloading rules to a TC datapath only allows the creating of ingress hook
qdiscs and the application of filters to these. However, there may be
certain situations where an egress qdisc is more applicable (e.g. when
offloading to TC rules applied to OvS internal ports).
Extend the TC API in OvS to allow the creation of egress qdiscs and to add
or interact with flower filters applied to these.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
The TC datapath only permits the offload of mirred actions if they are
egress. To offload TC actions that output to OvS internal ports, ingress
mirred actions are required. At the TC layer, an ingress mirred action
passes the packet back into the network stack as if it came in the action
port rather than attempting to egress the port.
Update OvS-TC offloads to support ingress mirred actions. To ensure
packets that match these rules are properly passed into the network stack,
add a TC skbedit action along with ingress mirred that sets the pkt_type
to PACKET_HOST. This mirrors the functionality of the OvS internal port
kernel module.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Firstly this patch introduces the notion of reserved priority, as the
filter implementing ingress policing would require the highest priority.
Secondly it allows setting rate limiters while tc-offloads has been
enabled. Lastly it installs a matchall filter that matches all traffic
and then applies a police action, when configuring an ingress rate
limiter.
An example of what to expect:
OvS CLI:
ovs-vsctl set interface <netdev_name> ingress_policing_rate=5000
ovs-vsctl set interface <netdev_name> ingress_policing_burst=100
Resulting TC filter:
filter protocol ip pref 1 matchall chain 0
filter protocol ip pref 1 matchall chain 0 handle 0x1
not_in_hw
action order 1: police 0x1 rate 5Mbit burst 125Kb mtu 64Kb
action drop/continue overhead 0b
ref 1 bind 1 installed 3 sec used 3 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
10.0.0.200 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 60.13 4.49
ovs-vsctl list interface <netdev_name>
_uuid : 2ca774e8-8b95-430f-a2c2-f8f742613ab1
admin_state : up
...
ingress_policing_burst: 100
ingress_policing_rate: 5000
...
type : ""
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Currently the TC tunnel_key action is always
initialized with the given tunnel id value. However,
some tunneling protocols define the tunnel id as an optional field.
This patch initializes the id field of tunnel_key:set and tunnel_key:unset
only if a value is provided.
In the case that a tunnel key value is not provided by the user
the key flag will not be set.
Signed-off-by: Adi Nissim <adin@mellanox.com>
Acked-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Extend ovs-tc translation by allowing non-byte-aligned fields
for set actions. Use new boundary shifts and add set ipv6 traffic
class action offload via pedit.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Add setting of ipv4 dscp and ecn fields in tc offload using pedit.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Add support for IPv6 hlimit field.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Tunnels (gre, geneve, vxlan) support 'csum' option (true/false), default is false.
Generated encap TC rule will now be configured as the tunnel configuration.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Add TC offload support for classifying geneve tunnels with options.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Add TC offload support for encapsulating geneve tunnels with options.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Add TC offload support for classifying single MPLS tagged traffic.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Move the tunnel match fields to be part of the tc/flower key structure.
This is pre-step for being able to apply masked match where needed.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Support matching on tos and ttl of ip tunnels
for the TC data-path.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Allow to set the tos and ttl for TC tunnels.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Add the missing code to match on ip tos when dealing
with the TC data-path.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Currently the inner VLAN header is ignored when using the TC data-path.
As TC flower supports QinQ, now we can offload the rules to match on both
outer and inner VLAN headers.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Currently we only support 802.1q, so we can offload push action without
specifying any vlan type. Kernel will push 802.1q ethertype by default.
But to support QinQ, we need to tell what ethertype is in push action as
it could be 802.1ad.
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Blocks, in tc classifiers, allow the grouping of multiple qdiscs with an
associated block id. Whenever a filter is added to/removed from this
block, the filter is added to/removed from all associated qdiscs.
Extend TC offload functions to take a block id as a parameter. If the id
is zero then the dqisc is not considered part of a block.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Previously, any rule that is offloaded via a netdev, not necessarily
to the HW, would be reported as "offloaded". This patch fixes this
misalignment, and introduces the 'dp' state, as follows:
rule is in HW via TC offload -> offloaded=yes dp:tc
rule is in not HW over TC DP -> offloaded=no dp:tc
rule is in not HW over OVS DP -> offloaded=no dp:ovs
To achieve this, the flows's 'offloaded' flag was encapsulated in a new
attrs struct, which contains the offloaded state of the flow and the
DP layer the flow is handled in, and instead of setting the flow's
'offloaded' state based solely on the type of dump it was acquired
via, for netdev flows it now sends the new attrs struct to be
collected along with the rest of the flow via the netdev, allowing
it to be set per flow.
For TC offloads, the offloaded state is set based on the 'in_hw' and
'not_in_hw' flags received from the TC as part of the flower. If no
such flag was received, due to lack of kernel support, it defaults
to true.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
[simon: resolved conflict in lib/dpctl.man]
Signed-off-by: Simon Horman <simon.horman@netronome.com>
When OVS DP passes the actions to TC library, we save all the
actions in data structure tc_flower and each action type has its
own field in tc_flower. So when TC library passes the actions to
kernel, actually the actions order is lost.
We add an actions array in tc_flower to keep the actions order
in this patch.
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Add support for frag no, first and later.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
FreeBSD insists that <sys/types.h> be included before <netinet/in.h> and
that <netinet/in.h> be included before <arpa/inet.h>. This adds guards to
the "sparse" headers to yield a warning if this order is violated. This
commit also adjusts the order of many #includes to suit this requirement.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
On dump, parse and verify the tc csum action update flags
in the same way as we put them.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
To be later used to implement ovs action set offloading.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellacom>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
To be used later for offloading rules matching on tcp_flags.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Add matching on ip layer ttl, to be used later.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Split dst/src_port and ipv4/ipv6 union so we can
distingush them easily for later features.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Fix minor style variations and unnecessary includes.
Signed-off-by: Joe Stringer <joe@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Greg Rose <gvrose8192@gmail.com>
linux/pkt_cls.h indirectly includes linux/stddef.h, which may define
__always_inline when using Linux kernel headers 4.6 or later.
netinet/in.h indirectly includes sys/cdefs.h, which before Glibc 2.24
would unconditionally define __always_inline.
This combination of headers causes compilation failure. Resolve the
issue by placing the netinet/in.h include above linux/pkt_cls.h.
Sourceware-BZ: #20215.
Fixes: f98e418fbdb6 ("tc: Add tc flower functions")
Signed-off-by: Joe Stringer <joe@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Greg Rose <gvrose8192@gmail.com>