2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 21:07:47 +00:00

91 Commits

Author SHA1 Message Date
Jianbo Liu
218304df18 netdev-offload-tc: Offloading rules with police actions
When offloading rule, tc should be filled with police index, instead
of meter id. As meter is mapped to police action, and the mapping is
inserted into meter_id_to_police_idx hmap, this hmap is used to find
the police index. Besides, the reverse mapping between meter id and
police index is also added, so meter id can be retrieved from this
hashmap and pass to dpif while dumping rules.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-11 11:20:50 +02:00
Jianbo Liu
f6ed09ef55 netdev-offload-tc: Cleanup police actions with reserved indexes on startup
As the police actions with indexes of 0x10000000-0x1fffffff are
reserved for meter offload, to provide a clean environment for OVS,
these reserved police actions should be deleted on startup. So dump
all the police actions, delete those actions with indexes in this
range.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-11 11:20:37 +02:00
Jianbo Liu
5c039ddc64 netdev-linux: Add functions to manipulate tc police action
Add helpers to add, delete and get stats of police action with
the specified index.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-11 11:18:16 +02:00
Jianbo Liu
a9b8cdde69 tc: Add support parsing tc police action
Add function to parse police action from netlink message.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-11 11:16:56 +02:00
Ilya Maximets
a2d202bdee tc: Fix misaligned access to stats and time values.
Pointers to gnet_stats_basic and tcf_t are not correctly aligned,
so we need to copy the data before accessing.  Found by running
check-offloads testsuite with UBsan:

 lib/tc.c:1791:50: runtime error:
   member access within misaligned address 0x61900005ce1c for type
   'const struct gnet_stats_basic', which requires 8 byte alignment
   0x61900005ce1c: note: pointer points here
      14 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  ...
                  ^
   0 0xd69044 in nl_parse_single_action lib/tc.c:1791:50
   1 0xd69044 in nl_parse_flower_actions lib/tc.c:1841:19
   2 0xd57612 in nl_parse_flower_options lib/tc.c:1893:12
   3 0xd5468d in parse_netlink_to_tc_flower lib/tc.c:1941:12
   4 0xd5a7ec in tc_replace_flower lib/tc.c:3199:19
   5 0xd2baea in probe_tc_block_support lib/netdev-offload-tc.c:2226:13
   6 0xd2baea in netdev_tc_init_flow_api lib/netdev-offload-tc.c:2279:9
   7 0x94d726 in netdev_assign_flow_api lib/netdev-offload.c:183:14
   8 0x94d726 in netdev_init_flow_api lib/netdev-offload.c:323:9
   9 0x9515c7 in netdev_ports_flow_init lib/netdev-offload.c:775:8
   10 0x9515c7 in netdev_set_flow_api_enabled lib/netdev-offload.c:815:13
   11 0x562ec8 in bridge_run vswitchd/bridge.c:3292:9
   12 0x59a98c in main vswitchd/ovs-vswitchd.c:129:9
   13 0x7fb5c583acf2 in __libc_start_main (/lib64/libc.so.6+0x3acf2)
   14 0x47e60d in _start (vswitchd/ovs-vswitchd+0x47e60d)

 lib/tc.c:1306:43: runtime error:
   member access within misaligned address 0x619000140324 for type
   'const struct tcf_t', which requires 8 byte alignment
   0x619000140324: note: pointer points here
      24 00 01 00 00 00 00 00  00 00 00 00 00 00 00 00  ...
                  ^
   0 0xd6ee6f in nl_parse_tcf lib/tc.c:1306:43
   1 0xd6423f in nl_parse_act_mirred lib/tc.c:1389:5
   2 0xd6423f in nl_parse_single_action lib/tc.c:1747:15
   3 0xd6423f in nl_parse_flower_actions lib/tc.c:1843:19
   4 0xd57612 in nl_parse_flower_options lib/tc.c:1895:12
   5 0xd5468d in parse_netlink_to_tc_flower lib/tc.c:1943:12
   6 0xd5a7ec in tc_replace_flower lib/tc.c:3201:19
   7 0xd28ae8 in netdev_tc_flow_put lib/netdev-offload-tc.c:1969:11
   8 0x94cc97 in netdev_flow_put lib/netdev-offload.c:257:14
   9 0xcba2be in parse_flow_put lib/dpif-netlink.c:2289:11
   10 0xcba2be in try_send_to_netdev lib/dpif-netlink.c:2376:15
   11 0xcba2be in dpif_netlink_operate lib/dpif-netlink.c:2447:23
   12 0x8647ce in dpif_operate lib/dpif.c:1372:13
   13 0x6b50a6 in push_dp_ops ofproto/ofproto-dpif-upcall.c:2406:5
   14 0x6c9bcd in revalidate ofproto/ofproto-dpif-upcall.c:2792:13
   15 0x6b79b5 in udpif_revalidator ofproto/ofproto-dpif-upcall.c:980:9
   16 0xb4d5ea in ovsthread_wrapper lib/ovs-thread.c:422:12
   17 0x7ff1090081ce in start_thread (/lib64/libpthread.so.0+0x81ce)
   18 0x7ff107c39dd2 in clone (/lib64/libc.so.6+0x39dd2)

This patch fixes only problems reported by UBsan in current tests,
there could be more issues like this not currently covered by the
testsuite.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-06-24 23:50:31 +02:00
Eelco Chaudron
6d76cfc444 netdev-offload-tc: Fix IP and port ranges in flower returns.
When programming NAT rules OVS only sets the minimum value for a
single IP/port value. However, responses from flower will always
return min == max for single IP/port values. This is causing the
verification to fail as the request is different than the response.
To avoid this, we will update the response to match the request.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Eelco Chaudron
38298a877b netdev-offload-tc: Fix use of ICMP values instead of masks defines.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Eelco Chaudron
a039636950 netdev-offload-tc: Always include conntrack information to tc.
Regardless of the traffic type, if requested, the conntrack information
should be included to keep the datapath and tc rules in sync.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Eelco Chaudron
b4868ee163 netdev-offload-tc: Set the correct VLAN_VID and VLAN_PCP masks.
This change will set the correct VID and PCP masks, as well as the
ethernet type mask.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Eelco Chaudron
2bdf5b288c netdev-offload-tc: Add debug logs on tc rule verify failures.
This patch adds more detailed debug logs on tc verify failures to
ease debugging the actual cause after the fact.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Chris Mi
920ec5761e tc: Keep header rewrite actions order.
Currently, tc merges all header rewrite actions into one tc pedit
action. So the header rewrite actions order is lost. Save each header
rewrite action into one tc pedit action to keep the order. And only
append one tc csum action to the last pedit action of a series.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-18 17:41:14 +01:00
Paul Blakey
f34a7626cc tc: Fix stats byte count on fragmented packets.
Fragmented packets with offset=0 are defragmented by tc act_ct, and
only when assembled pass to next action, in ovs offload case,
a goto action. Since stats are overwritten on each action dump,
only the stats for last action in the tc filter action priority
list is taken, the stats on the goto action, which count
only the assembled packets. See below for example.

Hardware updates just part of the actions (gact, ct, mirred) - those
that support stats_update() operation. Since datapath rules end
with either an output (mirred) or recirc/drop (both gact), tc rule
will at least have one action that supports it. For software packets,
the first action will have the max software packets count.
Tc dumps total packets (hw + sw) and hardware packets, then
software packets needs to be calculated from this (total - hw).

To fix the above, get hardware packets and calculate software packets
for each action, take the max of each set, then combine back
to get the total packets that went through software and hardware.

Example by running ping above MTU (ping <IP> -s 2000):
ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=first),
  packets:14, bytes:19544,..., actions:ct(zone=1),recirc(0x1)
ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=later),
  packets:14, bytes:28392,..., actions:ct(zone=1),recirc(0x1)

Second rule should have had bytes=14*<size of 'later' frag>, but instead
it's bytes=14*<size of assembled packets - size of 'first' + 'later'
frags>.

Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-04 16:02:45 +01:00
Roi Dayan
96ad83bc7b tc: Fix incorrect TC rule for decap+encap datapath flow.
A datapath flow generated for traffic from vxlan port to another vxlan port
looks like this:

 tunnel(tun_id=0x65,src=10.10.11.3,dst=10.10.11.2,ttl=0/0,tp_dst=4789,flags(+key)),
 ...,in_port(vxlan_sys_4789),...,
 actions:set(tunnel(tun_id=0x66,src=10.10.12.2,dst=10.10.12.3,tp_dst=4789,flags(key))),
          vxlan_sys_4789

The generated TC rule with explicit tunnel key unset action added after
tunnel key set action, which is wrong.

filter protocol ip pref 7 flower chain 0 handle 0x1
  dst_mac fa:16:3e:2a:4e:23
  eth_type ipv4
  ip_tos 0x0/3
  enc_dst_ip 10.10.11.2
  enc_src_ip 10.10.11.3
  enc_key_id 101
  enc_dst_port 4789
  ip_flags nofrag
  not_in_hw
        action order 1: tunnel_key  set
        src_ip 10.10.12.2
        dst_ip 10.10.12.3
        key_id 102
        dst_port 4789
        nocsum pipe
         index 1 ref 1 bind 1 installed 568 sec used 0 sec
        Action statistics:
        Sent 46620 bytes 555 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: tunnel_key  unset pipe
         index 2 ref 1 bind 1 installed 568 sec used 0 sec
        Action statistics:
        Sent 46620 bytes 555 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 3: mirred (Egress Redirect to device vxlan_sys_4789) stolen
        index 1 ref 1 bind 1 installed 568 sec used 0 sec
        Action statistics:
        Sent 46620 bytes 555 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
        cookie e0c82bfd504b701428b00db6b08db3b2

Fix it by also adding the the tunnel key unset action before the tunnel
key set action and not only before output port.

Fixes: 7c53bd7839d8 ("tc: Move tunnel_key unset action before output ports")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-02-09 15:40:19 +01:00
Vlad Buslov
6bb3f363d6 tc: Set action flags for tunnel_key release.
The commit that enabled 'no_percpu' flag for compatible actions missed the
tunnel_key release action code. Add missing call to nl_msg_put_act_flags().

Fixes: 292d5bd9bb34 ("tc: Set 'no_percpu' flag for compatible actions")
Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Tested-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-08-16 19:40:29 +02:00
Eelco Chaudron
464b5b13e6 netdev-offload-tc: Verify the flower rule installed.
When OVS installs the flower rule, it only checks for the OK from the
kernel. It does not check if the rule requested matches the one
actually programmed. This change will add this check and warns the
user if this is not the case.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-07-09 19:54:56 +02:00
Jianbo Liu
47e73f7f00 netdev-offload-tc: Flush rules on all chains before attach ingress block
Previously, only chain 0 is deleted before attach the ingress block.
If there are rules on the chain other than 0, these rules are not flushed.
In this case, the recreation of qdisc also fails. To fix this issue, flush
rules from all chains.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2021-02-24 11:07:20 +01:00
Eelco Chaudron
0f252daa94 tc: Fix mpls bottom of stack bit mask reporting.
Fix the reported back value of the bos mask used by the revalidator
threads.

Fixes: 34b1695506f8 ("lib/tc: add single mpls match offload support")
Reported-by: Marcelo Leitner <mleitner@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-02-02 19:30:58 +01:00
Maor Dickman
75e1e6fd2d lib/tc: add ICMP type and code match offload
Add TC offload support for classifying ICMPv4/6 type and code.

$ ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(3),eth(),\
  eth_type(0x0800),ipv4(proto=1),icmp(type=9,code=0)' 2

$ ovs-appctl dpctl/dump-flows
  ... icmp(type=9,code=0) ...

$ tc filter show dev <ethx> ingress
  ...
  eth_type ipv4
  ip_proto icmp
  icmp_type 9
  icmp_code 0
  not_in_hw
  action order 1: mirred (Egress Redirect to device <ethy>) stolen
  ...

Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2021-02-01 16:54:15 +01:00
wenxu
71d0c0d8b4 lib/tc: fix parse act pedit for tos rewrite
Check overlap between current pedit key, which is always 4 bytes
(range [off, off + 3]), and a map entry in flower_pedit_map
sf = ROUND_DOWN(mf, 4) (range [sf|mf, (mf + sz - 1)|ef]).

So for the tos the rewite the off + 3(3) is greater than mf,
and should less than ef(4) but not mf+sz(2).

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-12-04 07:33:14 +01:00
Roi Dayan
d5659751f6 tc: Use skip_hw flag when probing tc features
There is no need to pass tc rules to hw when just probing
for tc features. this will avoid redundant errors from hw drivers
that may happen.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Acked-By: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-08-06 17:26:46 +02:00
wenxu
0fc38297ba lib/tc: only update the stats for non-empty counter
A packet with first frag and execute act_ct action.
The packet will stole by defrag. So the stats counter
for "gact action goto chain" will always 0. The openvswitch
update each action in order. So the flower stats finally
alway be zero. The rule will be delete adter max-idle time
even there are packet executing the action.

ovs-appctl dpctl/dump-flows
recirc_id(0),in_port(1),eth_type(0x0800),ipv4(dst=11.0.0.7,frag=first), packets:0, bytes:0, used:5.390s, actions:ct(zone=1,nat),recirc(0x4)

filter protocol ip pref 2 flower chain 0 handle 0x2
  eth_type ipv4
  dst_ip 1.1.1.1
  ip_flags frag/firstfrag
  skip_hw
  not_in_hw
 action order 1: ct zone 1 nat pipe
  index 2 ref 1 bind 1 installed 11 sec used 1 sec
 Action statistics:
 Sent 15000 bytes 11 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 cookie e04106c2ac41769b278edaa9b5309960

 action order 2: gact action goto chain 1
  random type none pass val 0
  index 2 ref 1 bind 1 installed 11 sec used 11 sec
 Action statistics:
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 cookie e04106c2ac41769b278edaa9b5309960

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-07-01 10:00:15 +02:00
Roi Dayan
1fe4297563 netdev-offload-tc: Revert tunnel src/dst port masks handling
The cited commit intended to add tc support for masking tunnel src/dst
ips and ports. It's not possible to do tunnel ports masking with
openflow rules and the default mask for tunnel ports set to 0 in
tnl_wc_init(), unlike tunnel ports default mask which is full mask.
So instead of never passing tunnel ports to tc, revert the changes
to tunnel ports to always pass the tunnel port.
In sw classification is done by the kernel, but for hw we must match
the tunnel dst port.

Fixes: 5f568d049130 ("netdev-offload-tc: Allow to match the IP and port mask of tunnel")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-06-19 08:51:11 +02:00
Tonghao Zhang
a3db6e473d netdev-offload-tc: Allow installing arp rules to TC dp.
This patch allows to install arp rules to tc dp.
In the future, arp will be offloaded to hardware to
be processed. So OvS enable this now.

$ ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(3),eth(),\
  eth_type(0x0806),arp(op=2,tha=00:50:56:e1:4b:ab,tip=10.255.1.116)' 2

$ ovs-appctl dpctl/dump-flows
  ... arp(tip=10.255.1.116,op=2,tha=00:50:56:e1:4b:ab) ...

$ tc filter show dev <ethx> ingress
  ...
  eth_type arp
  arp_tip 10.255.1.116
  arp_op reply
  arp_tha 00:50:56:e1:4b:ab
  not_in_hw
    action order 1: mirred (Egress Redirect to device <ethy>) stolen
    ...

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-06-08 11:08:05 +02:00
Vlad Buslov
5db012c4ac tc: Support new terse dump kernel API
When dumping flows in terse mode set TCA_DUMP_FLAGS attribute to
TCA_DUMP_FLAGS_TERSE flag to prevent unnecessary copying of data between
kernel and user spaces. Only expect kernel to provide cookie, stats and
flags when dumping filters in terse mode.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-06-05 10:14:27 +02:00
Tonghao Zhang
5f568d0491 netdev-offload-tc: Allow to match the IP and port mask of tunnel
This patch allows users to offload the TC flower rules with
tunnel mask. This patch allows masked match of the following,
where previously supported an exact match was supported:
* Remote (dst) tunnel endpoint address
* Local (src) tunnel endpoint address
* Remote (dst) tunnel endpoint UDP port

And also allows masked match of the following, where previously
no match was supported:
* Local (src) tunnel endpoint UDP port

In some case, mask is useful as wildcards. For example, DDOS,
in that case, we don’t want to allow specified hosts IPs or
only source Ports to access the targeted host. For example:

$ ovs-appctl dpctl/add-flow "tunnel(dst=2.2.2.100,src=2.2.2.0/255.255.255.0,tp_dst=4789),\
  recirc_id(0),in_port(3),eth(),eth_type(0x0800),ipv4()" ""

$ tc filter show dev vxlan_sys_4789 ingress
  ...
  eth_type ipv4
  enc_dst_ip 2.2.2.100
  enc_src_ip 2.2.2.0/24
  enc_dst_port 4789
  enc_ttl 64
  in_hw in_hw_count 2
	action order 1: gact action drop
    ...

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-06-03 09:56:07 +02:00
Tonghao Zhang
4f4be08e47 netdev-offload-tc: Use ipv6_addr_is_set instead of is_all_zeros
Not bugfix, make the codes more readable.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-06-03 09:53:30 +02:00
Paul Blakey
2bf6ffb76a netdev-offload-tc: Add conntrack nat support
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-22 11:54:41 +01:00
Paul Blakey
9221c721be netdev-offload-tc: Add conntrack label and mark support
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-22 11:54:40 +01:00
Paul Blakey
576126a931 netdev-offload-tc: Add conntrack support
Zone and ct_state first.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-22 11:54:40 +01:00
Paul Blakey
b2ae40690e netdev-offload-tc: Add recirculation support via tc chains
Each recirculation id will create a tc chain, and we translate
the recirculation action to a tc goto chain action.

We check for kernel support for this by probing OvS Datapath for the
tc recirc id sharing feature. If supported, we can offload rules
that match on recirc_id, and recirculation action safely.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-22 11:54:40 +01:00
Paul Blakey
7c53bd7839 tc: Move tunnel_key unset action before output ports
Since OvS datapath gets packets already decapsulated from tunnel devices,
it doesn't explicitly decapsulate them. So in a recirculation setup,
the tunnel matching continues in the recirculation as the tunnel
metadata still exists on the SKB.

Tunnel key unset action unsets this metadata. Some drivers might rely
on this explicit tunnel key unset to know when to decapsulate the packet
instead of the device type. So instead of removing it completly,
we move it near the output actions.

This way, we also keep SKB metadata through recirculation, and for
non-recirculation rules, the resulting tc rules should remain the same.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-22 11:54:40 +01:00
Paul Blakey
acdd544c4c tc: Introduce tcf_id to specify a tc filter
Move all that is needed to identify a tc filter to a
new structure, tcf_id. This removes a lot of duplication
in accessing/creating tc filters.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-22 11:54:40 +01:00
Vlad Buslov
292d5bd9bb tc: Set 'no_percpu' flag for compatible actions
Recent changes in Linux kernel TC action subsystem introduced new
TCA_ACT_FLAGS_NO_PERCPU_STATS flag. The purpose of the flag is to request
action implementation to skip allocating action stats with expensive percpu
allocator and use regular built-in action stats instead. Such approach
significantly improves rule insertion rate and reduce memory usage for
hardware-offloaded rules that don't need benefits provided by percpu
allocated stats (improved software TC fast-path performance). Set the flag
for all compatible actions.

Modify acinclude.m4 to use OVS-internal pkt_cls.h implementation when
TCA_ACT_FLAGS is not defined by kernel headers and to manually define
struct nla_bitfield32 in netlink.h (new file) when it is not defined by
kernel headers.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-11-11 14:42:52 +01:00
Chris Mi
d0fbb09f90 tc: Limit the max action number to 16
Currently, ovs supports to offload max TCA_ACT_MAX_PRIO(32) actions.
But net sched api has a limit of 4K message size which is not enough
for 32 actions when echo flag is set.

After a lot of testing, we find that 16 actions is a reasonable number.
So in this commit, we introduced a new define to limit the max actions.

Fixes: 0c70132cd288("tc: Make the actions order consistent")
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-10-18 10:56:44 +02:00
John Hurley
a8f005cf26 ovs-tc: offload MPLS set actions to TC datapath
Recent modifications to TC allows the modifying of fields within the
outermost MPLS header of a packet. OvS datapath rules impliment an MPLS
set action by supplying a new MPLS header that should overwrite the
current one.

Convert the OvS datapath MPLS set action to a TC modify action and allow
such rules to be offloaded to a TC datapath.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-08-01 18:09:42 +02:00
John Hurley
283dcf850d ovs-tc: offload MPLS push actions to TC datapath
TC can now be used to push an MPLS header onto a packet. The MPLS label is
the only information that needs to be passed here with the rest reverting
to default values if none are supplied. OvS, however, gives the entire
MPLS header to be pushed along with the MPLS protocol to use. TC can
optionally accept these values so can be made replicate the OvS datapath
rule.

Convert OvS MPLS push datapath rules to TC format and offload to a TC
datapath.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-08-01 18:09:42 +02:00
John Hurley
55412eac69 ovs-tc: offload MPLS pop actions to TC datapath
TC now supports an action to pop the outer MPLS header from a packet. The
next protocol after the header is required alongside this. Currently, OvS
datapath rules also supply this information.

Offload OvS MPLS pop actions to TC along with the next protocol.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-08-01 18:09:42 +02:00
John Hurley
95255018a8 ovs-tc: allow offloading TC rules to egress qdiscs
Offloading rules to a TC datapath only allows the creating of ingress hook
qdiscs and the application of filters to these. However, there may be
certain situations where an egress qdisc is more applicable (e.g. when
offloading to TC rules applied to OvS internal ports).

Extend the TC API in OvS to allow the creation of egress qdiscs and to add
or interact with flower filters applied to these.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-04-09 17:34:07 +02:00
John Hurley
4aa2dc04ec ovs-tc: allow offloading of ingress mirred TC actions to datapath
The TC datapath only permits the offload of mirred actions if they are
egress. To offload TC actions that output to OvS internal ports, ingress
mirred actions are required. At the TC layer, an ingress mirred action
passes the packet back into the network stack as if it came in the action
port rather than attempting to egress the port.

Update OvS-TC offloads to support ingress mirred actions. To ensure
packets that match these rules are properly passed into the network stack,
add a TC skbedit action along with ingress mirred that sets the pkt_type
to PACKET_HOST. This mirrors the functionality of the OvS internal port
kernel module.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-04-09 17:34:03 +02:00
Pieter Jansen van Vuuren
e7f6ba220e lib/tc: add ingress ratelimiting support for tc-offload
Firstly this patch introduces the notion of reserved priority, as the
filter implementing ingress policing would require the highest priority.
Secondly it allows setting rate limiters while tc-offloads has been
enabled. Lastly it installs a matchall filter that matches all traffic
and then applies a police action, when configuring an ingress rate
limiter.

An example of what to expect:

OvS CLI:
ovs-vsctl set interface <netdev_name> ingress_policing_rate=5000
ovs-vsctl set interface <netdev_name> ingress_policing_burst=100

Resulting TC filter:
filter protocol ip pref 1 matchall chain 0
filter protocol ip pref 1 matchall chain 0 handle 0x1
  not_in_hw
	action order 1:  police 0x1 rate 5Mbit burst 125Kb mtu 64Kb
action drop/continue overhead 0b
        ref 1 bind 1 installed 3 sec used 3 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
10.0.0.200 () port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

131072  16384  16384    60.13       4.49

ovs-vsctl list interface <netdev_name>
_uuid               : 2ca774e8-8b95-430f-a2c2-f8f742613ab1
admin_state         : up
...
ingress_policing_burst: 100
ingress_policing_rate: 5000
...
type                : ""

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-03-04 17:22:34 +01:00
Adi Nissim
0227bf092e lib/tc: Support optional tunnel id
Currently the TC tunnel_key action is always
initialized with the given tunnel id value. However,
some tunneling protocols define the tunnel id as an optional field.

This patch initializes the id field of tunnel_key:set and tunnel_key:unset
only if a value is provided.

In the case that a tunnel key value is not provided by the user
the key flag will not be set.

Signed-off-by: Adi Nissim <adin@mellanox.com>
Acked-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-02-11 11:02:43 +01:00
Pieter Jansen van Vuuren
dbcb014d1f lib/tc: add set ipv6 traffic class action offload via pedit
Extend ovs-tc translation by allowing non-byte-aligned fields
for set actions. Use new boundary shifts and add set ipv6 traffic
class action offload via pedit.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-01-31 10:53:25 +01:00
Pieter Jansen van Vuuren
95431229b9 lib/tc: add set ipv4 dscp and ecn action offload via pedit
Add setting of ipv4 dscp and ecn fields in tc offload using pedit.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-01-31 10:53:20 +01:00
Pieter Jansen van Vuuren
0d9f0cd44b lib/tc: fix 32 bits shift for pedit offset calculation
pedit allows setting entire words with an optional mask and OVS
makes use of such masks to allow setting fields that do not span
entire words. One mask for leading bytes that should not be
updated and another mask for trailing bytes that should not be
updated. The masks are created using bit shifts.

In the case of the mask to omit trailing bytes a right bit shift
is used. Currently the code can produce shifts of 1, 2, 3 or 4
bytes (8, 16, 24 or 32 bits) based on the alignment of the end
of field being set.

However, a shift of 32 bits on a 32bit value is undefined.
As it stands the code relies on the result of UINT32_MAX >> 32
being UINT32_MAX. Or in other words a mask that results in the
pedit action setting all bytes of the word under operation.

This patch adjusts the code to use a shift of 0 for this case,
which gives the same result as the undefined behaviour that was
relied on, and appears logically correct as the desire is for no
trailing bytes (or bits!) to be omitted from the set action.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-01-31 10:53:18 +01:00
Pieter Jansen van Vuuren
f8b63e5930 lib/tc: make pedit mask calculations byte order agnostic
pedit allows setting entire words with an optional mask and OVS
makes use of such masks to allow setting fields that do not span
entire words.

The struct tc_pedit_key structure, which is part of the kernel
ABI, uses host byte order fields to store the mask and value for
a pedit action, however, these fields contain values in network
byte order.

In order to allow static analysis tools to check for endianness
problems this patch adds a local version of struct tc_pedit_key
which uses big endian types and refactors the relevant code as
appropriate.

In the course of making this change it became apparent that the
calculation of masks was occurring using host byte order although
the values are in network byte order. This patch also fixes that
problem by shifting values in host byte order and then converting
them to network byte order. It is believe this fixes a bug on big
endian systems although we are not in a position to test that.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-01-31 10:53:10 +01:00
Eli Britstein
46df7fac76 netdev-tc-offloads: Support IPv6 hlimit rewrite
Add support for IPv6 hlimit field.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-12-29 09:16:36 +09:00
Eli Britstein
e48f49e03e netdev-tc-offloads: Do not set 0 port attribute to TC tunnel
For non UDP tunnels as GRE there is no UDP port, i.e initialized to 0.

Do not set the port attribute in such case.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-12-29 09:14:50 +09:00
Eli Britstein
d9677a1f0e netdev-tc-offloads: TC csum option is not matched with tunnel configuration
Tunnels (gre, geneve, vxlan) support 'csum' option (true/false), default is false.
Generated encap TC rule will now be configured as the tunnel configuration.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-10-16 09:28:30 +02:00
Pieter Jansen van Vuuren
a468645c6d lib/tc: add geneve with option match offload
Add TC offload support for classifying geneve tunnels with options.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-09-20 15:58:24 +02:00
Pieter Jansen van Vuuren
202469aa9e lib/tc: add geneve with option encap action offload
Add TC offload support for encapsulating geneve tunnels with options.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-09-20 15:58:22 +02:00