2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

122 Commits

Author SHA1 Message Date
Eelco Chaudron
1be33d52af netdev-tc-offloads: Don't offload header modification on ip fragments.
While offloading header modifications to TC, OVS is using {TCA_PEDIT} +
{TCA_CSUM} combination as that it the only way to represent header
rewrite.  However, {TCA_CSUM} is unable to calculate L4 checksums for
IP fragments.

Since TC already applies fragmentation bit masking, this patch simply
needs to prevent these packets from being processed through TC.

Reported-at: https://issues.redhat.com/browse/FDP-545
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2024-12-13 16:07:08 +01:00
Eelco Chaudron
316fcc631a netdev-offload-tc: Add don't fragment support to encap action.
This patch adds "Don't Fragment" (DF) flag support to the encap action,
if supported by the kernel. If the kernel does not support this,
it falls back to the previous behavior of ignoring the DF request.

Acked-by: Roi Dayan <roid@nvidia.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2024-10-28 14:39:25 +01:00
Eelco Chaudron
3f7af5233a netdev-offload-tc: Check if TCA_FLOWER_KEY_ENC_FLAGS is supported.
This patch checks to see if the TCA_FLOWER_KEY_ENC_FLAGS key is supported.

Acked-by: Roi Dayan <roid@nvidia.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2024-10-28 14:38:51 +01:00
Ilya Maximets
169ff9ea93 tc: Fix -Wgnu-variable-sized-type-not-at-end warning with Clang 18.
Clang 18.1.3-2.fc41 throws a warning:

  lib/tc.c:3060:25: error: field 'sel' with variable sized type
            'struct tc_pedit_sel' not at the end of a struct or class is a
            GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end]

   3060 |         struct tc_pedit sel;
        |                         ^

Refactor the structure into a proper union to avoid the build failure.

Interestingly, clang 18.1.3-2.fc41 on Fedora throws a warning, but
relatively the same version 18.1.3 (1) on Ubuntu 24.04 does not.

Acked-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2024-04-30 17:23:30 +02:00
Ilya Maximets
a413fed99b tc: Improve logging of mismatched actions.
Currently we log the 980-ish byte long tc_action structure as a
single long hex string.  That is very hard to read and hard to
spot the difference between two.  And most of the fields are zero.

Use the sparse hex dump instead as we do for keys already.

Ex.:

  Action 1 mismatch:
   - Expected Action:
  00000000  f0 3c 00 00 01 00 00 00-00 00 00 00 00 00 00 00
  000003d0  00 00 00 00 ff ff ff ff-
   - Received Action:
  00000000  f0 3c 00 00 01 01 00 00-00 00 00 00 00 00 00 00
  000003d0  00 00 00 00 ff ff ff ff-

Without the change, each action would be a 1900+ characters
long string of mostly zeroes.

Acked-by: Simon Horman <horms@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-10-20 23:33:18 +02:00
Faicker Mo
c29ba54018 tc: Add IPIP/GRE protocols to offload in IP rewrite.
Currently checksum recalculation is not supported with TC offload for
IPIP and GRE packets. This patch adds support for TC offloading of
IPIP and GRE packets by adding the correct csum action.

Without this patch the following warning can be seen in the logging:
  Can't offload rewrite of IP/IPV6 with ip_proto: X.

Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2023-10-13 11:24:41 +02:00
Faicker Mo
b16ef52002 tc: Add csum offload of IGMP/UDPLITE/SCTP in IP rewrite.
When the IP header is modified, for example, by NAT or a ToS/TTL change,
the IP header checksum needs recalculation. In addition to the IP header
checksum, for UDPLITE, its checksum also needs recalculation when any
of the addresses change.

This patch adds support for TC offloading of IGMP, UDPLITE, and SCTP
packets by adding the correct csum action.

Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2023-10-13 11:24:22 +02:00
Adrian Moreno
07ce41da11 netdev-linux: Support 64-bit rates in tc policing.
Use TCA_POLICE_RATE64 if the rate cannot be expressed using 32bits.

This breaks the 32Gbps barrier. The new barrier is ~4Tbps caused by
netdev's API expressing kbps rates using 32-bit integers.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2137643
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-07-17 20:03:54 +02:00
Gavin Li
a2a3f1983f tc: Add vxlan encap action with gbp option offload.
Add TC offload support for vxlan encap with gbp option.

Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2023-07-03 11:56:39 +02:00
Gavin Li
256c1e5819 tc: Pass encap entirely to nl_msg_put_act_tunnel_key_set.
Most of the data members of struct tc_action{ } are defined as anonymous
struct in place. Instead of passing all members of an anonymous struct,
which is not flexible to new members being added, expose encap as named
struct and pass it entirely.

Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2023-07-03 11:56:39 +02:00
Gavin Li
a4332b5e68 tc: Add vxlan gbp option flower match offload.
Add TC offload support for filtering vxlan tunnels with gbp option.

Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2023-07-03 11:56:39 +02:00
Gavin Li
affb9b8183 tc: Pass tunnel entirely to tunnel option parse and put functions.
Tc flower tunnel key options were encoded in nl_msg_put_flower_tunnel_opts
and decoded in nl_parse_flower_tunnel_opts. Only geneve was supported.

To avoid adding more arguments to the function to support more vxlan
options in the future, change the function arguments to pass tunnel
entirely to it instead of keep adding new arguments.

Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2023-07-03 11:56:39 +02:00
Frode Nordahl
106ef21860 tc: Fix crash on malformed reply from kernel.
The tc module combines the use of the `tc_transact` helper
function for communication with the in-kernel tc infrastructure
with assertions on the reply data by `ofpbuf_at_assert` on the
received data prior to further processing.

With the presence of bugs on the kernel side, we need to treat
the kernel as an unreliable service provider and replace assertions
on the reply from it with checks to avoid a fatal crash of OVS.

For the record, the symptom of the crash is this in the log:
  EMER|include/openvswitch/ofpbuf.h:194:
      assertion offset + size <= b->size failed in ofpbuf_at_assert()

And an excerpt of the backtrace looks like this:
  ofpbuf_at_assert (offset=16, size=20) at include/openvswitch/ofpbuf.h:194
  tc_replace_flower  at lib/tc.c:3223
  netdev_tc_flow_put at lib/netdev-offload-tc.c:2096
  netdev_flow_put    at lib/netdev-offload.c:257
  parse_flow_put     at lib/dpif-netlink.c:2297
  try_send_to_netdev at lib/dpif-netlink.c:2384

Reported-At: https://launchpad.net/bugs/2018500
Fixes: 5c039ddc64ff ("netdev-linux: Add functions to manipulate tc police action")
Fixes: e7f6ba220e10 ("lib/tc: add ingress ratelimiting support for tc-offload")
Fixes: f98e418fbdb6 ("tc: Add tc flower functions")
Fixes: c1c9c9c4b636 ("Implement QoS framework.")
Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-06-07 22:46:45 +02:00
Roi Dayan
77d8228985 tc: Fix cleaning chains.
Sometimes there is a need to clean empty chains as done in
delete_chains_from_netdev().  The cited commit doesn't remove
the chain completely which cause adding ingress_block later to fail.
This can be reproduced with adding bond as ovs port which makes ovs
use ingress_block for it.
While at it add the netdev name that fails to the log.

Fixes: e1e5eac5b016 ("tc: Add TCA_KIND flower to delete and get operation to avoid rtnl_lock().")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-04-28 19:01:14 +02:00
Eelco Chaudron
a08a589ce4 netdev-offload-tc: If the flow has not been used, report it as such.
If a tc flow was installed but has not yet been used, report it as such.

In addition, add a delay to the "IGMP - flood under normal action" test
case to make it work with many repetitions. This delay is also present
in other ICMP/IGMP tests.

Fixes: f98e418fbdb6 ("tc: Add tc flower functions")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-02-09 00:32:55 +01:00
Eelco Chaudron
e1e5eac5b0 tc: Add TCA_KIND flower to delete and get operation to avoid rtnl_lock().
A long long time ago, an effort was made to make tc flower
rtnl_lock() free. However, on the OVS part we forgot to add
the TCA_KIND "flower" attribute, which tell the kernel to skip
the lock. This patch corrects this by adding the attribute for
the delete and get operations.

The kernel code calls tcf_proto_is_unlocked() to determine the
rtnl_lock() is needed for the specific tc protocol. It does this
in the tc_new_tfilter(), tc_del_tfilter(), and in tc_get_tfilter().

If the name is not set, tcf_proto_is_unlocked() will always return
false. If set, the specific protocol is queried for unlocked support.

Fixes: f98e418fbdb6 ("tc: Add tc flower functions")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-01-30 21:12:31 +01:00
Mike Pattrick
006e1c6dbf tc: Add support for TCA_STATS_PKT64.
Currently tc offload flow packet counters will roll over every ~4
billion packets. This is because the packet counter in struct
tc_stats provided by TCA_STATS_BASIC is a 32bit integer.

Now we check for the optional TCA_STATS_PKT64 attribute which provides
the full 64bit packet counter if the 32bit one has rolled over. Because
the TCA_STATS_PKT64 attribute may appear multiple times in a netlink
message, the method of parsing attributes was changed.

Fixes: f98e418fbdb6 ("tc: Add tc flower functions")
Reported-at: https://bugzilla.redhat.com/1776816
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-01-06 16:09:45 +01:00
Roi Dayan
6ccf8efffc tc: Fix coverity dereference null return value
CID 550702 (#1 of 1): Dereference null return value (NULL_RETURNS)
7. dereference: Dereferencing a pointer that might be NULL ex_type when calling nl_attr_get_u16.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-11-07 05:41:40 -05:00
Roi Dayan
d77f93f363 tc: Pass tun_metadata by reference
Fix coverity big parameter passed by value

CID 549858 (#1 of 1): Big parameter passed by value (PASS_BY_VALUE)
pass_by_value: Passing parameter metadata of type struct tun_metadata (size 272 bytes) by value,
which exceeds the medium threshold of 256 bytes

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-11-07 05:40:08 -05:00
Ilya Maximets
fec5424aed tc: Fix misaligned writes while parsing pedit.
Offsets within 'rewrite' action are not 4-byte aligned, so has to
be accessed carefully.

 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/tc.c:1132:17 in

 lib/tc.c:1132:17: runtime error: store to misaligned address 0x7fba215b2025
   for type 'ovs_be32' (aka 'unsigned int'), which requires 4 byte alignment

    0 0xd78857 in nl_parse_act_pedit lib/tc.c:1132:24
    1 0xd68103 in nl_parse_single_action lib/tc.c:1936:15
    2 0xd624ee in nl_parse_flower_actions lib/tc.c:2024:19
    3 0xd624ee in nl_parse_flower_options lib/tc.c:2139:12
    4 0xd5f082 in parse_netlink_to_tc_flower lib/tc.c:2187:12
    5 0xd6a2a1 in tc_replace_flower lib/tc.c:3776:19
    6 0xd2ae8f in netdev_tc_flow_put lib/netdev-offload-tc.c:2350:11
    7 0x951d07 in netdev_flow_put lib/netdev-offload.c:318:14
    8 0xcbb81a in parse_flow_put lib/dpif-netlink.c:2297:11
    9 0xcbb81a in try_send_to_netdev lib/dpif-netlink.c:2384:15
    10 0xcbb81a in dpif_netlink_operate lib/dpif-netlink.c:2455:23
    11 0x8678ae in dpif_operate lib/dpif.c:1372:13
    12 0x6bcc89 in handle_upcalls ofproto/ofproto-dpif-upcall.c:1674:5
    13 0x6bcc89 in recv_upcalls ofproto/ofproto-dpif-upcall.c:905:9
    14 0x6b7f9a in udpif_upcall_handler ofproto/ofproto-dpif-upcall.c:801:13
    15 0xb54c5a in ovsthread_wrapper lib/ovs-thread.c:422:12
    16 0x7fba2f2081ce in start_thread (/lib64/libpthread.so.0+0x81ce)
    17 0x7fba2de39dd2 in clone (/lib64/libc.so.6+0x39dd2)

Fixes: 8ada482bbe19 ("tc: Add header rewrite using tc pedit action")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-11-02 19:47:21 +01:00
Baowen Zheng
ffcb6f115f netdev-linux: Allow meter to work in tc software datapath when tc-policy is specified
Add tc action flags when adding police action to offload meter table.

There is a restriction that the flag of skip_sw/skip_hw should be same for
filter rule and the independent created tc actions the rule uses. In this
case, if we configure the tc-policy as skip_hw, filter rule will be created
with skip_hw flag and the police action according to meter table will have
no action flag, then flower rule will fail to add to tc kernel system.

To fix this issue, we will add tc action flag when adding police action to
offload a meter table, so it will allow meter table to work in tc software
datapath.

Fixes: 5c039ddc64ff ("netdev-linux: Add functions to manipulate tc police action")
Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-11-01 10:18:16 +01:00
Tianyu Yuan
743499607b Revert "tc: Fix stats dump when using same meter table"
This reverts commit dd9881ed55e6 ('tc: Fix stats dump when
using same meter table')

This patch doesn't solve the tc flow stats update issue and
will lead to failure of system-offloads-traffic testsuite, it
only counts packets surviving after the tc filter, rather than
hitting the filter

A following patch will come up to solve this flow stats update
issue

Signed-off-by: Tianyu Yuan <tianyu.yuan@corigine.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-10-31 13:55:06 +01:00
Roi Dayan
7a5ee32518 tc: On last action use drop action attribute instead of pipe
OVN is setting ct drop rule with a ct clear action.
OVS datapath behavior is if there is no forward action
the default is drop.
TC behavior is to continue with next match.
Fix to match tc to ovs behavior by setting last action
attribute as drop instead of pipe.
Also update lastused when parsing ct action.

example rule
recirc_id(0x1),in_port(2),ct_state(+trk),eth(),eth_type(0x0800),ipv4(frag=no),
packets:82, bytes:8036, used:2.108s, actions:ct_clear

Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-10-31 10:34:37 +01:00
Tianyu Yuan
dd9881ed55 tc: Fix stats dump when using same meter table
When we apply meter police on both directions of TCP traffic, the
dumped stats is shown same (as shown below). This issue is introduced
by modifying the stats update strategy.

...,in_port(6),eth(),eth_type(0x0800),ipv4(frag=no), packets:1488557,
bytes:2089059644, used:0.040s, actions:meter(0),9
...,in_port(9),eth(),eth_type(0x0800),ipv4(frag=no), packets:1488557,
bytes:2089059644, used:0.040s, actions:meter(0),6

In previous patch, after parsing police action, the flower stats will
be updated by dumped meter table stats, which will result in the issue
above.

Thus, the stats of meter table should not be used when dumping flow
stats. Ignore the stats update when police.index belongs to meter.

Fixes: a9b8cdde69de ("tc: Add support parsing tc police action")
Signed-off-by: Tianyu Yuan <tianyu.yuan@corigine.com>
Reviewed-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
2022-09-12 18:54:33 +02:00
Ilya Maximets
ff55e8f385 netdev-offload-tc: Add missing handling of the tunnel source port.
netdev_tc_flow_put() "consumes" the tunnel.tp_src value, but
it's never passed down to TC, and not parsed back.  Fix that.

Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-08-19 19:40:00 +02:00
Ilya Maximets
5d91bdf12e netdev-offload-tc: Explicitly handle mask for the tunnel destination port.
netdev_tc_flow_put() ignores the tunnel.tp_dst mask.  That results
in the exact match on the value.  TC supports the masked match on
this field and it does return the mask back during the flow dump
even if it wasn't provided initially.  OVS should correctly handle
that.  There is a problem though.  Some drivers (mlx5) doesn't
support offloading if the destination port is not an exact match [1].

Keeping the logic as-is for now, but making it explicit and somewhat
documented in the comment, so it is clear what is happening and we can
revisit this in the future.

[1] https://patchwork.ozlabs.org/project/openvswitch/patch/20220704224505.1117988-3-i.maximets@ovn.org/#2927396

Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-08-19 19:39:51 +02:00
Ilya Maximets
7a6c8074c5 netdev-offload-tc: Fix the mask for tunnel metadata length.
'wc.masks.tunnel.metadata.present.len' must be a mask for the same
field in the flow key, but in the tc_flower structure it's the real
length of metadata masks.

That is correctly handled for the individual opt->length, setting all
the masks to 0x1f like it's done in the tun_metadata_to_geneve_mask__(),
but not handled for the main 'len' field.

Fix that by setting the mask to 0xff like it's done during the flow
translation in xlate_actions() and during the flow dump in the
tun_metadata_from_geneve_nlattr().

Also, flower always has an exact match on the present.len field
regardless of its value and regardless of this field being masked
by OVS flow translation layer while installing the flow.  Hence,
all tunnel flows dumped from TC should have an exact match on
present.len and also UDPIF flag, because present.len doesn't make
sense without that flag.  Without the change, zero-length options
match is incorrectly reported as a wildcard match.  The side effect
though is that zero-length match on geneve options is reported even
for other tunnel types, e.g. vxlan.  But that should be fairly
harmless.  To avoid reporting a match on empty geneve options for
vxlan/etc. tunnels we'll need to check the tunnel port type, there
is no enough information in the TUNNEL attribute itself.

Extra checks and comments added around the code to better explain
what is going on.

Fixes: a468645c6d33 ("lib/tc: add geneve with option match offload")
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-08-19 19:39:45 +02:00
Ilya Maximets
398623a63e tc: Use sparse hex dump while printing inconsistencies.
Instead of a very long hex string something like this will be printed:

 |DBG|tc flower compare failed mask compare:
 Expected Mask:
 00000000  ff ff 00 00 ff ff ff ff-ff ff ff ff ff ff ff ff
 00000020  00 00 00 00 00 00 00 00-00 00 00 00 00 00 03 00
 00000090  00 00 00 00 00 00 00 00-ff ff ff ff ff ff ff ff
 000000c0  ff 00 00 00 ff ff 00 00-ff ff ff ff ff ff ff ff

 Received Mask:
 00000000  ff ff 00 00 ff ff ff ff-ff ff ff ff ff ff ff ff
 00000020  00 00 00 00 00 00 00 00-00 00 00 00 00 00 03 00
 00000090  00 00 00 00 00 00 00 00-ff ff ff ff ff ff ff ff
 000000c0  ff 00 00 00 00 00 00 00-ff ff ff ff ff ff ff ff

It's easier to spot the difference this way and count which bytes are
to blame, since offsets are printed as well.

Using a sparse dump to avoid printing huge number of all-zero lines.

Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-08-04 14:18:17 +02:00
Ilya Maximets
eacc544c4d tc: Fix misaligned access while creating pedit actions.
calc_offsets() function returns 'data' and 'mask' pointers, which
are pointers somewhere inside struct tc_flower_key, and they are not
aligned, causing misaligned memory access.  For example:

  ipv6.rewrite_hlimit is at 148 byte offset inside the struct
  tc_flower_key.  While the actual field is in the 7th byte of
  the IPv6 header in the actual packet.  So, pedit will need
  to write the last byte of the [4-7] range to the actual packet.
  So, data pointer is positioned to 145th byte inside the tc_flower_key
  with the 000000FF mask.  Obviously, 145th byte inside the structure
  is not 4-byte aligned.

 lib/tc.c:2879:34: runtime error:
   load of misaligned address 0x7f2802eaa321 for type 'ovs_be32' (aka
   'unsigned int'), which requires 4 byte alignment
   0x7f2802eaa321: note: pointer points here
     00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 ...
                  ^
     0 0xd7f2fb in nl_msg_put_flower_rewrite_pedits lib/tc.c:2879:34
     1 0xd7f2fb in nl_msg_put_flower_acts lib/tc.c:3141:25
     2 0xd6ae5a in nl_msg_put_flower_options lib/tc.c:3445:12
     3 0xd6a2be in tc_replace_flower lib/tc.c:3712:17
     4 0xd2bf25 in netdev_tc_flow_put lib/netdev-offload-tc.c:2224:11
     5 0x94f6b7 in netdev_flow_put lib/netdev-offload.c:316:14
     6 0xcbd19e in parse_flow_put lib/dpif-netlink.c:2289:11
     7 0xcbd19e in try_send_to_netdev lib/dpif-netlink.c:2376:15
     8 0xcbd19e in dpif_netlink_operate lib/dpif-netlink.c:2447:23
     9 0x86536e in dpif_operate lib/dpif.c:1372:13
    10 0x6bc289 in handle_upcalls ofproto/ofproto-dpif-upcall.c:1654:5
    11 0x6bc289 in recv_upcalls ofproto/ofproto-dpif-upcall.c:892:9
    12 0x6b766a in udpif_upcall_handler ofproto/ofproto-dpif-upcall.c:792:13
    13 0xb5015a in ovsthread_wrapper lib/ovs-thread.c:422:12
    14 0x7f280b2081ce in start_thread (/lib64/libpthread.so.0+0x81ce)
    15 0x7f2809e39dd2 in clone (/lib64/libc.so.6+0x39dd2)

Fix misaligned read by using appropriate functions.

Fixes: 8ada482bbe19 ("tc: Add header rewrite using tc pedit action")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-14 14:58:10 +02:00
Ilya Maximets
e5162ac42a tc: Fix misaligned access to struct tcf_t for police action.
lib/tc.c:1334:44: runtime error: member access within misaligned address
  0x6210001f5a2c for type 'const struct tcf_t', which requires 8 byte alignment
  0x6210001f5a2c: note: pointer points here
      24 00 06 00 c3 00 00 00  00 00 00 00 c3 00 00 00 ...
                  ^
     0 0xd7c7ea in nl_parse_tcf lib/tc.c:1334:44
     1 0xd7bd3a in nl_parse_act_police lib/tc.c:1433:9
     2 0xd68b1a in nl_parse_single_action lib/tc.c:1922:9
     3 0xd62c7e in nl_parse_flower_actions lib/tc.c:1992:19
     4 0xd62c7e in nl_parse_flower_options lib/tc.c:2107:12
     5 0xd5fa32 in parse_netlink_to_tc_flower lib/tc.c:2155:12
     6 0xd21760 in netdev_tc_flow_dump_next lib/netdev-offload-tc.c:1158:13
     7 0x94f442 in netdev_flow_dump_next lib/netdev-offload.c:301:14
     8 0xcba2f6 in dpif_netlink_flow_dump_next lib/dpif-netlink.c:1901:20
     9 0x8665b6 in dpif_flow_dump_next lib/dpif.c:1135:9
    10 0xee5f0f in dpctl_dump_flows lib/dpctl.c:1106:12
    11 0xee27a3 in dpctl_unixctl_handler lib/dpctl.c:3035:17
    12 0xc7f78b in process_command lib/unixctl.c:310:13
    13 0xc7f78b in run_connection lib/unixctl.c:344:17
    14 0xc7f78b in unixctl_server_run lib/unixctl.c:395:21
    15 0x59acb4 in main vswitchd/ovs-vswitchd.c:130:9
    16 0x7f1be043acf2 in __libc_start_main (/lib64/libc.so.6+0x3acf2)
    17 0x47e91d in _start (vswitchd/ovs-vswitchd+0x47e91d)

Fixes: a9b8cdde69de ("tc: Add support parsing tc police action")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-14 14:58:10 +02:00
Eelco Chaudron
e4daf88a43 netdev-offload-tc: Handle check_pkt_len datapath action.
This change implements support for the check_pkt_len
action using the TC police action, which supports MTU
checking.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-07-13 17:39:22 +02:00
Jianbo Liu
218304df18 netdev-offload-tc: Offloading rules with police actions
When offloading rule, tc should be filled with police index, instead
of meter id. As meter is mapped to police action, and the mapping is
inserted into meter_id_to_police_idx hmap, this hmap is used to find
the police index. Besides, the reverse mapping between meter id and
police index is also added, so meter id can be retrieved from this
hashmap and pass to dpif while dumping rules.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-11 11:20:50 +02:00
Jianbo Liu
f6ed09ef55 netdev-offload-tc: Cleanup police actions with reserved indexes on startup
As the police actions with indexes of 0x10000000-0x1fffffff are
reserved for meter offload, to provide a clean environment for OVS,
these reserved police actions should be deleted on startup. So dump
all the police actions, delete those actions with indexes in this
range.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-11 11:20:37 +02:00
Jianbo Liu
5c039ddc64 netdev-linux: Add functions to manipulate tc police action
Add helpers to add, delete and get stats of police action with
the specified index.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-11 11:18:16 +02:00
Jianbo Liu
a9b8cdde69 tc: Add support parsing tc police action
Add function to parse police action from netlink message.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
2022-07-11 11:16:56 +02:00
Ilya Maximets
a2d202bdee tc: Fix misaligned access to stats and time values.
Pointers to gnet_stats_basic and tcf_t are not correctly aligned,
so we need to copy the data before accessing.  Found by running
check-offloads testsuite with UBsan:

 lib/tc.c:1791:50: runtime error:
   member access within misaligned address 0x61900005ce1c for type
   'const struct gnet_stats_basic', which requires 8 byte alignment
   0x61900005ce1c: note: pointer points here
      14 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  ...
                  ^
   0 0xd69044 in nl_parse_single_action lib/tc.c:1791:50
   1 0xd69044 in nl_parse_flower_actions lib/tc.c:1841:19
   2 0xd57612 in nl_parse_flower_options lib/tc.c:1893:12
   3 0xd5468d in parse_netlink_to_tc_flower lib/tc.c:1941:12
   4 0xd5a7ec in tc_replace_flower lib/tc.c:3199:19
   5 0xd2baea in probe_tc_block_support lib/netdev-offload-tc.c:2226:13
   6 0xd2baea in netdev_tc_init_flow_api lib/netdev-offload-tc.c:2279:9
   7 0x94d726 in netdev_assign_flow_api lib/netdev-offload.c:183:14
   8 0x94d726 in netdev_init_flow_api lib/netdev-offload.c:323:9
   9 0x9515c7 in netdev_ports_flow_init lib/netdev-offload.c:775:8
   10 0x9515c7 in netdev_set_flow_api_enabled lib/netdev-offload.c:815:13
   11 0x562ec8 in bridge_run vswitchd/bridge.c:3292:9
   12 0x59a98c in main vswitchd/ovs-vswitchd.c:129:9
   13 0x7fb5c583acf2 in __libc_start_main (/lib64/libc.so.6+0x3acf2)
   14 0x47e60d in _start (vswitchd/ovs-vswitchd+0x47e60d)

 lib/tc.c:1306:43: runtime error:
   member access within misaligned address 0x619000140324 for type
   'const struct tcf_t', which requires 8 byte alignment
   0x619000140324: note: pointer points here
      24 00 01 00 00 00 00 00  00 00 00 00 00 00 00 00  ...
                  ^
   0 0xd6ee6f in nl_parse_tcf lib/tc.c:1306:43
   1 0xd6423f in nl_parse_act_mirred lib/tc.c:1389:5
   2 0xd6423f in nl_parse_single_action lib/tc.c:1747:15
   3 0xd6423f in nl_parse_flower_actions lib/tc.c:1843:19
   4 0xd57612 in nl_parse_flower_options lib/tc.c:1895:12
   5 0xd5468d in parse_netlink_to_tc_flower lib/tc.c:1943:12
   6 0xd5a7ec in tc_replace_flower lib/tc.c:3201:19
   7 0xd28ae8 in netdev_tc_flow_put lib/netdev-offload-tc.c:1969:11
   8 0x94cc97 in netdev_flow_put lib/netdev-offload.c:257:14
   9 0xcba2be in parse_flow_put lib/dpif-netlink.c:2289:11
   10 0xcba2be in try_send_to_netdev lib/dpif-netlink.c:2376:15
   11 0xcba2be in dpif_netlink_operate lib/dpif-netlink.c:2447:23
   12 0x8647ce in dpif_operate lib/dpif.c:1372:13
   13 0x6b50a6 in push_dp_ops ofproto/ofproto-dpif-upcall.c:2406:5
   14 0x6c9bcd in revalidate ofproto/ofproto-dpif-upcall.c:2792:13
   15 0x6b79b5 in udpif_revalidator ofproto/ofproto-dpif-upcall.c:980:9
   16 0xb4d5ea in ovsthread_wrapper lib/ovs-thread.c:422:12
   17 0x7ff1090081ce in start_thread (/lib64/libpthread.so.0+0x81ce)
   18 0x7ff107c39dd2 in clone (/lib64/libc.so.6+0x39dd2)

This patch fixes only problems reported by UBsan in current tests,
there could be more issues like this not currently covered by the
testsuite.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-06-24 23:50:31 +02:00
Eelco Chaudron
6d76cfc444 netdev-offload-tc: Fix IP and port ranges in flower returns.
When programming NAT rules OVS only sets the minimum value for a
single IP/port value. However, responses from flower will always
return min == max for single IP/port values. This is causing the
verification to fail as the request is different than the response.
To avoid this, we will update the response to match the request.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Eelco Chaudron
38298a877b netdev-offload-tc: Fix use of ICMP values instead of masks defines.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Eelco Chaudron
a039636950 netdev-offload-tc: Always include conntrack information to tc.
Regardless of the traffic type, if requested, the conntrack information
should be included to keep the datapath and tc rules in sync.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Eelco Chaudron
b4868ee163 netdev-offload-tc: Set the correct VLAN_VID and VLAN_PCP masks.
This change will set the correct VID and PCP masks, as well as the
ethernet type mask.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Eelco Chaudron
2bdf5b288c netdev-offload-tc: Add debug logs on tc rule verify failures.
This patch adds more detailed debug logs on tc verify failures to
ease debugging the actual cause after the fact.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:31:24 +01:00
Chris Mi
920ec5761e tc: Keep header rewrite actions order.
Currently, tc merges all header rewrite actions into one tc pedit
action. So the header rewrite actions order is lost. Save each header
rewrite action into one tc pedit action to keep the order. And only
append one tc csum action to the last pedit action of a series.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-18 17:41:14 +01:00
Paul Blakey
f34a7626cc tc: Fix stats byte count on fragmented packets.
Fragmented packets with offset=0 are defragmented by tc act_ct, and
only when assembled pass to next action, in ovs offload case,
a goto action. Since stats are overwritten on each action dump,
only the stats for last action in the tc filter action priority
list is taken, the stats on the goto action, which count
only the assembled packets. See below for example.

Hardware updates just part of the actions (gact, ct, mirred) - those
that support stats_update() operation. Since datapath rules end
with either an output (mirred) or recirc/drop (both gact), tc rule
will at least have one action that supports it. For software packets,
the first action will have the max software packets count.
Tc dumps total packets (hw + sw) and hardware packets, then
software packets needs to be calculated from this (total - hw).

To fix the above, get hardware packets and calculate software packets
for each action, take the max of each set, then combine back
to get the total packets that went through software and hardware.

Example by running ping above MTU (ping <IP> -s 2000):
ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=first),
  packets:14, bytes:19544,..., actions:ct(zone=1),recirc(0x1)
ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=later),
  packets:14, bytes:28392,..., actions:ct(zone=1),recirc(0x1)

Second rule should have had bytes=14*<size of 'later' frag>, but instead
it's bytes=14*<size of assembled packets - size of 'first' + 'later'
frags>.

Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-04 16:02:45 +01:00
Roi Dayan
96ad83bc7b tc: Fix incorrect TC rule for decap+encap datapath flow.
A datapath flow generated for traffic from vxlan port to another vxlan port
looks like this:

 tunnel(tun_id=0x65,src=10.10.11.3,dst=10.10.11.2,ttl=0/0,tp_dst=4789,flags(+key)),
 ...,in_port(vxlan_sys_4789),...,
 actions:set(tunnel(tun_id=0x66,src=10.10.12.2,dst=10.10.12.3,tp_dst=4789,flags(key))),
          vxlan_sys_4789

The generated TC rule with explicit tunnel key unset action added after
tunnel key set action, which is wrong.

filter protocol ip pref 7 flower chain 0 handle 0x1
  dst_mac fa:16:3e:2a:4e:23
  eth_type ipv4
  ip_tos 0x0/3
  enc_dst_ip 10.10.11.2
  enc_src_ip 10.10.11.3
  enc_key_id 101
  enc_dst_port 4789
  ip_flags nofrag
  not_in_hw
        action order 1: tunnel_key  set
        src_ip 10.10.12.2
        dst_ip 10.10.12.3
        key_id 102
        dst_port 4789
        nocsum pipe
         index 1 ref 1 bind 1 installed 568 sec used 0 sec
        Action statistics:
        Sent 46620 bytes 555 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: tunnel_key  unset pipe
         index 2 ref 1 bind 1 installed 568 sec used 0 sec
        Action statistics:
        Sent 46620 bytes 555 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 3: mirred (Egress Redirect to device vxlan_sys_4789) stolen
        index 1 ref 1 bind 1 installed 568 sec used 0 sec
        Action statistics:
        Sent 46620 bytes 555 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
        cookie e0c82bfd504b701428b00db6b08db3b2

Fix it by also adding the the tunnel key unset action before the tunnel
key set action and not only before output port.

Fixes: 7c53bd7839d8 ("tc: Move tunnel_key unset action before output ports")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-02-09 15:40:19 +01:00
Vlad Buslov
6bb3f363d6 tc: Set action flags for tunnel_key release.
The commit that enabled 'no_percpu' flag for compatible actions missed the
tunnel_key release action code. Add missing call to nl_msg_put_act_flags().

Fixes: 292d5bd9bb34 ("tc: Set 'no_percpu' flag for compatible actions")
Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Tested-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-08-16 19:40:29 +02:00
Eelco Chaudron
464b5b13e6 netdev-offload-tc: Verify the flower rule installed.
When OVS installs the flower rule, it only checks for the OK from the
kernel. It does not check if the rule requested matches the one
actually programmed. This change will add this check and warns the
user if this is not the case.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-07-09 19:54:56 +02:00
Jianbo Liu
47e73f7f00 netdev-offload-tc: Flush rules on all chains before attach ingress block
Previously, only chain 0 is deleted before attach the ingress block.
If there are rules on the chain other than 0, these rules are not flushed.
In this case, the recreation of qdisc also fails. To fix this issue, flush
rules from all chains.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2021-02-24 11:07:20 +01:00
Eelco Chaudron
0f252daa94 tc: Fix mpls bottom of stack bit mask reporting.
Fix the reported back value of the bos mask used by the revalidator
threads.

Fixes: 34b1695506f8 ("lib/tc: add single mpls match offload support")
Reported-by: Marcelo Leitner <mleitner@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-02-02 19:30:58 +01:00
Maor Dickman
75e1e6fd2d lib/tc: add ICMP type and code match offload
Add TC offload support for classifying ICMPv4/6 type and code.

$ ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(3),eth(),\
  eth_type(0x0800),ipv4(proto=1),icmp(type=9,code=0)' 2

$ ovs-appctl dpctl/dump-flows
  ... icmp(type=9,code=0) ...

$ tc filter show dev <ethx> ingress
  ...
  eth_type ipv4
  ip_proto icmp
  icmp_type 9
  icmp_code 0
  not_in_hw
  action order 1: mirred (Egress Redirect to device <ethy>) stolen
  ...

Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2021-02-01 16:54:15 +01:00
wenxu
71d0c0d8b4 lib/tc: fix parse act pedit for tos rewrite
Check overlap between current pedit key, which is always 4 bytes
(range [off, off + 3]), and a map entry in flower_pedit_map
sf = ROUND_DOWN(mf, 4) (range [sf|mf, (mf + sz - 1)|ef]).

So for the tos the rewite the off + 3(3) is greater than mf,
and should less than ef(4) but not mf+sz(2).

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-12-04 07:33:14 +01:00