mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

Author	SHA1	Message	Date
Maor Dickman	6280f5d04a	netdev-offload-tc: Reserve lower tc prio for vlan ethertype. The cited commit reserved lower tc priorities for IP ethertypes in order to give IP traffic higher priority than other management traffic. In case of of vlan encap traffic, IP traffic will still get lower priority. Fix it by also reserving low priority tc prio for vlan. Fixes: c230c7579c14 ("netdev-offload-tc: Reserve lower tc prios for ip ethertypes") Signed-off-by: Maor Dickman <maord@nvidia.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Simon Horman <horms@ovn.org>	2024-06-06 13:45:48 +01:00
Timothy Redaelli	be695f26fd	netdev-offload-tc: Check geneve metadata length. Currently ovs-vswitchd crashes, with hw offloading enabled, if a geneve packet with corrupted metadata is received, because the metadata header is not verified correctly. This commit adds a check for geneve metadata length and, if the header is wrong, the packet is not sent to flower. It also includes a system-traffic test for geneve packets with corrupted metadata. Fixes: a468645c6d33 ("lib/tc: add geneve with option match offload") Reported-by: Haresh Khandelwal <hakhande@redhat.com> Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-02-08 16:25:42 +01:00
Ilya Maximets	472dd66423	netdev-offload-tc: Fix offload of tunnel key tp_src. There is no TCA_TUNNEL_KEY_ENC_SRC_PORT in the kernel, so the offload should not be attempted if OVS_TUNNEL_KEY_ATTR_TP_SRC is requested by OVS. Current code just ignores the attribute in the tunnel(set()) action leading to a flow mismatch and potential incorrect datapath behavior: \|tc(handler21)\|DBG\|tc flower compare failed action compare ... Action 0 mismatch: - Expected Action: 00000010 01 00 00 00 00 00 00 00-00 00 00 00 00 ff 00 11 00000020 c0 5b 17 c1 00 40 00 00-0a 01 00 6d 0a 01 01 12 00000050 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000060 01 02 80 01 00 18 00 0b-00 00 00 00 00 00 00 00 ... - Received Action: 00000010 01 00 00 00 00 00 00 00-00 00 00 00 00 ff 00 11 00000020 00 00 17 c1 00 40 00 00-0a 01 00 6d 0a 01 01 12 00000050 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000060 01 02 80 01 00 18 00 0b-00 00 00 00 00 00 00 00 ... In the tc_action dump above we can see the difference on the second line. The action dumped from a kernel is missing 'c0 5b' - source port for a tunnel(set()) action on the second line. Removing the field from the tc_action_encap structure entirely to avoid any potential confusion. Note: In general, the source port number in the tunnel(set()) action is not particularly useful for most tunnels, because they may just ignore the value. Specs for Geneve and VXLAN suggest using a value based on the headers of the inner packet as a source port. In vast majority of scenarios the source port doesn't actually end up in the action itself. Having a mismatch between the userspace and TC leads to constant revalidation of the flow and warnings in the log. Adding a test case that demonstrates a scenario where the issue occurs - bridging of two tunnels. Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc interface") Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2023-October/052744.html Reported-by: Vladislav Odintsov <odivlad@gmail.com> Tested-by: Vladislav Odintsov <odivlad@gmail.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-12-04 17:57:24 +01:00
Gavin Li	7f04588d78	netdev-tc-offloads: Probe for allowing vxlan gbp support. Kernels that do not support vxlan gbp would treat the rule that has vxlan gbp encap action or vxlan gbp id match differently, either reject it or just skip the action/match and continue processing the knowing ones. To solve the issue, probe and disallow inserting rules with vxlan gbp action/match if kernel does not support it. Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	a2a3f1983f	tc: Add vxlan encap action with gbp option offload. Add TC offload support for vxlan encap with gbp option. Reviewed-by: Gavi Teitz <gavi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	a4332b5e68	tc: Add vxlan gbp option flower match offload. Add TC offload support for filtering vxlan tunnels with gbp option. Reviewed-by: Gavi Teitz <gavi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Roi Dayan	77d8228985	tc: Fix cleaning chains. Sometimes there is a need to clean empty chains as done in delete_chains_from_netdev(). The cited commit doesn't remove the chain completely which cause adding ingress_block later to fail. This can be reproduced with adding bond as ovs port which makes ovs use ingress_block for it. While at it add the netdev name that fails to the log. Fixes: e1e5eac5b016 ("tc: Add TCA_KIND flower to delete and get operation to avoid rtnl_lock().") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-04-28 19:01:14 +02:00
Faicker Mo	f9507c1ea4	netdev-offload-tc: Del ufid mapping if device not exist. The device may be deleted and added with ifindex changed. The tc rules on the device will be deleted if the device is deleted. The func tc_del_filter will fail when flow del. The mapping of ufid to tc will not be deleted. The traffic will trigger the same flow(with same ufid) to put to tc on the new device. Duplicated ufid mapping will be added. If the hashmap is expanded, the old mapping entry will be the first entry, and now the dp flow can't be deleted. Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-04-03 20:39:33 +02:00
Mike Pattrick	306583b568	netdev-tc-offloads: Fix misaligned 8 byte read. UB Sanitizer report: lib/netdev-offload-tc.c:1276:19: runtime error: load of misaligned address 0x7f74e801976c for type 'union ovs_u128', which requires 8 byte alignment 0 in netdev_tc_flow_dump_next lib/netdev-offload-tc.c:1276 1 in netdev_flow_dump_next lib/netdev-offload.c:303 2 in dpif_netlink_flow_dump_next lib/dpif-netlink.c:1921 [...] Fixes: 8f7620e6a406 ("netdev-tc-offloads: Implement netdev flow dump api using tc interface") Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-03-29 23:20:06 +02:00
Eelco Chaudron	d53ee36aa6	netdev-offload-tc: Fix parse_tc_flower_to_actions() reporting errors. parse_tc_flower_to_actions() was not reporting errors, which would cause parse_tc_flower_to_match() to ignore them. Fixes: dd03672f7bbb ("netdev-offload-tc: Move flower_to_match action handling to isolated function.") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-03-22 20:06:34 +01:00
Eelco Chaudron	b292cce2ff	netdev-offload-tc: Conntrack ALGs are not supported with tc. tc does not support conntrack ALGs. Even worse, with tc enabled, they should not be used/configured at all. This is because even though TC will ignore the rules with ALG configured, i.e., they will flow through the kernel module, return traffic might flow through a tc conntrack rule, and it will not invoke the ALG helper. Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-02-09 00:32:55 +01:00
Eelco Chaudron	564d09ef53	netdev-offload-tc: Fix tc conntrack force commit support. tc was not setting the OVS_CT_ATTR_FORCE_COMMIT flag when a forced commit was requested. This patch will fix this. Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-02-09 00:25:36 +01:00
Eelco Chaudron	f68e757ef1	tests: Include working system-traffic tests into the system-offloads-testsuite. Include and run the system-traffic.at tests as part of the system offload testsuite. Exclude all the tests that will not run without any special modifications. Lowered log level for "recirc_id sharing not supported" message, so tests will not fail with older kernels. This is not an error level message, but should be debug, like all other, EOPNOTSUPP, related log messages. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-02-08 22:14:22 +01:00
Eelco Chaudron	b1f58f5072	netdev-offload-tc: Preserve tc statistics when flow gets modified. When a flow gets modified, i.e. the actions are changes, the tc layer will remove, and re-add the flow. This is causing all the counters to be reset. This patch will remember the previous tc counters and adjust any requests for statistics. This is done in a similar way as the rte_flow implementation. It also updates the check_pkt_len tc test to purge the flows, so we do not use existing updated tc flow counters, but start with fresh installed set of datapath flows. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-02-03 16:39:59 +01:00
Eelco Chaudron	e1e5eac5b0	tc: Add TCA_KIND flower to delete and get operation to avoid rtnl_lock(). A long long time ago, an effort was made to make tc flower rtnl_lock() free. However, on the OVS part we forgot to add the TCA_KIND "flower" attribute, which tell the kernel to skip the lock. This patch corrects this by adding the attribute for the delete and get operations. The kernel code calls tcf_proto_is_unlocked() to determine the rtnl_lock() is needed for the specific tc protocol. It does this in the tc_new_tfilter(), tc_del_tfilter(), and in tc_get_tfilter(). If the name is not set, tcf_proto_is_unlocked() will always return false. If set, the specific protocol is queried for unlocked support. Fixes: f98e418fbdb6 ("tc: Add tc flower functions") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-01-30 21:12:31 +01:00
Ilya Maximets	9117f4d54f	netdev-offload-tc: Fix misaligned access to ct label. UndefinedBehaviorSanitizer: lib/netdev-offload-tc.c:1356:50: runtime error: member access within misaligned address 0x60700001a89c for type 'const struct (unnamed struct at lib/netdev-offload-tc.c:1350:27)', which requires 8 byte alignment 0x60700001a89c: note: pointer points here 24 00 04 00 01 00 00 05 00 00 0d 00 0a 00 00 00 00 00 00 00 ... ^ 0 0xd5d183 in parse_put_flow_ct_action lib/netdev-offload-tc.c:1356:50 1 0xd5783f in netdev_tc_parse_nl_actions lib/netdev-offload-tc.c:2015:19 2 0xd4027c in netdev_tc_flow_put lib/netdev-offload-tc.c:2355:11 3 0x9666d7 in netdev_flow_put lib/netdev-offload.c:318:14 4 0xcd4c0a in parse_flow_put lib/dpif-netlink.c:2297:11 5 0xcd4c0a in try_send_to_netdev lib/dpif-netlink.c:2384:15 6 0xcd4c0a in dpif_netlink_operate lib/dpif-netlink.c:2455:23 7 0x87d40e in dpif_operate lib/dpif.c:1372:13 8 0x6d43e9 in handle_upcalls ofproto/ofproto-dpif-upcall.c:1674:5 9 0x6d43e9 in recv_upcalls ofproto/ofproto-dpif-upcall.c:905:9 10 0x6cf6ea in udpif_upcall_handler ofproto/ofproto-dpif-upcall.c:801:13 11 0xb6d7ea in ovsthread_wrapper lib/ovs-thread.c:423:12 12 0x7f5ccf017801 in start_thread 13 0x7f5ccefb744f in __GI___clone3 Fixes: 9221c721bec0 ("netdev-offload-tc: Add conntrack label and mark support") Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-01-27 16:57:27 +01:00
Paul Blakey	c230c7579c	netdev-offload-tc: Reserve lower tc prios for ip ethertypes Currently ethertype to prio hmap is static and the first ethertype being used gets a lower priority. Usually there is an arp request before the ip traffic and the arp ethertype gets a lower tc priority while the ip traffic proto gets a higher priority. In this case ip traffic will go through more hops in tc and HW. Instead, reserve lower priorities for ip ethertypes. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-11-08 06:59:20 -05:00
Ilya Maximets	1336eeb570	netdev-offload-tc: Parse tunnel options only for geneve ports. Cited commit correctly fixed the issue of incorrect reporting of zero-length geneve option matches as wildcarded. But as a side effect, exact match on the metadata length was added to every tunnel flow match, even if the tunnel is not geneve. That doesn't generate any functional issues, but it maybe confusing to see 'tunnel(...,geneve(),...)' while looking at datapath flow dumps for, e.g., vxlan tunnel flows. Fix that by checking the port type before parsing geneve options. tunnel() attribute itself doesn't have enough information to figure out the tunnel type. Fixes: 7a6c8074c5d2 ("netdev-offload-tc: Fix the mask for tunnel metadata length.") Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-30 22:37:35 +02:00
Ilya Maximets	ff55e8f385	netdev-offload-tc: Add missing handling of the tunnel source port. netdev_tc_flow_put() "consumes" the tunnel.tp_src value, but it's never passed down to TC, and not parsed back. Fix that. Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:40:00 +02:00
Ilya Maximets	262eded5fb	netdev-offload-tc: Fix ignoring unknown tunnel keys. Current offloading code supports only limited number of tunnel keys and silently ignores everything it doesn't understand. This is causing, for example, offloaded ERSPAN tunnels to not work, because flow is offloaded, but ERSPAN options are not provided to TC. There is a number of tunnel keys, which are supported by the userspace, but silently ignored during offloading: OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT OVS_TUNNEL_KEY_ATTR_OAM OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS OVS_TUNNEL_KEY_ATTR_CSUM is kind of supported, but only for actions and for some reason is set from the tunnel port instead of the provided action, and not currently supported for the tunnel key in the match. Addig a default case to fail offloading of unknown attributes. For now explicitly allowing incorrect behavior for the DONT_FRAGMENT flag, otherwise we'll break all tunnel offloading by default. VXLAN and ERSPAN options has to fail offloading, because the tunnel will not work otherwise. OAM is not a default configurations, so failing it as well. The missing DONT_FRAGMENT flag though should, probably, cause frequent flow revalidation, but that is not new with this patch. Same for the 'match' key, only clearing masks that was actually consumed, except for the DONT_FRAGMENT and CSUM flags, which are explicitly allowed and highlighted as broken. Also, destination port as well as CSUM configuration for unknown reason was not taken from the actions list and were passed via HW offload info instead of being consumed from the set() action. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/395522.html Reported-by: Eelco Chaudron <echaudro@redhat.com> Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc interface") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:51 +02:00
Ilya Maximets	7a26634678	netdev-offload-tc: Use masks instead of keys while parsing tunnel attributes. If the key is zero, it doesn't mean we don't need to match on it. Masks should be checked instead. Fixes: 49a7961fca65 ("lib/tc: Avoid matching on tunnel ttl or tos if not needed") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:51 +02:00
Ilya Maximets	5d91bdf12e	netdev-offload-tc: Explicitly handle mask for the tunnel destination port. netdev_tc_flow_put() ignores the tunnel.tp_dst mask. That results in the exact match on the value. TC supports the masked match on this field and it does return the mask back during the flow dump even if it wasn't provided initially. OVS should correctly handle that. There is a problem though. Some drivers (mlx5) doesn't support offloading if the destination port is not an exact match [1]. Keeping the logic as-is for now, but making it explicit and somewhat documented in the comment, so it is clear what is happening and we can revisit this in the future. [1] https://patchwork.ozlabs.org/project/openvswitch/patch/20220704224505.1117988-3-i.maximets@ovn.org/#2927396 Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:51 +02:00
Ilya Maximets	7a6c8074c5	netdev-offload-tc: Fix the mask for tunnel metadata length. 'wc.masks.tunnel.metadata.present.len' must be a mask for the same field in the flow key, but in the tc_flower structure it's the real length of metadata masks. That is correctly handled for the individual opt->length, setting all the masks to 0x1f like it's done in the tun_metadata_to_geneve_mask__(), but not handled for the main 'len' field. Fix that by setting the mask to 0xff like it's done during the flow translation in xlate_actions() and during the flow dump in the tun_metadata_from_geneve_nlattr(). Also, flower always has an exact match on the present.len field regardless of its value and regardless of this field being masked by OVS flow translation layer while installing the flow. Hence, all tunnel flows dumped from TC should have an exact match on present.len and also UDPIF flag, because present.len doesn't make sense without that flag. Without the change, zero-length options match is incorrectly reported as a wildcard match. The side effect though is that zero-length match on geneve options is reported even for other tunnel types, e.g. vxlan. But that should be fairly harmless. To avoid reporting a match on empty geneve options for vxlan/etc. tunnels we'll need to check the tunnel port type, there is no enough information in the TUNNEL attribute itself. Extra checks and comments added around the code to better explain what is going on. Fixes: a468645c6d33 ("lib/tc: add geneve with option match offload") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:45 +02:00
Ilya Maximets	1fd336ccee	netdev-offload-tc: Disable offload of IPv6 fragments. OVS kernel datapath and TC are parsing IPv6 fragments differently. For IPv6 later fragments, according to the original design [1], OVS always sets nw_proto=44 (IPPROTO_FRAGMENT), regardless of the type of the L4 protocol. This leads to situation where flow for nw_proto=44 gets installed to TC, but packets can not match on it, causing all IPv6 later fragments to go to userspace significantly degrading performance. Disabling offload for such packets, so the flow can be installed to the OVS kernel datapath instead. Disabling for all IPv6 fragments including the first one, because it doesn't make a lot of sense to handle them separately. It may also cause potential problems with conntrack trying to re-assemble a packet from fragments handled by different datapaths (first in HW, later in OVS kernel). Checking both 'nw_proto' and 'nw_frag' as classifier might decide to match only on one of them and also nw_proto will not be 44 for the first fragment. The issue was hidden for some time due to incorrect behavior of the OVS kernel datapath that was recently fixed in kernel commit: 12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 fragments") To allow offloading in the future either flow dissector in TC should be changed to parse packets in the same way as OVS does, or parsing in OVS kernel and userspace should be made configurable, so users can opt-in to the behavior change. Silent change of the behavior (change by default) is not an option, because existing OpenFlow pipelines may depend on a certain behavior. [1] https://docs.openvswitch.org/en/latest/topics/design/#fragments Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation") Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-08 19:15:09 +02:00
Ilya Maximets	a7680c3caf	netdev-offload-tc: Print unused mask bits on failure. This change extends the debug logging with the sparse dump of the flow mask structure to make debug process easier. Sample output: \|netdev_offload_tc\|DBG\|offloading isn't supported, unknown attribute Unused mask bits: 00000270 00 00 00 00 00 00 00 00-00 00 00 ff 00 00 00 00 In this example, 0x270 + 11 = 635, which is an offset of the nsh.mdtype in the struct flow. Suggested-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-04 14:18:09 +02:00
Eelco Chaudron	e4daf88a43	netdev-offload-tc: Handle check_pkt_len datapath action. This change implements support for the check_pkt_len action using the TC police action, which supports MTU checking. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-13 17:39:22 +02:00
Eelco Chaudron	dd03672f7b	netdev-offload-tc: Move flower_to_match action handling to isolated function. Move handling of the individual actions in the parse_tc_flower_to_match() function to a separate function that will make recursive action handling easier. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-13 17:39:22 +02:00
Eelco Chaudron	64365293a5	netdev-offload-tc: Move flow_put action handling to isolated function. Move handling of the individual actions in the netdev_tc_flow_put() function to a separate function that will make recursive action handling easier. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-13 17:39:22 +02:00
Jianbo Liu	218304df18	netdev-offload-tc: Offloading rules with police actions When offloading rule, tc should be filled with police index, instead of meter id. As meter is mapped to police action, and the mapping is inserted into meter_id_to_police_idx hmap, this hmap is used to find the police index. Besides, the reverse mapping between meter id and police index is also added, so meter id can be retrieved from this hashmap and pass to dpif while dumping rules. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:20:50 +02:00
Jianbo Liu	f6ed09ef55	netdev-offload-tc: Cleanup police actions with reserved indexes on startup As the police actions with indexes of 0x10000000-0x1fffffff are reserved for meter offload, to provide a clean environment for OVS, these reserved police actions should be deleted on startup. So dump all the police actions, delete those actions with indexes in this range. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:20:37 +02:00
Jianbo Liu	4c226944f7	netdev-offload-tc: Implement meter offload API for tc For dpif-netlink, meters are mapped by tc to police actions with one-to-one relationship. Implement meter offload API to set/get/del the police action, and a hmap is used to save the mappings. An id-pool is used to manage all the available police indexes, which are 0x10000000-0x1fffffff, reserved only for OVS. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:20:04 +02:00
Jianbo Liu	a9b8cdde69	tc: Add support parsing tc police action Add function to parse police action from netlink message. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:16:56 +02:00
Eelco Chaudron	d632ad0aa1	odp_util: Fix parse_key_and_mask_to_match() vlan parsing. The parse_key_and_mask_to_match() is a function to translate a netlink formatted key/mask to match structure. And should not consider any configuration setting when translating. In addition we also enforce the encap_eth_type[0] mask as it's required for the VLAN match. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-06-24 20:24:55 +02:00
Eelco Chaudron	21b76c774d	netdev-offload-tc: Check for ct_state flag combinations that are not offloadable. This patch checks for none offloadable ct_state match flag combinations. If they exist force the +trk flag down to TC Flower Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-06 13:34:45 +02:00
Adrian Moreno	9e56549c2b	hmap: use short version of safe loops if possible. Using SHORT version of the *_SAFE loops makes the code cleaner and less error prone. So, use the SHORT version and remove the extra variable when possible for hmap and all its derived types. In order to be able to use both long and short versions without changing the name of the macro for all the clients, overload the existing name and select the appropriate version depending on the number of arguments. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Eelco Chaudron	db40eb79ec	netdev-offload-tc: Check for valid netdev ifindex in flow_put. Verify that the returned ifindex by netdev_get_ifindex() is valid. This might not be the case in the ERSPAN port scenario, which can not be offloaded. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-21 00:31:24 +01:00
Chris Mi	920ec5761e	tc: Keep header rewrite actions order. Currently, tc merges all header rewrite actions into one tc pedit action. So the header rewrite actions order is lost. Save each header rewrite action into one tc pedit action to keep the order. And only append one tc csum action to the last pedit action of a series. Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-18 17:41:14 +01:00
Paul Blakey	f34a7626cc	tc: Fix stats byte count on fragmented packets. Fragmented packets with offset=0 are defragmented by tc act_ct, and only when assembled pass to next action, in ovs offload case, a goto action. Since stats are overwritten on each action dump, only the stats for last action in the tc filter action priority list is taken, the stats on the goto action, which count only the assembled packets. See below for example. Hardware updates just part of the actions (gact, ct, mirred) - those that support stats_update() operation. Since datapath rules end with either an output (mirred) or recirc/drop (both gact), tc rule will at least have one action that supports it. For software packets, the first action will have the max software packets count. Tc dumps total packets (hw + sw) and hardware packets, then software packets needs to be calculated from this (total - hw). To fix the above, get hardware packets and calculate software packets for each action, take the max of each set, then combine back to get the total packets that went through software and hardware. Example by running ping above MTU (ping <IP> -s 2000): ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=first), packets:14, bytes:19544,..., actions:ct(zone=1),recirc(0x1) ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=later), packets:14, bytes:28392,..., actions:ct(zone=1),recirc(0x1) Second rule should have had bytes=14<size of 'later' frag>, but instead it's bytes=14<size of assembled packets - size of 'first' + 'later' frags>. Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-04 16:02:45 +01:00
Paolo Valerio	63c01b8956	netdev-offload-tc: Use nl_msg_put_flag for OVS_TUNNEL_KEY_ATTR_CSUM. When a tunnel port gets added to the bridge setting the checksum option to true: ovs-vsctl add-port br0 geneve0 \ -- set interface geneve0 type=geneve \ options:remote_ip=<remote_ip> options:key=<key> options:csum=true the flow dump for the outgoing traffic will include a "bad key length 1 ..." message: ovs-appctl dpctl/dump-flows --names -m ufid:<>, ..., dp:tc, actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081, key6(bad key length 1, expected 0)(01)flags(key))) ,genev_sys_6081 This is due to a mismatch present between the expected length (zero for OVS_TUNNEL_KEY_ATTR_CSUM in ovs_tun_key_attr_lens) and the current one. With this patch the same flow dump becomes: ovs-appctl dpctl/dump-flows --names -m ufid:<>, ..., dp:tc, actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081, flags(csum\|key))),genev_sys_6081 Fixes: d9677a1f0eaf ("netdev-tc-offloads: TC csum option is not matched with tunnel configuration") Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-06-30 23:56:03 +02:00
Ilya Maximets	a1ec428037	netdev-offload: Disallow offloading to unrelated tunneling vports. 'linux_tc' flow API suitable only for tunneling vports with backing linux interfaces. DPDK flow API is not suitable for such ports. With this change we could drop vport restriction from dpif-netdev. This is a prerequisite for enabling vport offloading in DPDK. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Eli Britstein <elibr@nvidia.com> Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Tested-by: Emma Finn <emma.finn@intel.com> Tested-by: Marko Kovacevic <marko.kovacevic@intel.com>	2021-06-24 22:22:08 +02:00
Ariel Levkovich	ea71a9d443	netdev-offload-tc: Add support for ct_state flag rel. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-04-19 23:57:05 +02:00
Paul Blakey	edcfd7176f	netdev-offload-tc: Add support for ct_state flags inv and rpl Signed-off-by: Paul Blakey <paulb@nvidia.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2021-03-15 19:38:24 +01:00
Paul Blakey	1e4aa061ac	netdev-offload-tc: Probe for support for any of the ct_state flags Upstream kernel now rejects unsupported ct_state flags. Earlier kernels, ignored it but still echoed back the requested ct_state, if ct_state was supported. ct_state initial support had trk, new, est, and rel flags. If kernel echos back ct_state, assume support for trk, new, est, and rel. If kernel rejects a specific unsupported flag, continue and use reject mechanisim to probe for flags rep and inv. Disallow inserting rules with unnsupported ct_state flags. Signed-off-by: Paul Blakey <paulb@nvidia.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2021-03-15 19:38:24 +01:00
Jianbo Liu	47e73f7f00	netdev-offload-tc: Flush rules on all chains before attach ingress block Previously, only chain 0 is deleted before attach the ingress block. If there are rules on the chain other than 0, these rules are not flushed. In this case, the recreation of qdisc also fails. To fix this issue, flush rules from all chains. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2021-02-24 11:07:20 +01:00
wenxu	498cf3eaed	netdev-offload-tc: Reject rules with unsupported ct_state flags. TC flower doesn't support some ct state flags such as INVALID/SNAT/DNAT/REPLY. So it is better to reject this rule. Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-02-04 20:11:12 +01:00
Maor Dickman	75e1e6fd2d	lib/tc: add ICMP type and code match offload Add TC offload support for classifying ICMPv4/6 type and code. $ ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(3),eth(),\ eth_type(0x0800),ipv4(proto=1),icmp(type=9,code=0)' 2 $ ovs-appctl dpctl/dump-flows ... icmp(type=9,code=0) ... $ tc filter show dev <ethx> ingress ... eth_type ipv4 ip_proto icmp icmp_type 9 icmp_code 0 not_in_hw action order 1: mirred (Egress Redirect to device <ethy>) stolen ... Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2021-02-01 16:54:15 +01:00
Jianbo Liu	c5b4b0ce95	dpif-netlink: Fix issues of the offloaded flows counter. The n_offloaded_flows counter is saved in dpif, and this is the first one when ofproto is created. When flow operation is done by ovs-appctl commands, such as, dpctl/add-flow, a new dpif is opened, and the n_offloaded_flows in it can't be used. So, instead of using counter, the number of offloaded flows is queried from each netdev, then sum them up. To achieve this, a new API is added in netdev_flow_api to get how many flows assigned to a netdev. In order to get better performance, this number is calculated directly from tc_to_ufid hmap for netdev-offload-tc, because flow dumping by tc takes much time if there are many flows offloaded. Fixes: af0618470507 ("dpif-netlink: Count the number of offloaded rules") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-21 20:25:59 +01:00
Roi Dayan	ecadc3a30b	netdev-offload-tc: Use single 'once' variable for probing tc features There is no need for a 'once' variable per probe. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-11-11 11:36:31 +01:00
Jianbo Liu	dd8ca104ac	netdev-tc-offloads: Don't delete ufid mapping if fail to delete filter tc_replace_flower may fail, so the return value must be checked. If not zero, ufid can't be deleted. Otherwise the operations on this filter may fail because its ufid is not found. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-10-23 16:20:46 +02:00
Roi Dayan	d5659751f6	tc: Use skip_hw flag when probing tc features There is no need to pass tc rules to hw when just probing for tc features. this will avoid redundant errors from hw drivers that may happen. Signed-off-by: Roi Dayan <roid@mellanox.com> Acked-By: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-08-06 17:26:46 +02:00

1 2

81 Commits