mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-29 05:18:13 +00:00

Author	SHA1	Message	Date
Paul Blakey	c230c7579c	netdev-offload-tc: Reserve lower tc prios for ip ethertypes Currently ethertype to prio hmap is static and the first ethertype being used gets a lower priority. Usually there is an arp request before the ip traffic and the arp ethertype gets a lower tc priority while the ip traffic proto gets a higher priority. In this case ip traffic will go through more hops in tc and HW. Instead, reserve lower priorities for ip ethertypes. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-11-08 06:59:20 -05:00
Ilya Maximets	1336eeb570	netdev-offload-tc: Parse tunnel options only for geneve ports. Cited commit correctly fixed the issue of incorrect reporting of zero-length geneve option matches as wildcarded. But as a side effect, exact match on the metadata length was added to every tunnel flow match, even if the tunnel is not geneve. That doesn't generate any functional issues, but it maybe confusing to see 'tunnel(...,geneve(),...)' while looking at datapath flow dumps for, e.g., vxlan tunnel flows. Fix that by checking the port type before parsing geneve options. tunnel() attribute itself doesn't have enough information to figure out the tunnel type. Fixes: 7a6c8074c5d2 ("netdev-offload-tc: Fix the mask for tunnel metadata length.") Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-30 22:37:35 +02:00
Ilya Maximets	ff55e8f385	netdev-offload-tc: Add missing handling of the tunnel source port. netdev_tc_flow_put() "consumes" the tunnel.tp_src value, but it's never passed down to TC, and not parsed back. Fix that. Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:40:00 +02:00
Ilya Maximets	262eded5fb	netdev-offload-tc: Fix ignoring unknown tunnel keys. Current offloading code supports only limited number of tunnel keys and silently ignores everything it doesn't understand. This is causing, for example, offloaded ERSPAN tunnels to not work, because flow is offloaded, but ERSPAN options are not provided to TC. There is a number of tunnel keys, which are supported by the userspace, but silently ignored during offloading: OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT OVS_TUNNEL_KEY_ATTR_OAM OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS OVS_TUNNEL_KEY_ATTR_CSUM is kind of supported, but only for actions and for some reason is set from the tunnel port instead of the provided action, and not currently supported for the tunnel key in the match. Addig a default case to fail offloading of unknown attributes. For now explicitly allowing incorrect behavior for the DONT_FRAGMENT flag, otherwise we'll break all tunnel offloading by default. VXLAN and ERSPAN options has to fail offloading, because the tunnel will not work otherwise. OAM is not a default configurations, so failing it as well. The missing DONT_FRAGMENT flag though should, probably, cause frequent flow revalidation, but that is not new with this patch. Same for the 'match' key, only clearing masks that was actually consumed, except for the DONT_FRAGMENT and CSUM flags, which are explicitly allowed and highlighted as broken. Also, destination port as well as CSUM configuration for unknown reason was not taken from the actions list and were passed via HW offload info instead of being consumed from the set() action. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/395522.html Reported-by: Eelco Chaudron <echaudro@redhat.com> Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc interface") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:51 +02:00
Ilya Maximets	7a26634678	netdev-offload-tc: Use masks instead of keys while parsing tunnel attributes. If the key is zero, it doesn't mean we don't need to match on it. Masks should be checked instead. Fixes: 49a7961fca65 ("lib/tc: Avoid matching on tunnel ttl or tos if not needed") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:51 +02:00
Ilya Maximets	5d91bdf12e	netdev-offload-tc: Explicitly handle mask for the tunnel destination port. netdev_tc_flow_put() ignores the tunnel.tp_dst mask. That results in the exact match on the value. TC supports the masked match on this field and it does return the mask back during the flow dump even if it wasn't provided initially. OVS should correctly handle that. There is a problem though. Some drivers (mlx5) doesn't support offloading if the destination port is not an exact match [1]. Keeping the logic as-is for now, but making it explicit and somewhat documented in the comment, so it is clear what is happening and we can revisit this in the future. [1] https://patchwork.ozlabs.org/project/openvswitch/patch/20220704224505.1117988-3-i.maximets@ovn.org/#2927396 Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:51 +02:00
Ilya Maximets	7a6c8074c5	netdev-offload-tc: Fix the mask for tunnel metadata length. 'wc.masks.tunnel.metadata.present.len' must be a mask for the same field in the flow key, but in the tc_flower structure it's the real length of metadata masks. That is correctly handled for the individual opt->length, setting all the masks to 0x1f like it's done in the tun_metadata_to_geneve_mask__(), but not handled for the main 'len' field. Fix that by setting the mask to 0xff like it's done during the flow translation in xlate_actions() and during the flow dump in the tun_metadata_from_geneve_nlattr(). Also, flower always has an exact match on the present.len field regardless of its value and regardless of this field being masked by OVS flow translation layer while installing the flow. Hence, all tunnel flows dumped from TC should have an exact match on present.len and also UDPIF flag, because present.len doesn't make sense without that flag. Without the change, zero-length options match is incorrectly reported as a wildcard match. The side effect though is that zero-length match on geneve options is reported even for other tunnel types, e.g. vxlan. But that should be fairly harmless. To avoid reporting a match on empty geneve options for vxlan/etc. tunnels we'll need to check the tunnel port type, there is no enough information in the TUNNEL attribute itself. Extra checks and comments added around the code to better explain what is going on. Fixes: a468645c6d33 ("lib/tc: add geneve with option match offload") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:45 +02:00
Ilya Maximets	1fd336ccee	netdev-offload-tc: Disable offload of IPv6 fragments. OVS kernel datapath and TC are parsing IPv6 fragments differently. For IPv6 later fragments, according to the original design [1], OVS always sets nw_proto=44 (IPPROTO_FRAGMENT), regardless of the type of the L4 protocol. This leads to situation where flow for nw_proto=44 gets installed to TC, but packets can not match on it, causing all IPv6 later fragments to go to userspace significantly degrading performance. Disabling offload for such packets, so the flow can be installed to the OVS kernel datapath instead. Disabling for all IPv6 fragments including the first one, because it doesn't make a lot of sense to handle them separately. It may also cause potential problems with conntrack trying to re-assemble a packet from fragments handled by different datapaths (first in HW, later in OVS kernel). Checking both 'nw_proto' and 'nw_frag' as classifier might decide to match only on one of them and also nw_proto will not be 44 for the first fragment. The issue was hidden for some time due to incorrect behavior of the OVS kernel datapath that was recently fixed in kernel commit: 12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 fragments") To allow offloading in the future either flow dissector in TC should be changed to parse packets in the same way as OVS does, or parsing in OVS kernel and userspace should be made configurable, so users can opt-in to the behavior change. Silent change of the behavior (change by default) is not an option, because existing OpenFlow pipelines may depend on a certain behavior. [1] https://docs.openvswitch.org/en/latest/topics/design/#fragments Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation") Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-08 19:15:09 +02:00
Ilya Maximets	a7680c3caf	netdev-offload-tc: Print unused mask bits on failure. This change extends the debug logging with the sparse dump of the flow mask structure to make debug process easier. Sample output: \|netdev_offload_tc\|DBG\|offloading isn't supported, unknown attribute Unused mask bits: 00000270 00 00 00 00 00 00 00 00-00 00 00 ff 00 00 00 00 In this example, 0x270 + 11 = 635, which is an offset of the nsh.mdtype in the struct flow. Suggested-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-04 14:18:09 +02:00
Eelco Chaudron	e4daf88a43	netdev-offload-tc: Handle check_pkt_len datapath action. This change implements support for the check_pkt_len action using the TC police action, which supports MTU checking. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-13 17:39:22 +02:00
Eelco Chaudron	dd03672f7b	netdev-offload-tc: Move flower_to_match action handling to isolated function. Move handling of the individual actions in the parse_tc_flower_to_match() function to a separate function that will make recursive action handling easier. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-13 17:39:22 +02:00
Eelco Chaudron	64365293a5	netdev-offload-tc: Move flow_put action handling to isolated function. Move handling of the individual actions in the netdev_tc_flow_put() function to a separate function that will make recursive action handling easier. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-13 17:39:22 +02:00
Jianbo Liu	218304df18	netdev-offload-tc: Offloading rules with police actions When offloading rule, tc should be filled with police index, instead of meter id. As meter is mapped to police action, and the mapping is inserted into meter_id_to_police_idx hmap, this hmap is used to find the police index. Besides, the reverse mapping between meter id and police index is also added, so meter id can be retrieved from this hashmap and pass to dpif while dumping rules. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:20:50 +02:00
Jianbo Liu	f6ed09ef55	netdev-offload-tc: Cleanup police actions with reserved indexes on startup As the police actions with indexes of 0x10000000-0x1fffffff are reserved for meter offload, to provide a clean environment for OVS, these reserved police actions should be deleted on startup. So dump all the police actions, delete those actions with indexes in this range. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:20:37 +02:00
Jianbo Liu	4c226944f7	netdev-offload-tc: Implement meter offload API for tc For dpif-netlink, meters are mapped by tc to police actions with one-to-one relationship. Implement meter offload API to set/get/del the police action, and a hmap is used to save the mappings. An id-pool is used to manage all the available police indexes, which are 0x10000000-0x1fffffff, reserved only for OVS. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:20:04 +02:00
Jianbo Liu	a9b8cdde69	tc: Add support parsing tc police action Add function to parse police action from netlink message. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:16:56 +02:00
Eelco Chaudron	d632ad0aa1	odp_util: Fix parse_key_and_mask_to_match() vlan parsing. The parse_key_and_mask_to_match() is a function to translate a netlink formatted key/mask to match structure. And should not consider any configuration setting when translating. In addition we also enforce the encap_eth_type[0] mask as it's required for the VLAN match. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-06-24 20:24:55 +02:00
Eelco Chaudron	21b76c774d	netdev-offload-tc: Check for ct_state flag combinations that are not offloadable. This patch checks for none offloadable ct_state match flag combinations. If they exist force the +trk flag down to TC Flower Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-06 13:34:45 +02:00
Adrian Moreno	9e56549c2b	hmap: use short version of safe loops if possible. Using SHORT version of the *_SAFE loops makes the code cleaner and less error prone. So, use the SHORT version and remove the extra variable when possible for hmap and all its derived types. In order to be able to use both long and short versions without changing the name of the macro for all the clients, overload the existing name and select the appropriate version depending on the number of arguments. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Eelco Chaudron	db40eb79ec	netdev-offload-tc: Check for valid netdev ifindex in flow_put. Verify that the returned ifindex by netdev_get_ifindex() is valid. This might not be the case in the ERSPAN port scenario, which can not be offloaded. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-21 00:31:24 +01:00
Chris Mi	920ec5761e	tc: Keep header rewrite actions order. Currently, tc merges all header rewrite actions into one tc pedit action. So the header rewrite actions order is lost. Save each header rewrite action into one tc pedit action to keep the order. And only append one tc csum action to the last pedit action of a series. Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-18 17:41:14 +01:00
Paul Blakey	f34a7626cc	tc: Fix stats byte count on fragmented packets. Fragmented packets with offset=0 are defragmented by tc act_ct, and only when assembled pass to next action, in ovs offload case, a goto action. Since stats are overwritten on each action dump, only the stats for last action in the tc filter action priority list is taken, the stats on the goto action, which count only the assembled packets. See below for example. Hardware updates just part of the actions (gact, ct, mirred) - those that support stats_update() operation. Since datapath rules end with either an output (mirred) or recirc/drop (both gact), tc rule will at least have one action that supports it. For software packets, the first action will have the max software packets count. Tc dumps total packets (hw + sw) and hardware packets, then software packets needs to be calculated from this (total - hw). To fix the above, get hardware packets and calculate software packets for each action, take the max of each set, then combine back to get the total packets that went through software and hardware. Example by running ping above MTU (ping <IP> -s 2000): ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=first), packets:14, bytes:19544,..., actions:ct(zone=1),recirc(0x1) ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=later), packets:14, bytes:28392,..., actions:ct(zone=1),recirc(0x1) Second rule should have had bytes=14<size of 'later' frag>, but instead it's bytes=14<size of assembled packets - size of 'first' + 'later' frags>. Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-04 16:02:45 +01:00
Paolo Valerio	63c01b8956	netdev-offload-tc: Use nl_msg_put_flag for OVS_TUNNEL_KEY_ATTR_CSUM. When a tunnel port gets added to the bridge setting the checksum option to true: ovs-vsctl add-port br0 geneve0 \ -- set interface geneve0 type=geneve \ options:remote_ip=<remote_ip> options:key=<key> options:csum=true the flow dump for the outgoing traffic will include a "bad key length 1 ..." message: ovs-appctl dpctl/dump-flows --names -m ufid:<>, ..., dp:tc, actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081, key6(bad key length 1, expected 0)(01)flags(key))) ,genev_sys_6081 This is due to a mismatch present between the expected length (zero for OVS_TUNNEL_KEY_ATTR_CSUM in ovs_tun_key_attr_lens) and the current one. With this patch the same flow dump becomes: ovs-appctl dpctl/dump-flows --names -m ufid:<>, ..., dp:tc, actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081, flags(csum\|key))),genev_sys_6081 Fixes: d9677a1f0eaf ("netdev-tc-offloads: TC csum option is not matched with tunnel configuration") Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-06-30 23:56:03 +02:00
Ilya Maximets	a1ec428037	netdev-offload: Disallow offloading to unrelated tunneling vports. 'linux_tc' flow API suitable only for tunneling vports with backing linux interfaces. DPDK flow API is not suitable for such ports. With this change we could drop vport restriction from dpif-netdev. This is a prerequisite for enabling vport offloading in DPDK. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Eli Britstein <elibr@nvidia.com> Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Tested-by: Emma Finn <emma.finn@intel.com> Tested-by: Marko Kovacevic <marko.kovacevic@intel.com>	2021-06-24 22:22:08 +02:00
Ariel Levkovich	ea71a9d443	netdev-offload-tc: Add support for ct_state flag rel. Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-04-19 23:57:05 +02:00
Paul Blakey	edcfd7176f	netdev-offload-tc: Add support for ct_state flags inv and rpl Signed-off-by: Paul Blakey <paulb@nvidia.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2021-03-15 19:38:24 +01:00
Paul Blakey	1e4aa061ac	netdev-offload-tc: Probe for support for any of the ct_state flags Upstream kernel now rejects unsupported ct_state flags. Earlier kernels, ignored it but still echoed back the requested ct_state, if ct_state was supported. ct_state initial support had trk, new, est, and rel flags. If kernel echos back ct_state, assume support for trk, new, est, and rel. If kernel rejects a specific unsupported flag, continue and use reject mechanisim to probe for flags rep and inv. Disallow inserting rules with unnsupported ct_state flags. Signed-off-by: Paul Blakey <paulb@nvidia.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2021-03-15 19:38:24 +01:00
Jianbo Liu	47e73f7f00	netdev-offload-tc: Flush rules on all chains before attach ingress block Previously, only chain 0 is deleted before attach the ingress block. If there are rules on the chain other than 0, these rules are not flushed. In this case, the recreation of qdisc also fails. To fix this issue, flush rules from all chains. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2021-02-24 11:07:20 +01:00
wenxu	498cf3eaed	netdev-offload-tc: Reject rules with unsupported ct_state flags. TC flower doesn't support some ct state flags such as INVALID/SNAT/DNAT/REPLY. So it is better to reject this rule. Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-02-04 20:11:12 +01:00
Maor Dickman	75e1e6fd2d	lib/tc: add ICMP type and code match offload Add TC offload support for classifying ICMPv4/6 type and code. $ ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(3),eth(),\ eth_type(0x0800),ipv4(proto=1),icmp(type=9,code=0)' 2 $ ovs-appctl dpctl/dump-flows ... icmp(type=9,code=0) ... $ tc filter show dev <ethx> ingress ... eth_type ipv4 ip_proto icmp icmp_type 9 icmp_code 0 not_in_hw action order 1: mirred (Egress Redirect to device <ethy>) stolen ... Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2021-02-01 16:54:15 +01:00
Jianbo Liu	c5b4b0ce95	dpif-netlink: Fix issues of the offloaded flows counter. The n_offloaded_flows counter is saved in dpif, and this is the first one when ofproto is created. When flow operation is done by ovs-appctl commands, such as, dpctl/add-flow, a new dpif is opened, and the n_offloaded_flows in it can't be used. So, instead of using counter, the number of offloaded flows is queried from each netdev, then sum them up. To achieve this, a new API is added in netdev_flow_api to get how many flows assigned to a netdev. In order to get better performance, this number is calculated directly from tc_to_ufid hmap for netdev-offload-tc, because flow dumping by tc takes much time if there are many flows offloaded. Fixes: af0618470507 ("dpif-netlink: Count the number of offloaded rules") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-21 20:25:59 +01:00
Roi Dayan	ecadc3a30b	netdev-offload-tc: Use single 'once' variable for probing tc features There is no need for a 'once' variable per probe. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-11-11 11:36:31 +01:00
Jianbo Liu	dd8ca104ac	netdev-tc-offloads: Don't delete ufid mapping if fail to delete filter tc_replace_flower may fail, so the return value must be checked. If not zero, ufid can't be deleted. Otherwise the operations on this filter may fail because its ufid is not found. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-10-23 16:20:46 +02:00
Roi Dayan	d5659751f6	tc: Use skip_hw flag when probing tc features There is no need to pass tc rules to hw when just probing for tc features. this will avoid redundant errors from hw drivers that may happen. Signed-off-by: Roi Dayan <roid@mellanox.com> Acked-By: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-08-06 17:26:46 +02:00
Ilya Maximets	8842fdf1b3	netdev-offload: Use dpif type instead of class. There is no real difference between the 'class' and 'type' in the context of common lookup operations inside netdev-offload module because it only checks the value of pointers without using the value itself. However, 'type' has some meaning and can be used by offload provides on the initialization phase to check if this type of Flow API in pair with the netdev type could be used in particular datapath type. For example, this is needed to check if Linux flow API could be used for current tunneling vport because it could be used only if tunneling vport belongs to system datapath, i.e. has backing linux interface. This is needed to unblock tunneling offloads in userspace datapath with DPDK flow API. Acked-by: Eli Britstein <elibr@mellanox.com> Acked-by: Roni Bar Yanai <roniba@mellanox.com> Acked-by: Ophir Munk <ophirmu@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-07-08 19:07:21 +02:00
William Tu	002682727e	netdev-offload-tc: Add drop action support. Currently drop action is not offloaded when using userspace datapath with tc offload. The patch programs tc gact (generic action) chain ID 0 to drop the packet by setting it to TC_ACT_SHOT. Example: $ ovs-appctl dpctl/add-flow netdev@ovs-netdev \ 'recirc_id(0),in_port(2),eth(),eth_type(0x0806),\ arp(op=2,tha=00:50:56:e1:4b:ab,tip=10.255.1.116)' drop Or no action also infers drop $ ovs-appctl dpctl/add-flow netdev@ovs-netdev \ 'recirc_id(0),in_port(2),eth(),eth_type(0x0806),\ arp(op=2,tha=00:50:56:e1:4b:ab,tip=10.255.1.116)' '' $ tc filter show dev ovs-p0 ingress filter protocol arp pref 2 flower chain 0 filter protocol arp pref 2 flower chain 0 handle 0x1 eth_type arp arp_tip 10.255.1.116 arp_op reply arp_tha 00:50:56:e1:4b:ab skip_hw not_in_hw action order 1: gact action drop ... Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-07-08 10:02:24 +02:00
Roi Dayan	1fe4297563	netdev-offload-tc: Revert tunnel src/dst port masks handling The cited commit intended to add tc support for masking tunnel src/dst ips and ports. It's not possible to do tunnel ports masking with openflow rules and the default mask for tunnel ports set to 0 in tnl_wc_init(), unlike tunnel ports default mask which is full mask. So instead of never passing tunnel ports to tc, revert the changes to tunnel ports to always pass the tunnel port. In sw classification is done by the kernel, but for hw we must match the tunnel dst port. Fixes: 5f568d049130 ("netdev-offload-tc: Allow to match the IP and port mask of tunnel") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-19 08:51:11 +02:00
Tonghao Zhang	a3db6e473d	netdev-offload-tc: Allow installing arp rules to TC dp. This patch allows to install arp rules to tc dp. In the future, arp will be offloaded to hardware to be processed. So OvS enable this now. $ ovs-appctl dpctl/add-flow 'recirc_id(0),in_port(3),eth(),\ eth_type(0x0806),arp(op=2,tha=00:50:56:e1:4b:ab,tip=10.255.1.116)' 2 $ ovs-appctl dpctl/dump-flows ... arp(tip=10.255.1.116,op=2,tha=00:50:56:e1:4b:ab) ... $ tc filter show dev <ethx> ingress ... eth_type arp arp_tip 10.255.1.116 arp_op reply arp_tha 00:50:56:e1:4b:ab not_in_hw action order 1: mirred (Egress Redirect to device <ethy>) stolen ... Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-08 11:08:05 +02:00
Vlad Buslov	5db012c4ac	tc: Support new terse dump kernel API When dumping flows in terse mode set TCA_DUMP_FLAGS attribute to TCA_DUMP_FLAGS_TERSE flag to prevent unnecessary copying of data between kernel and user spaces. Only expect kernel to provide cookie, stats and flags when dumping filters in terse mode. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-05 10:14:27 +02:00
Vlad Buslov	191536574e	netdev-offload: Implement terse dump support In order to improve revalidator performance by minimizing unnecessary copying of data, extend netdev-offloads to support terse dump mode. Extend netdev_flow_api->flow_dump_create() with 'terse' bool argument. Implement support for terse dump in functions that convert netlink to flower and flower to match. Set flow stats "used" value based on difference in number of flow packets because lastuse timestamp is not included in TC terse dump. Kernel API support is implemented in following patch. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-05 10:14:27 +02:00
Tonghao Zhang	3f82ac1fe3	netdev-offload-tc: Expand tunnel source IPs masked match To support more use case, for example, DDOS, which packets should be dropped in hardware, this patch allows users to match only the tunnel source IPs with masked value. $ ovs-appctl dpctl/add-flow "tunnel(src=2.2.2.0/255.255.255.0,tp_dst=4789,ttl=64),\ recirc_id(2),in_port(3),eth(),eth_type(0x0800),ipv4()" "" $ ovs-appctl dpctl/dump-flows tunnel(src=2.2.2.0/255.255.255.0,ttl=64,tp_dst=4789) ... actions:drop $ tc filter show dev vxlan_sys_4789 ingress ... eth_type ipv4 enc_src_ip 2.2.2.0/24 enc_dst_port 4789 enc_ttl 64 in_hw in_hw_count 2 action order 1: gact action drop ... Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-03 09:57:16 +02:00
Tonghao Zhang	5f568d0491	netdev-offload-tc: Allow to match the IP and port mask of tunnel This patch allows users to offload the TC flower rules with tunnel mask. This patch allows masked match of the following, where previously supported an exact match was supported: * Remote (dst) tunnel endpoint address * Local (src) tunnel endpoint address * Remote (dst) tunnel endpoint UDP port And also allows masked match of the following, where previously no match was supported: * Local (src) tunnel endpoint UDP port In some case, mask is useful as wildcards. For example, DDOS, in that case, we don’t want to allow specified hosts IPs or only source Ports to access the targeted host. For example: $ ovs-appctl dpctl/add-flow "tunnel(dst=2.2.2.100,src=2.2.2.0/255.255.255.0,tp_dst=4789),\ recirc_id(0),in_port(3),eth(),eth_type(0x0800),ipv4()" "" $ tc filter show dev vxlan_sys_4789 ingress ... eth_type ipv4 enc_dst_ip 2.2.2.100 enc_src_ip 2.2.2.0/24 enc_dst_port 4789 enc_ttl 64 in_hw in_hw_count 2 action order 1: gact action drop ... Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-03 09:56:07 +02:00
Tonghao Zhang	4f4be08e47	netdev-offload-tc: Use ipv6_addr_is_set instead of is_all_zeros Not bugfix, make the codes more readable. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-03 09:53:30 +02:00
Aaron Conole	8508a57228	netdev-offload-tc: Re-fetch block ID after probing. It's possible that block_id could changes after the probe for block support. Therefore, fetch the block_id again after the probe. Fixes: edc2055a2bf7 ("netdev-offload-tc: Flush rules on ingress block when init tc flow api") Cc: Dmytro Linkin <dmitrolin@mellanox.com> Acked-by: Roi Dayan <roid@mellanox.com> Co-authored-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-05-16 13:15:50 +02:00
Dmytro Linkin	edc2055a2b	netdev-offload-tc: Flush rules on ingress block when init tc flow api OVS can fail to attach ingress block on iface when init tc flow api, if block already exist with rules on it and is shared with other iface. Fix by flush all existing rules on the ingress block prior to deleting it. Fixes: 093c9458fb02 ("tc: allow offloading of block ids") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Acked-by: Raed Salem <raeds@mellanox.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-03-25 18:44:32 +01:00
wenxu	65b84d4a32	dpif-netlink: avoid netlink modify flow put op failed after tc modify flow put op failed. The tc modify flow put always delete the original flow first and then add the new flow. If the modfiy flow put operation failed, the flow put operation will change from modify to create if success to delete the original flow in tc (which will be always failed with ENOENT, the flow is already be deleted before add the new flow in tc). Finally, the modify flow put will failed to add in kernel datapath. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-03-19 11:58:45 +01:00
John Hurley	56c8027b5f	tc: handle packet mark of zero Openstack may set an skb mark of 0 in tunnel rules. This is considered to be an unused/unset value. However, it prevents the rule from being offloaded. Check if the key value of the skb mark is 0 when it is in use (mask is set to all ones). If it is then ignore the field and continue with TC offload. Only the exact-match case is covered by this patch as it addresses the Openstack use-case and seems most robust against feature evolution: f.e. in future there may exist hardware offload scenarios where an operation, such as a BPF offload, sets the SKB mark before proceeding tho the in-HW OVS. datapath. Signed-off-by: John Hurley <john.hurley@netronome.com> Co-Authored-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Aaron Conole <aconole@redhat.com>	2020-01-22 14:20:18 +01:00
Ilya Maximets	d7b55c5c94	dpif: Fix leak and usage of uninitialized dp_extra_info. 'dpif_probe_feature'/'revalidate' doesn't free the 'dp_extra_info' string. Also, all the implementations of dpif_flow_get() should initialize the value to avoid printing/freeing of random memory. 30 bytes in 1 blocks are definitely lost in loss record 323 of 889 at 0x483AD19: realloc (vg_replace_malloc.c:836) by 0xDDAD89: xrealloc (util.c:149) by 0xCE1609: ds_reserve (dynamic-string.c:63) by 0xCE1A90: ds_put_format_valist (dynamic-string.c:161) by 0xCE19B9: ds_put_format (dynamic-string.c:142) by 0xCCCEA9: dp_netdev_flow_to_dpif_flow (dpif-netdev.c:3170) by 0xCCD2DD: dpif_netdev_flow_get (dpif-netdev.c:3278) by 0xCCEA0A: dpif_netdev_operate (dpif-netdev.c:3868) by 0xCDF81B: dpif_operate (dpif.c:1361) by 0xCDEE93: dpif_flow_get (dpif.c:1002) by 0xCDECF9: dpif_probe_feature (dpif.c:962) by 0xC635D2: check_recirc (ofproto-dpif.c:896) by 0xC65C02: check_support (ofproto-dpif.c:1567) by 0xC63274: open_dpif_backer (ofproto-dpif.c:818) by 0xC65E3E: construct (ofproto-dpif.c:1605) by 0xC4D436: ofproto_create (ofproto.c:549) by 0xC3931A: bridge_reconfigure (bridge.c:877) by 0xC3FEAC: bridge_run (bridge.c:3324) by 0xC4551D: main (ovs-vswitchd.c:127) CC: Emma Finn <emma.finn@intel.com> Fixes: 0e8f5c6a38d0 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Roi Dayan <roid@mellanox.com>	2020-01-20 17:51:16 +01:00
Ilya Maximets	695e350919	netdev-offload-tc: Fix crash if offloading is not configured on outdev. If output device is not yet added to netdev-offload, netdev_ports_get() will not find it leading to NULL pointer dereference inside netdev_get_ifindex(). Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc interface") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>	2020-01-15 13:57:00 +01:00
Ilya Maximets	aeee33444c	netdev-offload-tc: Fix using uninitialized recirc_act. Fixes: b2ae40690ed7 ("netdev-offload-tc: Add recirculation support via tc chains") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-01-07 09:54:35 +01:00

1 2

65 Commits