mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-22 09:58:01 +00:00

Author	SHA1	Message	Date
David Marchand	cf7b86db1f	dp-packet: Rework TCP segmentation. Rather than mark with a offload flags + mark with a segmentation size, simply rely on the netdev implementation which sets a segmentation size when appropriate. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-19 21:03:09 +02:00
David Marchand	2956a61265	dp-packet: Rework L4 checksum offloads. The DPDK mbuf API specifies 4 status when it comes to L4 checksums: - RTE_MBUF_F_RX_L4_CKSUM_UNKNOWN: no information about the RX L4 checksum - RTE_MBUF_F_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong - RTE_MBUF_F_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid - RTE_MBUF_F_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet data, but the integrity of the L4 data is verified. Similarly to the IP checksum offloads API, revise OVS L4 offloads API. No information about the L4 protocol is provided by any netdev-* implementation, so OVS needs to mark this L4 protocol during flow extraction. Rename current API for consistency with dp_packet_(inner_)?l4_checksum_. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-19 21:02:56 +02:00
David Marchand	3daf04a4c5	dp-packet: Rework IP checksum offloads. As the packet traverses through OVS, offloading Tx flags must be carefully evaluated and updated which results in a bit of complexity because of a separate "outer" Tx offloading flag coming from DPDK API, and a "normal"/"inner" Tx offloading flag. On the other hand, the DPDK mbuf API specifies 4 status when it comes to IP checksums: - RTE_MBUF_F_RX_IP_CKSUM_UNKNOWN: no information about the RX IP checksum - RTE_MBUF_F_RX_IP_CKSUM_BAD: the IP checksum in the packet is wrong - RTE_MBUF_F_RX_IP_CKSUM_GOOD: the IP checksum in the packet is valid - RTE_MBUF_F_RX_IP_CKSUM_NONE: the IP checksum is not correct in the packet data, but the integrity of the IP header is verified. This patch changes OVS API so that OVS code only tracks the status of the checksum of the "current" L3 header and let the Tx flags aspect to the netdev-* implementations. With this API, the flow extraction can be cleaned up. During packet processing, OVS can simply look for the IP checksum validity (either good, or partial) before changing some IP header, and then mark the checksum as partial. In the conntrack case, when natting packets, the checksum status of the inner part (ICMP error case) must be forced temporarily as unknown to force checksum resolution. When tunneling comes into play, IP checksums status is bit-shifted for future considerations in the processing if, for example, the tunnel header gets decapsulated again, or in the netdev-* implementations that support tunnel offloading. Finally, netdev-* implementations only need to care about packets in partial status: a good checksum does not need touching, a bad checksum has been updated by kept as bad by OVS, an unknown checksum is either an IPv6 or if it was an IPv4, OVS updated it too (keeping it good or bad accordingly). Rename current API for consistency with dp_packet_(inner_)?ip_checksum_. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-19 21:00:54 +02:00
David Marchand	67abd51540	dp-packet: Rework tunnel offloads. Rather than set bits in the mbuf ol_flags field, that only makes sense for netdev-dpdk ports, mark packet for tunnel offload in OVS offloads API. While at it, since there is nothing really "hardware" related, rename current API for consistency with dp_packet_tunnel_ prefix. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-19 21:00:48 +02:00
David Marchand	e2200485c5	dp-packet: Expand offloads preparation helper. Expand this helper to clearly separate the non tunnel case from the tunnel one. This will make later changes easier to read. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-19 21:00:45 +02:00
David Marchand	d29ba0abdc	dp-packet: Add OVS offloading API. As a preparation for tracking inner checksums, separate Rx checksum status from the DPDK ol_flags field. To minimize the cost of translating from DPDK API to OVS API, simply map OVS flags to DPDK Rx mbuf flags. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-19 21:00:34 +02:00
David Marchand	19ef1b1f0f	dp-packet: Remove DPDK specific IP version. Flagging packets with IP version is only needed at the netdev-dpdk level. In most cases, OVS is already inspecting the IP header in packet data, so maintaining such IP version metadata won't save much cycles (given the cost of additional branches necessary for handling outer/inner flags). Cleanup OVS shared code and only set these flags in netdev-dpdk.c. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-19 20:59:22 +02:00
David Marchand	52fdeda11a	dp-packet: Remove Linux specific L4 offloads. As the virtio-net offload API is used for netdev-linux ports, but provides no information about the potentially encapsulated protocol concerned by a checksum request, specific information from this netdev- specific implementation is propagated into OVS code, and must be carefully evaluated when some tunnel gets decapsulated. This induces a cost in "normal" processing path, while the netdev-linux path is not performance critical. This patch removes such specific information, yet try harder to parse the packet on the Rx side and set offload flags accordingly for non encapsulated traffic. For encapsulated traffic, the inner checksum is computed. Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-19 20:59:04 +02:00
David Marchand	585c8088eb	dpif-netdev: Enhance checksum coverage. Enhance netdev-dummy: - add debug log, - split Rx and Tx aspects, - add coverage for bad status, Enhance unit tests: - enable Tx offloads on the transmitting port, - test L4 checksums for TCP and UDP (and partial status), - test IPv6, Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-05-21 19:43:01 +02:00
David Marchand	d49994634e	flow: Fix bad IP checksum flag. flow_compose() can generate packets with bad IPv4 checksum, however the associated Rx flags were not correctly set. The usefulness of setting this metadata seems limited, yet fix this for consistency. Fixes: c62b4ac8f8da ("ovs-ofctl: Implement compose-packet --bare [--bad-csum].") Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-05-21 19:08:36 +02:00
David Marchand	c771758249	dpif-netdev: Preserve inner offloads on recirculation. Rather than drop all pending Tx offloads on recirculation, preserve inner offloads (and mark packet with outer Tx offloads) when parsing the packet again. Fixes: c6538b443984 ("dpif-netdev: Fix crash due to tunnel offloading on recirculation.") Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") Reported-at: https://issues.redhat.com/browse/FDP-1144 Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-02-13 21:32:15 +01:00
Mike Pattrick	2276c3a2c6	userspace: Support GRE TSO. This patch extends the userspace datapaths support of tunnel tso from only supporting VxLAN and Geneve to also supporting GRE tunnels. There is also a software fallback for cases where the egress netdev does not support this feature. Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-01-17 00:20:48 +01:00
Mike Pattrick	82c1028e37	Userspace: Software fallback for UDP encapsulated TCP segmentation. When sending packets that are flagged as requiring segmentation to an interface that does not support this feature, send the packet to the TSO software fallback instead of dropping it. Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2024-09-11 15:36:27 +02:00
Mike Pattrick	2ff8ed8de0	dp-packet: Correct IPv4 checksum calculation. During the transition towards checksum offloading, the function to handle software fallback of IPv4 checksums didn't account for the options field. Fixes: 5d11c47d3ebe ("userspace: Enable IP checksum offloading by default.") Reported-by: Jun Wang <junwang01@cestc.cn> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2024-July/053236.html Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-08-15 12:25:14 +02:00
David Marchand	c39a84c131	netdev-dpdk: Refactor tunnel checksum offloading. All information required for checksum offloading can be deduced by already tracked dp_packet l3_ofs, l4_ofs, inner_l3_ofs and inner_l4_ofs fields. Remove DPDK specific l[2-4]_len from generic OVS code. netdev-dpdk code then fills mbuf specifics step by step: - outer_l2_len and outer_l3_len are needed for tunneling (and below features), - l2_len and l3_len are needed for IP and L4 checksum (and below features), - l4_len and tso_segsz are needed when doing TSO, Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com>	2024-06-06 17:10:29 +01:00
Ilya Maximets	c6538b4439	dpif-netdev: Fix crash due to tunnel offloading on recirculation. Recirculation involves re-parsing the packet from scratch and that process is not aware of multiple header levels nor the inner/outer offsets. So, it overwrites offsets with new ones from the outermost headers and sets offloading flags that change their meaning when the packet is marked for tunnel offloading. For example: 1. TCP packet enters OVS. 2. TCP packet gets encapsulated into UDP tunnel. 3. Recirculation happens. 4. Packet is re-parsed after recirculation with miniflow_extract() or similar function. 5. Packet is marked for UDP checksumming because we parse the outermost set of headers. But since it is tunneled, it means inner UDP checksumming. And that makes no sense, because the inner packet is TCP. This is causing packet drops due to malformed packets or even assertions and crashes in the code that is trying to fixup checksums for packets using incorrect metadata: SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/packets.c:2061:15: runtime error: member access within null pointer of type 'struct udp_header' 0 0xbe5221 in packet_udp_complete_csum lib/packets.c:2061:15 1 0x7e5662 in dp_packet_ol_send_prepare lib/dp-packet.c:638:9 2 0x96ef89 in netdev_send lib/netdev.c:940:9 3 0x818e94 in dp_netdev_pmd_flush_output_on_port lib/dpif-netdev.c:5577:9 4 0x817606 in dp_netdev_pmd_flush_output_packets lib/dpif-netdev.c:5618:27 5 0x81cfa5 in dp_netdev_process_rxq_port lib/dpif-netdev.c:5677:9 6 0x7eefe4 in dpif_netdev_run lib/dpif-netdev.c:7001:25 7 0x610e87 in type_run ofproto/ofproto-dpif.c:367:9 8 0x5b9e80 in ofproto_type_run ofproto/ofproto.c:1879:31 9 0x55bbb4 in bridge_run__ vswitchd/bridge.c:3281:9 10 0x558b6b in bridge_run vswitchd/bridge.c:3346:5 11 0x591dc5 in main vswitchd/ovs-vswitchd.c:130:9 12 0x172b89 in __libc_start_call_main (/lib64/libc.so.6+0x27b89) 13 0x172c4a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x27c4a) 14 0x47eff4 in _start (vswitchd/ovs-vswitchd+0x47eff4) Tests added for both IPv4 and IPv6 cases. Though IPv6 test doesn't trigger the issue it's better to have a symmetric test. Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2024-March/053014.html Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-22 20:45:37 +01:00
Mike Pattrick	f81d782c19	netdev-native-tnl: Mark all vxlan/geneve packets as tunneled. Previously some packets were excluded from the tunnel mark if they weren't L4. However, this causes problems with multi encapsulated packets like arp. Due to these flags being set, additional checks are required in checksum modification code. Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-02-16 15:23:26 +01:00
Mike Pattrick	a2d4ad651d	netdev-linux: Only repair IP checksum in IPv4. Previously a change was added to the vnet prepend code to solve for the case where no L4 checksum offloading was needed but the L3 checksum hadn't been calculated. But the added check didn't properly account for IPv6 traffic. Fixes: 85bcbbed839a ("userspace: Enable tunnel tests with TSO.") Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-02-16 15:23:25 +01:00
Mike Pattrick	bf921e5677	dp-packet: Validate correct offset for L4 inner size. This patch fixes the correctness of dp_packet_inner_l4_size() when checking for the existence of an inner L4 header. Previously it checked for the outer L4 header. This function is currently only used when a packet is already flagged for tunneling, so an incorrect determination isn't possible as long as the flags of the packet are correct. Fixes: 85bcbbed839a ("userspace: Enable tunnel tests with TSO.") Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-02-13 20:51:43 +01:00
Mike Pattrick	96990ea1e4	dp-packet: Reset offload/offsets when clearing a packet. The OVN test suite identified a bug in dp_packet_ol_send_prepare() where a BFD packet flagged as double encapsulated would trigger a seg fault. The problem surfaced because bfd_put_packet was reusing a packet allocated on the stack that wasn't having its flags reset between calls. This change will reset OL flags as well as the layer offsets in data_clear(), which should fix this type of packet reuse issue in general as long as data_clear() is called in between uses. Fixes: 8b5fe2dc6080 ("userspace: Add Generic Segmentation Offloading.") Reported-by: Dumitru Ceara <dceara@redhat.com> Reported-at: https://issues.redhat.com/browse/FDP-300 Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-01-26 16:31:56 +01:00
Ilya Maximets	bacd2c304a	dp-packet: Avoid checks while preparing non-offloading packets. Currently, dp_packet_ol_send_prepare() performs multiple checks for each offloading flag separately. That takes a noticeable amount of extra cycles for packets that do not have any offloading flags set. Skip most of the work if no checksumming flags are set. The change improves performance of direct forwarding between two virtio-user ports (V2V) by ~2.5 % and offsets all the negative effects of TSO support introduced recently. It adds an extra check to the offloading path, but it is not a default configuration and also should take much smaller hit due to lower number of larger packets. Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-01-19 13:52:22 +01:00
Mike Pattrick	85bcbbed83	userspace: Enable tunnel tests with TSO. This patch enables most of the tunnel tests in the testsuite, and adds a large TCP transfer to a vxlan and geneve test to verify TSO functionality. Some additional changes were required to accommodate these changes with netdev-linux interfaces. The test for vlan over vxlan is purposely not enabled as the traffic produced by this test gives incorrect values in the vnet header. Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-01-17 22:06:51 +01:00
Dexia Li	084c808729	userspace: Support VXLAN and GENEVE TSO. For userspace datapath, this patch provides vxlan and geneve tunnel tso. Only support userspace vxlan or geneve tunnel, meanwhile support tunnel outter and inner csum offload. If netdev do not support offload features, there is a software fallback.If netdev do not support vxlan and geneve tso,packets will drop. Front-end devices can close offload features by ethtool also. Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Dexia Li <dexia.li@jaguarmicro.com> Co-authored-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-01-17 22:06:45 +01:00
Mike Pattrick	9e3c842d57	dp-packet: Set checksum flags during software TSO. When OVS needs to fallback on the software TSO implementation to segment a packet, it currently doesn't guarantee that IP and TCP checksum offload flags are set. However, it is possible that these is required. This is true in the case of dp_netdev_upcall(), which clears these flags. This patch explicitly sets the appropriate flags when the segmentation flag is removed, to guarantee that packets always end up with correct checksums. Fixes: 8b5fe2dc6080 ("userspace: Add Generic Segmentation Offloading.") Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-01-17 21:40:51 +01:00
Flavio Leitner	8b5fe2dc60	userspace: Add Generic Segmentation Offloading. This provides a software implementation in the case the egress netdev doesn't support segmentation in hardware. The challenge here is to guarantee packet ordering in the original batch that may be full of TSO packets. Each TSO packet can go up to ~64kB, so with segment size of 1440 that means about 44 packets for each TSO. Each batch has 32 packets, so the total batch amounts to 1408 normal packets. The segmentation estimates the total number of packets and then the total number of batches. Then allocate enough memory and finally do the work. Finally each batch is sent in order to the netdev. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Co-authored-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-12-02 01:33:37 +01:00
Flavio Leitner	e0056018c4	userspace: Respect tso/gso segment size. Currently OVS will calculate the segment size based on the MTU of the egress port. That usually happens to be correct when the ports share the same MTU, but that is not always true. Therefore, if the segment size is provided, then use that and make sure the over sized packets are dropped. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Co-authored-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-12-02 00:56:36 +01:00
Mike Pattrick	3337e6d91c	userspace: Enable L4 checksum offloading by default. The netdev receiving packets is supposed to provide the flags indicating if the L4 checksum was verified and it is OK or BAD, otherwise the stack will check when appropriate by software. If the packet comes with good checksum, then postpone the checksum calculation to the egress device if needed. When encapsulate a packet with that flag, set the checksum of the inner L4 header since that is not yet supported. Calculate the L4 checksum when the packet is going to be sent over a device that doesn't support the feature. Linux tap devices allows enabling L3 and L4 offload, so this patch enables the feature. However, Linux socket interface remains disabled because the API doesn't allow enabling those two features without enabling TSO too. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Co-authored-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-15 23:50:30 +02:00
Mike Pattrick	5d11c47d3e	userspace: Enable IP checksum offloading by default. The netdev receiving packets is supposed to provide the flags indicating if the IP checksum was verified and it is GOOD or BAD, otherwise the stack will check when appropriate by software. If the packet comes with good checksum, then postpone the checksum calculation to the egress device if needed. When encapsulate a packet with that flag, set the checksum of the inner IP header since that is not yet supported. Calculate the IP checksum when the packet is going to be sent over a device that doesn't support the feature. Linux devices don't support IP checksum offload alone, so the support is not enabled. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Co-authored-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-15 23:49:51 +02:00
Mike Pattrick	4339e7b19f	dp-packet: Allocate on cacheline boundary with DPDK. UB Sanitizer report: lib/dp-packet.h:587:22: runtime error: member access within misaligned address 0x000001ecde10 for type 'struct dp_packet', which requires 64 byte alignment #0 in dp_packet_set_base lib/dp-packet.h:587 #1 in dp_packet_use__ lib/dp-packet.c:46 #2 in dp_packet_use lib/dp-packet.c:60 #3 in dp_packet_init lib/dp-packet.c:126 #4 in dp_packet_new lib/dp-packet.c:150 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-02-03 22:18:16 +01:00
Cian Ferriter	9855f35dd2	dpif-netdev/mfex: Add AVX512 NVGRE traffic profiles. A typical NVGRE encapsulated packet starts with the ETH/IP/GRE protocols. Miniflow extract will parse just the ETH and IP headers. The GRE header will be processed later as part of the pop action. Add support for parsing the ETH/IP headers in this scenario. Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-12-21 15:44:17 +00:00
David Marchand	0937209fc7	netdev-dpdk: Cleanup code when DPDK is disabled. Remove one unused stub: netdev_dpdk_register() can't be called if DPDK is disabled at build time. Remove unneeded #ifdef in call to free_dpdk_buf. Drop unneeded cast when calling free_dpdk_buf. Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-11-30 13:58:15 +01:00
Emma Finn	eec8227614	odp-execute: Add auto validation function for actions. This commit introduced the auto-validation function which allows users to compare the batch of packets obtained from different action implementations against the linear action implementation. The autovalidator function can be triggered at runtime using the following command: $ ovs-appctl odp-execute/action-impl-set autovalidator Signed-off-by: Emma Finn <emma.finn@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-07-15 11:38:54 +01:00
Kumar Amber	b80f58cde2	dpif-netdev/mfex: Add ipv6 profile based hashing. For packets which don't already have a hash calculated, miniflow_hash_5tuple() calculates the hash of a packet using the previously built miniflow. This commit adds IPv6 profile specific hashing which uses fixed offsets into the packet to improve hashing performance. Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-07-05 15:42:07 +01:00
Rosemarie O'Riorden	7e7083cc46	dpif-netdev: Replace loop iterating over packet batch with macro. The function dp_netdev_pmd_flush_output_on_port() iterates over the p->output_pkts batch directly, when it should be using the special iterator macro, DP_PACKET_BATCH_FOR_EACH. However, this wasn't possible because the macro could not accept &p->output_pkts. The addition of parentheses when BATCH is dereferenced allows the macro to expand properly. Parenthesizing arguments in macros is good practice to be able to handle whichever expressions are passed in. Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-04 21:18:08 +02:00
Kumar Amber	af864cedb0	dpif-netdev/mfex: Add ipv4 profile based hashing. For packets which don't already have a hash calculated, miniflow_hash_5tuple() calculates the hash of a packet using the previously built miniflow. This commit adds IPv4 profile specific hashing which uses fixed offsets into the packet to improve hashing performance. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Co-authored-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-05-04 14:24:04 +01:00
Ian Stokes	17346b3899	dpdk: Update to use DPDK v21.11. This commit adds support for DPDK v21.11, it includes the following changes. 1. ci: Install python elftools for DPDK 21.02. 2. ci: Update meson requirement for DPDK 21.05. 3. netdev-dpdk: Fix build with 21.05. 4. ci: Compile DPDK in non developer mode. http://patchwork.ozlabs.org/project/openvswitch/list/?series=242480&state=* 5. netdev-dpdk: Remove access to DPDK internals. 6. netdev-dpdk: Remove unused attribute from rte_flow rule. 7. netdev-dpdk: Fix mbuf macros namespace with 21.11-rc1. 8. netdev-dpdk: Fix vhost namespace with 21.11-rc2. http://patchwork.ozlabs.org/project/openvswitch/list/?series=271159&state=* In addition documentation and DPDK unit tests were also updated in this commit for use with DPDK v21.11. For credit all authors of the original commits to 'dpdk-latest' with the above changes have been added as co-authors for this commit. Signed-off-by: David Marchand <david.marchand@redhat.com> Co-authored-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Emma Finn <emma.finn"intel.com> Tested-by: Seamus Ryan <seamus.ryan@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2021-12-09 18:40:14 +00:00
Tony van der Peet	7e6b41ac8d	dpif-netdev: Fix crash when PACKET_OUT is metered. When a PACKET_OUT has output port of OFPP_TABLE, and the rule table includes a meter and this causes the packet to be deleted, execute with a clone of the packet, restoring the original packet if it is changed by the execution. Add tests to verify the original issue is fixed, and that the fix doesn't break tunnel processing. Reported-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz> Signed-off-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-09-08 17:52:35 +02:00
Aaron Conole	640d4db788	ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag. As reported by Wang Liang, the way packets are passed to the ipf module doesn't allow for use later on in reassembly. Such packets may be get released anyway, such as during cleanup of tx processing. Because the ipf module lacks a way of forcing the dp_packet to be retained, it will later reuse the packet. Instead, just clone the packet and let the ipf queue own the copy until the queue is destroyed. After this change, there are no more in-tree users of the batch 'do_not_steal' flag. Thus, we remove it as well. Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.") Fixes: 0b3ff31d35f5 ("dp-packet: Add 'do_not_steal' packet batch flag.") Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-April/382098.html Reported-by: Wang Liang <wangliangrt@didiglobal.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Co-authored-by: Wang Liang <wangliangrt@didiglobal.com> Signed-off-by: Wang Liang <wangliangrt@didiglobal.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-06-15 10:46:33 +02:00
Flavio Leitner	79349cbab0	flow: Support extra padding length. Although not required, padding can be optionally added until the packet length is MTU bytes. A packet with extra padding currently fails sanity checks. Vulnerability: CVE-2020-35498 Fixes: fa8d9001a624 ("miniflow_extract: Properly handle small IP packets.") Reported-by: Joakim Hindersson <joakim.hindersson@elastx.se> Acked-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-02-10 14:59:55 +01:00
Ben Pfaff	3eec7fb075	pcap-file: Fix calculation of TCP payload length in tcp_reader_run(). The calculation in tcp_reader_run() failed to account for L2 padding. This fixes the problem, by moving the existing function tcp_payload_length() from a conntrack private header file into dp-packet.h and renaming it to suit the dp_packet style. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ilya Maximets <i.maximets@ovn.org>	2021-02-02 09:59:31 -08:00
William Tu	29bb3093eb	userspace: Enable TSO support for non-DPDK. This patch enables TSO support for non-DPDK use cases, and also add check-system-tso testsuite. Before TSO, we have to disable checksum offload, allowing the kernel to calculate the TCP/UDP packet checsum. With TSO, we can skip the checksum validation by enabling checksum offload, and with large packet size, we see better performance. Consider container to container use cases: iperf3 -c (ns0) -> veth peer -> OVS -> veth peer -> iperf3 -s (ns1) And I got around 6Gbps, similar to TSO with DPDK-enabled. Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: William Tu <u9012063@gmail.com>	2020-05-14 07:21:34 -07:00
Flavio Leitner	c724012970	dp-packet: prefetch the next packet when cloning a batch. There is a cache miss when accessing mbuf->data_off while cloning a batch and using prefetch improved the throughput by ~2.3%. Before: 13709416.30 pps After: 14031475.80 pps Fixes: d48771848560 ("dp-packet: preserve headroom when cloning a pkt batch") Signed-off-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ben Pfaff <blp@ovn.org>	2020-02-10 09:41:15 -08:00
Flavio Leitner	73858f9dbe	netdev-linux: Prepend the std packet in the TSO packet Usually TSO packets are close to 50k, 60k bytes long, so to to copy less bytes when receiving a packet from the kernel change the approach. Instead of extending the MTU sized packet received and append with remaining TSO data from the TSO buffer, allocate a TSO packet with enough headroom to prepend the std packet data. Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support") Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ben Pfaff <blp@ovn.org>	2020-02-06 11:37:23 -08:00
Flavio Leitner	29cf9c1b3b	userspace: Add TCP Segmentation Offload support Abbreviated as TSO, TCP Segmentation Offload is a feature which enables the network stack to delegate the TCP segmentation to the NIC reducing the per packet CPU overhead. A guest using vhostuser interface with TSO enabled can send TCP packets much bigger than the MTU, which saves CPU cycles normally used to break the packets down to MTU size and to calculate checksums. It also saves CPU cycles used to parse multiple packets/headers during the packet processing inside virtual switch. If the destination of the packet is another guest in the same host, then the same big packet can be sent through a vhostuser interface skipping the segmentation completely. However, if the destination is not local, the NIC hardware is instructed to do the TCP segmentation and checksum calculation. It is recommended to check if NIC hardware supports TSO before enabling the feature, which is off by default. For additional information please check the tso.rst document. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Tested-by: Ciara Loftus <ciara.loftus.intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2020-01-17 22:27:25 +00:00
Flavio Leitner	d487718485	dp-packet: preserve headroom when cloning a pkt batch The headroom is useful if the packet needs to insert additional header, so preserve the original headroom when cloning the batch. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Tested-by: Ciara Loftus <ciara.loftus.intel.com> Acked-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2020-01-17 17:01:24 +00:00
Ilya Maximets	9965fef8db	dp-packet: Fix clearing/copying of memory layout flags. 'ol_flags' of DPDK mbuf could contain bits responsible for external or indirect buffers which are not actually offload flags in a common sense. Clearing/copying of these flags could lead to memory leaks of external memory chunks and crashes due to access to wrong memory. OVS should not clear these flags while resetting offloads and also should not copy them to the newly allocated packets. This change is required to support DPDK 19.11, as some drivers may return mbufs with these flags set. However, it might be good to do the same for DPDK 18.11, because these flags are present and should be taken into account. Fixes: 03f3f9c0faf8 ("dpdk: Update to use DPDK 18.11.") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Ben Pfaff <blp@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com>	2019-11-28 16:50:10 +01:00
William Tu	0de1b42596	netdev-afxdp: add new netdev type for AF_XDP. The patch introduces experimental AF_XDP support for OVS netdev. AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket type built upon the eBPF and XDP technology. It is aims to have comparable performance to DPDK but cooperate better with existing kernel's networking stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program attached to the netdev, by-passing a couple of Linux kernel's subsystems As a result, AF_XDP socket shows much better performance than AF_PACKET For more details about AF_XDP, please see linux kernel's Documentation/networking/af_xdp.rst. Note that by default, this feature is not compiled in. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>	2019-07-19 17:42:06 +03:00
Ilya Maximets	0f706b37d8	dp-packet: Add flow_mark support for non-DPDK case. Additionally, new API call 'dp_packet_set_flow_mark' is needed for packet clone. Mostly for dummy HWOL implementation. Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2019-03-13 11:07:24 +00:00
Ilya Maximets	a47e2db209	dp-packet: Refactor offloading API. 1. No reason to have mbuf related APIs in a generic code. 2. Not only RSS/checksums should be invalidated in case of tunnel decapsulation or sending to 'ring' ports. In order to fix two above issues, new function 'dp_packet_reset_offload' introduced. In order to clean up/unify the code and simplify addition of new offloading features to non-DPDK version of dp_packet, introduced 'ol_flags' bitmask. Additionally reduced code complexity in 'dp_packet_clone_with_headroom' by using already existent generic APIs. Unfortunately, we still need to have a special case for mbuf initialization inside 'dp_packet_init__()'. 'dp_packet_init_specific()' introduced for this purpose as a generic API for initialization of the implementation-specific fields. Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2019-03-13 09:51:30 +00:00
Ilya Maximets	92330af529	dp-packet: Constantify offloading APIs. Getters should have const arguments. Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2019-02-27 22:28:34 +00:00

1 2

90 Commits