mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

Author	SHA1	Message	Date
Mike Pattrick	85bcbbed83	userspace: Enable tunnel tests with TSO. This patch enables most of the tunnel tests in the testsuite, and adds a large TCP transfer to a vxlan and geneve test to verify TSO functionality. Some additional changes were required to accommodate these changes with netdev-linux interfaces. The test for vlan over vxlan is purposely not enabled as the traffic produced by this test gives incorrect values in the vnet header. Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-01-17 22:06:51 +01:00
Mike Pattrick	3337e6d91c	userspace: Enable L4 checksum offloading by default. The netdev receiving packets is supposed to provide the flags indicating if the L4 checksum was verified and it is OK or BAD, otherwise the stack will check when appropriate by software. If the packet comes with good checksum, then postpone the checksum calculation to the egress device if needed. When encapsulate a packet with that flag, set the checksum of the inner L4 header since that is not yet supported. Calculate the L4 checksum when the packet is going to be sent over a device that doesn't support the feature. Linux tap devices allows enabling L3 and L4 offload, so this patch enables the feature. However, Linux socket interface remains disabled because the API doesn't allow enabling those two features without enabling TSO too. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Co-authored-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-15 23:50:30 +02:00
Mike Pattrick	5d11c47d3e	userspace: Enable IP checksum offloading by default. The netdev receiving packets is supposed to provide the flags indicating if the IP checksum was verified and it is GOOD or BAD, otherwise the stack will check when appropriate by software. If the packet comes with good checksum, then postpone the checksum calculation to the egress device if needed. When encapsulate a packet with that flag, set the checksum of the inner IP header since that is not yet supported. Calculate the IP checksum when the packet is going to be sent over a device that doesn't support the feature. Linux devices don't support IP checksum offload alone, so the support is not enabled. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Co-authored-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-15 23:49:51 +02:00
Cian Ferriter	9855f35dd2	dpif-netdev/mfex: Add AVX512 NVGRE traffic profiles. A typical NVGRE encapsulated packet starts with the ETH/IP/GRE protocols. Miniflow extract will parse just the ETH and IP headers. The GRE header will be processed later as part of the pop action. Add support for parsing the ETH/IP headers in this scenario. Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-12-21 15:44:17 +00:00
Kumar Amber	b80f58cde2	dpif-netdev/mfex: Add ipv6 profile based hashing. For packets which don't already have a hash calculated, miniflow_hash_5tuple() calculates the hash of a packet using the previously built miniflow. This commit adds IPv6 profile specific hashing which uses fixed offsets into the packet to improve hashing performance. Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-07-05 15:42:07 +01:00
Kumar Amber	8cab30a9d2	dpif-netdev/mfex: Add AVX512 ipv6 traffic profiles. Add AVX512 Ipv6 optimized profile for vlan/IPv6/UDP and vlan/IPv6/TCP, IPv6/UDP and IPv6/TCP. MFEX autovalidaton test-case already has the IPv6 support for validating against the scalar mfex. Signed-off-by: Kumar Amber <kumar.amber@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-07-05 15:41:27 +01:00
Kumar Amber	3e6be8a0a8	mfex_avx512: Calculate miniflow_bits at compile time. The patch removes magic numbers from miniflow_bits and calculates the bits at compile time. This also makes it easier to handle any ABI changes. Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-07-05 15:03:48 +01:00
Kumar Amber	95be97a5a3	mfex_avx512: Calculate pkt offsets at compile time. The patch removes magic numbers pkt offsets and minimum packet lenght and instead calculate it at compile time. Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-07-05 15:01:39 +01:00
David Marchand	fe171e4f10	dpif-netdev: Refactor AVX512 runtime checks. As described in the bugzilla below, cpu_has_isa code may be compiled with some AVX512 instructions in it, because cpu.c is built as part of the libopenvswitchavx512. This is a problem when this function (supposed to probe for AVX512 instructions availability) is invoked from generic OVS code, on older CPUs that don't support them. For the same reason, dpcls_subtable_avx512_gather_probe, dp_netdev_input_outer_avx512_probe, mfex_avx512_probe and mfex_avx512_vbmi_probe are potential runtime bombs and can't either be built as part of libopenvswitchavx512. Move cpu.c to be part of the "normal" libopenvswitch. And move other helpers in generic OVS code. Note: - dpcls_subtable_avx512_gather_probe is split in two, because it also needs to do its own magic, - while moving those helpers, prefer direct calls to cpu_has_isa and avoid cast to intermediate integer variables when a simple boolean is enough, Fixes: 352b6c7116cd ("dpif-lookup: add avx512 gather implementation.") Fixes: abb807e27dd4 ("dpif-netdev: Add command to switch dpif implementation.") Fixes: 250ceddcc2d0 ("dpif-netdev/mfex: Add AVX512 based optimized miniflow extract") Fixes: b366fa2f4947 ("dpif-netdev: Call cpuid for x86 isa availability.") Reported-at: https://bugzilla.redhat.com/2100393 Reported-by: Ales Musil <amusil@redhat.com> Co-authored-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-06-29 11:27:09 +02:00
Cian Ferriter	23ed22594d	dpif-netdev-extract-avx512: Protect GCC builtin usage. __builtin_constant_p is only available in GCC and only versions >= 4. Use the same "#if __GNUC__ >= 4" check used in other parts of OVS for this builtin. Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-06-28 13:36:52 +02:00
Cian Ferriter	cb1c640077	acinclude: Add seperate checks for AVX512 ISA. Checking for each of the required AVX512 ISA separately will allow the compiler to generate some AVX512 code where there is some support in the compiler rather than only generating all AVX512 code when all of it is supported or no AVX512 code at all. For example, in GCC 4.9 where there is just support for AVX512F, this patch will allow building the AVX512 DPIF. Another example, in GCC 5 and 6, most AVX512 code can be generated, just without AVX512VPOPCNTDQ support. Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-30 23:12:51 +02:00
Cian Ferriter	34a77ca704	dpif-netdev-extract: Remove unnecessary compiler targets. No instructions from the AVX512VL ISA are used. Compilation for AVX512F and AVX512 BW ISA are already enabled in lib/automake.mk for the dpif-netdev-lookup-avx512-gather.c file because it's part of the libopenvswitchavx512.la library. They don't need to be enabled at a function level. Remove these unnecessary function-level compiler target attributes. Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-30 23:12:51 +02:00
Kumar Amber	af864cedb0	dpif-netdev/mfex: Add ipv4 profile based hashing. For packets which don't already have a hash calculated, miniflow_hash_5tuple() calculates the hash of a packet using the previously built miniflow. This commit adds IPv4 profile specific hashing which uses fixed offsets into the packet to improve hashing performance. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Co-authored-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-05-04 14:24:04 +01:00
Harry van Haaren	5db8aa39d9	dpif-netdev-avx512: Fix ubsan shift error in bitmasks. The code changes here are to handle (1 << i) shifts where 'i' is the packet index in the batch, and 1 << 31 is an overflow of the signed '1'. Fixed by adding UINT32_C() around the 1 character, ensuring compiler knows the 1 is unsigned (and 32-bits). Undefined Behaviour sanitizer is now happy with the bit-shifts at runtime. Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-27 00:36:28 +02:00
Harry van Haaren	4f810deab9	dpif-netdev: fix vlan and ipv4 parsing in avx512 This commit fixes the minimum packet size for the vlan/ipv4/tcp traffic profile, which was previously incorrectly set. This commit also disallows any fragmented IPv4 packets from being matched in the optimized miniflow-extract, avoiding complexity of handling fragmented packets and using scalar fallback instead. The DF (don't fragment) bit is now ignored, and stripped from the resulting miniflow. Fixes: aa85a25095 ("dpif-netdev/mfex: Add more AVX512 traffic profiles.") Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Tested-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-02-08 09:36:25 +00:00
Harry van Haaren	3b489a3b1b	dpif-netdev: Improve loading of packet data for undersized packets. This commit improves handling of packets where the allocated memory is less than 64 bytes. For packets recevied from DPDK ports this never matters, as an mbuf always pre-allocates enough space, however this can occur in cases where packet received from a kernel interface or injected by an OpenFlow controller. The fix is required to ensure OVS doesn't overread the allocated memory, e.g.: ==49944==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6060000d8181 at pc 0x000001cb9d24 bp 0x7ffce3b385d0 sp 0x7ffce3b385c8 READ of size 64 at 0x6060000d8181 thread T0 #0 0x1cb9d23 in mfex_avx512_process lib/dpif-netdev-extract-avx512.c:491:26 #1 0x1cb9d23 in mfex_avx512_ip_udp lib/dpif-netdev-extract-avx512.c:625:1 #2 0x18786a1 in dpif_miniflow_extract_autovalidator lib/dpif-netdev-private-extract.c:277:29 #3 0x1cbca5c in dp_netdev_input_outer_avx512 lib/dpif-netdev-avx512.c:159:19 #4 0x1853048 in dp_netdev_process_rxq_port lib/dpif-netdev.c:4900:19 #5 0x1837c76 in dpif_netdev_run lib/dpif-netdev.c:6197:25 #6 0x1727a02 in type_run ofproto/ofproto-dpif.c:370:9 #7 0x16f6e07 in ofproto_type_run ofproto/ofproto.c:1778:31 #8 0x16c1a8b in bridge_run__ vswitchd/bridge.c:3245:9 #9 0x16bd2fd in bridge_run vswitchd/bridge.c:3310:5 #10 0x16db8fe in main vswitchd/ovs-vswitchd.c:127:9 #11 0x7fbc0c5b61a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) #12 0xedabbd in _start (vswitchd/ovs-vswitchd+0xedabbd) 0x6060000d8181 is located 9 bytes to the right of 56-byte region [0x6060000d8140,0x6060000d8178) allocated by thread T0 here: #0 0xf7b09f in malloc (vswitchd/ovs-vswitchd+0xf7b09f) #1 0x1aff3b9 in xmalloc__ lib/util.c:137:15 #2 0x1aff3b9 in xmalloc lib/util.c:172:12 #3 0x1afe211 in process_command lib/unixctl.c:310:13 #4 0x1afe211 in run_connection lib/unixctl.c:344:17 #5 0x1afe211 in unixctl_server_run lib/unixctl.c:395:21 #6 0x16db918 in main vswitchd/ovs-vswitchd.c:128:9 #7 0x7fbc0c5b61a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) The solution implemented uses a mask-to-zero if the available buffer size is less than 64 bytes, and a branch for which type of load is used. Fixes: 250ceddcc2d0 ("dpif-netdev/mfex: Add AVX512 based optimized miniflow extract") Reported-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-12 13:59:33 +01:00
David Marchand	b366fa2f49	dpif-netdev: Call cpuid for x86 isa availability. DPIF AVX512 optimizations currently rely on DPDK availability while they can be used without DPDK. Besides, checking for availability of some isa only has to be done once and won't change while a OVS process runs. Resolve isa availability in constructors by using a simplified query based on cpuid API that comes from the compiler. Note: this also fixes the check on BMI2 availability: DPDK had a bug for this isa, see https://git.dpdk.org/dpdk/commit/?id=aae3037ab1e0. Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-03 18:45:40 +01:00
Harry van Haaren	f0266292b7	dpif-netdev: Improve handling of IP/TCP in avx512 mfex. This commit tightens the requirements for processing TCP packets in AVX512, ensuring that there are no TCP options by validating that the "data offset" field of the TCP header is exactly equal to 5. This ensures that the TCP header is not too short, and that it does not contain extra options. On the IP handling side, improve checks around total packet length. Now the next protocol is included in the length checks, ensuring that the IP header reported length is of appropriate size to contain the next protocol (e.g. UDP requires 8 bytes, TCP requires 20). Note that the inner protocol is always of a fixed size per profile, so it can be set using the UDP_ and TCP_ HEADER_LEN defines. Fixes: 250ceddcc2d0 ("dpif-netdev/mfex: Add AVX512 based optimized miniflow extract") Reported-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-12-17 19:50:55 +01:00
Harry van Haaren	aa85a25095	dpif-netdev/mfex: Add more AVX512 traffic profiles This commit adds 3 new traffic profile implementations to the existing avx512 miniflow extract infrastructure. The profiles added are: - Ether()/IP()/TCP() - Ether()/Dot1Q()/IP()/UDP() - Ether()/Dot1Q()/IP()/TCP() The design of the avx512 code here is for scalability to add more traffic profiles, as well as enabling CPU ISA. Note that an implementation is primarily adding static const data, which the compiler then specializes away when the profile specific function is declared below. As a result, the code is relatively maintainable, and scalable for new traffic profiles as well as new ISA, and does not lower performance compared with manually written code for each profile/ISA. Note that confidence in the correctness of each implementation is achieved through autovalidation, unit tests with known packets, and fuzz tested packets. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2021-07-16 11:31:49 +01:00
Harry van Haaren	250ceddcc2	dpif-netdev/mfex: Add AVX512 based optimized miniflow extract This commit adds AVX512 implementations of miniflow extract. By using the 64 bytes available in an AVX512 register, it is possible to convert a packet to a miniflow data-structure in a small quantity instructions. The implementation here probes for Ether()/IP()/UDP() traffic, and builds the appropriate miniflow data-structure for packets that match the probe. The implementation here is auto-validated by the miniflow extract autovalidator, hence its correctness can be easily tested and verified. Note that this commit is designed to easily allow addition of new traffic profiles in a scalable way, without code duplication for each traffic profile. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2021-07-16 11:31:42 +01:00

20 Commits