2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-24 19:07:45 +00:00
ovs/lib/dpif-netdev-lookup.h

93 lines
3.4 KiB
C
Raw Normal View History

/*
* Copyright (c) 2020 Intel Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at:
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef DPIF_NETDEV_LOOKUP_H
#define DPIF_NETDEV_LOOKUP_H 1
#include <config.h>
#include "dpif-netdev.h"
dpif-netdev: Refactor to multiple header files. Split the very large file dpif-netdev.c and the datastructures it contains into multiple header files. Each header file is responsible for the datastructures of that component. This logical split allows better reuse and modularity of the code, and reduces the very large file dpif-netdev.c to be more managable. Due to dependencies between components, it is not possible to move component in smaller granularities than this patch. To explain the dependencies better, eg: DPCLS has no deps (from dpif-netdev.c file) FLOW depends on DPCLS (struct dpcls_rule) DFC depends on DPCLS (netdev_flow_key) and FLOW (netdev_flow_key) THREAD depends on DFC (struct dfc_cache) DFC_PROC depends on THREAD (struct pmd_thread) DPCLS lookup.h/c require only DPCLS DPCLS implementations require only dpif-netdev-lookup.h. - This change was made in 2.12 release with function pointers - This commit only refactors the name to "private-dpcls.h" netdev_flow_key_equal_mf() is renamed to emc_flow_key_equal_mf(). Rename functions specific to dpcls from netdev_* namespace to the dpcls_* namespace, as they are only used by dpcls code. 'inline' is added to the dp_netdev_flow_hash() when it is moved definition to fix a compiler error. One valid checkpatch issue with the use of the EMC_FOR_EACH_POS_WITH_HASH() macro was fixed. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Co-authored-by: Cian Ferriter <cian.ferriter@intel.com> Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2021-07-09 15:58:15 +00:00
#include "dpif-netdev-private-dpcls.h"
#include "dpif-netdev-private-thread.h"
/* Function to perform a probe for the subtable bit fingerprint.
* Returns NULL if not valid, or a valid function pointer to call for this
* subtable on success.
*/
typedef
dpcls_subtable_lookup_func (*dpcls_subtable_probe_func)(uint32_t u0_bit_count,
uint32_t u1_bit_count);
/* Prototypes for subtable implementations */
dpcls_subtable_lookup_func
dpcls_subtable_autovalidator_probe(uint32_t u0_bit_count,
uint32_t u1_bit_count);
/* Probe function to select a specialized version of the generic lookup
* implementation. This provides performance benefit due to compile-time
* optimizations such as loop-unrolling. These are enabled by the compile-time
* constants in the specific function implementations.
*/
dpcls_subtable_lookup_func
dpcls_subtable_generic_probe(uint32_t u0_bit_count, uint32_t u1_bit_count);
dpif-lookup: add avx512 gather implementation. This commit adds an AVX-512 dpcls lookup implementation. It uses the AVX-512 SIMD ISA to perform multiple miniflow operations in parallel. To run this implementation, the "avx512f" and "bmi2" ISAs are required. These ISA checks are performed at runtime while probing the subtable implementation. If a CPU does not provide both "avx512f" and "bmi2", then this code does not execute. The avx512 code is built as a separate static library, with added CFLAGS to enable the required ISA features. By building only this static library with avx512 enabled, it is ensured that the main OVS core library is *not* using avx512, and that OVS continues to run as before on CPUs that do not support avx512. The approach taken in this implementation is to use the gather instruction to access the packet miniflow, allowing any miniflow blocks to be loaded into an AVX-512 register. This maximizes the usefulness of the register, and hence this implementation handles any subtable with up to miniflow 8 bits. Note that specialization of these avx512 lookup routines still provides performance value, as the hashing of the resulting data is performed in scalar code, and compile-time loop unrolling occurs when specialized to miniflow bits. This commit checks at configure time if the assembling in use has a known bug in assembling AVX512 code. If this bug is present, all AVX512 code is disabled. Checking the version string of the binutils or assembler is not a good method to detect the issue, as back ported fixes would not be reflected. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2020-07-13 13:42:14 +01:00
/* Probe function for AVX-512 gather implementation */
dpcls_subtable_lookup_func
dpif-netdev: Refactor AVX512 runtime checks. As described in the bugzilla below, cpu_has_isa code may be compiled with some AVX512 instructions in it, because cpu.c is built as part of the libopenvswitchavx512. This is a problem when this function (supposed to probe for AVX512 instructions availability) is invoked from generic OVS code, on older CPUs that don't support them. For the same reason, dpcls_subtable_avx512_gather_probe, dp_netdev_input_outer_avx512_probe, mfex_avx512_probe and mfex_avx512_vbmi_probe are potential runtime bombs and can't either be built as part of libopenvswitchavx512. Move cpu.c to be part of the "normal" libopenvswitch. And move other helpers in generic OVS code. Note: - dpcls_subtable_avx512_gather_probe is split in two, because it also needs to do its own magic, - while moving those helpers, prefer direct calls to cpu_has_isa and avoid cast to intermediate integer variables when a simple boolean is enough, Fixes: 352b6c7116cd ("dpif-lookup: add avx512 gather implementation.") Fixes: abb807e27dd4 ("dpif-netdev: Add command to switch dpif implementation.") Fixes: 250ceddcc2d0 ("dpif-netdev/mfex: Add AVX512 based optimized miniflow extract") Fixes: b366fa2f4947 ("dpif-netdev: Call cpuid for x86 isa availability.") Reported-at: https://bugzilla.redhat.com/2100393 Reported-by: Ales Musil <amusil@redhat.com> Co-authored-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-06-29 09:32:24 +02:00
dpcls_subtable_avx512_gather_probe__(uint32_t u0_bit_cnt, uint32_t u1_bit_cnt,
bool use_vpop);
dpif-lookup: add avx512 gather implementation. This commit adds an AVX-512 dpcls lookup implementation. It uses the AVX-512 SIMD ISA to perform multiple miniflow operations in parallel. To run this implementation, the "avx512f" and "bmi2" ISAs are required. These ISA checks are performed at runtime while probing the subtable implementation. If a CPU does not provide both "avx512f" and "bmi2", then this code does not execute. The avx512 code is built as a separate static library, with added CFLAGS to enable the required ISA features. By building only this static library with avx512 enabled, it is ensured that the main OVS core library is *not* using avx512, and that OVS continues to run as before on CPUs that do not support avx512. The approach taken in this implementation is to use the gather instruction to access the packet miniflow, allowing any miniflow blocks to be loaded into an AVX-512 register. This maximizes the usefulness of the register, and hence this implementation handles any subtable with up to miniflow 8 bits. Note that specialization of these avx512 lookup routines still provides performance value, as the hashing of the resulting data is performed in scalar code, and compile-time loop unrolling occurs when specialized to miniflow bits. This commit checks at configure time if the assembling in use has a known bug in assembling AVX512 code. If this bug is present, all AVX512 code is disabled. Checking the version string of the binutils or assembler is not a good method to detect the issue, as back ported fixes would not be reflected. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2020-07-13 13:42:14 +01:00
/* Subtable registration and iteration helpers */
struct dpcls_subtable_lookup_info_t {
/* higher priority gets used over lower values. This allows deployments
* to select the best implementation for the use-case.
*/
uint8_t prio;
/* Probe function: tests if the (u0,u1) combo is supported. If not
* supported, this function returns NULL. If supported, a function pointer
* is returned which when called will perform the lookup on the subtable.
*/
dpcls_subtable_probe_func probe;
/* Human readable name, used in setting subtable priority commands */
const char *name;
/* Counter which holds the usage count of each implementations. */
atomic_count usage_cnt;
};
int dpcls_subtable_set_prio(const char *name, uint8_t priority);
void dpcls_info_inc_usage(struct dpcls_subtable_lookup_info_t *info);
void dpcls_info_dec_usage(struct dpcls_subtable_lookup_info_t *info);
/* Lookup the best subtable lookup implementation for the given u0,u1 count. */
dpcls_subtable_lookup_func
dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
struct dpcls_subtable_lookup_info_t **info);
/* Retrieve the array of lookup implementations for iteration.
* On error, returns a negative number.
* On success, returns the size of the arrays pointed to by the out parameter.
*/
int
dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr);
/* Prints dpcls subtables in use for different implementations. */
void
dpcls_impl_print_stats(struct ds *reply);
#endif /* dpif-netdev-lookup.h */