2016-10-04 17:58:05 -07:00
|
|
|
/*
|
2017-03-08 15:44:39 -08:00
|
|
|
* Copyright (c) 2014, 2015, 2016, 2017 Nicira, Inc.
|
2016-10-04 17:58:05 -07:00
|
|
|
*
|
|
|
|
* Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
* you may not use this file except in compliance with the License.
|
|
|
|
* You may obtain a copy of the License at:
|
|
|
|
*
|
|
|
|
* http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
*
|
|
|
|
* Unless required by applicable law or agreed to in writing, software
|
|
|
|
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
* See the License for the specific language governing permissions and
|
|
|
|
* limitations under the License.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <config.h>
|
|
|
|
#include "dpdk.h"
|
|
|
|
|
2017-03-06 09:49:11 +03:00
|
|
|
#include <stdio.h>
|
2016-10-04 17:58:05 -07:00
|
|
|
#include <sys/types.h>
|
|
|
|
#include <getopt.h>
|
|
|
|
|
2020-07-13 13:42:13 +01:00
|
|
|
#include <rte_cpuflags.h>
|
2018-05-03 15:08:00 -04:00
|
|
|
#include <rte_errno.h>
|
2017-03-06 09:49:11 +03:00
|
|
|
#include <rte_log.h>
|
2021-05-19 08:00:53 +00:00
|
|
|
#include <rte_malloc.h>
|
2016-10-04 17:58:05 -07:00
|
|
|
#include <rte_memzone.h>
|
2018-01-15 19:21:12 +01:00
|
|
|
#include <rte_version.h>
|
2016-10-04 17:58:05 -07:00
|
|
|
|
|
|
|
#include "dirs.h"
|
2016-10-13 18:27:51 +01:00
|
|
|
#include "fatal-signal.h"
|
2016-10-04 17:58:05 -07:00
|
|
|
#include "netdev-dpdk.h"
|
2019-05-07 12:24:08 +03:00
|
|
|
#include "netdev-offload-provider.h"
|
2016-10-04 17:58:05 -07:00
|
|
|
#include "openvswitch/dynamic-string.h"
|
|
|
|
#include "openvswitch/vlog.h"
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
#include "ovs-atomic.h"
|
2018-05-09 11:14:25 +01:00
|
|
|
#include "ovs-numa.h"
|
2016-10-04 17:58:05 -07:00
|
|
|
#include "smap.h"
|
2019-01-29 11:11:49 +03:00
|
|
|
#include "svec.h"
|
dpdk: Add commands to configure log levels.
Enabling debug logs in dpdk can be a challenge to be sure of what is
actually enabled, add commands to list and change those log levels.
However, these commands do not help when tracking issues in dpdk init
itself: dump log levels right after init.
Example:
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is info
id 1: lib.malloc, level is info
id 2: lib.ring, level is info
id 3: lib.mempool, level is info
id 4: lib.timer, level is info
id 5: pmd, level is info
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is info
[...]
$ ovs-appctl dpdk/log-set debug pmd.*:notice
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is debug
id 1: lib.malloc, level is debug
id 2: lib.ring, level is debug
id 3: lib.mempool, level is debug
id 4: lib.timer, level is debug
id 5: pmd, level is debug
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is notice
[...]
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-13 10:06:21 +02:00
|
|
|
#include "unixctl.h"
|
2019-05-14 16:08:43 +03:00
|
|
|
#include "util.h"
|
2018-05-03 15:08:01 -04:00
|
|
|
#include "vswitch-idl.h"
|
2016-10-04 17:58:05 -07:00
|
|
|
|
|
|
|
VLOG_DEFINE_THIS_MODULE(dpdk);
|
|
|
|
|
2017-03-06 09:49:11 +03:00
|
|
|
static FILE *log_stream = NULL; /* Stream for DPDK log redirection */
|
|
|
|
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
/* Indicates successful initialization of DPDK. */
|
2023-02-28 18:30:56 -08:00
|
|
|
static atomic_bool dpdk_initialized = false;
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
|
2016-10-04 17:58:05 -07:00
|
|
|
static bool
|
2019-01-29 11:11:49 +03:00
|
|
|
args_contains(const struct svec *args, const char *value)
|
2016-10-04 17:58:05 -07:00
|
|
|
{
|
2019-01-29 11:11:49 +03:00
|
|
|
const char *arg;
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
/* We can't just use 'svec_contains' because args are not sorted. */
|
|
|
|
SVEC_FOR_EACH (i, arg, args) {
|
|
|
|
if (!strcmp(arg, value)) {
|
2016-10-04 17:58:05 -07:00
|
|
|
return true;
|
2019-01-29 11:11:49 +03:00
|
|
|
}
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
static void
|
|
|
|
construct_dpdk_options(const struct smap *ovs_other_config, struct svec *args)
|
2016-10-04 17:58:05 -07:00
|
|
|
{
|
|
|
|
struct dpdk_options_map {
|
|
|
|
const char *ovs_configuration;
|
|
|
|
const char *dpdk_option;
|
|
|
|
bool default_enabled;
|
|
|
|
const char *default_value;
|
|
|
|
} opts[] = {
|
dpdk: Limit DPDK memory usage.
Since 18.05 release, DPDK moved to dynamic memory model in which
hugepages could be allocated on demand. At the same time '--socket-mem'
option was re-defined as a size of pre-allocated memory, i.e. memory
that should be allocated at startup and could not be freed.
So, DPDK with a new memory model could allocate more hugepage memory
than specified in '--socket-mem' or '-m' options.
This change adds new configurable 'other_config:dpdk-socket-limit'
which could be used to limit the ammount of memory DPDK could use.
It uses new DPDK option '--socket-limit'.
Ex.:
ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-limit="1024,1024"
Also, in order to preserve old behaviour, if '--socket-limit' is not
specified, it will be defaulted to the amount of memory specified by
'--socket-mem' option, i.e. OVS will not be able to allocate more.
This is needed, for example, to disallow OVS to allocate more memory
than reserved for it by Nova in OpenStack installations.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-01-29 11:11:50 +03:00
|
|
|
{"dpdk-lcore-mask", "-c", false, NULL},
|
|
|
|
{"dpdk-hugepage-dir", "--huge-dir", false, NULL},
|
|
|
|
{"dpdk-socket-limit", "--socket-limit", false, NULL},
|
2016-10-04 17:58:05 -07:00
|
|
|
};
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
int i;
|
2016-10-04 17:58:05 -07:00
|
|
|
|
|
|
|
/*First, construct from the flat-options (non-mutex)*/
|
|
|
|
for (i = 0; i < ARRAY_SIZE(opts); ++i) {
|
2019-01-29 11:11:49 +03:00
|
|
|
const char *value = smap_get(ovs_other_config,
|
|
|
|
opts[i].ovs_configuration);
|
|
|
|
if (!value && opts[i].default_enabled) {
|
|
|
|
value = opts[i].default_value;
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
if (value) {
|
|
|
|
if (!args_contains(args, opts[i].dpdk_option)) {
|
|
|
|
svec_add(args, opts[i].dpdk_option);
|
|
|
|
svec_add(args, value);
|
2016-10-04 17:58:05 -07:00
|
|
|
} else {
|
|
|
|
VLOG_WARN("Ignoring database defined option '%s' due to "
|
2019-01-29 11:11:49 +03:00
|
|
|
"dpdk-extra config", opts[i].dpdk_option);
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
#define MAX_DPDK_EXCL_OPTS 10
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
static void
|
2016-10-04 17:58:05 -07:00
|
|
|
construct_dpdk_mutex_options(const struct smap *ovs_other_config,
|
2019-01-29 11:11:49 +03:00
|
|
|
struct svec *args)
|
2016-10-04 17:58:05 -07:00
|
|
|
{
|
|
|
|
struct dpdk_exclusive_options_map {
|
|
|
|
const char *category;
|
|
|
|
const char *ovs_dpdk_options[MAX_DPDK_EXCL_OPTS];
|
|
|
|
const char *eal_dpdk_options[MAX_DPDK_EXCL_OPTS];
|
|
|
|
const char *default_value;
|
|
|
|
int default_option;
|
|
|
|
} excl_opts[] = {
|
|
|
|
{"memory type",
|
|
|
|
{"dpdk-alloc-mem", "dpdk-socket-mem", NULL,},
|
|
|
|
{"-m", "--socket-mem", NULL,},
|
dpdk: Remove default values for socket-mem and limit.
This change removes the default values for EAL args socket-mem and
socket-limit. As DPDK supports dynamic memory allocation, there is no
need to allocate a certain amount of memory on start-up, nor limit the
amount of memory available, if not requested.
Currently, socket-mem has a default value of 1024 when it is not
configured by the user, and socket-limit takes on the value of
socket-mem, 1024, by default. With this change, socket-mem is not
configured by default, meaning that socket-limit is not either.
Neither, either or both options can be set.
Removed extra logs that announce this change and fixed documentation.
Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1949850
Signed-off-by: Rosemarie O'Riorden <roriorde@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-07-15 17:37:52 -04:00
|
|
|
NULL, 0
|
2016-10-04 17:58:05 -07:00
|
|
|
},
|
|
|
|
};
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
int i;
|
2016-10-04 17:58:05 -07:00
|
|
|
for (i = 0; i < ARRAY_SIZE(excl_opts); ++i) {
|
|
|
|
int found_opts = 0, scan, found_pos = -1;
|
|
|
|
const char *found_value;
|
|
|
|
struct dpdk_exclusive_options_map *popt = &excl_opts[i];
|
|
|
|
|
|
|
|
for (scan = 0; scan < MAX_DPDK_EXCL_OPTS
|
|
|
|
&& popt->ovs_dpdk_options[scan]; ++scan) {
|
2019-01-29 11:11:49 +03:00
|
|
|
const char *value = smap_get(ovs_other_config,
|
|
|
|
popt->ovs_dpdk_options[scan]);
|
|
|
|
if (value && strlen(value)) {
|
2016-10-04 17:58:05 -07:00
|
|
|
found_opts++;
|
|
|
|
found_pos = scan;
|
2019-01-29 11:11:49 +03:00
|
|
|
found_value = value;
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!found_opts) {
|
|
|
|
if (popt->default_option) {
|
|
|
|
found_pos = popt->default_option;
|
|
|
|
found_value = popt->default_value;
|
|
|
|
} else {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (found_opts > 1) {
|
|
|
|
VLOG_ERR("Multiple defined options for %s. Please check your"
|
|
|
|
" database settings and reconfigure if necessary.",
|
|
|
|
popt->category);
|
|
|
|
}
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
if (!args_contains(args, popt->eal_dpdk_options[found_pos])) {
|
|
|
|
svec_add(args, popt->eal_dpdk_options[found_pos]);
|
|
|
|
svec_add(args, found_value);
|
2016-10-04 17:58:05 -07:00
|
|
|
} else {
|
|
|
|
VLOG_WARN("Ignoring database defined option '%s' due to "
|
2019-01-29 11:11:49 +03:00
|
|
|
"dpdk-extra config", popt->eal_dpdk_options[found_pos]);
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
static void
|
|
|
|
construct_dpdk_args(const struct smap *ovs_other_config, struct svec *args)
|
2016-10-04 17:58:05 -07:00
|
|
|
{
|
2019-01-29 11:11:49 +03:00
|
|
|
const char *extra_configuration = smap_get(ovs_other_config, "dpdk-extra");
|
2016-10-04 17:58:05 -07:00
|
|
|
|
|
|
|
if (extra_configuration) {
|
2019-01-29 11:11:49 +03:00
|
|
|
svec_parse_words(args, extra_configuration);
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
construct_dpdk_options(ovs_other_config, args);
|
|
|
|
construct_dpdk_mutex_options(ovs_other_config, args);
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
2017-03-06 09:49:11 +03:00
|
|
|
static ssize_t
|
|
|
|
dpdk_log_write(void *c OVS_UNUSED, const char *buf, size_t size)
|
|
|
|
{
|
2018-03-23 12:56:53 +03:00
|
|
|
static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(600, 600);
|
|
|
|
static struct vlog_rate_limit dbg_rl = VLOG_RATE_LIMIT_INIT(600, 600);
|
2017-03-06 09:49:11 +03:00
|
|
|
|
|
|
|
switch (rte_log_cur_msg_loglevel()) {
|
|
|
|
case RTE_LOG_DEBUG:
|
2019-09-06 13:26:02 +02:00
|
|
|
VLOG_DBG_RL(&dbg_rl, "%.*s", (int) size, buf);
|
2017-03-06 09:49:11 +03:00
|
|
|
break;
|
|
|
|
case RTE_LOG_INFO:
|
|
|
|
case RTE_LOG_NOTICE:
|
2019-09-06 13:26:02 +02:00
|
|
|
VLOG_INFO_RL(&rl, "%.*s", (int) size, buf);
|
2017-03-06 09:49:11 +03:00
|
|
|
break;
|
|
|
|
case RTE_LOG_WARNING:
|
2019-09-06 13:26:02 +02:00
|
|
|
VLOG_WARN_RL(&rl, "%.*s", (int) size, buf);
|
2017-03-06 09:49:11 +03:00
|
|
|
break;
|
|
|
|
case RTE_LOG_ERR:
|
2019-09-06 13:26:02 +02:00
|
|
|
VLOG_ERR_RL(&rl, "%.*s", (int) size, buf);
|
2017-03-06 09:49:11 +03:00
|
|
|
break;
|
|
|
|
case RTE_LOG_CRIT:
|
|
|
|
case RTE_LOG_ALERT:
|
|
|
|
case RTE_LOG_EMERG:
|
2019-09-06 13:26:02 +02:00
|
|
|
VLOG_EMER("%.*s", (int) size, buf);
|
2017-03-06 09:49:11 +03:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
OVS_NOT_REACHED();
|
|
|
|
}
|
|
|
|
|
|
|
|
return size;
|
|
|
|
}
|
|
|
|
|
|
|
|
static cookie_io_functions_t dpdk_log_func = {
|
|
|
|
.write = dpdk_log_write,
|
|
|
|
};
|
|
|
|
|
dpdk: Add commands to configure log levels.
Enabling debug logs in dpdk can be a challenge to be sure of what is
actually enabled, add commands to list and change those log levels.
However, these commands do not help when tracking issues in dpdk init
itself: dump log levels right after init.
Example:
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is info
id 1: lib.malloc, level is info
id 2: lib.ring, level is info
id 3: lib.mempool, level is info
id 4: lib.timer, level is info
id 5: pmd, level is info
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is info
[...]
$ ovs-appctl dpdk/log-set debug pmd.*:notice
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is debug
id 1: lib.malloc, level is debug
id 2: lib.ring, level is debug
id 3: lib.mempool, level is debug
id 4: lib.timer, level is debug
id 5: pmd, level is debug
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is notice
[...]
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-13 10:06:21 +02:00
|
|
|
static void
|
|
|
|
dpdk_unixctl_mem_stream(struct unixctl_conn *conn, int argc OVS_UNUSED,
|
|
|
|
const char *argv[] OVS_UNUSED, void *aux)
|
|
|
|
{
|
|
|
|
void (*callback)(FILE *) = aux;
|
|
|
|
char *response = NULL;
|
|
|
|
FILE *stream;
|
|
|
|
size_t size;
|
|
|
|
|
|
|
|
stream = open_memstream(&response, &size);
|
|
|
|
if (!stream) {
|
|
|
|
response = xasprintf("Unable to open memstream: %s.",
|
|
|
|
ovs_strerror(errno));
|
|
|
|
unixctl_command_reply_error(conn, response);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
callback(stream);
|
|
|
|
fclose(stream);
|
|
|
|
unixctl_command_reply(conn, response);
|
|
|
|
out:
|
|
|
|
free(response);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
dpdk_parse_log_level(const char *s)
|
|
|
|
{
|
|
|
|
static const char * const levels[] = {
|
|
|
|
[RTE_LOG_EMERG] = "emergency",
|
|
|
|
[RTE_LOG_ALERT] = "alert",
|
|
|
|
[RTE_LOG_CRIT] = "critical",
|
|
|
|
[RTE_LOG_ERR] = "error",
|
|
|
|
[RTE_LOG_WARNING] = "warning",
|
|
|
|
[RTE_LOG_NOTICE] = "notice",
|
|
|
|
[RTE_LOG_INFO] = "info",
|
|
|
|
[RTE_LOG_DEBUG] = "debug",
|
|
|
|
};
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 1; i < ARRAY_SIZE(levels); ++i) {
|
|
|
|
if (!strcmp(s, levels[i])) {
|
|
|
|
return i;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
dpdk_unixctl_log_set(struct unixctl_conn *conn, int argc, const char *argv[],
|
|
|
|
void *aux OVS_UNUSED)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/* With no argument, set all components level to 'debug'. */
|
|
|
|
if (argc == 1) {
|
|
|
|
rte_log_set_level_pattern("*", RTE_LOG_DEBUG);
|
|
|
|
}
|
|
|
|
for (i = 1; i < argc; i++) {
|
|
|
|
char *err_msg = NULL;
|
|
|
|
char *level_string;
|
|
|
|
char *pattern;
|
|
|
|
char *s;
|
|
|
|
int level;
|
|
|
|
|
|
|
|
s = xstrdup(argv[i]);
|
|
|
|
level_string = strchr(s, ':');
|
|
|
|
if (level_string == NULL) {
|
|
|
|
pattern = "*";
|
|
|
|
level_string = s;
|
|
|
|
} else {
|
|
|
|
pattern = s;
|
|
|
|
level_string[0] = '\0';
|
|
|
|
level_string++;
|
|
|
|
}
|
|
|
|
|
|
|
|
level = dpdk_parse_log_level(level_string);
|
|
|
|
if (level == -1) {
|
|
|
|
err_msg = xasprintf("invalid log level: '%s'", level_string);
|
|
|
|
} else if (rte_log_set_level_pattern(pattern, level) < 0) {
|
|
|
|
err_msg = xasprintf("cannot set log level for '%s'", argv[i]);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (err_msg) {
|
|
|
|
unixctl_command_reply_error(conn, err_msg);
|
|
|
|
free(err_msg);
|
|
|
|
free(s);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
free(s);
|
|
|
|
}
|
|
|
|
unixctl_command_reply(conn, NULL);
|
|
|
|
}
|
|
|
|
|
2021-05-19 08:00:53 +00:00
|
|
|
static void
|
|
|
|
malloc_dump_stats_wrapper(FILE *stream)
|
|
|
|
{
|
|
|
|
rte_malloc_dump_stats(stream, NULL);
|
|
|
|
}
|
|
|
|
|
2018-05-03 15:08:00 -04:00
|
|
|
static bool
|
2016-10-04 17:58:05 -07:00
|
|
|
dpdk_init__(const struct smap *ovs_other_config)
|
|
|
|
{
|
2019-01-29 11:11:49 +03:00
|
|
|
char **argv = NULL;
|
2016-10-04 17:58:05 -07:00
|
|
|
int result;
|
|
|
|
bool auto_determine = true;
|
2019-08-13 14:28:26 +03:00
|
|
|
struct ovs_numa_dump *affinity = NULL;
|
2019-01-29 11:11:49 +03:00
|
|
|
struct svec args = SVEC_EMPTY_INITIALIZER;
|
2016-10-04 17:58:05 -07:00
|
|
|
|
2017-03-06 09:49:11 +03:00
|
|
|
log_stream = fopencookie(NULL, "w+", dpdk_log_func);
|
|
|
|
if (log_stream == NULL) {
|
|
|
|
VLOG_ERR("Can't redirect DPDK log: %s.", ovs_strerror(errno));
|
|
|
|
} else {
|
|
|
|
rte_openlog_stream(log_stream);
|
|
|
|
}
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
svec_add(&args, ovs_get_program_name());
|
|
|
|
construct_dpdk_args(ovs_other_config, &args);
|
2016-10-04 17:58:05 -07:00
|
|
|
|
2021-08-06 08:44:32 -04:00
|
|
|
#ifdef DPDK_IN_MEMORY_SUPPORTED
|
|
|
|
if (!args_contains(&args, "--in-memory") &&
|
|
|
|
!args_contains(&args, "--legacy-mem")) {
|
|
|
|
svec_add(&args, "--in-memory");
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2024-06-13 10:05:00 +01:00
|
|
|
if (args_contains(&args, "-c") ||
|
|
|
|
args_contains(&args, "-l") ||
|
|
|
|
args_contains(&args, "--lcores")) {
|
2019-01-29 11:11:49 +03:00
|
|
|
auto_determine = false;
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* NOTE: This is an unsophisticated mechanism for determining the DPDK
|
2020-12-15 16:41:28 +00:00
|
|
|
* main core.
|
2016-10-04 17:58:05 -07:00
|
|
|
*/
|
|
|
|
if (auto_determine) {
|
2019-08-13 14:28:26 +03:00
|
|
|
const struct ovs_numa_info_core *core;
|
2019-01-29 11:11:49 +03:00
|
|
|
int cpu = 0;
|
|
|
|
|
2016-10-04 17:58:05 -07:00
|
|
|
/* Get the main thread affinity */
|
2019-08-13 14:28:26 +03:00
|
|
|
affinity = ovs_numa_thread_getaffinity_dump();
|
|
|
|
if (affinity) {
|
|
|
|
cpu = INT_MAX;
|
|
|
|
FOR_EACH_CORE_ON_DUMP (core, affinity) {
|
|
|
|
if (cpu > core->core_id) {
|
|
|
|
cpu = core->core_id;
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
/* User did not set dpdk-lcore-mask and unable to get current
|
2019-01-29 11:11:49 +03:00
|
|
|
* thread affintity - default to core #0 */
|
2019-08-13 14:28:26 +03:00
|
|
|
VLOG_ERR("Thread getaffinity failed. Using core #0");
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
2025-04-17 13:30:52 +02:00
|
|
|
svec_add(&args, "--lcores");
|
|
|
|
svec_add_nocopy(&args, xasprintf("0@%d", cpu));
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
svec_terminate(&args);
|
2016-10-04 17:58:05 -07:00
|
|
|
|
|
|
|
optind = 1;
|
|
|
|
|
|
|
|
if (VLOG_IS_INFO_ENABLED()) {
|
2019-01-29 11:11:49 +03:00
|
|
|
struct ds eal_args = DS_EMPTY_INITIALIZER;
|
|
|
|
char *joined_args = svec_join(&args, " ", ".");
|
|
|
|
|
|
|
|
ds_put_format(&eal_args, "EAL ARGS: %s", joined_args);
|
2016-10-04 17:58:05 -07:00
|
|
|
VLOG_INFO("%s", ds_cstr_ro(&eal_args));
|
|
|
|
ds_destroy(&eal_args);
|
2019-01-29 11:11:49 +03:00
|
|
|
free(joined_args);
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
2019-01-29 11:11:49 +03:00
|
|
|
/* Copy because 'rte_eal_init' will change the argv, i.e. it will remove
|
|
|
|
* some arguments from it. '+1' to copy the terminating NULL. */
|
|
|
|
argv = xmemdup(args.names, (args.n + 1) * sizeof args.names[0]);
|
2016-12-09 11:22:27 -05:00
|
|
|
|
2016-10-04 17:58:05 -07:00
|
|
|
/* Make sure things are initialized ... */
|
2019-01-29 11:11:49 +03:00
|
|
|
result = rte_eal_init(args.n, argv);
|
|
|
|
|
|
|
|
free(argv);
|
|
|
|
svec_destroy(&args);
|
2016-10-04 17:58:05 -07:00
|
|
|
|
|
|
|
/* Set the main thread affinity back to pre rte_eal_init() value */
|
2019-08-13 14:28:26 +03:00
|
|
|
if (affinity) {
|
|
|
|
ovs_numa_thread_setaffinity_dump(affinity);
|
|
|
|
ovs_numa_dump_destroy(affinity);
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
2018-05-03 15:08:00 -04:00
|
|
|
if (result < 0) {
|
|
|
|
VLOG_EMER("Unable to initialize DPDK: %s", ovs_strerror(rte_errno));
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
if (!rte_mp_disable()) {
|
|
|
|
VLOG_EMER("Could not disable multiprocess, DPDK won't be available.");
|
|
|
|
rte_eal_cleanup();
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2019-03-14 17:43:48 +03:00
|
|
|
if (VLOG_IS_DBG_ENABLED()) {
|
|
|
|
size_t size;
|
|
|
|
char *response = NULL;
|
|
|
|
FILE *stream = open_memstream(&response, &size);
|
|
|
|
|
|
|
|
if (stream) {
|
dpdk: Add commands to configure log levels.
Enabling debug logs in dpdk can be a challenge to be sure of what is
actually enabled, add commands to list and change those log levels.
However, these commands do not help when tracking issues in dpdk init
itself: dump log levels right after init.
Example:
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is info
id 1: lib.malloc, level is info
id 2: lib.ring, level is info
id 3: lib.mempool, level is info
id 4: lib.timer, level is info
id 5: pmd, level is info
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is info
[...]
$ ovs-appctl dpdk/log-set debug pmd.*:notice
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is debug
id 1: lib.malloc, level is debug
id 2: lib.ring, level is debug
id 3: lib.mempool, level is debug
id 4: lib.timer, level is debug
id 5: pmd, level is debug
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is notice
[...]
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-13 10:06:21 +02:00
|
|
|
fprintf(stream, "rte_memzone_dump:\n");
|
2019-03-14 17:43:48 +03:00
|
|
|
rte_memzone_dump(stream);
|
dpdk: Add commands to configure log levels.
Enabling debug logs in dpdk can be a challenge to be sure of what is
actually enabled, add commands to list and change those log levels.
However, these commands do not help when tracking issues in dpdk init
itself: dump log levels right after init.
Example:
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is info
id 1: lib.malloc, level is info
id 2: lib.ring, level is info
id 3: lib.mempool, level is info
id 4: lib.timer, level is info
id 5: pmd, level is info
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is info
[...]
$ ovs-appctl dpdk/log-set debug pmd.*:notice
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is debug
id 1: lib.malloc, level is debug
id 2: lib.ring, level is debug
id 3: lib.mempool, level is debug
id 4: lib.timer, level is debug
id 5: pmd, level is debug
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is notice
[...]
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-13 10:06:21 +02:00
|
|
|
fprintf(stream, "rte_log_dump:\n");
|
|
|
|
rte_log_dump(stream);
|
2019-03-14 17:43:48 +03:00
|
|
|
fclose(stream);
|
dpdk: Add commands to configure log levels.
Enabling debug logs in dpdk can be a challenge to be sure of what is
actually enabled, add commands to list and change those log levels.
However, these commands do not help when tracking issues in dpdk init
itself: dump log levels right after init.
Example:
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is info
id 1: lib.malloc, level is info
id 2: lib.ring, level is info
id 3: lib.mempool, level is info
id 4: lib.timer, level is info
id 5: pmd, level is info
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is info
[...]
$ ovs-appctl dpdk/log-set debug pmd.*:notice
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is debug
id 1: lib.malloc, level is debug
id 2: lib.ring, level is debug
id 3: lib.mempool, level is debug
id 4: lib.timer, level is debug
id 5: pmd, level is debug
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is notice
[...]
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-13 10:06:21 +02:00
|
|
|
VLOG_DBG("%s", response);
|
2019-03-14 17:43:48 +03:00
|
|
|
free(response);
|
|
|
|
} else {
|
dpdk: Add commands to configure log levels.
Enabling debug logs in dpdk can be a challenge to be sure of what is
actually enabled, add commands to list and change those log levels.
However, these commands do not help when tracking issues in dpdk init
itself: dump log levels right after init.
Example:
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is info
id 1: lib.malloc, level is info
id 2: lib.ring, level is info
id 3: lib.mempool, level is info
id 4: lib.timer, level is info
id 5: pmd, level is info
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is info
[...]
$ ovs-appctl dpdk/log-set debug pmd.*:notice
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is debug
id 1: lib.malloc, level is debug
id 2: lib.ring, level is debug
id 3: lib.mempool, level is debug
id 4: lib.timer, level is debug
id 5: pmd, level is debug
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is notice
[...]
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-13 10:06:21 +02:00
|
|
|
VLOG_DBG("Could not dump memzone and log levels. "
|
|
|
|
"Unable to open memstream: %s.", ovs_strerror(errno));
|
2019-03-14 17:43:48 +03:00
|
|
|
}
|
|
|
|
}
|
2016-10-04 17:58:05 -07:00
|
|
|
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
unixctl_command_register("dpdk/lcore-list", "", 0, 0,
|
|
|
|
dpdk_unixctl_mem_stream, rte_lcore_dump);
|
dpdk: Add commands to configure log levels.
Enabling debug logs in dpdk can be a challenge to be sure of what is
actually enabled, add commands to list and change those log levels.
However, these commands do not help when tracking issues in dpdk init
itself: dump log levels right after init.
Example:
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is info
id 1: lib.malloc, level is info
id 2: lib.ring, level is info
id 3: lib.mempool, level is info
id 4: lib.timer, level is info
id 5: pmd, level is info
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is info
[...]
$ ovs-appctl dpdk/log-set debug pmd.*:notice
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is debug
id 1: lib.malloc, level is debug
id 2: lib.ring, level is debug
id 3: lib.mempool, level is debug
id 4: lib.timer, level is debug
id 5: pmd, level is debug
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is notice
[...]
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-13 10:06:21 +02:00
|
|
|
unixctl_command_register("dpdk/log-list", "", 0, 0,
|
|
|
|
dpdk_unixctl_mem_stream, rte_log_dump);
|
|
|
|
unixctl_command_register("dpdk/log-set", "{level | pattern:level}", 0,
|
|
|
|
INT_MAX, dpdk_unixctl_log_set, NULL);
|
2021-05-19 08:00:53 +00:00
|
|
|
unixctl_command_register("dpdk/get-malloc-stats", "", 0, 0,
|
|
|
|
dpdk_unixctl_mem_stream,
|
|
|
|
malloc_dump_stats_wrapper);
|
2025-01-21 12:41:38 +02:00
|
|
|
unixctl_command_register("dpdk/get-memzone-stats", "", 0, 0,
|
|
|
|
dpdk_unixctl_mem_stream, rte_memzone_dump);
|
dpdk: Add commands to configure log levels.
Enabling debug logs in dpdk can be a challenge to be sure of what is
actually enabled, add commands to list and change those log levels.
However, these commands do not help when tracking issues in dpdk init
itself: dump log levels right after init.
Example:
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is info
id 1: lib.malloc, level is info
id 2: lib.ring, level is info
id 3: lib.mempool, level is info
id 4: lib.timer, level is info
id 5: pmd, level is info
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is info
[...]
$ ovs-appctl dpdk/log-set debug pmd.*:notice
$ ovs-appctl dpdk/log-list
global log level is debug
id 0: lib.eal, level is debug
id 1: lib.malloc, level is debug
id 2: lib.ring, level is debug
id 3: lib.mempool, level is debug
id 4: lib.timer, level is debug
id 5: pmd, level is debug
[...]
id 37: pmd.net.bnxt.driver, level is notice
id 38: pmd.net.e1000.init, level is notice
id 39: pmd.net.e1000.driver, level is notice
id 40: pmd.net.enic, level is notice
[...]
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-13 10:06:21 +02:00
|
|
|
|
2016-10-04 17:58:05 -07:00
|
|
|
/* We are called from the main thread here */
|
|
|
|
RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID;
|
|
|
|
|
|
|
|
/* Finally, register the dpdk classes */
|
netdev-dpdk: Add shared mempool config.
Mempools may currently be shared between DPDK ports based
on port MTU and NUMA. With some hint from the user we can
increase the sharing on MTU and hence reduce memory
consumption in many cases.
For example, a port with MTU 9000, uses a mempool with an
mbuf size based on 9000 MTU. A port with MTU 1500, uses a
different mempool with an mbuf size based on 1500 MTU.
In this case, assuming same NUMA, both these ports could
share the 9000 MTU mempool.
The user must give a hint as order of creation of ports and
setting of MTUs may vary and we need to ensure that upgrades
from older OVS versions do not require more memory.
This scheme can also prevent multiple mempools being created
for cases where a port is added picking up a default MTU and
an appropriate mempool, but later has it's MTU changed to a
different value requiring a different mempool.
Example usage:
$ ovs-vsctl --no-wait set Open_vSwitch . \
other_config:shared-mempool-config=9000,1500:1,6000:1
Port added on NUMA 0:
* MTU 1500, use mempool based on 9000 MTU
* MTU 5000, use mempool based on 9000 MTU
* MTU 9000, use mempool based on 9000 MTU
* MTU 9300, use mempool based on 9300 MTU (existing behaviour)
Port added on NUMA 1:
* MTU 1500, use mempool based on 1500 MTU
* MTU 5000, use mempool based on 6000 MTU
* MTU 9000, use mempool based on 9000 MTU
* MTU 9300, use mempool based on 9300 MTU (existing behaviour)
Default behaviour is unchanged and mempools are still only created
when needed.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sunil Pai G <sunil.pai.g@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2022-06-24 11:13:23 +01:00
|
|
|
netdev_dpdk_register(ovs_other_config);
|
2019-05-07 12:24:09 +03:00
|
|
|
netdev_register_flow_api_provider(&netdev_offload_dpdk);
|
2018-05-03 15:08:00 -04:00
|
|
|
return true;
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
dpdk_init(const struct smap *ovs_other_config)
|
|
|
|
{
|
2016-10-04 17:58:05 -07:00
|
|
|
static bool enabled = false;
|
|
|
|
|
|
|
|
if (enabled || !ovs_other_config) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2018-05-03 15:08:01 -04:00
|
|
|
const char *dpdk_init_val = smap_get_def(ovs_other_config, "dpdk-init",
|
|
|
|
"false");
|
|
|
|
|
2019-03-01 14:59:33 +03:00
|
|
|
bool try_only = !strcasecmp(dpdk_init_val, "try");
|
|
|
|
if (!strcasecmp(dpdk_init_val, "true") || try_only) {
|
2016-10-04 17:58:05 -07:00
|
|
|
static struct ovsthread_once once_enable = OVSTHREAD_ONCE_INITIALIZER;
|
2016-10-04 17:58:05 -07:00
|
|
|
|
2016-10-04 17:58:05 -07:00
|
|
|
if (ovsthread_once_start(&once_enable)) {
|
2018-01-15 19:21:12 +01:00
|
|
|
VLOG_INFO("Using %s", rte_version());
|
2016-10-04 17:58:05 -07:00
|
|
|
VLOG_INFO("DPDK Enabled - initializing...");
|
2018-05-03 15:08:00 -04:00
|
|
|
enabled = dpdk_init__(ovs_other_config);
|
|
|
|
if (enabled) {
|
|
|
|
VLOG_INFO("DPDK Enabled - initialized");
|
2018-05-03 15:08:01 -04:00
|
|
|
} else if (!try_only) {
|
2018-05-03 15:08:00 -04:00
|
|
|
ovs_abort(rte_errno, "Cannot init EAL");
|
|
|
|
}
|
2016-10-04 17:58:05 -07:00
|
|
|
ovsthread_once_done(&once_enable);
|
2018-05-03 15:08:00 -04:00
|
|
|
} else {
|
|
|
|
VLOG_ERR_ONCE("DPDK Initialization Failed.");
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
} else {
|
2017-03-08 15:44:39 -08:00
|
|
|
VLOG_INFO_ONCE("DPDK Disabled - Use other_config:dpdk-init to enable");
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
atomic_store_relaxed(&dpdk_initialized, enabled);
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
|
|
|
|
2019-07-05 08:43:15 -04:00
|
|
|
bool
|
|
|
|
dpdk_available(void)
|
|
|
|
{
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
bool initialized;
|
|
|
|
|
|
|
|
atomic_read_relaxed(&dpdk_initialized, &initialized);
|
|
|
|
return initialized;
|
2019-07-05 08:43:15 -04:00
|
|
|
}
|
|
|
|
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
bool
|
|
|
|
dpdk_attach_thread(unsigned cpu)
|
2016-10-04 17:58:05 -07:00
|
|
|
{
|
|
|
|
/* NON_PMD_CORE_ID is reserved for use by non pmd threads. */
|
|
|
|
ovs_assert(cpu != NON_PMD_CORE_ID);
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
|
|
|
|
if (!dpdk_available()) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (rte_thread_register() < 0) {
|
|
|
|
VLOG_WARN("DPDK max threads count has been reached. "
|
|
|
|
"PMD thread performance may be impacted.");
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
VLOG_INFO("PMD thread uses DPDK lcore %u.", rte_lcore_id());
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
dpdk_detach_thread(void)
|
|
|
|
{
|
|
|
|
unsigned int lcore_id;
|
|
|
|
|
|
|
|
lcore_id = rte_lcore_id();
|
|
|
|
rte_thread_unregister();
|
|
|
|
VLOG_INFO("PMD thread released DPDK lcore %u.", lcore_id);
|
2016-10-04 17:58:05 -07:00
|
|
|
}
|
2018-01-15 19:21:12 +01:00
|
|
|
|
|
|
|
void
|
|
|
|
print_dpdk_version(void)
|
|
|
|
{
|
|
|
|
puts(rte_version());
|
|
|
|
}
|
2018-05-03 15:08:01 -04:00
|
|
|
|
|
|
|
void
|
|
|
|
dpdk_status(const struct ovsrec_open_vswitch *cfg)
|
|
|
|
{
|
|
|
|
if (cfg) {
|
dpdk: Support running PMD threads on any core.
Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.
DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.
The DPDK multiprocess feature is not compatible with this new API and is
disabled.
DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).
DPDK lcore/OVS pmd threads mapping are logged at threads when trying to
attach a OVS PMD thread, and when detaching.
A new command is added to help get DPDK point of view of the DPDK lcores
at any time:
$ ovs-appctl dpdk/lcore-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-10 17:53:41 +01:00
|
|
|
ovsrec_open_vswitch_set_dpdk_initialized(cfg, dpdk_available());
|
2018-05-03 15:08:01 -04:00
|
|
|
ovsrec_open_vswitch_set_dpdk_version(cfg, rte_version());
|
|
|
|
}
|
|
|
|
}
|