mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-26 03:47:27 +00:00

Author	SHA1	Message	Date
Ilya Maximets	9474073615	netdev-dpdk: Dump flow patterns only if debug enabled. No need to waste time for fields checking in case DBG disabled. Additionally sequence of prints replaced with single print to avoid output interrupting by other log messages. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-11-02 15:16:14 +00:00
Ilya Maximets	faf71e4922	netdev-dpdk: Print port name in offload API messages. This is useful for understanding which flows offloaded to which ports. Code refactored a bit to reduce number of casts. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-11-02 15:15:27 +00:00
Ilya Maximets	5752eae485	dpif-netdev: Fix cmap node use after free on flow disassociation. Data pointed by cmap node must not be freed while iterating. ovsrcu_postpone should be used instead. CC: Finn Christensen <fc@napatech.com> Fixes: e8a2b5bf92bb ("netdev-dpdk: implement flow offload with rte flow") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-11-02 15:13:54 +00:00
Ilya Maximets	95ca79d542	netdev-dpdk: Secure flow offload API. rte API is not thread safe. We have to get netdev mutex before uing it and also before using fields of netdev structure. This is important because offload API used from the separate thread and could be used at the same time with other netdev functions called from the main thread. CC: Finn Christensen <fc@napatech.com> Fixes: e8a2b5bf92bb ("netdev-dpdk: implement flow offload with rte flow") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-11-02 15:13:40 +00:00
Ilya Maximets	c0af6425d7	netdev-dpdk: Drop offload API for vhost ports. vhost ports are not DPDK eth ports and has no rte_flow API. Stop calling this API with DPDK_ETH_PORT_ID_INVALID to avoid time wasting and errors in log. Additionally, DPDK_FLOW_OFFLOAD_API definition moved to .c file, because there is no need to expose it in header. CC: Finn Christensen <fc@napatech.com> Fixes: e8a2b5bf92bb ("netdev-dpdk: implement flow offload with rte flow") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-11-02 15:13:19 +00:00
Ben Pfaff	89c09c1cd1	netdev: Clean up class initialization. The macros are hard to read. This makes it a little more readable. Signed-off-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-08-27 17:48:23 +01:00
Xu Binbin	74cd69a479	netdev-dpdk: Support the link speed of XL710 In the scenario of XL710, the link speed which stored in the table of Interface is not 40G. Because the implementation of query of link speed only support to 10G, the parameter 'current' will be a random value in the scenario of higher link speed. In this case, incorrect link speed of XL710 nic will be stored in the database. Signed-off-by: Xu Binbin <xu.binbin1@zte.com.cn> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-08-27 17:48:23 +01:00
Kevin Traynor	51c6a5a3c8	netdev-dpdk: Use hex for PCI vendor ID. Match the prefix and formatting. Fixes: 8a9562d21a40 ("dpif-netdev: Add DPDK netdev.") Cc: pshelar@ovn.org Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-08-08 22:06:21 +01:00
Sugesh Chandran	7e1de65e8d	netdev-dpdk: Fix failure to configure flow control at netdev-init. Configuring flow control at ixgbe netdev-init is throwing error in port start. For eg: without this fix, user cannot configure flow control on ixgbe dpdk port as below, " ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ options:dpdk-devargs=0000:05:00.1 options:rx-flow-ctrl=true " Instead, it must be configured as two different commands, " ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ options:dpdk-devargs=0000:05:00.1 ovs-vsctl set Interface dpdk0 options:rx-flow-ctrl=true " The DPDK ixgbe driver is now validating all the 'rte_eth_fc_conf' fields before trying to configuring the dpdk ethdev. Hence OVS can no longer set the 'dont care' fields to just '0' as before. This commit make sure all the 'rte_eth_fc_conf' fields are populated with default values before the dev init. Also to avoid read error on unsupported ports, the flow control parameters are now read only when user is trying to configure/update it. Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-08-08 22:06:21 +01:00
Ben Pfaff	773c3cb40f	netdev-dpdk: Use ETH_ADDR_BYTES_ARGS instead of open-coding it. Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-07-24 22:36:38 +01:00
Ben Pfaff	31a033cb71	netdev-dpdk: Fix sparse complaints. Neither of these is a real problem. Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-07-24 22:36:29 +01:00
Ben Pfaff	2b7b5dbb07	netdev-dpdk: Fix incorrect byte order conversion in log message. uint8_t values shouldn't be passed to ntohs(). Found by soarse. Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-07-24 22:36:21 +01:00
Ian Stokes	43307ad0e2	dpdk: Support both shared and per port mempools. This commit re-introduces the concept of shared mempools as the default memory model for DPDK devices. Per port mempools are still available but must be enabled explicitly by a user. OVS previously used a shared mempool model for ports with the same MTU and socket configuration. This was replaced by a per port mempool model to address issues flagged by users such as: https://mail.openvswitch.org/pipermail/ovs-discuss/2016-September/042560.html However the per port model potentially requires an increase in memory resource requirements to support the same number of ports and configuration as the shared port model. This is considered a blocking factor for current deployments of OVS when upgrading to future OVS releases as a user may have to redimension memory for the same deployment configuration. This may not be possible for users. This commit resolves the issue by re-introducing shared mempools as the default memory behaviour in OVS DPDK but also refactors the memory configuration code to allow for per port mempools. This patch adds a new global config option, per-port-memory, that controls the enablement of per port mempools for DPDK devices. ovs-vsctl set Open_vSwitch . other_config:per-port-memory=true This value defaults to false; to enable per port memory support, this field should be set to true when setting other global parameters on init (such as "dpdk-socket-mem", for example). Changing the value at runtime is not supported, and requires restarting the vswitch daemon. The mempool sweep functionality is also replaced with the sweep functionality from OVS 2.9 found in commits c77f692 (netdev-dpdk: Free mempool only when no in-use mbufs.) a7fb0a4 (netdev-dpdk: Add mempool reuse/free debug.) A new document to discuss the specifics of the memory models and example memory requirement calculations is also added. Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Tiago Lam <tiago.lam@intel.com> Tested-by: Tiago Lam <tiago.lam@intel.com>	2018-07-06 12:46:26 +01:00
Yuanhan Liu	daf90186e2	netdev-dpdk: add debug for rte flow patterns For debug purpose. Co-authored-by: Finn Christensen <fc@napatech.com> Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org> Signed-off-by: Finn Christensen <fc@napatech.com> Co-authored-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-07-06 10:32:52 +01:00
Finn Christensen	e8a2b5bf92	netdev-dpdk: implement flow offload with rte flow The basic yet the major part of this patch is to translate the "match" to rte flow patterns. And then, we create a rte flow with MARK + RSS actions. Afterwards, all packets match the flow will have the mark id in the mbuf. The reason RSS is needed is, for most NICs, a MARK only action is not allowed. It has to be used together with some other actions, such as QUEUE, RSS, etc. However, QUEUE action can specify one queue only, which may break the rss. Likely, RSS action is currently the best we could now. Thus, RSS action is choosen. For any unsupported flows, such as MPLS, -1 is returned, meaning the flow offload is failed and then skipped. Co-authored-by: Yuanhan Liu <yliu@fridaylinux.org> Signed-off-by: Finn Christensen <fc@napatech.com> Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org> Co-authored-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-07-06 10:32:52 +01:00
John Hurley	88dcf2aa82	netdev-provider: add class op to get block_id Add a new class op for netdevs to get the block_id if one exists. The block_id is used in offload ops to group multiple qdiscs together. Stub calls are made to the new class op (implementation to follow in further patches). The default block_id of 0 (no block) will be used in these cases. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2018-06-29 14:51:47 +02:00
Aaron Conole	b9a3183d3a	netdev-dpdk: Avoid warning for snprintf() call. lib/netdev-dpdk.c: In function : lib/netdev-dpdk.c:2865:49: warning: output may be truncated before the last format character [-Wformat-truncation=] snprintf(vhost_vring, 16, "vring_%d_size", i); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ilya Maximets <i.maximets@samsung.com>	2018-06-15 11:26:14 -07:00
Ian Stokes	4dd16ca0c3	netdev-dpdk: Handle ENOTSUP for rte_eth_dev_set_mtu. The function rte_eth_dev_set_mtu is not supported for all DPDK drivers. Currently if it is not supported we return an error in dpdk_eth_dev_queue_setup. There are two issues with this. (i) A device can still function even if rte_eth_dev_set_mtu is not supported albeit with the default max rx packet length. (ii) When ENOTSUP is returned it will not be caught in port_reconfigure() at the dpif-netdev layer. Port_reconfigure() checks if a netdev_reconfigure() function is supported for a given netdev and ignores EOPNOTSUPP errors as it assumes errors of this value mean there is no reconfiguration function. In this case the reconfiguration function is supported for netdev dpdk but a function called as part of the reconfigure (rte_eth_dev_set_mtu) may not be supported. As this is a corner case, this commit warns a user when rte_eth_dev_set_mtu is not supported and informs them of the default max rx packet length that will be used instead. Signed-off-by: Ian Stokes <ian.stokes@intel.com> Co-author: Michal Weglicki <michalx.weglicki@intel.com> Tested-By: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Tested-by: Cian Ferriter <cian.ferriter@intel.com>	2018-06-08 17:27:56 +01:00
Michal Weglicki	e10ca8b921	netdev-dpdk: Enable HW_CRC_STRIP for virtual functions. Virtual functions such as igb_vf and i40e_vf require HW_CRC_STRIP to be explicitly enabled before configuration, otherwise device configuration will fail. This commit achieves this by adding NETDEV_RX_HW_CRC_STRIP to dpdk_hw_ol_features. When a dpdk device is added, the driver for the device is examined, if the device is a virtual function enable HW_CRC_STRIP. Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> Co-Authored: Ian Stokes <ian.stokes@intel.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Tested-by: Cian Ferriter <cian.ferriter@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-06-08 17:27:56 +01:00
Timothy Redaelli	7bbc2e1def	netdev-dpdk: fix check for "net_nfp" driver Currently the check of "net_nfp" driver while enabling scatter compares only the first 6 bytes, but "net_nfp" is 7 bytes long. This change fixes the check by comparing the first 7 bytes. CC: Pablo Cascón <pablo.cascon@netronome.com> CC: Simon Horman <simon.horman@netronome.com> Fixes: 65a87968f4cf ("netdev-dpdk: don't enable scatter for jumbo RX support for nfp") Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Acked-by: Pablo Cascón <pablo.cascon@netronome.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-05-25 09:09:50 +01:00
Eelco Chaudron	606f665072	netdev-dpdk: Don't use PMD driver if not configured successfully When initialization of the DPDK PMD driver fails (dpdk_eth_dev_init()), the reconfigure_datapath() function will remove the port from dp_netdev, and the port is not used. Now when bridge_reconfigure() is called again, no changes to the previous failing netdev configuration are detected and therefore the ports gets added to dp_netdev and used uninitialized. This is causing exceptions... The fix has two parts to it. First in netdev-dpdk.c we remember if the DPDK port was started or not, and when calling netdev_dpdk_reconfigure() we also try re-initialization if the port was not already active. The second part of the change is in dpif-netdev.c where it makes sure netdev_reconfigure() is called if the port needs reconfiguration, as netdev_is_reconf_required() is only true until netdev_reconfigure() is called (even if it fails). Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Tested-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-05-25 09:09:50 +01:00
Kevin Traynor	1f84a2d5b5	netdev-dpdk: Remove use of rte_mempool_ops_get_count. rte_mempool_ops_get_count is not exported by DPDK so it means it cannot be used by OVS when using DPDK as a shared library. Remove rte_mempool_ops_get_count but still use rte_mempool_full and document it's behavior. Fixes: 91fccdad72a2 ("netdev-dpdk: Free mempool only when no in-use mbufs.") Reported-by: Timothy Redaelli <tredaelli@redhat.com> Reported-by: Markos Chandras <mchandras@suse.de> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-05-25 09:09:50 +01:00
Darrell Ball	7d7ded7af7	odp-execute: Rename 'may_steal' to 'should_steal'. Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-05-23 11:36:47 -07:00
Eelco Chaudron	eaa4358119	netdev-dpdk: Fixed netdev_dpdk structure alignment Currently, the code tells us we have 4 pad bytes left in cacheline0 while actually we are 8 bytes short: struct netdev_dpdk { union { OVS_CACHE_LINE_MARKER cacheline0; /* 1 / struct { dpdk_port_t port_id; / 0 2 / _Bool attached; / 2 1 / struct eth_addr hwaddr; / 4 6 / int mtu; / 12 4 / int socket_id; / 16 4 / int buf_size; / 20 4 / int max_packet_len; / 24 4 / enum dpdk_dev_type type; / 28 4 / enum netdev_flags flags; / 32 4 / char devargs; /* 40 8 / struct dpdk_tx_queue tx_q; /* 48 8 / struct rte_eth_link link; / 56 8 / int link_reset_cnt; / 64 4 / }; / 72 / uint8_t pad9[128]; / 128 / }; / 0 128 / / --- cacheline 2 boundary (128 bytes) --- / Re-located one member, link_reset_cnt, and now it's one cache line: struct netdev_dpdk { union { OVS_CACHE_LINE_MARKER cacheline0; / 1 / struct { dpdk_port_t port_id; / 0 2 / _Bool attached; / 2 1 / struct eth_addr hwaddr; / 4 6 / int mtu; / 12 4 / int socket_id; / 16 4 / int buf_size; / 20 4 / int max_packet_len; / 24 4 / enum dpdk_dev_type type; / 28 4 / enum netdev_flags flags; / 32 4 / int link_reset_cnt; / 36 4 / char devargs; /* 40 8 / struct dpdk_tx_queue tx_q; /* 48 8 / struct rte_eth_link link; / 56 8 / }; / 64 / uint8_t pad9[64]; / 64 / }; / 0 64 / / --- cacheline 1 boundary (64 bytes) --- */ Fixes: 5e925ccc2a6f ("netdev-dpdk: DPDK v17.11 upgrade") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Tiago Lam <tiago.lam@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-05-11 08:08:24 +01:00
Róbert Mulik	f8b64a61bc	Configurable Link State Change (LSC) detection mode It is possible to set LSC detection mode to polling or interrupt mode for DPDK interfaces. The default is polling mode. To set interrupt mode, option dpdk-lsc-interrupt has to be set to true. For detailed description and usage see the dpdk install documentation. Signed-off-by: Robert Mulik <robert.mulik@ericsson.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-05-11 08:08:24 +01:00
Jan Scheurich	8492adc270	netdev: Add optional qfill output parameter to rxq_recv() If the caller provides a non-NULL qfill pointer and the netdev implemementation supports reading the rx queue fill level, the rxq_recv() function returns the remaining number of packets in the rx queue after reception of the packet burst to the caller. If the implementation does not support this, it returns -ENOTSUP instead. Reading the remaining queue fill level should not substantilly slow down the recv() operation. A first implementation is provided for ethernet and vhostuser DPDK ports in netdev-dpdk.c. This output parameter will be used in the upcoming commit for PMD performance metrics to supervise the rx queue fill level for DPDK vhostuser ports. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Acked-by: Billy O'Mahony <billy.o.mahony@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-05-11 08:08:24 +01:00
Pablo Cascón	65a87968f4	netdev-dpdk: don't enable scatter for jumbo RX support for nfp Currently to RX jumbo packets fails for NICs not supporting scatter. Scatter is not strictly needed for jumbo RX support. This change fixes the issue by not enabling scatter only for the PMD/NIC known not to need it to support jumbo RX. Note: this change is temporary and not needed for later releases OVS/DPDK Reported-by: Louis Peens <louis.peens@netronome.com> Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-05-11 08:08:24 +01:00
Kevin Traynor	91fccdad72	netdev-dpdk: Free mempool only when no in-use mbufs. DPDK mempools are freed when they are no longer needed. This can happen when a port is removed or a port's mtu is reconfigured so that a new mempool is used. It is possible that an mbuf is attempted to be returned to a freed mempool from NIC Tx queues and this can lead to a segfault. In order to prevent this, only free mempools when they are not needed and have no in-use mbufs. As this might not be possible immediately, create a free list of mempools and sweep it anytime a port tries to get a mempool. Fixes: 8d38823bdf8b ("netdev-dpdk: fix memory leak") Cc: mark.b.kavanagh81@gmail.com Cc: Ilya Maximets <i.maximets@samsung.com> Reported-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-04-21 16:59:45 +01:00
Kevin Traynor	1dfebee971	netdev-dpdk: Remove 'error' from non error log. Presently, if OVS tries to setup more queues than are allowed by a specific NIC, OVS will handle this case by retrying with a lower amount of queues. Rather than reporting initial failed queue setups in the logs as ERROR, they are reported as INFO but contain the word 'error'. Unless a user has detailed knowledge of OVS-DPDK workings, this is confusing. Let's remove 'error' from the INFO log. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-03-23 11:35:34 +00:00
Ilya Maximets	fa9f4eebd3	netdev-dpdk: Fix print format for dpdk port ids. Since 17.11 release DPDK uses uint16 for port_id. Format strings for printing functions must be updated accordingly. CC: Mark Kavanagh <mark.b.kavanagh@intel.com> Fixes: 5e925ccc2a6f ("netdev-dpdk: DPDK v17.11 upgrade") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-03-23 11:28:35 +00:00
Justin Pettit	e883448e3f	dp-packet: Add index to DP_PACKET_BATCH_FOR_EACH to prevent shadowing. Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>	2018-02-28 14:53:27 -08:00
Ciara Loftus	10087cba9d	netdev-dpdk: Add support for vHost dequeue zero copy (experimental) Zero copy is disabled by default. To enable it, set the 'dq-zero-copy' option to 'true' when configuring the Interface: ovs-vsctl set Interface dpdkvhostuserclient0 options:vhost-server-path=/tmp/dpdkvhostuserclient0 options:dq-zero-copy=true When packets from a vHost device with zero copy enabled are destined for a single 'dpdk' port, the number of tx descriptors on that 'dpdk' port must be set to a smaller value. 128 is recommended. This can be achieved like so: ovs-vsctl set Interface dpdkport options:n_txq_desc=128 Note: The sum of the tx descriptors of all 'dpdk' ports the VM will send to should not exceed 128. Due to this requirement, the feature is considered 'experimental'. Testing of the patch showed a ~8% improvement when switching 512B packets between vHost devices on different VMs on the same host when zero copy was enabled on the transmitting device. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-31 14:04:35 +00:00
Ilya Maximets	ac1a9bb93f	netdev-dpdk: Fix xstats leak on port destruction. CC: Michal Weglicki <michalx.weglicki@intel.com> Fixes: 971f4b394c6e ("netdev: Custom statistics.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-26 20:49:18 +00:00
Ilya Maximets	34eb086342	netdev-dpdk: Fix memory leak in netdev_dpdk_configure_xstats(). CC: Michal Weglicki <michalx.weglicki@intel.com> Fixes: 971f4b394c6e ("netdev: Custom statistics.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-26 20:49:18 +00:00
Ilya Maximets	526259f22c	netdev-dpdk: Fix memory leak in netdev_dpdk_get_custom_stats(). CC: Michal Weglicki <michalx.weglicki@intel.com> Fixes: 971f4b394c6e ("netdev: Custom statistics.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-26 20:49:18 +00:00
Yuanhan Liu	5e75881868	netdev-dpdk: fix port addition for ports sharing same PCI id Some NICs have only one PCI address associated with multiple ports. This patch extends the dpdk-devargs option's format to cater for such devices. To achieve that, this patch uses a new syntax that will be adapted and implemented in future DPDK release (likely, v18.05): http://dpdk.org/ml/archives/dev/2017-December/084234.html And since it's the DPDK duty to parse the (complete and full) syntax and this patch is more likely to serve as an intermediate workaround, here I take a simpler and shorter syntax from it (note it's allowed to have only one category being provided): class=eth,mac=00:11:22:33:44:55:66 Also, old compatibility is kept. Users can still go on with using the PCI id to add a port (if that's enough for them). Meaning, this patch will not break anything. This patch is basically based on the one from Ciara: https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339496.html Cc: Loftus Ciara <ciara.loftus@intel.com> Cc: Thomas Monjalon <thomas@monjalon.net> Cc: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-26 20:49:18 +00:00
Ian Stokes	f6f50552a3	netdev-dpdk: Fix requested MTU size validation. This commit replaces MTU_TO_FRAME_LEN(mtu) with MTU_TO_MAX_FRAME_LEN(mtu) in netdev_dpdk_set_mtu(), in order to determine if the total length of the L2 frame with an MTU of ’mtu’ exceeds NETDEV_DPDK_MAX_PKT_LEN. When setting an MTU we first check if the requested total frame length (which includes associated L2 overhead) will exceed the maximum frame length supported in netdev_dpdk_set_mtu(). The frame length is calculated by MTU_TO_FRAME_LEN as MTU + ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a later stage in dpdk_eth_dev_init() using rte_eth_dev_set_mtu(mtu). However when using rte_eth_dev_set_mtu(mtu) the calculation used to check that the frame does not exceed the max frame length for that device varies between DPDK device drivers. For example ixgbe driver calculates the frame length for a given MTU as mtu + ETHER_HDR_LEN + ETHER_CRC_LEN i40e driver calculates it as mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2 em driver calculates it as mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE Currently it is possible to set an MTU for a netdev_dpdk device that exceeds the upper limit MTU for that devices DPDK driver. This leads to a segfault. This is because the frame length comparison as is, does not take into account the addition of the vlan tag overhead expected in the drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but the subsequent dpdk_eth_dev_init() will fail before the queues have been created for the DPDK device. This coupled with assumptions regarding reconfiguration requirements for the netdev will lead to a segfault when the rxq is polled for this device. A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when validating a requested MTU in netdev_dpdk_set_mtu(). MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following: mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN) By using MTU_TO_MAX_FRAME_LEN at the netdev_dpdk_set_mtu() stage, OvS now takes into account the maximum L2 overhead that a DPDK driver could allow for in its frame size calculation. This allows OVS to flag an error rather than the DPDK driver if the frame length exceeds the max DPDK frame length. OVS can fail gracefully at this point and use the default MTU of 1500 to continue to configure the port. Note: this fix is a work around, a better approach would be if DPDK devices could report the maximum MTU value that can be requested on a per device basis. This capability however is not currently available. A downside of this patch is that the MTU upper limit will be reduced by 8 bytes for DPDK devices that do not need to account for vlan tags in the frame length driver calculations e.g. ixgbe devices upper MTU limit is reduced from the OVS point of view from 9710 to 9702. CC: Mark Kavanagh <mark.b.kavanagh@intel.com> Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames") Signed-off-by: Ian Stokes <ian.stokes@intel.com> Co-authored-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Flavio Leitner <fbl@sysclose.org>	2018-01-26 20:49:18 +00:00
Flavio Leitner	b2e8b12f8a	netdev-dpdk: add vhost-user get_status. Expose relevant vhost-user information in status. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Tested-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-17 18:12:46 +00:00
zhangliping	4c47ddde34	netdev-dpdk: fix ingress_policer leak on error path Fix memory leak by freeing the policer if rte_meter_srtcm_config fails. Fixes: 9509913aa722 ("netdev-dpdk.c: Add ingress-policing functionality.") Signed-off-by: zhangliping <zhangliping02@baidu.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-17 18:11:28 +00:00
Michal Weglicki	971f4b394c	netdev: Custom statistics. - New get_custom_stats interface function is added to netdev. It allows particular netdev implementation to expose custom counters in dictionary format (counter name/counter value). - New statistics are retrieved using experimenter code and are printed as a result to ofctl dump-ports. - New counters are available for OpenFlow 1.4+. - New statistics are printed to output via ofctl only if those are present in reply message. - New statistics definition is added to include/openflow/intel-ext.h. - Custom statistics are implemented only for dpdk-physical port type. - DPDK-physical implementation uses xstats to collect statistics. Only dropped and error counters are exposed. Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-01-10 15:29:13 -08:00
Ilya Maximets	ad8b0b4fe7	netdev: Remove useless cutlen. Cutlen already applied while processing OVS_ACTION_ATTR_OUTPUT. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com	2017-12-20 21:07:46 +00:00
Ilya Maximets	b30896c969	netdev: Remove unused may_steal. Not needed anymore because 'may_steal' already handled on dpif-netdev layer and always true. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com	2017-12-20 21:07:46 +00:00
Ilya Maximets	be48173310	netdev-dpdk: Add debug appctl to get mempool information. New appctl 'netdev-dpdk/get-mempool-info' implemented to get result of 'rte_mempool_list_dump()' function if no arguments passed and 'rte_mempool_dump()' if DPDK netdev passed as argument. Could be used for debugging mbuf leaks and other mempool related issues. Most useful in pair with `grep -v "cache_count.*=0"`. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-20 21:07:46 +00:00
Michal Weglicki	3eb8d4fa0d	netdev-dpdk: extend netdev_dpdk_get_status to include if_type and if_descr This commit extends netdev_dpdk_get_status API to include additional driver-related information: if_type and if_descr. v2->v3: Code rebase. v3->v4: Minor comments applied. v5->v6: Adds DPDK port specific description in documentation. Co-authored-by: Michal Weglicki <michalx.weglicki@intel.com> Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> Signed-off-by: Przemyslaw Szczerbik <przemyslawx.szczerbik@intel.com> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Mark Kavanagh	a14d1cc8a7	netdev-dpdk: vHost IOMMU support DPDK v17.11 introduces support for the vHost IOMMU feature. This is a security feature, which restricts the vhost memory that a virtio device may access. This feature also enables the vhost REPLY_ACK protocol, the implementation of which is known to work in newer versions of QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the feature is disabled by default in (and should remain so), for the aforementioned older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even without having an IOMMU device, with no performance penalty. This patch adds a new global config option, vhost-iommu-support, that controls enablement of the vhost IOMMU feature: ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true This value defaults to false; to enable IOMMU support, this field should be set to true when setting other global parameters on init (such as "dpdk-socket-mem", for example). Changing the value at runtime is not supported, and requires restarting the vswitch daemon. Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Mark Kavanagh	5e925ccc2a	netdev-dpdk: DPDK v17.11 upgrade This commit adds support for DPDK v17.11: - minor updates to accomodate DPDK API changes - update references to DPDK version in Documentation - update DPDK version in travis' linux-build script - document DPDK v17.11 virtio driver bug Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Jan Scheurich <jan.scheurich@ericsson.com> Tested-by: Jan Scheurich <jan.scheurich@ericsson.com> Tested-by: Guoshuai Li <ligs@dtdream.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Kevin Traynor	255b7bda98	netdev-dpdk: Remove uneeded call to rte_eth_dev_count(). The call to rte_eth_dev_count() was added as workaround for rte_eth_dev_get_port_by_name() not handling cases when there was no DPDK ports. In versions of DPDK >= 17.02 rte_eth_dev_get_port_by_name() does handle this case (DPDK commit f9ae888b1e19). rte_eth_dev_count() is no longer needed so remove it. Acked-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Ilya Maximets	b2e72a9c9d	netdev-dpdk: Add comment about variables naming convention. It'll be nice to document current naming convention for variables of the following types used in netdev-dpdk: * netdev * netdev_dpdk * netdev_rxq * netdev_rxq_dpdk to be sure that we will not return to chaos which was before commit d46285a2206f ("netdev-dpdk: Consistent variable naming."). Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Ilya Maximets	3d0d5ab153	netdev-dpdk: Fix variables naming in set_admin_state function. Function 'netdev_dpdk_set_admin_state()' was missed while fixing variables naming according to the following convention: 'struct netdev':'netdev' 'struct netdev_dpdk':'dev' 'struct netdev_rxq':'rxq' 'struct netdev_rxq_dpdk':'rx' Fixes: d46285a2206f ("netdev-dpdk: Consistent variable naming.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokess <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Ilya Maximets	af5b0dad30	netdev-dpdk: Fix mempool creation with large MTU. Currently mempool name size limited to 25 characters by RTE_MEMPOOL_NAMESIZE. netdev-dpdk tries to create mempool with the following name pattern: "ovs_%{hash}_%{socket}_%{mtu}_%{n_mbuf}". We have 3 chars for "ovs" + 4 chars for delimiters + 8 chars for hash (because it's the 32 bit integer printed in hex) + 1 char for socket_id (mostly 1, but it could be 2 on some systems; larger?) = 16. Only 25 - 16 = 9 characters remains for mtu + n_mbufs. Minimum usual value for mtu is 1500 --> 2030 (4 chars) after dpdk_buf_size conversion and the minimum value for n_mbufs is 16384 (5 chars). So, all the 9 characters are used. If we'll try to create port with mtu = 9500, mempool creation will fail, because FRAME_LEN_TO_MTU(dpdk_buf_size(9500)) = 10222 (5 chars) and this value will overflow the RTE_MEMPOOL_NAMESIZE limit. Same issue will happen if we'll try to create port with big enough number of queues or will try to create big enough number of PMD threads (number of tx queues will enlarge the mempool requirements). Fix that by removing the delimiters. To keep the readability (at least partial) of the mempool names exact field sizes with zero padding are used. Following limits should be suitable for now: - Hash length: 8 chars (uint32_t in hex) - Socket ID : 2 chars (For systems with up to 10 sockets) - MTU : 5 chars (MTU (10^5 - 1) should be enough for now) - n_mbufs : 7 chars (Up to 10^7 of mbufs) Total : 22 + 3 (for "ovs") = 25 CC: Antonio Fischetti <antonio.fischetti@intel.com> CC: Robert Wojciechowicz <robertx.wojciechowicz@intel.com> Fixes: f06546a51dd8 ("Fix mempool names to reflect socket id.") Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-11-17 16:26:33 +00:00

1 2 3 4 5 ...

350 Commits