mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 21:07:47 +00:00

Author	SHA1	Message	Date
Ilya Maximets	ac1a9bb93f	netdev-dpdk: Fix xstats leak on port destruction. CC: Michal Weglicki <michalx.weglicki@intel.com> Fixes: 971f4b394c6e ("netdev: Custom statistics.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-26 20:49:18 +00:00
Ilya Maximets	34eb086342	netdev-dpdk: Fix memory leak in netdev_dpdk_configure_xstats(). CC: Michal Weglicki <michalx.weglicki@intel.com> Fixes: 971f4b394c6e ("netdev: Custom statistics.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-26 20:49:18 +00:00
Ilya Maximets	526259f22c	netdev-dpdk: Fix memory leak in netdev_dpdk_get_custom_stats(). CC: Michal Weglicki <michalx.weglicki@intel.com> Fixes: 971f4b394c6e ("netdev: Custom statistics.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-26 20:49:18 +00:00
Yuanhan Liu	5e75881868	netdev-dpdk: fix port addition for ports sharing same PCI id Some NICs have only one PCI address associated with multiple ports. This patch extends the dpdk-devargs option's format to cater for such devices. To achieve that, this patch uses a new syntax that will be adapted and implemented in future DPDK release (likely, v18.05): http://dpdk.org/ml/archives/dev/2017-December/084234.html And since it's the DPDK duty to parse the (complete and full) syntax and this patch is more likely to serve as an intermediate workaround, here I take a simpler and shorter syntax from it (note it's allowed to have only one category being provided): class=eth,mac=00:11:22:33:44:55:66 Also, old compatibility is kept. Users can still go on with using the PCI id to add a port (if that's enough for them). Meaning, this patch will not break anything. This patch is basically based on the one from Ciara: https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339496.html Cc: Loftus Ciara <ciara.loftus@intel.com> Cc: Thomas Monjalon <thomas@monjalon.net> Cc: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-26 20:49:18 +00:00
Ian Stokes	f6f50552a3	netdev-dpdk: Fix requested MTU size validation. This commit replaces MTU_TO_FRAME_LEN(mtu) with MTU_TO_MAX_FRAME_LEN(mtu) in netdev_dpdk_set_mtu(), in order to determine if the total length of the L2 frame with an MTU of ’mtu’ exceeds NETDEV_DPDK_MAX_PKT_LEN. When setting an MTU we first check if the requested total frame length (which includes associated L2 overhead) will exceed the maximum frame length supported in netdev_dpdk_set_mtu(). The frame length is calculated by MTU_TO_FRAME_LEN as MTU + ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a later stage in dpdk_eth_dev_init() using rte_eth_dev_set_mtu(mtu). However when using rte_eth_dev_set_mtu(mtu) the calculation used to check that the frame does not exceed the max frame length for that device varies between DPDK device drivers. For example ixgbe driver calculates the frame length for a given MTU as mtu + ETHER_HDR_LEN + ETHER_CRC_LEN i40e driver calculates it as mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2 em driver calculates it as mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE Currently it is possible to set an MTU for a netdev_dpdk device that exceeds the upper limit MTU for that devices DPDK driver. This leads to a segfault. This is because the frame length comparison as is, does not take into account the addition of the vlan tag overhead expected in the drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but the subsequent dpdk_eth_dev_init() will fail before the queues have been created for the DPDK device. This coupled with assumptions regarding reconfiguration requirements for the netdev will lead to a segfault when the rxq is polled for this device. A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when validating a requested MTU in netdev_dpdk_set_mtu(). MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following: mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN) By using MTU_TO_MAX_FRAME_LEN at the netdev_dpdk_set_mtu() stage, OvS now takes into account the maximum L2 overhead that a DPDK driver could allow for in its frame size calculation. This allows OVS to flag an error rather than the DPDK driver if the frame length exceeds the max DPDK frame length. OVS can fail gracefully at this point and use the default MTU of 1500 to continue to configure the port. Note: this fix is a work around, a better approach would be if DPDK devices could report the maximum MTU value that can be requested on a per device basis. This capability however is not currently available. A downside of this patch is that the MTU upper limit will be reduced by 8 bytes for DPDK devices that do not need to account for vlan tags in the frame length driver calculations e.g. ixgbe devices upper MTU limit is reduced from the OVS point of view from 9710 to 9702. CC: Mark Kavanagh <mark.b.kavanagh@intel.com> Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames") Signed-off-by: Ian Stokes <ian.stokes@intel.com> Co-authored-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Flavio Leitner <fbl@sysclose.org>	2018-01-26 20:49:18 +00:00
Flavio Leitner	b2e8b12f8a	netdev-dpdk: add vhost-user get_status. Expose relevant vhost-user information in status. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Tested-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-17 18:12:46 +00:00
zhangliping	4c47ddde34	netdev-dpdk: fix ingress_policer leak on error path Fix memory leak by freeing the policer if rte_meter_srtcm_config fails. Fixes: 9509913aa722 ("netdev-dpdk.c: Add ingress-policing functionality.") Signed-off-by: zhangliping <zhangliping02@baidu.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-01-17 18:11:28 +00:00
Michal Weglicki	971f4b394c	netdev: Custom statistics. - New get_custom_stats interface function is added to netdev. It allows particular netdev implementation to expose custom counters in dictionary format (counter name/counter value). - New statistics are retrieved using experimenter code and are printed as a result to ofctl dump-ports. - New counters are available for OpenFlow 1.4+. - New statistics are printed to output via ofctl only if those are present in reply message. - New statistics definition is added to include/openflow/intel-ext.h. - Custom statistics are implemented only for dpdk-physical port type. - DPDK-physical implementation uses xstats to collect statistics. Only dropped and error counters are exposed. Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-01-10 15:29:13 -08:00
Ilya Maximets	ad8b0b4fe7	netdev: Remove useless cutlen. Cutlen already applied while processing OVS_ACTION_ATTR_OUTPUT. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com	2017-12-20 21:07:46 +00:00
Ilya Maximets	b30896c969	netdev: Remove unused may_steal. Not needed anymore because 'may_steal' already handled on dpif-netdev layer and always true. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com	2017-12-20 21:07:46 +00:00
Ilya Maximets	be48173310	netdev-dpdk: Add debug appctl to get mempool information. New appctl 'netdev-dpdk/get-mempool-info' implemented to get result of 'rte_mempool_list_dump()' function if no arguments passed and 'rte_mempool_dump()' if DPDK netdev passed as argument. Could be used for debugging mbuf leaks and other mempool related issues. Most useful in pair with `grep -v "cache_count.*=0"`. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-20 21:07:46 +00:00
Michal Weglicki	3eb8d4fa0d	netdev-dpdk: extend netdev_dpdk_get_status to include if_type and if_descr This commit extends netdev_dpdk_get_status API to include additional driver-related information: if_type and if_descr. v2->v3: Code rebase. v3->v4: Minor comments applied. v5->v6: Adds DPDK port specific description in documentation. Co-authored-by: Michal Weglicki <michalx.weglicki@intel.com> Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> Signed-off-by: Przemyslaw Szczerbik <przemyslawx.szczerbik@intel.com> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Mark Kavanagh	a14d1cc8a7	netdev-dpdk: vHost IOMMU support DPDK v17.11 introduces support for the vHost IOMMU feature. This is a security feature, which restricts the vhost memory that a virtio device may access. This feature also enables the vhost REPLY_ACK protocol, the implementation of which is known to work in newer versions of QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the feature is disabled by default in (and should remain so), for the aforementioned older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can safely be enabled, even without having an IOMMU device, with no performance penalty. This patch adds a new global config option, vhost-iommu-support, that controls enablement of the vhost IOMMU feature: ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true This value defaults to false; to enable IOMMU support, this field should be set to true when setting other global parameters on init (such as "dpdk-socket-mem", for example). Changing the value at runtime is not supported, and requires restarting the vswitch daemon. Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Mark Kavanagh	5e925ccc2a	netdev-dpdk: DPDK v17.11 upgrade This commit adds support for DPDK v17.11: - minor updates to accomodate DPDK API changes - update references to DPDK version in Documentation - update DPDK version in travis' linux-build script - document DPDK v17.11 virtio driver bug Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Jan Scheurich <jan.scheurich@ericsson.com> Tested-by: Jan Scheurich <jan.scheurich@ericsson.com> Tested-by: Guoshuai Li <ligs@dtdream.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Kevin Traynor	255b7bda98	netdev-dpdk: Remove uneeded call to rte_eth_dev_count(). The call to rte_eth_dev_count() was added as workaround for rte_eth_dev_get_port_by_name() not handling cases when there was no DPDK ports. In versions of DPDK >= 17.02 rte_eth_dev_get_port_by_name() does handle this case (DPDK commit f9ae888b1e19). rte_eth_dev_count() is no longer needed so remove it. Acked-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Ilya Maximets	b2e72a9c9d	netdev-dpdk: Add comment about variables naming convention. It'll be nice to document current naming convention for variables of the following types used in netdev-dpdk: * netdev * netdev_dpdk * netdev_rxq * netdev_rxq_dpdk to be sure that we will not return to chaos which was before commit d46285a2206f ("netdev-dpdk: Consistent variable naming."). Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Ilya Maximets	3d0d5ab153	netdev-dpdk: Fix variables naming in set_admin_state function. Function 'netdev_dpdk_set_admin_state()' was missed while fixing variables naming according to the following convention: 'struct netdev':'netdev' 'struct netdev_dpdk':'dev' 'struct netdev_rxq':'rxq' 'struct netdev_rxq_dpdk':'rx' Fixes: d46285a2206f ("netdev-dpdk: Consistent variable naming.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokess <ian.stokes@intel.com>	2017-12-08 21:42:54 +00:00
Ilya Maximets	af5b0dad30	netdev-dpdk: Fix mempool creation with large MTU. Currently mempool name size limited to 25 characters by RTE_MEMPOOL_NAMESIZE. netdev-dpdk tries to create mempool with the following name pattern: "ovs_%{hash}_%{socket}_%{mtu}_%{n_mbuf}". We have 3 chars for "ovs" + 4 chars for delimiters + 8 chars for hash (because it's the 32 bit integer printed in hex) + 1 char for socket_id (mostly 1, but it could be 2 on some systems; larger?) = 16. Only 25 - 16 = 9 characters remains for mtu + n_mbufs. Minimum usual value for mtu is 1500 --> 2030 (4 chars) after dpdk_buf_size conversion and the minimum value for n_mbufs is 16384 (5 chars). So, all the 9 characters are used. If we'll try to create port with mtu = 9500, mempool creation will fail, because FRAME_LEN_TO_MTU(dpdk_buf_size(9500)) = 10222 (5 chars) and this value will overflow the RTE_MEMPOOL_NAMESIZE limit. Same issue will happen if we'll try to create port with big enough number of queues or will try to create big enough number of PMD threads (number of tx queues will enlarge the mempool requirements). Fix that by removing the delimiters. To keep the readability (at least partial) of the mempool names exact field sizes with zero padding are used. Following limits should be suitable for now: - Hash length: 8 chars (uint32_t in hex) - Socket ID : 2 chars (For systems with up to 10 sockets) - MTU : 5 chars (MTU (10^5 - 1) should be enough for now) - n_mbufs : 7 chars (Up to 10^7 of mbufs) Total : 22 + 3 (for "ovs") = 25 CC: Antonio Fischetti <antonio.fischetti@intel.com> CC: Robert Wojciechowicz <robertx.wojciechowicz@intel.com> Fixes: f06546a51dd8 ("Fix mempool names to reflect socket id.") Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-11-17 16:26:33 +00:00
Ilya Maximets	daf22bf7a8	netdev-dpdk: Fix calling vhost API with negative vid. Currently, rx and tx functions for vhost interfaces always obtain 'vid' twice. First time inside 'is_vhost_running' for checking the value and the second time in enqueue/dequeue function calls to send/receive packets. But second time we're not checking the returned value. If vhost device will be destroyed between checking and enqueue/dequeue, DPDK API will be called with '-1' instead of valid 'vid'. DPDK API does not validate the 'vid'. This leads to getting random memory value as a pointer to internal device structure inside DPDK. Access by this pointer leads to segmentation fault. For example: \|00503\|dpdk\|INFO\|VHOST_CONFIG: read message VHOST_USER_GET_VRING_BASE [New Thread 0x7fb6754910 (LWP 21246)] Program received signal SIGSEGV, Segmentation fault. rte_vhost_enqueue_burst at lib/librte_vhost/virtio_net.c:630 630 if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) (gdb) bt full #0 rte_vhost_enqueue_burst at lib/librte_vhost/virtio_net.c:630 dev = 0xffffffff #1 __netdev_dpdk_vhost_send at lib/netdev-dpdk.c:1803 tx_pkts = <optimized out> cur_pkts = 0x7f340084f0 total_pkts = 32 dropped = 0 i = <optimized out> retries = 0 ... (gdb) p ((struct netdev_dpdk ) netdev) $8 = { ... , flags = (NETDEV_UP \| NETDEV_PROMISC), ... , vid = {v = -1}, vhost_reconfigured = false, ... } Issue can be reproduced by stopping DPDK application (testpmd) inside guest while heavy traffic flows to this VM. Fix that by obtaining and checking the 'vid' only once. CC: Ciara Loftus <ciara.loftus@intel.com> Fixes: 0a0f39df1d5a ("netdev-dpdk: Add support for DPDK 16.07") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Billy O'Mahony <billy.o.mahony@intel.com> Acked-by: Billy O'Mahony <billy.o.mahony@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-11-16 16:24:11 +00:00
Ilya Maximets	bc57ed901f	netdev-dpdk: Remove unused MAX_NB_MBUF. MAX_NB_MBUF was used as a default mempool size for almost all ports. Not used since new per-port mempool allocation introduced. MIN_NB_MBUF still used as a lower limit. CC: Robert Wojciechowicz <robertx.wojciechowicz@intel.com> Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-11-16 16:24:11 +00:00
Ilya Maximets	24e78f9350	netdev-dpdk: Factor out struct dpdk_mp. Since commit d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port."), struct dpdk_mp is redundant because each mempool can be used by single port only and this port already contains all the information we store in dpdk_mp. There is no need to duplicate the information. Fields of this structure currently used only to generate mempool name. But it's required only while creation and after that we can use mp->name directly from the struct rte_mempool. Let's remove this structure and use struct rte_mempool directly instead. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-11-16 16:24:11 +00:00
Ilya Maximets	173ef76bbf	netdev-dpdk: Fix dpdk_mp leak in case of EEXIST. In case of EEXIST, 'dpdk_mp_create()' will allocate yet another 'struct dpdk_mp' with same 'mp' pointer inside. We need to free this structure to avoid the leak. CC: Robert Wojciechowicz <robertx.wojciechowicz@intel.com> CC: Antonio Fischetti <antonio.fischetti@intel.com> Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.") Fixes: b6b26021d2e2 ("netdev-dpdk: fix management of pre-existing mempools.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-11-16 16:24:11 +00:00
Mark Kavanagh	7ee94cbac8	netdev-dpdk: replace uint8_t with dpdk_port_t netdev_dpdk_detach() declares a 'port_id' variable, of type uint8_t. This variable should instead be of type dpdk_port_t. Fixes: bb37956ac ("netdev-dpdk: Use uint8_t for port_id.") CC: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-11-16 16:24:11 +00:00
Bhanuprakash Bodireddy	23d4d53f14	netdev-dpdk: Refactor netdev_dpdk structure. This commit introduces below changes to netdev_dpdk structure. - Mark cachelines and reorder few member variables. - Maintain the grouping of related member variables. - Add comment on the information on pad bytes where ever appropriate, so new members can be introduced in the future to fill the gaps. Below is how this structure looks with this commit. Member size OVS_CACHE_LINE_MARKER cacheline0; dpdk_port_t port_id; 1 bool attached; 1 ... OVS_CACHE_LINE_MARKER cacheline1; struct ovs_mutex; 48 struct dpdk_mp *dpdk_mp; 8 ... Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-11-03 13:36:53 -07:00
Ilya Maximets	ec6edc8cde	netdev-dpdk: Fix mp_name leak on snprintf failure. CC: Robert Wojciechowicz <robertx.wojciechowicz@intel.com> CC: Antonio Fischetti <antonio.fischetti@intel.com> Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.") Fixes: 65056fd79694 ("netdev-dpdk: manage failure in mempool name creation.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-10-30 14:58:52 -07:00
antonio.fischetti@intel.com	a08a115d26	netdev-dpdk: Rename dpdk_mp_put as dpdk_mp_free. For readability purposes dpdk_mp_put is renamed as dpdk_mp_free. CC: Mark B Kavanagh <mark.b.kavanagh@intel.com> CC: Darrell Ball <dlu998@gmail.com> CC: Ciara Loftus <ciara.loftus@intel.com> CC: Kevin Traynor <ktraynor@redhat.com> CC: Aaron Conole <aconole@redhat.com> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-10-26 20:45:33 +01:00
antonio.fischetti@intel.com	ad9b5b9bc7	netdev-dpdk: Reword mp_size as n_mbufs. For code readability purposes mp_size is renamed as n_mbufs in dpdk_mp structure. This parameter is passed to rte mempool creation functions and is meant to contain the number of elements inside the requested mempool. CC: Ciara Loftus <ciara.loftus@intel.com> CC: Aaron Conole <aconole@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-10-26 20:23:41 +01:00
antonio.fischetti@intel.com	65056fd796	netdev-dpdk: manage failure in mempool name creation. In case a mempool name could not be generated log a message and return a null mempool pointer to the caller. CC: Mark B Kavanagh <mark.b.kavanagh@intel.com> CC: Darrell Ball <dlu998@gmail.com> CC: Ciara Loftus <ciara.loftus@intel.com> CC: Kevin Traynor <ktraynor@redhat.com> CC: Aaron Conole <aconole@redhat.com> Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.") Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-10-26 20:17:27 +01:00
antonio.fischetti@intel.com	837c1761be	netdev-dpdk: skip init for existing mempools. Skip initialization of mempool packet areas if this was already done in a previous call to dpdk_mp_create. CC: Darrell Ball <dlu998@gmail.com> CC: Ciara Loftus <ciara.loftus@intel.com> CC: Aaron Conole <aconole@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-10-26 20:08:01 +01:00
antonio.fischetti@intel.com	f06546a51d	Fix mempool names to reflect socket id. Create mempool names by considering also the NUMA socket number. So a name reflects what socket the mempool is allocated on. This change is needed for the NUMA-awareness feature. CC: Mark B Kavanagh <mark.b.kavanagh@intel.com> CC: Aaron Conole <aconole@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Reported-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Ciara Loftus <ciara.loftus@intel.com> Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.") Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-10-26 19:49:29 +01:00
antonio.fischetti@intel.com	b6b26021d2	netdev-dpdk: fix management of pre-existing mempools. Fix an issue on reconfiguration of pre-existing mempools. This patch avoids to call dpdk_mp_put() - and erroneously release the mempool - when it already exists. CC: Mark B Kavanagh <mark.b.kavanagh@intel.com> CC: Aaron Conole <aconole@redhat.com> CC: Darrell Ball <dlu998@gmail.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Reported-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Ciara Loftus <ciara.loftus@intel.com> Reported-by: Róbert Mulik <robert.mulik@ericsson.com> Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.") Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2017-10-26 19:46:58 +01:00
Zoltan Balogh	75fb914892	netdev-dpdk: reset packet_type for reused dp_packets. DPDK uses dp-packet pool for storing received packets. The pool is reused by rxq_recv funcions of the DPDK netdevs. The datapath is capable to modify the packet_type property of packets. For instance when encapsulated L3 packets are received on a ptap gre port. In this case the packet_type property of struct dp_packet can be modified and later the same dp_packet with the modified packet_type can be reused in the rxq_rec function, so it can contain corrupted data. The dp_packet_batch_init_cutlen() in the rxq_recv functions iterates over dp_packets and sets their cutlen. So I modified this function to set packet_type to Ethernet for the dp_packets as well. I also renamed this function because of the added functionality. The dp_packet_batch_init_cutlen() iterates over batch->count dp_packet. Therefore setting of batch->count = nb_rx needs to be done before the former function is invoked. This is an additional fix. Signed-off-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Laszlo Suru <laszlo.suru@ericsson.com> Co-authored-by: Laszlo Suru <laszlo.suru@ericsson.com> CC: Jan Scheurich <jan.scheurich@ericsson.com> CC: Sugesh Chandran <sugesh.chandran@intel.com> CC: Darrell Ball <dlu998@gmail.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-09-22 02:58:13 -07:00
Bhanuprakash Bodireddy	fd57eebacb	netdev-dpdk: Minor cleanup of netdev_dpdk_send__. The variable 'cnt' is initialized and reused in multiple function calls inside netdev_dpdk_send__() and is confusing sometimes. Instead introduce 'batch_cnt' to hold the original packet count and 'tx_cnt' to store the final packet count resulting after filtering and qos operations. Finally 'tx_cnt' packets gets transmitted on the respective 'qid'. Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-09-22 02:14:59 -07:00
Bhanuprakash Bodireddy	8a14bd7b6b	netdev-dpdk: Cleanup dpdk_do_tx_copy. Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-09-22 02:12:22 -07:00
Bhanuprakash Bodireddy	8a543eb03a	netdev-dpdk: Use DP_PACKET_BATCH_FOR_EACH in netdev_dpdk_ring_send. Use DP_PACKET_BATCH_FOR_EACH macro in netdev_dpdk_ring_send(). Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-09-22 02:05:49 -07:00
Gao Zhenyu	3e90f7d753	netdev-dpdk: Execute QoS Checking before copying to mbuf. In dpdk_do_tx_copy function, all packets were copied to mbuf first, but QoS checking may drop some of them. Move the QoS checking in front of copying data to mbuf, it helps to reduce useless copy. Signed-off-by: Zhenyu Gao <sysugaozhenyu@gmail.com> Acked-by: ian.stokes@intel.com<mailto:ian.stokes@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-09-06 22:27:53 -07:00
Robert Wojciechowicz	d555d9bded	netdev-dpdk: Create separate memory pool for each port. Since it's possible to delete memory pool in DPDK we can try to estimate better required memory size when port is reconfigured, e.g. with different number of rx queues. CC: Kevin Traynor <ktraynor@redhat.com> CC: Aaron Conole <aconole@redhat.com> Signed-off-by: Robert Wojciechowicz <robertx.wojciechowicz@intel.com> Co-authored-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-09-05 12:02:25 -07:00
wangzhike	894af647a8	netdev-dpdk: update vhost user client port status. After ovs-vswitchd reboots, vhost user client port status is displayed as LINK DOWN though the traffic is OK. The problem is that the port may be udpated while the vhost_reconfigured is false. Then the vhost_reconfigured is updated to true. As a result, the vhost port status is kept as LINK-DOWN. Signed-off-by: wangzhike <wangzhike@jd.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-09-05 12:02:25 -07:00
wangzhike	50986e7842	netdev-dpdk: vhost get stats fix. In netdev_dpdk_vhost_get_stats, '+=' was used in a few places where '=' was expected. Signed-off-by: wangzhike <wangzhike@jd.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-08-25 14:49:55 -07:00
Lance Richardson	602c86681c	netdev-dpdk: use 64-bit arithmetic when converting rates. Force 64-bit arithmetic to be used when converting uint32_t rate and burst parameters from kilobits per second to bytes per second, avoiding incorrect behavior for rates exceeding UINT_MAX bits per second. Reported-by: "王志克" <wangzhike@jd.com> Fixes: 9509913aa722 ("netdev-dpdk.c: Add ingress-policing functionality.") Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-By: Mark Michelson <mmichels@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-08-25 14:35:31 -07:00
Darrell Ball	11d4c7a843	dp-packet: Refactor DPDK packet initialization. DPDK uses dp-packet pools and manages the mbuf portion of each packet. When a pool is created, partial initialization is also done on the OVS portion (i.e. non-mbuf). Since packet memory is reused, this is not very useful for transient fields and is also misleading. Furthermore, some of these transient fields are properly initialized for DPDK packets entering OVS anyways, which is the only reasonable way to do this. Another field, cutlen, is initialized in this manner in the pool and intended to be reset when cutlen is applied on sending the packet out. However, if cutlen context is set but the packet is not sent out for some reason, then the packet header would be corrupted in the memory pool. It is better to just reset the cutlen in the packets when received. I did not detect a degradation in performance, however, I would be willing to have some degradation, since this is a proper way to handle this. In addition to initializing cutlen in received packets, the other OVS transient fields are removed from the DPDK pool initialization. Acked-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>	2017-08-24 22:09:58 -07:00
Aaron Conole	fc56f5e0f5	netdev-dpdk: include dpdk PCI header directly As part of a devargs rework in DPDK, the PCI header file was removed, and needs to be directly included. This isn't required to build with 17.05 or earlier, but will be required should a future update happen. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-By: Timothy Redaelli <tredaelli@redhat.com> Acked-by: Ciara Loftus <ciara.loftus@intel.com>	2017-08-10 13:44:24 -07:00
Darrell Ball	f8121b3912	dp-packet: Reset DPDK hwol flags on init. Reset the DPDK hwol flags in dp_packet_init_. The new hwol bad checksum flag is uninitialized for non-dpdk ports and this is noticed as test failures using netdev-dummy ports, when built with the --with-dpdk flag set. Hence, in this case, packets may be falsely marked as having a bad checksum. The existing APIs are simplified at the same time by making them specific to either DPDK or otherwise; they also now manage a single field. Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-August/045081.html Fixes: 7451af618e0d ("dp-packet : Update DPDK rx checksum validation functions.") CC: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-08-10 13:34:00 -07:00
Darrell Ball	0ee821c2e6	dpdk: Fix device cleanup. Commit 5dcde09c80a8 was introduced to make detaching more automatic without using an additional command beyond ovs-vsctl del-port <br> <port>. Sometimes, since commit 5dcde09c80a8, dpdk devices are not detached when del-port is issued; command example: sudo ovs-vsctl del-port br0 dpdk1 This can happen when vswitchd is (re)started with an existing database and devices are already bound to dpdk. A minimal recipe to reproduce the issue is: 1/ Starting with darrell@prmh-nsx-perf-server125:~$ sudo ovs-vsctl show 1c50d8ee-b17f-4fac-a595-03b0da8c8275 Bridge "br0" Port "br0" Interface "br0" type: internal Port "dpdk1" Interface "dpdk1" type: dpdk options: {dpdk-devargs="0000:04:00.1"} Port "dpdk0" Interface "dpdk0" type: dpdk options: {dpdk-devargs="0000:04:00.0"} darrell@prmh-nsx-perf-server125:~$ /usr/src/dpdk-16.11/tools/dpdk-devbind.py --status Network devices using DPDK-compatible driver ============================================ 0000:04:00.0 'Ethernet Controller 10-Gigabit X540-AT2' drv=uio_pci_generic unused=ixgbe,vfio-pci 0000:04:00.1 'Ethernet Controller 10-Gigabit X540-AT2' drv=uio_pci_generic unused=ixgbe,vfio-pci 2/ restart vswitchd 3/ run sudo ovs-vsctl del-port br0 dpdk1 and find the interface is NOT detached; there is no info log ‘Device '0000:04:00.1' detached’. A more verbose discussion is here: https://mail.openvswitch.org/pipermail/ovs-dev/2017-June/333462.html along with another possible solution. Since we are nearing the end of a release, a safe approach is needed, at this time. One approach is to revert 5dcde09c80a8. This patch does not do that but reinstates the command ovs-appctl netdev-dpdk/detach to handle cases when del-port will not work. To detach the device, run the reinstated command ovs-appctl netdev-dpdk/detach 0000:04:00.1 Observe console output ‘Device '0000:04:00.1' has been detached’ Fixes: 5dcde09c80a8 ("netdev-dpdk: Fix device leak on port deletion.") CC: Ilya Maximets <i.maximets@samsung.com> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Fischetti, Antonio <antonio.fischetti@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-08-02 10:18:04 -07:00
Michal Weglicki	f3e7ec2547	Update relevant artifacts to add support for DPDK 17.05.1. Upgrading to DPDK 17.05.1 stable release adds new significant features relevant to OVS, including, but not limited to: - tun/tap PMD, - VFIO hotplug support, - Generic flow API. Following changes are applied: - netdev-dpdk: Changes required by DPDK API modifications. - doc: Because of DPDK API changes, backward compatibility with previous DPDK releases will be broken, thus all relevant documentation entries are updated. - .travis: DPDK version change from 16.11.1 to 17.05.1. - rhel/openvswitch-fedora.spec.in: DPDK version change from 16.11 to 17.05.1 Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Tested-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-08-02 10:18:00 -07:00
Mark Kavanagh	67fe6d6351	netdev-dpdk: use rte_eth_dev_set_mtu. DPDK provides an API to set the MTU of compatible physical devices - rte_eth_dev_set_mtu(). Prior to DPDK v16.07 however, this API was not implemented in some DPDK PMDs (i40e, specifically). To allow the use of jumbo frames with affected NICs in OvS-DPDK, MTU configuration was achieved by setting the jumbo frame flag, and corresponding maximum permitted Rx frame size, in an rte_eth_conf structure for the NIC port, and subsequently invoking rte_eth_dev_configure() with that configuration. However, that method does not set the MTU field of the underlying DPDK structure (rte_eth_dev) for the corresponding physical device; consequently, rte_eth_dev_get_mtu() reports the incorrect MTU for an OvS-DPDK phy device with non-standard MTU. Resolve this issue by invoking rte_eth_dev_set_mtu() when setting up or modifying the MTU of a DPDK phy port. Fixes: 0072e93 ("netdev-dpdk: add support for jumbo frames") Reported-by: Aaron Conole <aconole@redhat.com> Reported-by: Vipin Varghese <vipin.varghese@intel.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Acked-by: Sugesh Chandran <sugesh.chandran@intel.com> Tested-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-08-02 10:17:58 -07:00
Kevin Traynor	2cfe866fb3	netdev-dpdk: Log Rx checksum offload not supported. Rx checksum offload is enabled by default on DPDK NICs where supported. Previously Rx checksum offload not supported was logged only once. It meant that if multiple NICs did not support Rx checksum offload, it was only reported for the first NIC configured. Fixes: 1a2bb11817a4 ("netdev-dpdk: Enable Rx checksum offloading feature on DPDK physical ports.") Reported-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-07-11 22:09:22 -07:00
Kevin Traynor	d4f5282cf1	netdev-dpdk: Remove Rx checksum reconfigure. Rx checksum offload is enabled by default on DPDK physical NICs where available, with reconfiguration through options:rx-checksum-offload. However, changing rx-checksum-offload did not result in a reconfiguration of the NIC and wrong status is reported for it. As there seems to be diminishing reasons why a user would want to disable Rx checksum offload, just remove the broken reconfiguration option. Fixes: 1a2bb11817a4 ("netdev-dpdk: Enable Rx checksum offloading feature on DPDK physical ports.") Reported-by: Kevin Traynor <ktraynor@redhat.com> Suggested-by: Sugesh Chandran <sugesh.chandran@intel.com> Acked-by: Darrell Ball <dlu998@gmail.com> Tested-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-07-11 22:09:19 -07:00
Ben Pfaff	875ab13020	userspace: Handling of versatile tunnel ports In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based on packet_type of flow. If it's about an Ethernet packet, it is set to ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set according to the name space type. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-06-27 17:28:30 -04:00
Santosh Shukla	31b88c9751	netdev-dpdk: round up mbuf_size to cache_line_size Some pmd driver(e.g: vNIC thunderx PMD) want mbuf_size to be multiple of cache_line_size. With out this fix, Netdev-dpdk initialization would fail for those PMD. Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ian Stokes <ian.stokes@intel.com>	2017-06-14 14:04:40 -07:00

1 2 3 4 5 ...

318 Commits