2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-23 10:28:00 +00:00

211 Commits

Author SHA1 Message Date
Kevin Traynor
47a45d868f dpif-netdev/netdev-dpdk: Fix line lengths.
Fix line lengths to be <= 79 as per coding style and so that checkpatch
will not show up existing warnings on these files.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-05-18 13:48:38 -07:00
Przemyslaw Lal
12d0d1242c netdev-dpdk: fix ifindex assignment for DPDK ports
In current implementation port_id is used as an ifindex for all netdev-dpdk
interfaces.

For physical DPDK interfaces using port_id as ifindex causes that '0' is set as
ifindex for 'dpdk0' interface, '1' for 'dpdk1' and so on. For the DPDK vHost
interfaces ifindexes are not even assigned (0 is used by default) due to the
fact that vHost ports don't use port_id field from the DPDK library.

This causes multiple negative side-effects. First of all 0 is an invalid
ifindex value. The other issue is possible overlapping of 'dpdkX' interfaces
ifindex values with the ifindexes of kernel space interfaces which may cause
problems in any external tools that use those values. Neither 'dpdk0', nor any
DPDK vHost interfaces are visible in sFlow collector tools, as all interfaces
with ifindexes smaller than 1 are ignored.

Proposed solution to these issues is to calculate a hash of interface's name
and use calculated value as an ifindex. This way interfaces keep their
ifindexes during OVS-DPDK restarts, ports re-initialization events, etc., show
up in sFlow collectors and meet RFC 2863 specification regarding re-using
ifindex values by the same virtual interfaces and maximum ifindex value.

Signed-off-by: Przemyslaw Lal <przemyslawx.lal@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked by: Darrell Ball <dlu998@gmail.com>
2017-05-18 13:40:50 -07:00
Hemant Agrawal
401b70d632 netdev-dpdk: leverage the mempool offload support
DPDK 16.07 introduced the support for mempool offload support.
rte_pktmbuf_pool_create is the recommended method for creating pktmbuf
pools. Buffer pools created with rte_mempool_create may not get offloaded
to the underlying offloaded mempools.

This patch, changes the rte_mempool_create to use helper wrapper
"rte_pktmbuf_pool_create" provided by dpdk, so that it can leverage
offloaded mempools.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-05-03 12:38:37 -07:00
Billy O'Mahony
86bd9ed93c netdev-dpdk: Enable INDIRECT_DESC on DPDK vHostUser.
This gives much better performance for linux apps in the guest without
affecting dpdk applications in the guest.

I'm creating this patch on the basis of performance results outlined below.
In summary it appears that enabling INDIRECT_DESC on DPDK vHostUser ports
leads to very large increase in performance when using linux stack
applications in the guest with no noticable performance drop for DPDK based
applications in the guest.

Test#1 (VM-VM iperf3 performance)
 VMs use DPDK vhostuser ports
 OVS bridge is configured for normal action.
 OVS version 603381a (on 2.7.0 branch but before release,
     also seen on v2.6.0 and v2.6.1)
 DPDK v16.11
 QEMU v2.5.0 (also seen with v2.7.1)

 Results:
  INDIRECT_DESC enabled    5.30 Gbit/s
  INDIRECT_DESC disabled   0.05 Gbit/s

Test#2  (Phy-VM-Phy RFC2544 Throughput)
 DPDK PMDs are polling NIC, DPDK loopback app running in guest.
 OVS bridge is configured with port forwarding to VM (via dpdkvhostuser ports).
 OVS version 603381a (on 2.7.0 branch but before release),
     other versions not tested.
 DPDK v16.11
 QEMU v2.5.0 (also seen with v2.7.1)

 Results:
  INDIRECT_DESC enabled    2.75 Mpps @64B pkts (0.176 Gbit/s)
  INDIRECT_DESC disabled   2.75 Mpps @64B pkts (0.176 Gbit/s)

Signed-off-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Darrell Ball <dlu998@gmail.com>
2017-05-03 12:35:07 -07:00
Mark Kavanagh
c67e46c01e netdev-dpdk: fix mempool_configure error state
netdev_dpdk_mempool_configure obtains a handle to a
DPDK memory pool via a call to dpdk_mp_get. If dpdk_mp_get
fails, the former informs the user that insufficient memory
is available, and  returns ENOMEM. However, this is
potentially misleading, as there are a number of reasons why
creation of a mempool can fail (as per rte_mempool_create),
including:
   - insufficient memory available
   - mempool already exists
   - other memory allocation error

Update the error log to reflect this fact, and return rte_errno
in the event of error, instead of ENOMEM.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Darrell Ball <dlu998@gmail.com>
2017-05-03 11:35:19 -07:00
Ian Stokes
96e9b168e0 netdev-dpdk: Fix mempool segfault.
The dpdk_mp_get() function can return a NULL pointer which leads to a
segfault when a mempool cannot be created. The lack of a return value
check for the function netdev_dpdk_mempool_configure() when called in
netdev_dpdk_reconfigure() can result in a segfault also as
a NULL pointer for the mempool will be passed to rte_eth_rx_queue_setup().

Fix this by adding appropriate NULL pointer and return value checks to
dpdk_mp_get(), netdev_dpdk_reconfigure() and dpdk_vhost_reconfigure_helper().

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Fixes: 2ae3d542 ("netdev-dpdk: Refactor dpdk_mp_get().")
Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
CC: Daniele Di Proietto <diproiettod@vmware.com>
CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-03-09 17:21:55 -08:00
Ian Stokes
21e9844c59 netdev-dpdk: Fix rx_error stat for dpdk ports.
"rx_error" stat for a DPDK interface was calculated with the assumption that
dropped packets due to hardware buffer overload were counted as errors
in DPDK and the rte ierror stat included rte imissed packets i.e.

rx_errors = rte_stats.ierrors - rte_stats.imissed

This results in negative statistic values as imissed packets are no longer
counted as part of ierror since DPDK v.16.04.

Fix this by setting rx_errors equal to ierrors only.

Fixes: 9e3ddd45 (netdev-dpdk: Add some missing statistics.)
CC: Timo Puha <timox.puha@intel.com>)
Reported-by: Stepan Andrushko <stepanx.andrushko@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-02-16 16:11:10 -08:00
Daniele Di Proietto
1ce30dfd34 netdev-dpdk: Refactor construct and destruct.
Some refactoring for _construct() and _destruct() methods:
* Rename netdev_dpdk_init() to common_construct(). init() has a
  different meaning in the netdev context.
* Remove DPDK_DEV_ETH and DPDK_DEV_VHOST checks in common_construct()
  and move them to different functions
* Introduce common_destruct().
* Avoid taking 'dev->mutex' in construct and destruct: we're guaranteed
  to be the only thread with access to the object.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2017-01-15 19:25:11 -08:00
Daniele Di Proietto
7f381c2e54 netdev-dpdk: Start also dpdkr devices only once on port-add.
Since commit 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"),
we don't call rte_eth_start() from netdev_open() anymore, we only call
it from netdev_reconfigure().  This commit does that also for 'dpdkr'
devices, and remove some useless code.

Calling rte_eth_start() also from netdev_open() was unnecessary and
wasteful. Not doing it reduces code duplication and makes adding a port
faster (~900ms before the patch, ~400ms after).

Another reason why this is useful is that some DPDK driver might have
problems with reconfiguration. For example, until DPDK commit
8618d19b52b1("net/vmxnet3: reallocate shared memzone on re-config"),
vmxnet3 didn't support being restarted with a different number of
queues.

Technically, the netdev interface changed because before opening rxqs or
calling netdev_send() the user must check if reconfiguration is
required.  This patch also documents that, even though no change to the
userspace datapath (the only user) is required.

Lastly, this patch makes sure the errors returned by ofproto_port_add
(which includes the first port reconfiguration) are reported back to the
database.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2017-01-15 19:25:11 -08:00
Daniele Di Proietto
3b1fb0779b netdev-dpdk: Don't call rte_dev_stop() in update_flags().
Calling rte_eth_dev_stop() while the device is running causes a crash.

We could use rte_eth_dev_set_link_down(), but not every PMD implements
that, and I found one NIC where that has no effect.

Instead, this commit checks if the device has the NETDEV_UP flag when
transmitting or receiving (similarly to what we do for vhostuser). I
didn't notice any performance difference with this check in case the
device is up.

An alternative would be to remove the device queues from the pmd threads
tx and receive cache, but that requires reconfiguration and I'd prefer
to avoid it, because the change can come from OpenFlow.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2017-01-15 19:25:11 -08:00
Daniele Di Proietto
9fff138ec3 netdev: Add 'errp' to set_config().
Since 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"),
set_config() is used to identify a DPDK device, so it's better to report
its detailed error message to the user.  Tunnel devices and patch ports
rely a lot on set_config() as well.

This commit adds a param to set_config() that can be used to return
an error message and makes use of that in netdev-dpdk and netdev-vport.

Before this patch:

$ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl: Error detected while setting up 'dpdk0': dpdk0: could not set
    configuration (Invalid argument).  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

$ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch
ovs-vsctl: Error detected while setting up 'p+': p+: could not set
    configuration (Invalid argument).  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

$ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve
ovs-vsctl: Error detected while setting up 'gnv0': gnv0: could not set
    configuration (Invalid argument).  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

After this patch:

$ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl: Error detected while setting up 'dpdk0': 'dpdk0' is missing
    'options:dpdk-devargs'. The old 'dpdk<port_id>' names are not
    supported.  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

$ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch
ovs-vsctl: Error detected while setting up 'p+': p+: patch type requires
    valid 'peer' argument.  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

$ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve
ovs-vsctl: Error detected while setting up 'gnv0': gnv0: geneve type
    requires valid 'remote_ip' argument.  See ovs-vswitchd log for
    details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

CC: Ciara Loftus <ciara.loftus@intel.com>
CC: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
2017-01-11 18:29:39 -08:00
xu.binbin1@zte.com.cn
bd4e172b28 netdev-dpdk: Assign socket id according to device's numa id
We can hotplug attach DPDK ports specified via the 'dpdk-devargs'
option now.

But the socket id of DPDK ports can't be assigned correctly,
it is always 0. The socket id of DPDK ports should be assigned
according to the numa id of the device.

Fixes: 55e075e65ef9e ("netdev-dpdk: Arbitrary 'dpdk' port naming")
Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-11 18:11:30 -08:00
nickcooper-zhangtonghao
182ca1ae94 netdev-dpdk: Fix formatting typo.
Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-08 18:04:41 -08:00
Ciara Loftus
69876ed786 netdev-dpdk: Add support for virtual DPDK PMDs (vdevs)
Prior to this commit, the 'dpdk' port type could only be used for
physical DPDK devices. Now, virtual devices (or 'vdevs') are supported.
'vdev' devices are those which use virtual DPDK Poll Mode Drivers eg.
null, pcap. To add a DPDK vdev, a valid 'dpdk-devargs' must be set for
the given dpdk port. The format expected is 'eth_<driver_name><x>' where
'x' is a number between 0 and RTE_MAX_ETHPORTS -1.

For example to add a port that uses the 'null' DPDK PMD driver:

ovs-vsctl set Interface null0 options:dpdk-devargs=eth_null0

Not all DPDK vdevs have been verified to work at this point in time.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Stephen Finucane <stephen@that.guru>  # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-05 20:10:19 -08:00
Ciara Loftus
55e075e65e netdev-dpdk: Arbitrary 'dpdk' port naming
'dpdk' ports no longer have naming restrictions. Now, instead of
specifying the dpdk port ID as part of the name, the PCI address of the
device must be specified via the 'dpdk-devargs' option. eg.

ovs-vsctl add-port br0 my-port
ovs-vsctl set Interface my-port type=dpdk
  options:dpdk-devargs=0000:06:00.3

The user must no longer hotplug attach DPDK ports by issuing the
specific ovs-appctl netdev-dpdk/attach command. The hotplug is now
automatically invoked when a valid PCI address is set in the
dpdk-devargs. The format for ovs-appctl netdev-dpdk/detach command
has changed in that the user now must specify the relevant PCI address
as input instead of the port name.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Co-authored-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Stephen Finucane <stephen@that.guru>  # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-05 20:10:13 -08:00
Mauricio Vásquez
b8374d0d04 netdev-dpdk: add hotplug support
In order to use dpdk ports in ovs they have to be bound to a DPDK
compatible driver before ovs is started.

This patch adds the possibility to hotplug (or hot-unplug) a device
after ovs has been started. The implementation adds two appctl commands:
netdev-dpdk/attach and netdev-dpdk/detach

After the user attaches a new device, it has to be added to a bridge
using the add-port command, similarly, before detaching a device,
it has to be removed using the del-port command.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Co-authored-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Stephen Finucane <stephen@that.guru>  # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-05 20:10:07 -08:00
Kevin Traynor
48fffdee11 netdev-dpdk: Rename ivshmem structures.
Rename some structures that call themselves ivshmem,
as they are just a collection of dpdk rings and other
information.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-04 16:35:18 -08:00
Sugesh Chandran
1a2bb11817 netdev-dpdk: Enable Rx checksum offloading feature on DPDK physical ports.
Add Rx checksum offloading feature support on DPDK physical ports. By default,
the Rx checksum offloading is enabled if NIC supports. However,
the checksum offloading can be turned OFF either while adding a new DPDK
physical port to OVS or at runtime.

The rx checksum offloading can be turned off by setting the parameter to
'false'. For eg: To disable the rx checksum offloading when adding a port,

     'ovs-vsctl add-port br0 dpdk0 -- \
      set Interface dpdk0 type=dpdk options:rx-checksum-offload=false'

OR (to disable at run time after port is being added to OVS)

    'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=false'

Similarly to turn ON rx checksum offloading at run time,
    'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=true'

The Tx checksum offloading support is not implemented due to the following
reasons.

1) Checksum offloading and vectorization are mutually exclusive in DPDK poll
mode driver. Vector packet processing is turned OFF when checksum offloading
is enabled which causes significant performance drop at Tx side.

2) Normally, OVS generates checksum for tunnel packets in software at the
'tunnel push' operation, where the tunnel headers are created. However
enabling Tx checksum offloading involves,

*) Mark every packets for tx checksum offloading at 'tunnel_push' and
recirculate.
*) At the time of xmit, validate the same flag and instruct the NIC to do the
checksum calculation.  In case NIC doesnt support Tx checksum offloading,
the checksum calculation has to be done in software before sending out the
packets.

No significant performance improvement noticed with Tx checksum offloading
due to the e overhead of additional validations + non vector packet processing.
In some test scenarios, it introduces performance drop too.

Rx checksum offloading still offers 8-9% of improvement on VxLAN tunneling
decapsulation even though the SSE vector Rx function is disabled in DPDK poll
mode driver.

Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>
Acked-by: Jesse Gross <jesse@kernel.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2017-01-04 01:10:35 -08:00
Ilya Maximets
2a21e75796 netdev: Set the default number of queues at removal from the database
Expected behavior for attribute removal from the database is
resetting it to default value. Currently this doesn't work for
n_rxq/n_txq options of pmd netdevs (last requested value used):

	# ovs-vsctl set interface dpdk0 options:n_rxq=4
	# ovs-vsctl remove interface dpdk0 options n_rxq
	# ovs-appctl dpif/show | grep dpdk0
	  <...>
	  dpdk0 1/1: (dpdk: configured_rx_queues=4, <...> \
	                    requested_rx_queues=4,  <...>)

Fix that by using NR_QUEUE or 1 as a default value for 'smap_get_int'.

Fixes: a14b8947fd13 ("dpif-netdev: Allow different numbers of
                      rx queues for different ports.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-12-09 18:15:51 -08:00
Ilya Maximets
569c26da9d netdev-dpdk: Don't use dev->vhost_id without mutex.
The copy should be used here.
Additionally, 'strlen' changed to the faster check.

Fixes: 821b86649a90 ("netdev-dpdk: Don't try to unregister empty vhost_id.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-12-06 09:38:35 -08:00
Ilya Maximets
821b86649a netdev-dpdk: Don't try to unregister empty vhost_id.
If 'vhost-server-path' not provided for vhostuserclient port,
'netdev_dpdk_vhost_destruct()' will try to unregister an empty string.
This leads to error message in log:

netdev_dpdk|ERR|vhost2: Unable to unregister vhost driver for socket ''.

CC: Ciara Loftus <ciara.loftus@intel.com>
Fixes: 2d24d165d6a5 ("netdev-dpdk: Add new 'dpdkvhostuserclient' port type")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-12-02 10:59:48 -08:00
Binbin Xu
ca3d4f55fb netdev-dpdk: Assign value '0' to unsupported netdev features
When OVS&DPDK is used, DPDK doesn't support features 'advertised',
'supported' and 'peer'. If a physical port added to bridge, features
descirbed above can't be assigned, and the values are random.

Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-11-28 14:54:23 -08:00
Ciara Loftus
04de404e1b netdev-dpdk: Add support for DPDK 16.11
This commit announces support for DPDK 16.11. Compatibility with DPDK
v16.07 is not broken yet thanks to only minor code changes being needed
for the upgrade.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-11-25 14:57:29 +01:00
Ilya Maximets
451f26fdce netdev-dpdk: Return rx/tx queue sizes only for ETH devices.
'dev->requested_{rxq,txq}_size' and 'dev->{rxq,txq}_size' are
relevant only for DPDK_DEV_ETH devices and should be skipped
in 'netdev_dpdk_get_config()' for other ports.

CC: Ciara Loftus <ciara.loftus@intel.com>
Fixes: b685696b8c81 ("netdev-dpdk: Allow configurable queue sizes for 'dpdk' ports")

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-11-15 17:31:06 -08:00
xu.binbin1@zte.com.cn
314fb5ad07 netdev-dpdk: Fix the issue of physical port's admin state configuration
When we set physical port's admin state via ovs-appctl, the application
seems to work and returns "OK". But the application doesn't work perfectly,
the state stored in database doesn't change.

Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-11-15 13:55:53 -08:00
xu.binbin1@zte.com.cn
b3722dca6f netdev-dpdk: Can't set specified vhost port's admin state
When we set a vhost port's admin state via ovs-appctl, the
application doesn't work and returns the information
"Not a DPDK Interface".

Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-11-15 13:55:53 -08:00
Daniele Di Proietto
44975bb06f netdev-dpdk: Fix crash in QoS.
qos_conf can be NULL.  This can be easily reproduced by setting egress
QoS on a port:

```
ovs-vsctl set port dpdk2 qos=@newqos -- --id=@newqos create qos
type=egress-policer other-config:cir=46000000 other-config:cbs=2048
```

Reported-by: Ian Stokes <ian.stokes@intel.com>
Fixes: 78bd47cf44a5 ("netdev-dpdk: Use RCU for egress QoS.")
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
2016-11-14 17:24:34 -08:00
Ciara Loftus
a0cbc627a6 dpdk: Fix DPDK pdump compilation
The rte_pdump header file was not included in the file that requires it.
Fix this.

Fixes: 01961bbdd34a ("dpdk: New module with some code from netdev-dpdk.")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-10-13 11:08:30 -07:00
Daniele Di Proietto
01961bbdd3 dpdk: New module with some code from netdev-dpdk.
There's a lot of code in netdev-dpdk which is not at all related to the
netdev interface, mostly the library initialization code.

This commit moves it to a new 'dpdk' module, to simplify 'netdev-dpdk'.

Also a new module 'dpdk-stub' is introduced to implement some functions
when DPDK is not available.  This replaces the old 'netdev-nodpdk'
module.

Some redundant includes are removed or reorganized as a consequence.

No functional change.

CC: Aaron Conole <aconole@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:31:06 -07:00
Daniele Di Proietto
05b49df6e5 netdev-dpdk: Change vlog module name to 'netdev_dpdk'.
It is customary to have the vlog module name similar to the filename.
Plus a following commit will introduce a 'dpdk' module.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:29:35 -07:00
Daniele Di Proietto
ecc1a34ea5 netdev-dpdk: Use init() function to initialize classes.
It's better to use the classes init() functions to perform
initialization required for classes.

This will make it easier to move dpdk_init__() to a separate module in a
future commit.

No functional change.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:27:16 -07:00
Daniele Di Proietto
65e19e70e3 netdev-dpdk: Remove useless 'rte_eal_init_ret'.
If rte_eal_init() fails, we do not register the DPDK netdev classes,
therefore it's impossible to reach the classes construct functions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:27:16 -07:00
Daniele Di Proietto
1166b0d820 netdev-dpdk: Remove useless nonpmd_mempool_mutex.
Since DPDK commit 30e639989227("mempool: support non-EAL thread"),
non-EAL threads can use the mempool API safely.  Plus, nonpmd threads
access to netdev is already serialized with 'non_pmd_mutex' in
dpif-netdev.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:27:16 -07:00
Daniele Di Proietto
0c6f39e5b1 netdev-dpdk: Use xasprintf() when possible.
We're in the slowpath.  I find it easier to allocate and free memory,
than to handle snprintf() error conditions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:27:16 -07:00
Daniele Di Proietto
eff23640bd netdev-dpdk: Do not abort if out of hugepage memory.
We can run out of hugepage memory coming from rte_*alloc() more easily
than heap coming from malloc().

Therefore:

* We should not use hugepage memory if we're going to access it only in
  the slowpath.
* We shouldn't abort if we're out of hugepage memory.
* We should gracefully handle out of memory conditions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:26:59 -07:00
Daniele Di Proietto
819f13bd39 netdev-dpdk: Acquire dev->stats_lock only once.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 15:30:04 -07:00
Daniele Di Proietto
78bd47cf44 netdev-dpdk: Use RCU for egress QoS.
I think it's clearer to use RCU than to check for a pointer twice in the
fast path (before and after taking the spinlock). Now the spinlock is
integrated into 'qos_conf'.

'qos_conf' objects cannot be modified, so, instead of having
'qos_set()', we now have 'qos_is_equal()', which tells us if an object
must be destroyed and recreated.

With this patch we also avoid passing the netdev parameter to qos ops,
since it was unused most of the times.

Lastly, some duplication is removed.

CC: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 15:25:45 -07:00
Daniele Di Proietto
2ae3d5421b netdev-dpdk: Refactor dpdk_mp_get().
The error handling path in dpdk_mp_get() is getting complicated, it
even requires a boolean variable.

Simplify it by extracting the function dpdk_mp_create().

CC: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 15:16:34 -07:00
Ilya Maximets
b614c894ee netdev-dpdk: Configure flow control only when necessary.
It is not necessary to touch the physical device each time, if the
configuration has not been changed. Also, few style issues fixed.

Thread-safety annotation added to 'dpdk_set_rxq_config()'. It was
missed while previous refactoring of the flow control configuration.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: Sugesh Chandran <sugesh.chandran@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-30 10:59:00 -07:00
Ciara Loftus
b685696b8c netdev-dpdk: Allow configurable queue sizes for 'dpdk' ports
The 'options:n_rxq_desc' and 'n_txq_desc' fields allow the number of rx
and tx descriptors for dpdk ports to be modified. By default the values
are set to 2048, but can be modified to an integer between 1 and 4096
that is a power of two. The values can be modified at runtime, however
require the NIC to restart when changed.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-30 10:58:39 -07:00
Mark Kavanagh
58be5c0eec netdev-dpdk: Fix coding style
Coding style violations of the following conventions are present in netdev-dpdk.c:
    - limit lines to 79 characters
    - put a space after (but not before) the "sizeof" keyword
    - put a space between the () used in a cast and the
      expression whose type is cast: (void *) 0.

Resolve occurrences of each, and any other minor style infractions.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-29 11:41:33 -07:00
Mark Kavanagh
2391135ca7 netdev-dpdk: consistent naming for mbuf variables
Pointers to struct rte_mbuf are typically denoted within functions as
'pkt'; similarly, arrays of, and pointer-to-pointer to, struct rte_mbuf
are denoted by 'pkts'.

Update discrepancies to the above convention for consistency.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-29 11:41:07 -07:00
Ilya Maximets
c2adb102e2 netdev-dpdk: Introduce dpdk_mp_mutex.
'dpdk_mutex' protects two independent things: list of dpdk devices
and list of memory pools. Let's spit it in two to avoid global blocking
inside 'netdev_dpdk.*_reconfigure()' as possible.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-29 11:04:11 -07:00
Ilya Maximets
4196454379 netdev-dpdk: More correct log message on vhost_driver_unregister failure.
Current error message incorrect for the client mode.

Fixes: c1ff66ac80b5 ("netdev-dpdk: vHost client mode and reconnect")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-23 13:27:21 -07:00
Ilya Maximets
6881885a3b netdev-dpdk: Add missed lock in set_config for vhost client mode.
'vhost_driver_flags' and 'vhost_id' are mutable and must be protected
by 'dev->mutex'.

Fixes: 2d24d165d6a5 ("netdev-dpdk: Add new 'dpdkvhostuserclient' port type")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-23 13:23:14 -07:00
Ilya Maximets
5f88de0d9f netdev-dpdk: Fix memory leak in dpdk_mp_{get, put}().
'dmp' should be freed on failure and on put.

Fixes: 8a9562d21a40 ("dpif-netdev: Add DPDK netdev.")
Fixes: 8d38823bdf8b ("netdev-dpdk: fix memory leak")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-19 14:32:52 -07:00
Ciara Loftus
2d24d165d6 netdev-dpdk: Add new 'dpdkvhostuserclient' port type
The 'dpdkvhostuser' port type no longer supports both server and client
mode. Instead, 'dpdkvhostuser' ports are always 'server' mode and
'dpdkvhostuserclient' ports are always 'client' mode.

Suggested-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-19 14:02:06 -07:00
Ciara Loftus
5b9bf9e067 netdev-dpdk: Fix occurance of error log
If NUMA information can't be derived from a vHost User device, only
print an error if the VHOST_NUMA option is enabled in DPDK. Otherwise
'fail' silently.

Fixes: 0a0f39df1d5a ("netdev-dpdk: Add support for DPDK 16.07")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reported-by: Ian Stokes <ian.stokes@intel.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-18 17:04:27 -07:00
Ilya Maximets
6b094bf47b netdev-dpdk: Simplify send function for ETH devices.
'netdev_dpdk_send__()' function can be greatly simplified by using
recently introduced 'netdev_dpdk_filter_packet_len()'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-18 13:15:52 -07:00
Ilya Maximets
c6ec9d176d netdev-dpdk: Fix vHost stats.
This patch introduces function 'netdev_dpdk_filter_packet_len()' which is
intended to find and remove all packets with 'pkt_len > max_packet_len'
from the Tx batch.

It fixes inaccurate counting of 'tx_bytes' in vHost case if there was
dropped packets and allows to simplify send function.

Fixes: 0072e931b207 ("netdev-dpdk: add support for jumbo frames")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-18 13:15:52 -07:00