2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-24 10:58:04 +00:00

249 Commits

Author SHA1 Message Date
nickcooper-zhangtonghao
182ca1ae94 netdev-dpdk: Fix formatting typo.
Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-08 18:04:41 -08:00
Ciara Loftus
69876ed786 netdev-dpdk: Add support for virtual DPDK PMDs (vdevs)
Prior to this commit, the 'dpdk' port type could only be used for
physical DPDK devices. Now, virtual devices (or 'vdevs') are supported.
'vdev' devices are those which use virtual DPDK Poll Mode Drivers eg.
null, pcap. To add a DPDK vdev, a valid 'dpdk-devargs' must be set for
the given dpdk port. The format expected is 'eth_<driver_name><x>' where
'x' is a number between 0 and RTE_MAX_ETHPORTS -1.

For example to add a port that uses the 'null' DPDK PMD driver:

ovs-vsctl set Interface null0 options:dpdk-devargs=eth_null0

Not all DPDK vdevs have been verified to work at this point in time.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Stephen Finucane <stephen@that.guru>  # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-05 20:10:19 -08:00
Ciara Loftus
55e075e65e netdev-dpdk: Arbitrary 'dpdk' port naming
'dpdk' ports no longer have naming restrictions. Now, instead of
specifying the dpdk port ID as part of the name, the PCI address of the
device must be specified via the 'dpdk-devargs' option. eg.

ovs-vsctl add-port br0 my-port
ovs-vsctl set Interface my-port type=dpdk
  options:dpdk-devargs=0000:06:00.3

The user must no longer hotplug attach DPDK ports by issuing the
specific ovs-appctl netdev-dpdk/attach command. The hotplug is now
automatically invoked when a valid PCI address is set in the
dpdk-devargs. The format for ovs-appctl netdev-dpdk/detach command
has changed in that the user now must specify the relevant PCI address
as input instead of the port name.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Co-authored-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Stephen Finucane <stephen@that.guru>  # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-05 20:10:13 -08:00
Mauricio Vásquez
b8374d0d04 netdev-dpdk: add hotplug support
In order to use dpdk ports in ovs they have to be bound to a DPDK
compatible driver before ovs is started.

This patch adds the possibility to hotplug (or hot-unplug) a device
after ovs has been started. The implementation adds two appctl commands:
netdev-dpdk/attach and netdev-dpdk/detach

After the user attaches a new device, it has to be added to a bridge
using the add-port command, similarly, before detaching a device,
it has to be removed using the del-port command.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Co-authored-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Stephen Finucane <stephen@that.guru>  # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-05 20:10:07 -08:00
Kevin Traynor
48fffdee11 netdev-dpdk: Rename ivshmem structures.
Rename some structures that call themselves ivshmem,
as they are just a collection of dpdk rings and other
information.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2017-01-04 16:35:18 -08:00
Sugesh Chandran
1a2bb11817 netdev-dpdk: Enable Rx checksum offloading feature on DPDK physical ports.
Add Rx checksum offloading feature support on DPDK physical ports. By default,
the Rx checksum offloading is enabled if NIC supports. However,
the checksum offloading can be turned OFF either while adding a new DPDK
physical port to OVS or at runtime.

The rx checksum offloading can be turned off by setting the parameter to
'false'. For eg: To disable the rx checksum offloading when adding a port,

     'ovs-vsctl add-port br0 dpdk0 -- \
      set Interface dpdk0 type=dpdk options:rx-checksum-offload=false'

OR (to disable at run time after port is being added to OVS)

    'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=false'

Similarly to turn ON rx checksum offloading at run time,
    'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=true'

The Tx checksum offloading support is not implemented due to the following
reasons.

1) Checksum offloading and vectorization are mutually exclusive in DPDK poll
mode driver. Vector packet processing is turned OFF when checksum offloading
is enabled which causes significant performance drop at Tx side.

2) Normally, OVS generates checksum for tunnel packets in software at the
'tunnel push' operation, where the tunnel headers are created. However
enabling Tx checksum offloading involves,

*) Mark every packets for tx checksum offloading at 'tunnel_push' and
recirculate.
*) At the time of xmit, validate the same flag and instruct the NIC to do the
checksum calculation.  In case NIC doesnt support Tx checksum offloading,
the checksum calculation has to be done in software before sending out the
packets.

No significant performance improvement noticed with Tx checksum offloading
due to the e overhead of additional validations + non vector packet processing.
In some test scenarios, it introduces performance drop too.

Rx checksum offloading still offers 8-9% of improvement on VxLAN tunneling
decapsulation even though the SSE vector Rx function is disabled in DPDK poll
mode driver.

Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>
Acked-by: Jesse Gross <jesse@kernel.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2017-01-04 01:10:35 -08:00
Ilya Maximets
2a21e75796 netdev: Set the default number of queues at removal from the database
Expected behavior for attribute removal from the database is
resetting it to default value. Currently this doesn't work for
n_rxq/n_txq options of pmd netdevs (last requested value used):

	# ovs-vsctl set interface dpdk0 options:n_rxq=4
	# ovs-vsctl remove interface dpdk0 options n_rxq
	# ovs-appctl dpif/show | grep dpdk0
	  <...>
	  dpdk0 1/1: (dpdk: configured_rx_queues=4, <...> \
	                    requested_rx_queues=4,  <...>)

Fix that by using NR_QUEUE or 1 as a default value for 'smap_get_int'.

Fixes: a14b8947fd13 ("dpif-netdev: Allow different numbers of
                      rx queues for different ports.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-12-09 18:15:51 -08:00
Ilya Maximets
569c26da9d netdev-dpdk: Don't use dev->vhost_id without mutex.
The copy should be used here.
Additionally, 'strlen' changed to the faster check.

Fixes: 821b86649a90 ("netdev-dpdk: Don't try to unregister empty vhost_id.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-12-06 09:38:35 -08:00
Ilya Maximets
821b86649a netdev-dpdk: Don't try to unregister empty vhost_id.
If 'vhost-server-path' not provided for vhostuserclient port,
'netdev_dpdk_vhost_destruct()' will try to unregister an empty string.
This leads to error message in log:

netdev_dpdk|ERR|vhost2: Unable to unregister vhost driver for socket ''.

CC: Ciara Loftus <ciara.loftus@intel.com>
Fixes: 2d24d165d6a5 ("netdev-dpdk: Add new 'dpdkvhostuserclient' port type")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-12-02 10:59:48 -08:00
Binbin Xu
ca3d4f55fb netdev-dpdk: Assign value '0' to unsupported netdev features
When OVS&DPDK is used, DPDK doesn't support features 'advertised',
'supported' and 'peer'. If a physical port added to bridge, features
descirbed above can't be assigned, and the values are random.

Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-11-28 14:54:23 -08:00
Ciara Loftus
04de404e1b netdev-dpdk: Add support for DPDK 16.11
This commit announces support for DPDK 16.11. Compatibility with DPDK
v16.07 is not broken yet thanks to only minor code changes being needed
for the upgrade.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-11-25 14:57:29 +01:00
Ilya Maximets
451f26fdce netdev-dpdk: Return rx/tx queue sizes only for ETH devices.
'dev->requested_{rxq,txq}_size' and 'dev->{rxq,txq}_size' are
relevant only for DPDK_DEV_ETH devices and should be skipped
in 'netdev_dpdk_get_config()' for other ports.

CC: Ciara Loftus <ciara.loftus@intel.com>
Fixes: b685696b8c81 ("netdev-dpdk: Allow configurable queue sizes for 'dpdk' ports")

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-11-15 17:31:06 -08:00
xu.binbin1@zte.com.cn
314fb5ad07 netdev-dpdk: Fix the issue of physical port's admin state configuration
When we set physical port's admin state via ovs-appctl, the application
seems to work and returns "OK". But the application doesn't work perfectly,
the state stored in database doesn't change.

Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-11-15 13:55:53 -08:00
xu.binbin1@zte.com.cn
b3722dca6f netdev-dpdk: Can't set specified vhost port's admin state
When we set a vhost port's admin state via ovs-appctl, the
application doesn't work and returns the information
"Not a DPDK Interface".

Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-11-15 13:55:53 -08:00
Daniele Di Proietto
44975bb06f netdev-dpdk: Fix crash in QoS.
qos_conf can be NULL.  This can be easily reproduced by setting egress
QoS on a port:

```
ovs-vsctl set port dpdk2 qos=@newqos -- --id=@newqos create qos
type=egress-policer other-config:cir=46000000 other-config:cbs=2048
```

Reported-by: Ian Stokes <ian.stokes@intel.com>
Fixes: 78bd47cf44a5 ("netdev-dpdk: Use RCU for egress QoS.")
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
2016-11-14 17:24:34 -08:00
Ciara Loftus
a0cbc627a6 dpdk: Fix DPDK pdump compilation
The rte_pdump header file was not included in the file that requires it.
Fix this.

Fixes: 01961bbdd34a ("dpdk: New module with some code from netdev-dpdk.")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-10-13 11:08:30 -07:00
Daniele Di Proietto
01961bbdd3 dpdk: New module with some code from netdev-dpdk.
There's a lot of code in netdev-dpdk which is not at all related to the
netdev interface, mostly the library initialization code.

This commit moves it to a new 'dpdk' module, to simplify 'netdev-dpdk'.

Also a new module 'dpdk-stub' is introduced to implement some functions
when DPDK is not available.  This replaces the old 'netdev-nodpdk'
module.

Some redundant includes are removed or reorganized as a consequence.

No functional change.

CC: Aaron Conole <aconole@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:31:06 -07:00
Daniele Di Proietto
05b49df6e5 netdev-dpdk: Change vlog module name to 'netdev_dpdk'.
It is customary to have the vlog module name similar to the filename.
Plus a following commit will introduce a 'dpdk' module.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:29:35 -07:00
Daniele Di Proietto
ecc1a34ea5 netdev-dpdk: Use init() function to initialize classes.
It's better to use the classes init() functions to perform
initialization required for classes.

This will make it easier to move dpdk_init__() to a separate module in a
future commit.

No functional change.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:27:16 -07:00
Daniele Di Proietto
65e19e70e3 netdev-dpdk: Remove useless 'rte_eal_init_ret'.
If rte_eal_init() fails, we do not register the DPDK netdev classes,
therefore it's impossible to reach the classes construct functions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:27:16 -07:00
Daniele Di Proietto
1166b0d820 netdev-dpdk: Remove useless nonpmd_mempool_mutex.
Since DPDK commit 30e639989227("mempool: support non-EAL thread"),
non-EAL threads can use the mempool API safely.  Plus, nonpmd threads
access to netdev is already serialized with 'non_pmd_mutex' in
dpif-netdev.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:27:16 -07:00
Daniele Di Proietto
0c6f39e5b1 netdev-dpdk: Use xasprintf() when possible.
We're in the slowpath.  I find it easier to allocate and free memory,
than to handle snprintf() error conditions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:27:16 -07:00
Daniele Di Proietto
eff23640bd netdev-dpdk: Do not abort if out of hugepage memory.
We can run out of hugepage memory coming from rte_*alloc() more easily
than heap coming from malloc().

Therefore:

* We should not use hugepage memory if we're going to access it only in
  the slowpath.
* We shouldn't abort if we're out of hugepage memory.
* We should gracefully handle out of memory conditions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 16:26:59 -07:00
Daniele Di Proietto
819f13bd39 netdev-dpdk: Acquire dev->stats_lock only once.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 15:30:04 -07:00
Daniele Di Proietto
78bd47cf44 netdev-dpdk: Use RCU for egress QoS.
I think it's clearer to use RCU than to check for a pointer twice in the
fast path (before and after taking the spinlock). Now the spinlock is
integrated into 'qos_conf'.

'qos_conf' objects cannot be modified, so, instead of having
'qos_set()', we now have 'qos_is_equal()', which tells us if an object
must be destroyed and recreated.

With this patch we also avoid passing the netdev parameter to qos ops,
since it was unused most of the times.

Lastly, some duplication is removed.

CC: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 15:25:45 -07:00
Daniele Di Proietto
2ae3d5421b netdev-dpdk: Refactor dpdk_mp_get().
The error handling path in dpdk_mp_get() is getting complicated, it
even requires a boolean variable.

Simplify it by extracting the function dpdk_mp_create().

CC: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
2016-10-12 15:16:34 -07:00
Ilya Maximets
b614c894ee netdev-dpdk: Configure flow control only when necessary.
It is not necessary to touch the physical device each time, if the
configuration has not been changed. Also, few style issues fixed.

Thread-safety annotation added to 'dpdk_set_rxq_config()'. It was
missed while previous refactoring of the flow control configuration.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: Sugesh Chandran <sugesh.chandran@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-30 10:59:00 -07:00
Ciara Loftus
b685696b8c netdev-dpdk: Allow configurable queue sizes for 'dpdk' ports
The 'options:n_rxq_desc' and 'n_txq_desc' fields allow the number of rx
and tx descriptors for dpdk ports to be modified. By default the values
are set to 2048, but can be modified to an integer between 1 and 4096
that is a power of two. The values can be modified at runtime, however
require the NIC to restart when changed.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-30 10:58:39 -07:00
Mark Kavanagh
58be5c0eec netdev-dpdk: Fix coding style
Coding style violations of the following conventions are present in netdev-dpdk.c:
    - limit lines to 79 characters
    - put a space after (but not before) the "sizeof" keyword
    - put a space between the () used in a cast and the
      expression whose type is cast: (void *) 0.

Resolve occurrences of each, and any other minor style infractions.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-29 11:41:33 -07:00
Mark Kavanagh
2391135ca7 netdev-dpdk: consistent naming for mbuf variables
Pointers to struct rte_mbuf are typically denoted within functions as
'pkt'; similarly, arrays of, and pointer-to-pointer to, struct rte_mbuf
are denoted by 'pkts'.

Update discrepancies to the above convention for consistency.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-29 11:41:07 -07:00
Ilya Maximets
c2adb102e2 netdev-dpdk: Introduce dpdk_mp_mutex.
'dpdk_mutex' protects two independent things: list of dpdk devices
and list of memory pools. Let's spit it in two to avoid global blocking
inside 'netdev_dpdk.*_reconfigure()' as possible.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-29 11:04:11 -07:00
Ilya Maximets
4196454379 netdev-dpdk: More correct log message on vhost_driver_unregister failure.
Current error message incorrect for the client mode.

Fixes: c1ff66ac80b5 ("netdev-dpdk: vHost client mode and reconnect")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-23 13:27:21 -07:00
Ilya Maximets
6881885a3b netdev-dpdk: Add missed lock in set_config for vhost client mode.
'vhost_driver_flags' and 'vhost_id' are mutable and must be protected
by 'dev->mutex'.

Fixes: 2d24d165d6a5 ("netdev-dpdk: Add new 'dpdkvhostuserclient' port type")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-23 13:23:14 -07:00
Ilya Maximets
5f88de0d9f netdev-dpdk: Fix memory leak in dpdk_mp_{get, put}().
'dmp' should be freed on failure and on put.

Fixes: 8a9562d21a40 ("dpif-netdev: Add DPDK netdev.")
Fixes: 8d38823bdf8b ("netdev-dpdk: fix memory leak")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-19 14:32:52 -07:00
Ciara Loftus
2d24d165d6 netdev-dpdk: Add new 'dpdkvhostuserclient' port type
The 'dpdkvhostuser' port type no longer supports both server and client
mode. Instead, 'dpdkvhostuser' ports are always 'server' mode and
'dpdkvhostuserclient' ports are always 'client' mode.

Suggested-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-09-19 14:02:06 -07:00
Ciara Loftus
5b9bf9e067 netdev-dpdk: Fix occurance of error log
If NUMA information can't be derived from a vHost User device, only
print an error if the VHOST_NUMA option is enabled in DPDK. Otherwise
'fail' silently.

Fixes: 0a0f39df1d5a ("netdev-dpdk: Add support for DPDK 16.07")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reported-by: Ian Stokes <ian.stokes@intel.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-18 17:04:27 -07:00
Ilya Maximets
6b094bf47b netdev-dpdk: Simplify send function for ETH devices.
'netdev_dpdk_send__()' function can be greatly simplified by using
recently introduced 'netdev_dpdk_filter_packet_len()'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-18 13:15:52 -07:00
Ilya Maximets
c6ec9d176d netdev-dpdk: Fix vHost stats.
This patch introduces function 'netdev_dpdk_filter_packet_len()' which is
intended to find and remove all packets with 'pkt_len > max_packet_len'
from the Tx batch.

It fixes inaccurate counting of 'tx_bytes' in vHost case if there was
dropped packets and allows to simplify send function.

Fixes: 0072e931b207 ("netdev-dpdk: add support for jumbo frames")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-18 13:15:52 -07:00
Ciara Loftus
c1ff66ac80 netdev-dpdk: vHost client mode and reconnect
Until now, vHost ports in OVS have only been able to operate in 'server'
mode whereby OVS creates and manages the vHost socket and essentially
acts as the vHost 'server'. With this commit a new mode, 'client' mode,
is available. In this mode, OVS acts as the vHost 'client' and connects
to the socket created and managed by QEMU which now acts as the vHost
'server'. This mode allows for reconnect capability, which allows a
vHost port to resume normal connectivity in event of switch reset.

By default dpdkvhostuser ports still operate in 'server' mode. That is
unless a valid 'vhost-server-path' is specified for a device like so:

ovs-vsctl set Interface dpdkvhostuser0
options:vhost-server-path=/path/to/socket

'vhost-server-path' represents the full path of the vhost user socket
that has been or will be created by QEMU. Once specified, the port stays
in 'client' mode for the remainder of its lifetime.

QEMU v2.7.0+ is required when using OVS in vHost client mode and QEMU in
vHost server mode.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-15 17:29:12 -07:00
Ciara Loftus
53f50d2400 netdev-dpdk: Consistent naming for vhost
A mix of vhost_user_ and vhost_ is used when naming vhost functions. The
'user_' has been dropped for consistency. Also remove empty init
functions for netdev dpdk classes.

Suggested-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Daniele Di Proietto <diproiettod at vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
2016-08-15 17:29:12 -07:00
Ciara Loftus
4198764443 netdev-dpdk: Remove dpdkvhostcuse ports
This commit removes the 'dpdkvhostcuse' port type from the userspace
datapath. vhost-cuse ports are quickly becoming obsolete as the
vhost-user port type begins to support a greater feature-set thanks to
the addition of things like vhost-user multiqueue and potential
upcoming features like vhost-user client-mode and vhost-user reconnect.
The feature is also expected to be removed from DPDK soon.

One potential drawback of the removal of this support is that a
userspace vHost port type is not available in OVS for use with older
versions of QEMU (pre v2.2). Considering v2.2 is nearly two years old
this should however be a low impact change.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2016-08-15 17:29:12 -07:00
Ciara Loftus
c3d062a777 netdev-dpdk: Do not attempt to initialise flow control for 'dpdkr' ports
Only 'dpdk' ports support flow control. This patch stops 'dpdkr' ports
from attempting to initialise this feature as this port type does not
support it.

Fixes: 9fd39370c12c ("netdev-dpdk: Add Flow Control support.")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-15 13:10:18 -07:00
Ciara Loftus
7cd1261d13 netdev-dpdk: Use rte_eth_is_valid_port instead of manual check
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Mauricio Vásquez B <mauricio.vasquez@polito.it>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-15 13:10:18 -07:00
Mark Kavanagh
0072e931b2 netdev-dpdk: add support for jumbo frames
Add support for Jumbo Frames to DPDK-enabled port types,
using single-segment-mbufs.

Using this approach, the amount of memory allocated to each mbuf
to store frame data is increased to a value greater than 1518B
(typical Ethernet maximum frame length). The increased space
available in the mbuf means that an entire Jumbo Frame of a specific
size can be carried in a single mbuf, as opposed to partitioning
it across multiple mbuf segments.

The amount of space allocated to each mbuf to hold frame data is
defined dynamically by the user with ovs-vsctl, via the 'mtu_request'
parameter.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
[diproiettod@vmware.com rebased]
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-12 19:33:56 -07:00
Daniele Di Proietto
56abcf497b vswitchd: Introduce 'mtu_request' column in Interface.
The 'mtu_request' column can be used to set the MTU of a specific
interface.

This column is useful because it will allow changing the MTU of DPDK
devices (implemented in a future commit), which are not accessible
outside the ovs-vswitchd process, but it can be used for kernel
interfaces as well.

The current implementation of set_mtu() in netdev-dpdk is removed
because it's broken.  It will be reintroduced by a subsequent commit on
this series.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2016-08-12 19:32:12 -07:00
Ciara Loftus
4b88d6787d netdev-dpdk: add DPDK pdump capability
This commit provides the ability to 'listen' on DPDK ports and save
packets to a pcap file with a DPDK app that uses the librte_pdump
library. One such app is the 'pdump' app that can be found in the DPDK
'app' directory. Instructions on how to use this can be found in
INSTALL.DPDK-ADVANCED.md

Pdump capability in OVS with DPDK will only be initialised if the
CONFIG_RTE_LIBRTE_PMD_PCAP=y and CONFIG_RTE_LIBRTE_PDUMP=y options are
set in DPDK. libpcap is required if the above configuration is used.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-12 17:56:43 -07:00
Ilya Maximets
dd52de45b7 netdev-dpdk: vhost: Fix double free and use after free with QoS.
While using QoS with vHost interfaces 'netdev_dpdk_qos_run__()' will
free mbufs while executing 'netdev_dpdk_policer_run()'. After
that same mbufs will be freed at the end of '__netdev_dpdk_vhost_send()'
if 'may_steal == true'. This behaviour will break mempool.

Also 'netdev_dpdk_qos_run__()' will free packets even if we shouldn't
do this ('may_steal == false'). This will lead to using of already freed
packets by the upper layers.

Fix that by copying all packets that we can't steal like it done
for DPDK_DEV_ETH devices and freeing only packets not freed by QoS.

Fixes: 0bf765f753fd ("netdev_dpdk.c: Add QoS functionality.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-10 18:42:57 -07:00
Ilya Maximets
7f5f2bd0ce netdev-dpdk: Avoid reconfiguration on reconnection of same vhost device.
Binding/unbinding of virtio driver inside VM leads to reconfiguration
of PMD threads. This behaviour may be abused by executing bind/unbind
in an infinite loop to break normal networking on all ports attached
to the same instance of Open vSwitch.

Fix that by avoiding reconfiguration if it's not necessary.
Number of queues will not be decreased to 1 on device disconnection but
it's not very important in comparison with possible DOS attack from the
inside of guest OS.

Fixes: 81acebdaaf27 ("netdev-dpdk: Obtain number of queues for vhost
                      ports from attached virtio.")
Reported-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-09 17:35:50 -07:00
Ian Stokes
7ea266e953 netdev-dpdk: Fix egress policer error detection bug.
When egress policer is set as a QoS type for a port, an error may occur during
setup if incorrect parameters are used for the rte_meter. If this occurs
the egress policer construct and set functions should free any allocated
memory relevant to the policer and set the QoS configuration pointer to
null. The netdev_dpdk_set_qos function should check the error value returned
for any QoS construct/set calls with an assertion to avoid segfault.
Also this commit modifies egress_policer_qos_set() to correctly lock the QoS
spinlock while the egress policer configuration is updated to avoid
segfault.

Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-09 15:01:10 -07:00
Bhanuprakash Bodireddy
27373420c8 netdev-dpdk: Fix dead initialization reported by clang.
Clang reports that value stored to 'tok' during initialization is never
read.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-09 14:54:20 -07:00