2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-22 18:07:40 +00:00

326 Commits

Author SHA1 Message Date
Daniele Di Proietto
033e9df25f netdev-dpdk: Refactor dpdk_class_init()
The following changes were made:

- Since we have two dpdk classes, we should split the initial operations needed
  by both classes from the initialization needed by each class.
- The dpdk_ring class does not need an initialization function: it has been
  removed. This also prevents many testcase from failing, because
  dpdk_ring_class_init() was printing an unexpected log message
  (OVS_VSWITCHD_START at tests/ofproto-macros.at:54 check for a specific set of
  startup log messages)
- If the user doesn't pass the --dpdk option we do not register the dpdk*
  classes
- Do not call VLOG_ERR if there are 0 dpdk ethernet device. OVS can now be used
  with dpdk_ring devices.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-07-17 11:16:32 -07:00
maryam.tahhan
95fb793ae7 netdev-dpdk: add dpdk rings to netdev-dpdk
Shared memory ring patch

This patch enables the client dpdk rings within the netdev-dpdk.  It adds
a new dpdk device called dpdkr (other naming suggestions?).  This allows
for the use of shared memory to communicate with other dpdk applications,
on the host or within a virtual machine.  Instructions for use are in
INSTALL.DPDK.

This has been tested on Intel multi-core platforms and with the client
application within the host.

Signed-off-by: Gerald Rogers <gerald.rogers@intel.com>
Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-07-16 09:43:15 -07:00
Ryan Wilson
f98d78641c netdev-dpdk: Add OVS_UNLIKELY annotations in dpdk_do_tx_copy().
Since dropped packets due to large packet size or lack of memory
are unlikely, it is best to add OVS_UNLIKELY annotations to these
conditions.

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-30 13:48:49 -07:00
Ryan Wilson
175cf4de3f netdev-dpdk: Fix memory leak in dpdk_do_tx_copy().
This patch fixes a bug where rte_pktmbuf_alloc() would fail and
packets which succeeded to allocate memory with rte_pktmbuf_alloc()
would not be sent and leak memory.

Also, as a byproduct of using a local variable to record dropped
packets, this reduces the locking of the netdev's mutex when
multiple packets are dropped in dpdk_do_tx_copy().

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-30 10:53:58 -07:00
Ryan Wilson
844f2d749a netdev-dpdk: Set current timestamp when flushing TX queue.
The current timestamp should be set every time the queue is flushed.
Thus, if DRAIN_TSC timer cycles have passed since the last timestamp,
the send queue should be flushed again.

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-30 10:53:58 -07:00
Ryan Wilson
b170db2aaa netdev-dpdk: Refactor dpdk_queue_flush().
This patch refactors dpdk_queue_flush() to reuse code in
dpdk_queue_pkts().

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-30 10:53:53 -07:00
Polehn, Mike A
79f5354c4c dpdk: High speed PMD physical NIC queue size
Large TX and RX queues are needed for high speed 10 GbE physical NICS.
Observed a 250% zero loss improvement over small NIC queue test for
port to port flow test.

Signed-off-by: Mike A. Polehn <mike.a.polehn@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-27 09:55:51 -07:00
Daniele Di Proietto
9477751009 netdev-dpdk: Disable NIC offloading and multiseg mbufs
We do not use any offloading (now) or multiple segments per packet, so
we might as well disable those features while configuring the NIC.

This could give performance improvements. For ixgbe, for example, this change
allows the driver to use a simpler tx routine, resulting in throuput
improvements (~7.5%)

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-25 10:54:04 -07:00
Daniele Di Proietto
a28ddd11c6 netdev-dpdk: Fix coding style in TX/RX conf structs
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-25 10:54:02 -07:00
Daniele Di Proietto
1ebfe1ac52 netdev-dpdk: Count and delete every dropped packet
Commit f4fd623c4c25 introduced a bug in netdev_dpdk_send(): if multiple
consecutive packets exceed MTU, only the first one is deleted and
counted.

This should fix the bug

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-25 10:54:00 -07:00
Pravin B Shelar
e381def971 lib: Rename ofp to buf.
dpif-packet contains ofpbuf which points to packet data.  Here buf
is better name rather than ofp.
Following patch renames all remaining instances of ofp variable.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
2014-06-25 09:28:42 -07:00
Ben Pfaff
451450fa4b netdev-dpdk: Coding style improvements.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
2014-06-24 12:36:48 -07:00
Daniele Di Proietto
f4fd623c4c netdev: netdev_send accepts multiple packets
The netdev_send function has been modified to accept multiple packets, to
allow netdev providers to amortize locking and queuing costs.
This is especially true for netdev-dpdk.

Later commits exploit the new API.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:13 -07:00
Daniele Di Proietto
910885540a dpif-netdev: use dpif_packet structure for packets
This commit introduces a new data structure used for receiving packets from
netdevs and passing them to dpifs.
The purpose of this change is to allow storing some private data for each
packet. The subsequent commits make use of it.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:12 -07:00
Daniele Di Proietto
9441caf372 vswitchd: skip right number of arguments in dpdk_init()
rte_eal_init() returns the number of parsed dpdk arguments to skip.
dpdk_init() should add 1 to that number, because it has already skipped
the "--dpdk" argument itself.

This patch also makes sure the program name is ovs-vswitchd in
rte_eal_init() and proctitle_init().

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:09 -07:00
Ryan Wilson
143859ec63 dpif-netdev: Upcall: Remove an extra memcpy of packet data.
When a bridge of datatype type netdev receives a packet, it
copies the packet from the NIC to a buffer in userspace.
Currently, when making an upcall, the packet is again copied
to the upcall's buffer. However, this extra copy is not
necessary when the datapath exists in userspace as the upcall
can directly access the packet data.

This patch eliminates this extra copy of the packet data in
most cases. In cases where the packet may still be used later
by callers of dp_netdev_execute_actions, making a copy of the
packet data is still necessary.

This patch also adds a dpdk_buf field to 'struct ofpbuf' when
using DPDK. This field holds a pointer to the allocated DPDK
buffer in the rte_mempool. Thus, an upcall packet ofpbuf
allocated on the stack can now share data and free memory of
a rte_mempool allocated ofpbuf.

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-04 15:48:30 -07:00
Daniele Di Proietto
d221ffa1e1 netdev-dpdk: create queues on configured NUMA node
This patch makes sure that the tx and rx queues are allocated on the NUMA socket
chosen at device initalization time, instead of the NUMA socket 0.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-04 15:39:49 -07:00
Daniele Di Proietto
7d08d53ed5 netdev-dpdk: receive up to NETDEV_MAX_RX_BATCH
As per netdev-provider interface, netdev_dpdk_rxq_recv should receive at most
NETDEV_MAX_RX_BATCH.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-04 15:38:45 -07:00
Daniele Di Proietto
a715f600c3 netdev-dpdk: use defined values for queues length
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-05-24 10:01:22 -07:00
Ben Pfaff
8ba0a5227f ovs-thread: Make caller provide thread name when creating a thread.
Thread names are occasionally very useful for debugging, but from time to
time we've forgotten to set one.  This commit adds the new thread's name
as a parameter to the function to start a thread, to make that mistake
impossible.  This also simplifies code, since two function calls become
only one.

This makes a few other changes to the thread creation function:

    * Since it is no longer a direct wrapper around a pthread function,
      rename it to avoid giving that impression.

    * Remove 'pthread_attr_t *' param that every caller supplied as NULL.

    * Change 'pthread *' parameter into a return value, for convenience.

The system-stats code hadn't set a thread name, so this fixes that issue.

This patch is a prerequisite for making RCU report the name of a thread
that is blocking RCU synchronization, because the easiest way to do that is
for ovsrcu_quiesce_end() to record the current thread's name.
ovsrcu_quiesce_end() is called before the thread function is called, so it
won't get a name set within the thread function itself.  Setting the thread
name earlier, as in this patch, avoids the problem.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
2014-04-28 15:25:49 -07:00
Alex Wang
045c0d1a77 netdev-dpdk: Indicate the change of etheraddr and mtu.
This commit makes the netdev-dpdk module signal the change of
etheraddr and mtu by changing the global sequence number and
incrementing its 'change_seq'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-10 12:55:28 -07:00
Alex Wang
3e912ffcbb netdev: Add 'change_seq' back to netdev.
This commit can be seen as a partial revert of commit
da4a619179d (netdev: Globally track port status changes)
by adding the 'change_seq' to 'struct netdev'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-10 12:55:28 -07:00
Pravin Shelar
b3cd9f9d6a netdev-dpdk: Remove alloc from packet recv.
On DPDK packet recv, ovs is given pointer to mbuf which has
information about a packet, for example pointer to data and size.
By moving mbuf to ofpbuf we can let dpdk allocate ofpbuf and
pass that to ovs for processing the packet.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-03-30 06:26:11 -07:00
Pravin Shelar
1f317cb5c2 ofpbuf: Introduce access api for base, data and size.
These functions will be used by later patches.  Following patch
does not change functionality.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-03-30 06:18:43 -07:00
Pravin
8617affff4 netdev-dpdk: Use multiple core for dpdk IO.
DPDK need to set _lcore_id for using multiple core.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
8a9562d21a dpif-netdev: Add DPDK netdev.
Following patch adds DPDK netdev-class to userspace datapath. Now
OVS can use DPDK port for IO by just configuring DPDK port and then
adding dpdk type port to userspace datapath.

Refer to INSTALL.DPDK doc for further info.

This is based a patch from Gerald Rogers.

Signed-off-by: Gerald Rogers <gerald.rogers@intel.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00