mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-23 10:28:00 +00:00

Author	SHA1	Message	Date
Daniele Di Proietto	bd8baf47a1	netdev-dpdk: Fix sparse and clang warnings Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2015-06-25 11:21:38 -07:00
Wei li	dc6ba5dc8b	netdev-dpdk: Do not flush tx queue which is shared among CPUs since it is always flushed When tx queue is shared among CPUS,the pkts always be flush in 'netdev_dpdk_eth_send'. So it is unnecessarily for flushing in netdev_dpdk_rxq_recv Otherwise tx will be accessed without locking. Signed-off-by: Wei li <liw@dtdream.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2015-06-25 11:21:38 -07:00
Ciara Loftus	7d1ced0177	netdev-dpdk: add dpdk vhost-user ports This patch adds support for a new port type to the userspace datapath called dpdkvhostuser. A new dpdkvhostuser port will create a unix domain socket which when provided to QEMU is used to facilitate communication between the virtio-net device on the VM and the OVS port on the host. vhost-cuse ('dpdkvhost') ports are still available as 'dpdkvhostcuse' ports and will be enabled if vhost-cuse support is detected in the DPDK build specified during compilation of the switch. Otherwise, vhost-user ports are enabled. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2015-06-14 20:36:52 -07:00
Daniele Di Proietto	a0cb2d66f5	netdev-dpdk: Adapt the requested number of tx and rx queues. This commit changes the semantics of 'netdev_set_multiq()' to allow OVS DPDK to run on device with limited multi queue support. * If a netdev doesn't have the requested number of rxqs it can simply inform the datapath without failing. * If a netdev doesn't have the requested number of txqs it should try to create as many as possible and use locking. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2015-05-22 11:28:19 -07:00
Daniele Di Proietto	45d947c400	netdev-dpdk: Use specific spinlock for stats. Right now ethernet and ring devices use a mutex, while vhost devices use a mutex or a spinlock to protect statistics. This commit introduces a single spinlock that's always used for stats updates. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2015-05-22 11:28:19 -07:00
Daniele Di Proietto	d5c199ea7f	netdev-dpdk: Properly support non pmd threads. We used to reserve DPDK lcore 0 for non pmd operations, making it difficult to use core 0 for packet processing. DPDK 2.0 properly support non EAL threads with lcore LCORE_ID_ANY. Using non EAL threads for non pmd threads, we do not need to reserve any core for non pmd operations Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2015-05-22 11:28:19 -07:00
Daniele Di Proietto	bd5131ba76	ovs-numa: Change 'core_id' to unsigned. DPDK lcore_id is unsigned. We need to support big values like LCORE_ID_ANY (=UINT32_MAX). Therefore I am changing the type everywhere in OVS. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2015-05-22 11:28:19 -07:00
Kevin Traynor	9154f798ef	netdev-dpdk: Use default NIC configuration. This patch simplifies Rx/Tx NIC configuration by removing custom values and using the defaults provided by the DPDK PMDs. This also enables Rx vectorisation which improves performance. Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>	2015-05-21 13:48:18 -07:00
Ethan Jackson	cd159f1a82	dpdk: Ditch MAX_PKT_BURST macro. The MAX_PKT_BURST and NETDEV_MAX_RX_BATCH macros had a confusing relationship. They basically purport to do the same thing, making it unclear which is the source of truth. Furthermore, while NETDEV_MAX_RX_BATCH was 256, MAX_PKT_BURST was 32, meaning we never process a batch larger than 32 packets further adding to the confusion. This patch resolves the issue by removing MAX_PKT_BURST completely, and shrinking the new NETDEV_MAX_BURST macro to only 32. This should have no change in the execution path except shrinking a couple of structs and memory allocations (can't hurt). Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>	2015-05-19 14:47:00 -07:00
Ethan Jackson	bce01e3a89	netdev-dpdk: Fix sparse warnings. These are all minor style issues. Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>	2015-05-19 14:47:00 -07:00
Kevin Traynor	95e9881f84	netdev-dpdk: Add vhost enqueue retries. The max allowed burst size for a single vhost enqueue is 32. This code facilitates trying to send greater than the burst size of packets to the vhost interface by adding a retry loop and calling vhost enqueue multiple times. As this could potentially block, a timeout is added. Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2015-05-11 21:58:14 -07:00
Kevin Traynor	4345e1b5bf	netdev-dpdk: Change phy rx burst size. Change phy rx burst size from 192 to 32. This aligns the burst size with the other dpdk interfaces and significantly improves performance when forwarding to dpdk vhost ports. Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2015-05-11 21:58:12 -07:00
Mark Kavanagh	543342a41c	DPDK: add support for v2.0.0 Update relevant artifacts to add support for DPDK v2.0.0 - INSTALL.DPDK.md - travis build script - acinclude.m4: add 'mssse3' flag to OVS_CFLAGS - netdev-dpdk: fix build with unified offload types in DPDK v2.0.0 Note that this breaks compatibility with DPDK v1.8.0 Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Panu Matilainen <pmatilai@redhat.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2015-04-29 20:49:37 -07:00
Mark D. Gray	1b99bb0552	netdev-dpdk: Reset RSS hash on transmit When using DPDK rings (dpdkr port type), packet buffers get shared to consumers of the rings (e.g. Virtual Machines). The packet buffers also include the RSS hash. This is a hash of a number of fields in the packet and is used in order to do a fast lookup in the EMC. However, if a consumer of the packet modifies the packet without regenerating the RSS hash, the EMC will use the same hash for lookup even though the packet may belong to a different flow. This would cause unnecessary collisions in the EMC reducing performance in the presence of multiple flows. To avoid receiving an incorrect RSS hash on reception from a DPDK ring, the RSS hash needs to be reset on transmission. This will reduce performance of the forwarding path as the RSS hash will need to calculated for every packet received from an dpdkr but will behave correctly in the presence of a large number of flows that get modified by the consumer of a DPDK ring Signed-off-by: Mark D. Gray <mark.d.gray@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2015-04-22 20:09:52 -07:00
Kevin Traynor	618f44f7a4	netdev-dpdk: Put cuse thread into quiescent state. ovsrcu_synchronize() is used when setting virtio_dev to NULL. This results in an ovsrcu_quiesce_end() call which means the cuse thread may not go into quiescent state again for an indefinite time. Add an ovsrcu_quiesce_start() call to prevent this. Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2015-03-27 11:12:39 -07:00
Daniele Di Proietto	da79ce2b71	netdev-dpdk: create smaller mempools in case of failure If rte_mempool_create() fails with ENOMEM, try asking for a smaller mempools. This patch enables OVS DPDK to run on systems without 1GB hugepages Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2015-03-24 14:52:01 -07:00
Kevin Traynor	58397e6c1e	netdev-dpdk: add dpdk vhost-cuse ports This patch adds support for a new port type to userspace datapath called dpdkvhost. This allows KVM (QEMU) to offload the servicing of virtio-net devices to its associated dpdkvhost port. Instructions for use are in INSTALL.DPDK. This has been tested on Intel multi-core platforms and with clients that have virtio-net interfaces. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2015-03-19 20:26:03 -07:00
Mark Kavanagh	b8e57534ec	lib: upgrade to DPDK v1.8.0 DPDK v1.8.0 makes significant changes to struct rte_mbuf, including removal of the 'pkt' and 'data' fields. The latter, formally a pointer, is now calculated via an offset from the start of the segment buffer. So now dp_packet data is also stored as offset from base pointer. Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Signed-off-by: Rory Sexton <rory.sexton@intel.com> Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2015-03-04 10:05:39 -08:00
Pravin B Shelar	cf62fa4c70	dp-packet: Remove ofpbuf dependency. Currently dp-packet make use of ofpbuf for managing packet buffers. That complicates ofpbuf, by making dp-packet independent of ofpbuf both libraries can be optimized for their own use case. This avoids mapping operation between ofpbuf and dp_packet in datapath upcalls. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2015-03-03 13:37:37 -08:00
Pravin B Shelar	e14deea0bd	dpif_packet: Rename to dp_packet dp_packet is short and better name for datapath packet structure. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>	2015-03-03 13:37:34 -08:00
Mark D. Gray	ee32150e7f	netdev-dpdk: set_miimon should return EOPNOTSUPP. According to netdev-provider, this function should return EOPNOTSUPP if not supported. Signed-off-by: Mark D. Gray <mark.d.gray@intel.com> Signed-off-by: Ben Pfaff <blp@nicira.com>	2015-02-13 12:32:59 -08:00
Alex Wang	abb5943dbb	netdev-dpdk: Allow changing NON_PMD_CORE_ID for testing purpose. For testing purpose, developers may want to change the NON_PMD_CORE_ID and use a different core for non-pmd threads. Since the netdev-dpdk module is hard-coded to assert the non-pmd threads using core 0, such change will cause abortion of OVS. This commit fixes the assertion and allows changing NON_PMD_CORE_ID. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2015-02-10 17:38:03 -08:00
Thomas Graf	e6211adce4	lib: Move vlog.h to <openvswitch/vlog.h> A new function vlog_insert_module() is introduced to avoid using list_insert() from the vlog.h header. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-12-15 14:15:19 +01:00
Thomas Graf	55951e15e5	lib: Expose struct ovs_list definition in <openvswitch/list.h> Expose the struct ovs_list definition in <openvswitch/list.h>. Keep the list access API private for now. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-12-15 14:15:16 +01:00
Thomas Graf	ca6ba70092	list: Rename struct list to struct ovs_list struct list is a common name and can't be used in public headers. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-12-15 14:15:12 +01:00
Pravin B Shelar	a36de779d7	openvswitch: Userspace tunneling. Following patch adds support for userspace tunneling. Tunneling needs three more component first is routing table which is configured by caching kernel routes and second is ARP cache which build automatically by snooping arp. And third is tunnel protocol table which list all listening protocols which is populated by vswitchd as tunnel ports are added. GRE and VXLAN protocol support is added in this patch. Tunneling works as follows: On packet receive vswitchd check if this packet is targeted to tunnel port. If it is then vswitchd inserts tunnel pop action which pops header and sends packet to tunnel port. On packet xmit rather than generating Set tunnel action it generate tunnel push action which has tunnel header data. datapath can use tunnel-push action data to generate header for each packet and forward this packet to output port. Since tunnel-push action contains most of packet header vswitchd needs to lookup routing table and arp table to build this action. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-11-12 15:08:33 -08:00
David Verbeiren	7251515ea9	netdev-dpdk: Fix DPDK rings broken by multi queue DPDK rings don't need one queue per PMD thread and don't support multiple queues (set_multiq function is undefined). To fix operation with DPDK rings, this patch ignores EOPNOTSUPP error on netdev_set_multiq() and provides, for DPDK rings, a netdev send() function that ignores the provided queue id (= PMD thread core id). Suggested-by: Maryam Tahhan <maryam.tahhan@intel.com> Signed-off-by: David Verbeiren <david.verbeiren@intel.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-04 09:21:27 -08:00
Alex Wang	1b7a04e05b	netdev-dpdk: Fix crash when there is no pci numa info. When kernel cannot obtain the pci numa info, the numa_node file in corresponding pci directory in sysfs will show -1. Then the rte_eth_dev_socket_id() function will return it to ovs. On current master, ovs assumes rte_eth_dev_socket_id() always returns non-negative value. So using this -1 in pmd thread creation will cause ovs crash. To fix the above issue, this commit makes ovs always check the return value of rte_eth_dev_socket_id() and use numa node 0 if the return value is negative. Reported-by: Daniel Badea <daniel.badea@windriver.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>	2014-09-25 14:17:54 -07:00
Alex Wang	91968eb096	netdev-dpdk: Fix a bug in netdev_dpdk_set_multiq(). Commit 5a0340 (dpif-netdev: Create multiple tx/rx queues when adding dpdk interface.) introduced a bug which causes the function netdev_dpdk_set_multiq() never resetting the tx queues. This bug could cause pmd thread accessing unassigned memory, resulting in segfault. This commit fixes the bug. Reported-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>	2014-09-19 11:57:29 -07:00
Alex Wang	ba0358a118	netdev-dpdk: Fix a typo. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>	2014-09-19 11:57:16 -07:00
Alex Wang	2654cc338b	netdev-dpdk: Pass queue id to dpdk_do_tx_copy(). Since dpdk_do_tx_copy() will be called by both pmd and non-pmd thread, it should take the queue id as input. The current ovs always uses NON_PMD_THREAD_TX_QUEUE as queue id, which causes unprotected multi-access to the same queue. This commit fixes the issue by passing the queue id to dpdk_do_tx_copy(). Reported-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>	2014-09-18 17:26:13 -07:00
Alex Wang	b7ccaf673e	netdev-dpdk: Fix thread-safety breach. dpdk_eth_dev_init() must be called with dpdk_mutex. However, netdev_dpdk_set_multiq() fails to follow this rule. This commit fixes this breach. Found by clang. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>	2014-09-15 14:35:41 -07:00
Alex Wang	476590621e	netdev-dpdk: Make get_config() report correct queue info. With the separation of tx queue and rx queue configuration in netdev-dpdk module, the netdev_dpdk_get_config() can no longer report 'n_rxq' as tx queue configuration. This commit fixes the above issue. Reported-by: Daniele Di Proietto <ddiproietto@vmware.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>	2014-09-15 14:35:41 -07:00
Alex Wang	65f13b50c5	dpif-netdev: Create multiple pmd threads by default. With this commit, ovs by default will create one pmd thread for each numa node and pin the pmd thread to available cpu core on the numa node. NON_PMD_CORE_ID (currently 0) is used to reserve a particular cpu core for the I/O of all non-pmd threads. No pmd thread can be pinned to this reserved core. As side-effects of this commit: - pmd thread will not be created, if there is no dpdk interface from the corresponding numa node added to ovs. - the exact-match cache for non-pmd threads is removed from 'struct dp_netdev'. Instead, all non-pmd threads will use the exact-match cache defined in the 'struct dp_netdev_pmd_thread' for NON_PMD_CORE_ID. - the rx packet processing functions are refactored to use 'struct dp_netdev_pmd_thread' as input. - the 'netdev_send()' function will be called with the proper queue id. - both pmd and non-pmd threads can call the dpif_netdev_execute(). so, use a per-thread key to help recognize the calling thread. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 11:43:49 -07:00
Alex Wang	95a596e3d9	netdev-dpdk: Remove the tx queue spinlock. The previous commit makes OVS create one tx queue for each cpu core, each pmd thread will use a separate tx queue. Also, tx of non-pmd threads on dpdk interface is all through 'NON_PMD_THREAD_TX_QUEUE', protected by the 'nonpmd_mempool_mutex'. Therefore, the spinlock is no longer needed. And this commit removes it from 'struct dpdk_tx_queue'. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 11:43:49 -07:00
Alex Wang	94143fc41e	netdev-dpdk: Add indicator for flushing tx queue. Previous commit makes OVS create one tx queue for each cpu core. An upcoming patch will allow multiple pmd threads be created and pinned to cpu cores. So each pmd thread will use the tx queue corresponding to its core id. Moreover, the pmd threads running on different numa node than the dpdk interface (called non-local pmd thread) will not handle the rx of the interface. Consequently, there need to be a way to flush the tx queues of the non-local pmd threads. To address the queue flushing issue, this commit introduces a new flag 'flush_tx' in the 'struct dpdk_tx_queue' which is set if the queue is to be used by a non-local pmd thread. Then, when enqueueing the tx pkts, if the flag is set, the tx queue will always be flushed immediately after the enqueue. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 11:43:48 -07:00
Alex Wang	5a03406477	dpif-netdev: Create multiple tx/rx queues when adding dpdk interface. Before this commit, ovs creates one tx and one rx queue for each dpdk interface and uses only one poll thread for handling I/O of all dpdk interfaces. An upcoming patch will allow multiple poll threads be created. As a preparation, this commit changes the dpif-netdev to create multiple tx/rx queues when the dpdk interface is added. Specifically, the number of rx queues will still be one per-dpdk interface for this commit. But upcoming work will allow user create multiple rx queues. The number of tx queues will be the number of cpu cores on the machine. Although not all the tx queues will be used, each poll thread will have its own queue for transmission on the dpdk interface. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 11:43:48 -07:00
Alex Wang	5496878cbf	netdev: Add function for configuring tx and rx queues. This commit adds a new API to the 'struct netdev_class' which allows user to configure the number of tx queues and rx queues of 'netdev'. Upcoming patches will use this function to set multiple tx/rx queues when adding the netdev to dpif-netdev. Currently, only netdev-dpdk module implements this function. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 11:43:48 -07:00
Pravin B Shelar	2f9dd77fcd	ofproto: Do not update stats on fake bond interface. There are couple of reasons to remove this support: * This is used in very old OVS use-case. It is much better to read stats directly from OVS. * Forthcoming commit will remove support for setting stats for vport. The stats update depends on stats-set. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-09-15 10:08:56 -07:00
Alex Wang	f00fa8cbad	netdev: Add n_txq to 'struct netdev'. This commit adds new variable n_txq to 'struct netdev' for recording the number of tx queues. Correspondingly, the send_() functions are extended to accept queue id as input argument. All 'netdev-' implementation will ignore the queue id since having multiple tx queues is not supported. Upcomping patches will start using it and create multiple tx queues for dpdk netdev. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-12 11:30:58 -07:00
Alex Wang	7dec44fe1c	netdev: Add function for getting the numa node id of netdev. This commit adds a new API to the 'struct netdev_class' which allows user to query the numa node id the 'netdev' is on. Currently, only netdev-dpdk module implements this function. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-12 11:30:58 -07:00
Alex Wang	e0a801c7fd	netdev-dpdk: Show interface status for dpdk0. This commit fixes a bug which prevents the display of interface status for dpdk0. Found by inspection. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-02 21:48:28 -07:00
Alex Wang	34631d72fb	netdev-dpdk: Make memory pool name contain the socket id. This commit makes the memory pool name contain the socket id. Since dpdk library do not allow creation of memory pool with same name, this commit serves as a simple way of making each name unique. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-02 15:46:57 -07:00
Daniele Di Proietto	61a2647e15	packet-dpif: Add dpif_packet_{get, set}_hash() These function are used to stored the packet hash. 'netdev-dpdk' automatically set this value to the RSS hash returned by the NIC. Other 'netdev's set it to 0 (which is an invalid hash value), so that callers can compute the hash on their own. If DPDK support is enabled, struct dpif_packet's member 'dp_hash' is removed and 'pkt.hash.rss' from DPDK mbuf is used This commit also configure DPDK devices to compute RSS hash for UDP and IPv6 packets Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-08-29 16:32:21 -07:00
Daniele Di Proietto	58f7c37b1f	netdev-dpdk: Use different constant for ring size DPDK rings must have a power-of-two size. Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-08-29 15:48:59 -07:00
Daniele Di Proietto	1304f1f8a7	netdev-dpdk: Keep calling rte_eth_tx_burst() until it returns 0 rte_eth_tx_burst() _should_ transmit every packet that it is passed unless the queue is full. Nontheless some implementation of rte_eth_tx_burst (e.g. ixgbe_xmit_pkts_vec()) does not transmit more than a fixed number (32) of packets at a time. With this commit we assume that there's an error only if rte_eth_tx_burst returns 0. Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-08-12 17:38:48 -07:00
Daniele Di Proietto	d731058395	netdev-dpdk: Move to DPDK 1.7.0 With this commit we move our DPDK support to 1.7.0. DPDK binaries (starting with dpdk 1.7.0) should be linked with --whole-archive to include pmd drivers Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-08-12 17:38:48 -07:00
Ethan Jackson	645b893423	style: Replace TODO with XXX. In accordance with CodingStyle. Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-08-05 14:13:20 -07:00
Daniele Di Proietto	3a10026527	netdev-dpdk: Increase tx queue size and rx batch size These values has been found to give the best throughput in simple cases (1 flow 64 bytes UDP packets). Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-07-22 20:06:02 -07:00
Daniele Di Proietto	db73f7166a	netdev-dpdk: Fix race condition with DPDK mempools in non pmd threads DPDK mempools rely on rte_lcore_id() to implement a thread-local cache. Our non pmd threads had rte_lcore_id() == 0. This allowed concurrent access to the "thread-local" cache, causing crashes. This commit resolves the issue with the following changes: - Every non pmd thread has the same lcore_id (0, for management reasons), which is not shared with any pmd thread (lcore_id for pmd threads now start from 1) - DPDK mbufs must be allocated/freed in pmd threads. When there is the need to use mempools in non pmd threads, like in dpdk_do_tx_copy(), a mutex must be held. - The previous change does not allow us anymore to pass DPDK mbufs to handler threads: therefore this commit partially revert 143859ec63d45e. Now packets are copied for upcall processing. We can remove the extra memcpy by processing upcalls in the pmd thread itself. With the introduction of the extra locking, the packet throughput will be lower in the following cases: - When using internal (tap) devices with DPDK devices on the same datapath. Anyway, to support internal devices efficiently, we needed DPDK KNI devices, which will be proper pmd devices and will not need this locking. - When packets are processed in the slow path by non pmd threads. This overhead can be avoided by handling the upcalls directly in pmd threads (a change that has already been proposed by Ryan Wilson) Also, the following two fixes have been introduced: - In dpdk_free_buf() use rte_pktmbuf_free_seg() instead of rte_mempool_put(). This allows OVS to run properly with CONFIG_RTE_LIBRTE_MBUF_DEBUG DPDK option - Do not bulk free mbufs in a transmission queue. They may belong to different mempools Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>	2014-07-20 10:13:22 -07:00

... 3 4 5 6 7

326 Commits