2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-31 06:15:47 +00:00

doc: Add "PMD" topic document

This continues the breakup of the huge DPDK "howto" into smaller
components. There are a couple of related changes included, such as
using "Rx queue" instead of "rxq" and noting how Tx queues cannot be
configured.

Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
This commit is contained in:
Stephen Finucane
2018-04-19 13:57:22 +01:00
committed by Ian Stokes
parent 6477dbb9d6
commit 31d0dae22a
6 changed files with 183 additions and 95 deletions

View File

@@ -35,6 +35,7 @@ DOC_SOURCE = \
Documentation/topics/design.rst \
Documentation/topics/dpdk/index.rst \
Documentation/topics/dpdk/phy.rst \
Documentation/topics/dpdk/pmd.rst \
Documentation/topics/dpdk/ring.rst \
Documentation/topics/dpdk/vhost-user.rst \
Documentation/topics/testing.rst \

View File

@@ -81,92 +81,6 @@ To stop ovs-vswitchd & delete bridge, run::
$ ovs-appctl -t ovsdb-server exit
$ ovs-vsctl del-br br0
PMD Thread Statistics
---------------------
To show current stats::
$ ovs-appctl dpif-netdev/pmd-stats-show
To clear previous stats::
$ ovs-appctl dpif-netdev/pmd-stats-clear
Port/RXQ Assigment to PMD Threads
---------------------------------
To show port/rxq assignment::
$ ovs-appctl dpif-netdev/pmd-rxq-show
To change default rxq assignment to pmd threads, rxqs may be manually pinned to
desired cores using::
$ ovs-vsctl set Interface <iface> \
other_config:pmd-rxq-affinity=<rxq-affinity-list>
where:
- ``<rxq-affinity-list>`` is a CSV list of ``<queue-id>:<core-id>`` values
For example::
$ ovs-vsctl set interface dpdk-p0 options:n_rxq=4 \
other_config:pmd-rxq-affinity="0:3,1:7,3:8"
This will ensure:
- Queue #0 pinned to core 3
- Queue #1 pinned to core 7
- Queue #2 not pinned
- Queue #3 pinned to core 8
After that PMD threads on cores where RX queues was pinned will become
``isolated``. This means that this thread will poll only pinned RX queues.
.. warning::
If there are no ``non-isolated`` PMD threads, ``non-pinned`` RX queues will
not be polled. Also, if provided ``core_id`` is not available (ex. this
``core_id`` not in ``pmd-cpu-mask``), RX queue will not be polled by any PMD
thread.
If pmd-rxq-affinity is not set for rxqs, they will be assigned to pmds (cores)
automatically. The processing cycles that have been stored for each rxq
will be used where known to assign rxqs to pmd based on a round robin of the
sorted rxqs.
For example, in the case where here there are 5 rxqs and 3 cores (e.g. 3,7,8)
available, and the measured usage of core cycles per rxq over the last
interval is seen to be:
- Queue #0: 30%
- Queue #1: 80%
- Queue #3: 60%
- Queue #4: 70%
- Queue #5: 10%
The rxqs will be assigned to cores 3,7,8 in the following order:
Core 3: Q1 (80%) |
Core 7: Q4 (70%) | Q5 (10%)
core 8: Q3 (60%) | Q0 (30%)
To see the current measured usage history of pmd core cycles for each rxq::
$ ovs-appctl dpif-netdev/pmd-rxq-show
.. note::
A history of one minute is recorded and shown for each rxq to allow for
traffic pattern spikes. An rxq's pmd core cycles usage changes due to traffic
pattern or reconfig changes will take one minute before they are fully
reflected in the stats.
Rxq to pmds assignment takes place whenever there are configuration changes
or can be triggered by using::
$ ovs-appctl dpif-netdev/pmd-rxq-rebalance
QoS
---

View File

@@ -31,3 +31,4 @@ The DPDK Datapath
phy
vhost-user
ring
pmd

View File

@@ -113,3 +113,15 @@ tool::
For more information, refer to the `DPDK documentation <dpdk-drivers>`__.
.. _dpdk-drivers: http://dpdk.org/doc/guides/linux_gsg/linux_drivers.html
.. _dpdk-phy-multiqueue:
Multiqueue
----------
Poll Mode Driver (PMD) threads are the threads that do the heavy lifting for
the DPDK datapath. Correct configuration of PMD threads and the Rx queues they
utilize is a requirement in order to deliver the high-performance possible with
DPDK acceleration. It is possible to configure multiple Rx queues for ``dpdk``
ports, thus ensuring this is not a bottleneck for performance. For information
on configuring PMD threads, refer to :doc:`pmd`.

View File

@@ -0,0 +1,161 @@
..
Licensed under the Apache License, Version 2.0 (the "License"); you may
not use this file except in compliance with the License. You may obtain
a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations
under the License.
Convention for heading levels in Open vSwitch documentation:
======= Heading 0 (reserved for the title in a document)
------- Heading 1
~~~~~~~ Heading 2
+++++++ Heading 3
''''''' Heading 4
Avoid deeper levels because they do not render well.
===========
PMD Threads
===========
Poll Mode Driver (PMD) threads are the threads that do the heavy lifting for
the DPDK datapath and perform tasks such as continuous polling of input ports
for packets, classifying packets once received, and executing actions on the
packets once they are classified.
PMD threads utilize Receive (Rx) and Transmit (Tx) queues, commonly known as
*rxq*\s and *txq*\s. While Tx queue configuration happens automatically, Rx
queues can be configured by the user. This can happen in one of two ways:
- For physical interfaces, configuration is done using the
:program:`ovs-appctl` utility.
- For virtual interfaces, configuration is done using the :program:`ovs-appctl`
utility, but this configuration must be reflected in the guest configuration
(e.g. QEMU command line arguments).
The :program:`ovs-appctl` utility also provides a number of commands for
querying PMD threads and their respective queues. This, and all of the above,
is discussed here.
.. todo::
Add an overview of Tx queues including numbers created, how they relate to
PMD threads, etc.
PMD Thread Statistics
---------------------
To show current stats::
$ ovs-appctl dpif-netdev/pmd-stats-show
To clear previous stats::
$ ovs-appctl dpif-netdev/pmd-stats-clear
Port/Rx Queue Assigment to PMD Threads
--------------------------------------
.. todo::
This needs a more detailed overview of *why* this should be done, along with
the impact on things like NUMA affinity.
Correct configuration of PMD threads and the Rx queues they utilize is a
requirement in order to achieve maximum performance. This is particularly true
for enabling things like multiqueue for :ref:`physical <dpdk-phy-multiqueue>`
and :ref:`vhost-user <dpdk-vhost-user>` interfaces.
To show port/Rx queue assignment::
$ ovs-appctl dpif-netdev/pmd-rxq-show
Rx queues may be manually pinned to cores. This will change the default Rx
queue assignment to PMD threads::
$ ovs-vsctl set Interface <iface> \
other_config:pmd-rxq-affinity=<rxq-affinity-list>
where:
- ``<rxq-affinity-list>`` is a CSV list of ``<queue-id>:<core-id>`` values
For example::
$ ovs-vsctl set interface dpdk-p0 options:n_rxq=4 \
other_config:pmd-rxq-affinity="0:3,1:7,3:8"
This will ensure there are *4* Rx queues and that these queues are configured
like so:
- Queue #0 pinned to core 3
- Queue #1 pinned to core 7
- Queue #2 not pinned
- Queue #3 pinned to core 8
PMD threads on cores where Rx queues are *pinned* will become *isolated*. This
means that this thread will only poll the *pinned* Rx queues.
.. warning::
If there are no *non-isolated* PMD threads, *non-pinned* RX queues will not
be polled. Also, if the provided ``<core-id>`` is not available (e.g. the
``<core-id>`` is not in ``pmd-cpu-mask``), the RX queue will not be polled
by any PMD thread.
If ``pmd-rxq-affinity`` is not set for Rx queues, they will be assigned to PMDs
(cores) automatically. Where known, the processing cycles that have been stored
for each Rx queue will be used to assign Rx queue to PMDs based on a round
robin of the sorted Rx queues. For example, take the following example, where
there are five Rx queues and three cores - 3, 7, and 8 - available and the
measured usage of core cycles per Rx queue over the last interval is seen to
be:
- Queue #0: 30%
- Queue #1: 80%
- Queue #3: 60%
- Queue #4: 70%
- Queue #5: 10%
The Rx queues will be assigned to the cores in the following order::
Core 3: Q1 (80%) |
Core 7: Q4 (70%) | Q5 (10%)
Core 8: Q3 (60%) | Q0 (30%)
To see the current measured usage history of PMD core cycles for each Rx
queue::
$ ovs-appctl dpif-netdev/pmd-rxq-show
.. note::
A history of one minute is recorded and shown for each Rx queue to allow for
traffic pattern spikes. Any changes in the Rx queue's PMD core cycles usage,
due to traffic pattern or reconfig changes, will take one minute to be fully
reflected in the stats.
Rx queue to PMD assignment takes place whenever there are configuration changes
or can be triggered by using::
$ ovs-appctl dpif-netdev/pmd-rxq-rebalance
.. versionchanged:: 2.8.0
Automatic assignment of Rx queues to PMDs and the two related commands,
``pmd-rxq-show`` and ``pmd-rxq-rebalance``, were added in OVS 2.8.0. Prior
to this, behavior was round-robin and processing cycles were not taken into
consideration. Tracking for stats was not available.
.. versionchanged:: 2.9.0
The output of ``pmd-rxq-show`` was modified to include utilization as a
percentage.

View File

@@ -130,11 +130,10 @@ an additional set of parameters::
-netdev type=vhost-user,id=mynet2,chardev=char2,vhostforce
-device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2
In addition, QEMU must allocate the VM's memory on hugetlbfs. vhost-user
ports access a virtio-net device's virtual rings and packet buffers mapping the
VM's physical memory on hugetlbfs. To enable vhost-user ports to map the VM's
memory into their process address space, pass the following parameters to
QEMU::
In addition, QEMU must allocate the VM's memory on hugetlbfs. vhost-user ports
access a virtio-net device's virtual rings and packet buffers mapping the VM's
physical memory on hugetlbfs. To enable vhost-user ports to map the VM's memory
into their process address space, pass the following parameters to QEMU::
-object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on
-numa node,memdev=mem -mem-prealloc
@@ -154,18 +153,18 @@ where:
The number of vectors, which is ``$q`` * 2 + 2
The vhost-user interface will be automatically reconfigured with required
number of rx and tx queues after connection of virtio device. Manual
number of Rx and Tx queues after connection of virtio device. Manual
configuration of ``n_rxq`` is not supported because OVS will work properly only
if ``n_rxq`` will match number of queues configured in QEMU.
A least 2 PMDs should be configured for the vswitch when using multiqueue.
A least two PMDs should be configured for the vswitch when using multiqueue.
Using a single PMD will cause traffic to be enqueued to the same vhost queue
rather than being distributed among different vhost queues for a vhost-user
interface.
If traffic destined for a VM configured with multiqueue arrives to the vswitch
via a physical DPDK port, then the number of rxqs should also be set to at
least 2 for that physical DPDK port. This is required to increase the
via a physical DPDK port, then the number of Rx queues should also be set to at
least two for that physical DPDK port. This is required to increase the
probability that a different PMD will handle the multiqueue transmission to the
guest using a different vhost queue.