When compiling Open vSwitch on aarch64, the compiler will warn about a
uninitialized variable in lib/dpif-netdev-perf.c. If the clock_gettime
function in rdtsc_syscall fails, the member last_tsc of the
uninitialized struct will be returned. In order to avoid the warnings,
it is necessary to initialize the variable before using.
Reviewed-by: Yanqin Wei <Yanqin.Wei@arm.com>
Reviewed-by: Malvika Gupta <Malvika.Gupta@arm.com>
Signed-off-by: Lance Yang <Lance.Yang@arm.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Unlike 'rte_get_tsc_cycles()' which doesn't need any specific
initialization, 'rte_get_tsc_hz()' could be used only after successfull
call to 'rte_eal_init()'. 'rte_eal_init()' estimates the TSC frequency
for later use by 'rte_get_tsc_hz()'. Fairly said, we're not allowed
to use 'rte_get_tsc_cycles()' before initializing DPDK too, but it
works this way for now and provides correct results.
This patch provides TSC frequency estimation code that will be used
in two cases:
* DPDK is not compiled in, i.e. DPDK_NETDEV not defined.
* DPDK compiled in but not initialized,
i.e. other_config:dpdk-init=false
This change is mostly useful for AF_XDP netdev support, i.e. allows
to use dpif-netdev/pmd-perf-show command and various PMD perf metrics.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
Unlike x86 where TSC frequency usually matches with CPU frequency,
another architectures could have much slower TSCs.
For example, it's common for Arm SoCs to have 100 MHz TSC by default.
In this case perf module will check for end of current millisecond
each 10K cycles, i.e 10 times per millisecond. This could be not
enough to collect precise statistics.
Fix that by taking current TSC frequency into account instead of
hardcoding the number of cycles.
CC: Jan Scheurich <jan.scheurich@ericsson.com>
Fixes: 79f368756ce8 ("dpif-netdev: Detailed performance stats for PMDs")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Real values of 'packets per batch' and 'cycles per upcall' already
added to histograms in 'dpif-netdev' on receive. Adding the averages
makes statistics wrong. We should not add to histograms values that
never really appeared.
For exmaple, in current code following situation is possible:
pmd thread numa_id 0 core_id 5:
...
Rx packets: 83 (0 Kpps, 13873 cycles/pkt)
...
- Upcalls: 3 ( 3.6 %, 248.6 us/upcall)
Histograms
packets/it pkts/batch upcalls/it cycles/upcall
1 83 1 166 1 3 ...
15848 2
19952 2
...
50118 2
i.e. all the packets counted twice in 'pkts/batch' column and
all the upcalls counted twice in 'cycles/upcall' column.
CC: Jan Scheurich <jan.scheurich@ericsson.com>
Fixes: 79f368756ce8 ("dpif-netdev: Detailed performance stats for PMDs")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
'dpif-netdev/pmd-perf-show' command prints the frequency number
calculated from the total number of cycles spent for iterations
for the measured period. This number could be confusing, because
users may think that it should be equal to CPU frequency, especially
on non-x86 systems where TSC frequency likely does not match with
CPU one.
Moreover, counted TSC cycles could differ from the HW TSC cycles
in case of a large number of PMD reloads, because cycles spent
outside of the main polling loop are not taken into account anywhere.
In this case the frequency will not match even TSC frequency.
Let's clarify the meaning in order to avoid this misunderstanding.
'Cycles' replaced with 'Used TSC cycles', which describes how many TSC
cycles consumed by the main polling loop. % of the total TSC cycles
now printed instead of GHz frequency, because GHz is unclear for
understanding, especially without knowing the exact TSC frequency.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Printing of the SMC hits missed in the 'dpif-netdev/pmd-perf-show'
appctl command.
CC: Yipeng Wang <yipeng1.wang@intel.com>
Fixes: 60d8ccae135f ("dpif-netdev: Add SMC cache after EMC cache")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
MSVC complains:
"libopenvswitch.lib(dpif-netdev.obj) : error LNK2019: unresolved external
symbol pmd_perf_start_iteration referenced in function pmd_thread_main
libopenvswitch.lib(dpif-netdev.obj) : error LNK2019: unresolved external
symbol pmd_perf_end_iteration referenced in function pmd_thread_main"
Remove inline keyword from the declaration of:
`pmd_perf_start_iteration` and `pmd_perf_end_iteration`
More on the subject:
https://docs.microsoft.com/en-us/cpp/error-messages/tool-errors/function-inlining-problems
Fixes: broken build on Windows
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
This patch enhances dpif-netdev-perf to detect iterations with
suspicious statistics according to the following criteria:
- iteration lasts longer than US_THR microseconds (default 250).
This can be used to capture events where a PMD is blocked or
interrupted for such a period of time that there is a risk for
dropped packets on any of its Rx queues.
- max vhost qlen exceeds a threshold Q_THR (default 128). This can
be used to infer virtio queue overruns and dropped packets inside
a VM, which are not visible in OVS otherwise.
Such suspicious iterations can be logged together with their iteration
statistics to be able to correlate them to packet drop or other events
outside OVS.
A new command is introduced to enable/disable logging at run-time and
to adjust the above thresholds for suspicious iterations:
ovs-appctl dpif-netdev/pmd-perf-log-set on | off
[-b before] [-a after] [-e|-ne] [-us usec] [-q qlen]
Turn logging on or off at run-time (on|off).
-b before: The number of iterations before the suspicious iteration to
be logged (default 5).
-a after: The number of iterations after the suspicious iteration to
be logged (default 5).
-e: Extend logging interval if another suspicious iteration is
detected before logging occurs.
-ne: Do not extend logging interval (default).
-q qlen: Suspicious vhost queue fill level threshold. Increase this
to 512 if the Qemu supports 1024 virtio queue length.
(default 128).
-us usec: change the duration threshold for a suspicious iteration
(default 250 us).
Note: Logging of suspicious iterations itself consumes a considerable
amount of processing cycles of a PMD which may be visible in the iteration
history. In the worst case this can lead OVS to detect another
suspicious iteration caused by logging.
If more than 100 iterations around a suspicious iteration have been
logged once, OVS falls back to the safe default values (-b 5/-a 5/-ne)
to avoid that logging itself causes continuos further logging.
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Acked-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
This patch instruments the dpif-netdev datapath to record detailed
statistics of what is happening in every iteration of a PMD thread.
The collection of detailed statistics can be controlled by a new
Open_vSwitch configuration parameter "other_config:pmd-perf-metrics".
By default it is disabled. The run-time overhead, when enabled, is
in the order of 1%.
The covered metrics per iteration are:
- cycles
- packets
- (rx) batches
- packets/batch
- max. vhostuser qlen
- upcalls
- cycles spent in upcalls
This raw recorded data is used threefold:
1. In histograms for each of the following metrics:
- cycles/iteration (log.)
- packets/iteration (log.)
- cycles/packet
- packets/batch
- max. vhostuser qlen (log.)
- upcalls
- cycles/upcall (log)
The histograms bins are divided linear or logarithmic.
2. A cyclic history of the above statistics for 999 iterations
3. A cyclic history of the cummulative/average values per millisecond
wall clock for the last 1000 milliseconds:
- number of iterations
- avg. cycles/iteration
- packets (Kpps)
- avg. packets/batch
- avg. max vhost qlen
- upcalls
- avg. cycles/upcall
The gathered performance metrics can be printed at any time with the
new CLI command
ovs-appctl dpif-netdev/pmd-perf-show [-nh] [-it iter_len] [-ms ms_len]
[-pmd core] [dp]
The options are
-nh: Suppress the histograms
-it iter_len: Display the last iter_len iteration stats
-ms ms_len: Display the last ms_len millisecond stats
-pmd core: Display only the specified PMD
The performance statistics are reset with the existing
dpif-netdev/pmd-stats-clear command.
The output always contains the following global PMD statistics,
similar to the pmd-stats-show command:
Time: 15:24:55.270
Measurement duration: 1.008 s
pmd thread numa_id 0 core_id 1:
Cycles: 2419034712 (2.40 GHz)
Iterations: 572817 (1.76 us/it)
- idle: 486808 (15.9 % cycles)
- busy: 86009 (84.1 % cycles)
Rx packets: 2399607 (2381 Kpps, 848 cycles/pkt)
Datapath passes: 3599415 (1.50 passes/pkt)
- EMC hits: 336472 ( 9.3 %)
- Megaflow hits: 3262943 (90.7 %, 1.00 subtbl lookups/hit)
- Upcalls: 0 ( 0.0 %, 0.0 us/upcall)
- Lost upcalls: 0 ( 0.0 %)
Tx packets: 2399607 (2381 Kpps)
Tx batches: 171400 (14.00 pkts/batch)
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Acked-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Add module dpif-netdev-perf to host all PMD performance-related
data structures and functions in dpif-netdev. Refactor the PMD
stats handling in dpif-netdev and delegate whatever possible into
the new module, using clean interfaces to shield dpif-netdev from
the implementation details. Accordingly, the all PMD statistics
members are moved from the main struct dp_netdev_pmd_thread into
a dedicated member of type struct pmd_perf_stats.
Include Darrel's prior refactoring of PMD stats contained in
[PATCH v5,2/3] dpif-netdev: Refactor some pmd stats:
1. The cycles per packet counts are now based on packets
received rather than packet passes through the datapath.
2. Packet counters are now kept for packets received and
packets recirculated. These are kept as separate counters for
maintainability reasons. The cost of incrementing these counters
is negligible. These new counters are also displayed to the user.
3. A display statistic is added for the average number of
datapath passes per packet. This should be useful for user
debugging and understanding of packet processing.
4. The user visible 'miss' counter is used for successful upcalls,
rather than the sum of sucessful and unsuccessful upcalls. Hence,
this becomes what user historically understands by OVS 'miss upcall'.
The user display is annotated to make this clear as well.
5. The user visible 'lost' counter remains as failed upcalls, but
is annotated to make it clear what the meaning is.
6. The enum pmd_stat_type is annotated to make the usage of the
stats counters clear.
7. The subtable lookup stats is renamed to make it clear that it
relates to masked lookups.
8. The PMD stats test is updated to handle the new user stats of
packets received, packets recirculated and average number of datapath
passes per packet.
On top of that introduce a "-pmd <core>" option to the PMD info
commands to filter the output for a single PMD.
Made the pmd-stats-show output a bit more readable by adding a blank
between colon and value.
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off: Ian Stokes <ian.stokes@intel.com>