mirror of
https://gitlab.isc.org/isc-projects/bind9
synced 2025-08-29 21:47:59 +00:00
CPU profiling using perf
parent
96ac49945c
commit
a85b12e284
60
CPU-profiling-using-perf.md
Normal file
60
CPU-profiling-using-perf.md
Normal file
@ -0,0 +1,60 @@
|
||||
To profile `named` using `perf`, the following steps:
|
||||
|
||||
1. Install the prerequisites. E.g. on Fedora:
|
||||
|
||||
```
|
||||
sudo dnf install perf inferno
|
||||
```
|
||||
|
||||
`inferno` is a Rust implementation of the flamegraph utility by Brendan Gregg. Since it is written in a native language, it is faster than the original perl script.
|
||||
If your distribution does not package `inferno`, you can install the original `flamegraph.pl` script. E.g.
|
||||
|
||||
```
|
||||
sudo dnf install perf flamegraph
|
||||
```
|
||||
|
||||
Replace calls to `inferno` with `flamegraph.pl` if necessary.
|
||||
|
||||
1. Ensure you have compiled `named` with debug symbols. Optionally, set up a prefix
|
||||
|
||||
```sh
|
||||
cd ~/Path/to/bind9/checkout
|
||||
./configure CFLAGS='-O2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -ggdb3' --prefix=/path/to/parent
|
||||
make clean && make
|
||||
```
|
||||
|
||||
2. Install `named`. Normally `libtool` runs `named` through a wrapper shell script that can make perf results more difficult to read.
|
||||
|
||||
```sh
|
||||
make install
|
||||
```
|
||||
|
||||
3. Run `named` under `perf`.
|
||||
```sh
|
||||
perf record --compression-level=1 --user-callchains -g --call-graph=dwarf,65528 -e cycles:ppp --output=$OUTPUT_FILE -- $YOUR_NAMED_CMDLINE_GOES_HERE
|
||||
```
|
||||
|
||||
Below is an explanation of what each of the command line flags do:
|
||||
`perf record` Runs a command under a sampling profiler. See `perf-record(1)`
|
||||
|
||||
`--compression-level=1` *optional* Compresses the output file using zstd.
|
||||
|
||||
`--user-callchains` Do not record the kernel.
|
||||
|
||||
`-g --call-graph=dwarf,65528` Use dwarf symbols and a maximum stack size of 64k. If this option is omitted, inlined functions might not be visualized correctly.
|
||||
|
||||
`-e cycles:ppp` Record stack traces using precise events (:ppp). This option is needed only if you want to look at the assembly.
|
||||
|
||||
`--output=$OUTPUT_FILE` Name of the output file. Otherwise it defaults to perf.data.
|
||||
|
||||
4. Generate a flamegraph with `inferno`:
|
||||
|
||||
```sh
|
||||
perf script --input=$INPUT_DATA | inferno-collapse-perf | inferno-flamegraph --width=1920 > $OUTPUT_SVG
|
||||
```
|
||||
|
||||
You can also instruct `inferno` to output a "reversed" flamegraph. This is useful to see if there is any leaf function that is used through the codebase and can be optimized:
|
||||
|
||||
```sh
|
||||
perf script --input=$INPUT_DATA | inferno-collapse-perf | inferno-flamegraph --reverse --width=1920 > $OUTPUT_SVG
|
||||
```
|
Loading…
x
Reference in New Issue
Block a user