diff --git a/CPU-profiling-using-perf.md b/CPU-profiling-using-perf.md new file mode 100644 index 0000000..6801dcf --- /dev/null +++ b/CPU-profiling-using-perf.md @@ -0,0 +1,60 @@ +To profile `named` using `perf`, the following steps: + + 1. Install the prerequisites. E.g. on Fedora: + +``` +sudo dnf install perf inferno +``` + +`inferno` is a Rust implementation of the flamegraph utility by Brendan Gregg. Since it is written in a native language, it is faster than the original perl script. +If your distribution does not package `inferno`, you can install the original `flamegraph.pl` script. E.g. + +``` +sudo dnf install perf flamegraph +``` + +Replace calls to `inferno` with `flamegraph.pl` if necessary. + + 1. Ensure you have compiled `named` with debug symbols. Optionally, set up a prefix + +```sh +cd ~/Path/to/bind9/checkout +./configure CFLAGS='-O2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -ggdb3' --prefix=/path/to/parent +make clean && make +``` + + 2. Install `named`. Normally `libtool` runs `named` through a wrapper shell script that can make perf results more difficult to read. + +```sh +make install +``` + + 3. Run `named` under `perf`. + ```sh +perf record --compression-level=1 --user-callchains -g --call-graph=dwarf,65528 -e cycles:ppp --output=$OUTPUT_FILE -- $YOUR_NAMED_CMDLINE_GOES_HERE +``` + + Below is an explanation of what each of the command line flags do: + `perf record` Runs a command under a sampling profiler. See `perf-record(1)` + + `--compression-level=1` *optional* Compresses the output file using zstd. + + `--user-callchains` Do not record the kernel. + + `-g --call-graph=dwarf,65528` Use dwarf symbols and a maximum stack size of 64k. If this option is omitted, inlined functions might not be visualized correctly. + + `-e cycles:ppp` Record stack traces using precise events (:ppp). This option is needed only if you want to look at the assembly. + + `--output=$OUTPUT_FILE` Name of the output file. Otherwise it defaults to perf.data. + +4. Generate a flamegraph with `inferno`: + +```sh +perf script --input=$INPUT_DATA | inferno-collapse-perf | inferno-flamegraph --width=1920 > $OUTPUT_SVG +``` + +You can also instruct `inferno` to output a "reversed" flamegraph. This is useful to see if there is any leaf function that is used through the codebase and can be optimized: + +```sh +perf script --input=$INPUT_DATA | inferno-collapse-perf | inferno-flamegraph --reverse --width=1920 > $OUTPUT_SVG +``` \ No newline at end of file