2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-31 06:25:31 +00:00

Update DNS Shotgun integration into Gitlab CI - use bind9-shotgun-ci repo

Tom Krizek
2022-08-22 11:29:57 +00:00
parent 9d77085257
commit 88608f0962

@@ -1,4 +1,4 @@
Work in progress, please report problems to @pspacek
Work in progress, please report problems to @tkrizek (@pspacek)
Usage
=====
@@ -11,11 +11,13 @@ Configurable parameters:
- list of versions to be tested
- at least one value must be provided
- accepts commit or branch or tag name
- v9_11_31 is always added on top of user-specified list of versions and serves as reference
- `SHOTGUN_SCENARIO` - udp (default), tcp, dot, doh
- `SHOTGUN_TRAFFIC_MULTIPLIER` - simulated load - if unsure leave default value "10", which is roughly 80 k QPS; that is about the maximum v9_11_31 can handle in our setup on UDP
- `SHOTGUN_TRAFFIC_MULTIPLIER` - simulated load - if unsure leave default value "10", which is roughly 94 k QPS
- `SHOTGUN_DURATION` - first 60 seconds (default) is most interesting because we always start with fresh instance and an empty cache
- `SHOTGUN_ROUNDS` - how many test rounds - three recommended (default) so we are not fooled by noise
- `SHOTGUN_FLAMEGRAPH` - whether flamegraph should be created, don't use if test is longer than 60 seconds otherwise runner will run out of space and the job will fail
- `SHOTGUN_SERVER_THREADS` - default 16 is also max; can be used to limit online CPUs to test performance with less cores
- `SHOTGUN_CI_IMAGE_TAG` - leave it as `latest` unless you know what you're doing
Parameters `SHOTGUN_TEST_VERSION`, `SHOTGUN_SCENARIO`, `SHOTGUN_TRAFFIC_MULTIPLIER` accept either one string or list of strings in Python syntax: `['main', 'v9_16_15', '1234567abcdef']`. If multiple parameters contain list then Cartesian product of all provided values is tested.
@@ -26,7 +28,7 @@ Example #1: Determining maximum load for one version
- `SHOTGUN_DURATION` = `60`
- `SHOTGUN_ROUNDS` = 3
Run specified version and fire at it over UDP "10 x base load", "12 x base load", "14 x base load". Repeat three times for each load value. Produces 9 (3 load x 3 runs) charts with response rates + one for reference v9_11_31. Good for finding maximum load by determining when response load starts dropping.
Run specified version and fire at it over UDP "10 x base load", "12 x base load", "14 x base load". Repeat three times for each load value. Produces 9 (3 load x 3 runs) charts with response rates. Good for finding maximum load by determining when response load starts dropping.
Example #2: Comparing performance between versions
- `SHOTGUN_TEST_VERSION` = `['main', 'v9_16_15', '1234567abcdef']`
@@ -35,34 +37,26 @@ Example #2: Comparing performance between versions
- `SHOTGUN_DURATION` = `60`
- `SHOTGUN_ROUNDS` = 3
Run each version three times, and fire "10 x base load" at it over UDP (roughly 10 x 8 k QPS). Produces 9 (3 versions x 3 runs) charts with response rates + one for reference v9_11_31. Good for comparison between versions. Assumes the load (traffic multiplier) is set to a value where at least one version is able to keep up, otherwise it would be hard to interpret results.
Run each version three times, and fire "10 x base load" at it over UDP (roughly 10 x 9.4 k QPS). Produces 9 (3 versions x 3 runs) charts with response rates. Good for comparison between versions. Assumes the load (traffic multiplier) is set to a value where at least one version is able to keep up, otherwise it would be hard to interpret results.
Start test
--------------
Go to https://gitlab.isc.org/isc-projects/bind9/-/pipelines/new and select branch `pspacek/ci-aws-integr2`. Do not worry, you will enter versions to be tested later.
Go to https://gitlab.isc.org/isc-projects/bind9-shotgun-ci/-/pipelines/new, fill in the parameters and click on `Run pipeline`.
Wait couple seconds until Gitlab shows you this form:
![form](uploads/345439bbcce5fa150b20b4040ffd388d/form.png)
(If it does not show up, reload page and click around for a while. This Gitlab form can be flaky.)
Fill in parameters and click on `Run pipeline`.
![2022-08-22_131328](uploads/fae2d4ae547d199558e00d5525380681/2022-08-22_131328.png)
Getting results
----------------
Go to https://gitlab.isc.org/isc-projects/bind9/-/pipelines and find your new pipeline. Beware, branch will be shown as `pspacek/ci-aws-integr2`. Open the "job map":
Go to https://gitlab.isc.org/isc-projects/bind9-shotgun-ci/-/pipelines and find your new pipeline. The pipeline will dynamically create a "[child pipeline](https://docs.gitlab.com/ee/ci/parent_child_pipelines.html)" where the actual jobs and results will be. To access it, you can click on the performance or downstream job.
![Screenshot_2021-05-06_Pipeline___ISC_Open_Source_Projects_BIND](uploads/cfbbb6c314b536c7edd190f1488538c1/Screenshot_2021-05-06_Pipeline___ISC_Open_Source_Projects_BIND.png)
![2022-08-22_132016](uploads/f42018a05ff2b6244468ee68032ed4ba/2022-08-22_132016.png)
Eventually the job map will dynamically add a "[child pipeline](https://docs.gitlab.com/ee/ci/parent_child_pipelines.html)". Here do the least obvious thing and click on the little black arrow (denoted by the huge red arrow), and it will magically expand and show all the test jobs and final postprocessing job.
![2022-08-22_132426](uploads/fa093e0d6e7fe0b06d03b046814bee37/2022-08-22_132426.png)
![job-map-with-arrow](uploads/f92a2013c1e58b150e18c80b41b1e0e8/job-map-with-arrow.png)
Wait until all jobs are finished. Then download all artifacts from the `postproc` job, it has all the charts and also (optionally) profiling flame charts for each run.
Wait until all jobs are finished. Then download all artifacts from the `postproc` job, it has all the charts and also profiling flame charts for each run.
SVG charts in the artifacts are various representations of the same data. Depending on what you are trying to find out it might be beneficial to either look at individual runs and study rcodes, or look at summary charts for all runs without rcodes etc.
The charts in the artifacts are various representations of the same data. Depending on what you are trying to find out it might be beneficial to either look at individual runs and study rcodes, or look at summary charts for all runs without rcodes etc.
Interpretation
@@ -73,7 +67,7 @@ Obviously DNS Shotgun does not provide information "why" something is happening,
Bear in mind that we are testing against the live Internet, so results are noisy and can change over time. Do not compare "old" and "new" results, it's better to retest then to chase ghosts of non-existing performance regressions.
Obviously you can also ask @pspacek :-)
Obviously you can also ask @tkrizek or @pspacek :-)
Implementation overview