mirror of
https://gitlab.isc.org/isc-projects/kea
synced 2025-09-03 07:25:18 +00:00
[#3297] Updated ARM, added a ChangeLog
This commit is contained in:
@@ -1,3 +1,10 @@
|
|||||||
|
2216. [func] tmark
|
||||||
|
PerfMon hook library is now functional. It accumulates
|
||||||
|
and reports performance data, and supports alarms. Still
|
||||||
|
lacking are API commands.
|
||||||
|
(Gitlab #3297)
|
||||||
|
|
||||||
|
|
||||||
2215. [bug] tmark
|
2215. [bug] tmark
|
||||||
Corrected an issue that can cause kea-dhcp-ddns to
|
Corrected an issue that can cause kea-dhcp-ddns to
|
||||||
stop processing queued requests. Thanks to Shawn
|
stop processing queued requests. Thanks to Shawn
|
||||||
|
@@ -9,7 +9,8 @@ to extend them with the ability to track and report performance related data.
|
|||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
This library is currently under development and not fully functional.
|
This library is experimental and not recommended for use in production
|
||||||
|
environments.
|
||||||
|
|
||||||
Overview
|
Overview
|
||||||
~~~~~~~~
|
~~~~~~~~
|
||||||
@@ -93,23 +94,127 @@ uniquely identified by a "duration key" which consists of the following values:
|
|||||||
Once the server has finished constructing a response to a query, the query's event stack
|
Once the server has finished constructing a response to a query, the query's event stack
|
||||||
is processed into a series of updates to monitored durations. If upon updating, a
|
is processed into a series of updates to monitored durations. If upon updating, a
|
||||||
duration's sample interval is found to have been completed, it is sent to reporting
|
duration's sample interval is found to have been completed, it is sent to reporting
|
||||||
and a new sample interval is begun.
|
and a new sample interval is begun. The interval width is dictacted by configuration
|
||||||
|
parameter ``interval-width-secs``.
|
||||||
|
|
||||||
.. Note:
|
The event stack for the multi-threaded mode DHCPDISCOVER/DHCPOFFER cycle shown above
|
||||||
Monitoring is not yet functional.
|
contains the following events:
|
||||||
|
|
||||||
|
+-----------------------------+--------------------+
|
||||||
|
| Event Timestamp | Event Label |
|
||||||
|
+=============================+====================+
|
||||||
|
| 2024-Mar-20 14:52:20.067563 | socket_received |
|
||||||
|
+-----------------------------+--------------------+
|
||||||
|
| 2024-Mar-20 14:52:20.067810 | buffer_read |
|
||||||
|
+-----------------------------+--------------------+
|
||||||
|
| 2024-Mar-20 14:52:20.067897 | mt_queued |
|
||||||
|
+-----------------------------+--------------------+
|
||||||
|
| 2024-Mar-20 14:52:20.067952 | process_started |
|
||||||
|
+-----------------------------+--------------------+
|
||||||
|
| 2024-Mar-20 14:52:20.069614 | process_completed |
|
||||||
|
+-----------------------------+--------------------+
|
||||||
|
|
||||||
|
Assuming the selected subnet's ID was 100, the duration updates formed by PerfMon
|
||||||
|
from these events are shown below:
|
||||||
|
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| Duration Keys for SubnetID 100 | Update in |
|
||||||
|
| | microseconds |
|
||||||
|
+==============================================================+==============+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.socket_received-buffer_read.100 | 247 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.buffer_read-mt_queue.100 | 87 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.mt_queued-process_started.100 | 55 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.process_started-process_completed.100 | 1662 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.composite-total_response.100 | 2051 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
|
||||||
|
Notice that in addition to the adjacent event updates, there is an additional duration
|
||||||
|
update for the total duration of the entire stack whose key contains the event-pair
|
||||||
|
``composite-total_response``. This tracks the total time to responds from query
|
||||||
|
received until the response is ready to send. Finally, there would also be global
|
||||||
|
duration updates for each of the above:
|
||||||
|
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| Global Duration Keys | Update in |
|
||||||
|
| | milliseconds |
|
||||||
|
+==============================================================+==============+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.socket_received-buffer_read.0 | 247 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.buffer_read-mt_queue.0 | 87 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.mt_queued-process_started.0 | 55 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.process_started-process_completed.0 | 1662 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
| DHCPDISCOVER.DHCPOFFER.composite-total_response.0 | 2051 |
|
||||||
|
+--------------------------------------------------------------+--------------+
|
||||||
|
|
||||||
|
Statistics Reporting
|
||||||
|
~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
When enabled (see ``stats-mgr-reporting``), PerfMon will report a duration's data
|
||||||
|
data each time the duration completes a sampling interval. Each statistic employs
|
||||||
|
the following naming convention:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
{subnet-id[x]}.perfmon.<query type>-<response type>.<start event>-<end event>.<value-name>
|
||||||
|
|
||||||
|
And there will be both a global and a subnet-specific value for each. Currently the only
|
||||||
|
value reported for a given duration key is "average-ms". This statistic is the average time
|
||||||
|
between the duration's event pair over the most recently completed interval. In other
|
||||||
|
words if during a given interval there were 7 occurrences (i.e. updates) totaling
|
||||||
|
350ms, the the average-ms reported would be 50ms. Continuing with example above, the
|
||||||
|
statistics reported would be named as follows for subnet level values:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
subnet[100].perfmon.DHCPDISCOVER.DHCPOFFER.socket_received-buffer_read.average-ms
|
||||||
|
subnet[100].perfmon.DHCPDISCOVER.DHCPOFFER.buffer_read-mt_queue.average-ms
|
||||||
|
subnet[100].perfmon.DHCPDISCOVER.DHCPOFFER.mt_queued-process_started.average-ms
|
||||||
|
subnet[100].perfmon.DHCPDISCOVER.DHCPOFFER.process_started-process_completed.average-ms
|
||||||
|
subnet[100].perfmon.DHCPDISCOVER.DHCPOFFER.composite-total_response.average-ms
|
||||||
|
|
||||||
|
and as shown for global values:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
perfmon.DHCPDISCOVER.DHCPOFFER.socket_received-buffer_read.average-ms
|
||||||
|
perfmon.DHCPDISCOVER.DHCPOFFER.buffer_read-mt_queue.average-ms
|
||||||
|
perfmon.DHCPDISCOVER.DHCPOFFER.mt_queued-process_started.average-ms
|
||||||
|
perfmon.DHCPDISCOVER.DHCPOFFER.process_started-process_completed.average-ms
|
||||||
|
perfmon.DHCPDISCOVER.DHCPOFFER.composite-total_response.average-ms
|
||||||
|
|
||||||
|
Since they are reported to StatsMgr they may be fetched using the commands :isccmd:`statistic-get-all`
|
||||||
|
or :isccmd:`statistic-get`.
|
||||||
|
|
||||||
Alarms
|
Alarms
|
||||||
~~~~~~
|
~~~~~~
|
||||||
|
|
||||||
Alarms may be defined to watch specific durations. Each alarm defines a high-water mark and a
|
Alarms may be defined to watch specific durations. Each alarm defines a high-water mark,
|
||||||
low-water mark. When the reported average value for duration exceeds the high-water mark, a
|
``high-water-ms`` and a low-water mark, ``low-water-ms``. When the reported average value
|
||||||
WARN level alarm log will be emitted at which point the alarm is considered "triggered". Once
|
for duration exceeds the high-water mark, a WARN level alarm log will be emitted at which
|
||||||
triggered the WARN level log will be repeated at a specified alarm report interval as long the
|
point the alarm is considered "triggered". Once triggered the WARN level log will be
|
||||||
reported average for the duration remains above the low-water mark. Once the average falls
|
repeated at a specified alarm report interval, ``alarm-report-secs`` as long the reported
|
||||||
below the low-water mark the alarm is "cleared" and an INFO level log will be emitted.
|
average for the duration remains above the low-water mark. Once the average falls below the
|
||||||
|
low-water mark the alarm is "cleared" and an INFO level log will be emitted.
|
||||||
|
|
||||||
.. Note:
|
The alarm triggered WARN log will look similar to the following:
|
||||||
Alarms are not yet functional.
|
|
||||||
|
::
|
||||||
|
|
||||||
|
2024-03-20 10:22:14.030 WARN [kea-dhcp6.leases/47195.139913679886272] PERFMON_ALARM_TRIGGERED Alarm for DHCPDISCOVER.DHCPOFFER.composite-total_response.0 has been triggered since 2024-03-20 10:18:20.070000, reported average duration 00:00:00.700000 exceeds high-water-ms: 500
|
||||||
|
|
||||||
|
|
||||||
|
and he alarm cleared INFO log will look similar to the following:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
2024-03-20 10:30:14.030 INFO [kea-dhcp6.leases/47195.139913679886272] PERFMON_ALARM_CLEARED Alarm for DHCPDISCOVER.DHCPOFFER.composite-total_response.0 has been cleared, reported average duration 00:00:00.010000 is now below low-water-ms: 25
|
||||||
|
|
||||||
API Commands
|
API Commands
|
||||||
~~~~~~~~~~~~
|
~~~~~~~~~~~~
|
||||||
@@ -153,7 +258,7 @@ An example of the anticipated configuration is shown below:
|
|||||||
Where:
|
Where:
|
||||||
|
|
||||||
* enable-monitoring
|
* enable-monitoring
|
||||||
Enables event data aggregation for reporting, statisitcs, and alarms. Defaults to false.
|
Enables event data aggregation for reporting, statistics, and alarms. Defaults to false.
|
||||||
* interval-width-secs
|
* interval-width-secs
|
||||||
The amount of time, in seconds, that individual task durations are accumulated into an
|
The amount of time, in seconds, that individual task durations are accumulated into an
|
||||||
aggregate before it is reported. Default is 60 seconds.
|
aggregate before it is reported. Default is 60 seconds.
|
||||||
@@ -167,7 +272,7 @@ Where:
|
|||||||
defined by the following:
|
defined by the following:
|
||||||
|
|
||||||
* duration-key
|
* duration-key
|
||||||
Idenitifies the monitored duration to watch
|
Identifies the monitored duration to watch
|
||||||
|
|
||||||
* query-type - Message type of the client query (e.g.DHCPDISCOVER, DHCPV6_REQUEST)
|
* query-type - Message type of the client query (e.g.DHCPDISCOVER, DHCPV6_REQUEST)
|
||||||
* response-type - Message type of the server response (e.g. DHCPOFFER, DHCPV6_REPLY)
|
* response-type - Message type of the server response (e.g. DHCPOFFER, DHCPV6_REPLY)
|
||||||
|
@@ -20,7 +20,7 @@ extern const isc::log::MessageID PERFMON_INIT_OK = "PERFMON_INIT_OK";
|
|||||||
namespace {
|
namespace {
|
||||||
|
|
||||||
const char* values[] = {
|
const char* values[] = {
|
||||||
"PERFMON_ALARM_CLEARED", "Alarm for %1 has been cleared, reported average duration %2 is now beloe low-water-ms: %3",
|
"PERFMON_ALARM_CLEARED", "Alarm for %1 has been cleared, reported average duration %2 is now below low-water-ms: %3",
|
||||||
"PERFMON_ALARM_TRIGGERED", "Alarm for %1 has been triggered since %2, reported average duration %3 exceeds high-water-ms: %4",
|
"PERFMON_ALARM_TRIGGERED", "Alarm for %1 has been triggered since %2, reported average duration %3 exceeds high-water-ms: %4",
|
||||||
"PERFMON_DEINIT_FAILED", "unloading PerfMon hooks library failed: %1",
|
"PERFMON_DEINIT_FAILED", "unloading PerfMon hooks library failed: %1",
|
||||||
"PERFMON_DEINIT_OK", "unloading PerfMon hooks library successful",
|
"PERFMON_DEINIT_OK", "unloading PerfMon hooks library successful",
|
||||||
|
Reference in New Issue
Block a user