2
0
mirror of https://gitlab.isc.org/isc-projects/kea synced 2025-08-30 05:27:55 +00:00

[#3278] Updated ARM

Added ChangeLog entry, updated ARM, and changed
enable-monitoring to false by default.

doc/sphinx/arm/hooks-perfmon.rst
    Added a good deal more information

src/hooks/dhcp/perfmon/perfmon_config.cc
src/hooks/dhcp/perfmon/perfmon_mgr.cc
    enable-monitoring now defaults to false. Users must
    explicitly enable it.

src/hooks/dhcp/perfmon/tests/perfmon_config_unittests.cc
src/hooks/dhcp/perfmon/tests/perfmon_mgr_unittests.cc
    Updated tests
This commit is contained in:
Thomas Markwalder 2024-03-20 15:20:23 -04:00
parent e7c5bbb5e2
commit 12e4dddcc5
6 changed files with 177 additions and 17 deletions

View File

@ -1,3 +1,9 @@
22124. [func] tmark
PerfMon hook library can now parse its configuration
and the ARM has been updated with more detailed
information. Functionality is still limited.
(Gitlab #3278)
Kea 2.5.7 (development) released on March 27, 2024
2213. [build] razvan

View File

@ -9,7 +9,7 @@ to extend them with the ability to track and report performance related data.
.. note::
This library is currently under development and not yet functional.
This library is currently under development and not fully functional.
Overview
~~~~~~~~
@ -32,8 +32,161 @@ the server's configuration:
],
...
}
..
It tracks the life cycle of client query packets as they are processed by Kea,
beginning with when the query was received by the kernel to when the response
is ready to be sent. This tracking is driven by a packet event stack which
contains a list of event/timestamp pairs for significant milestones that
occur during the processing of a client query. The list of possible events is
shown below:
#. socket_received - Kernel placed the client query into the socket buffer
#. buffer_read - Server read the client query from the socket buffer
#. mt_queued - Server placed the client query into its thread-pool work queue (MT mode only)
#. process_started - Server has begun processing the query
#. process_completed - Server has constructed the response and is ready to send it
This list is likely to expand over time. It will also be possible for hook developers
to add their own events. This will be detailed in a future release in t.
Passive Event Logging
~~~~~~~~~~~~~~~~~~~~~
As long as the PerfMon hook library is loaded it will log the packet event stack
contents for each client query which generates a response packet. The log entry
will contain client query identifiers followed the list of event/timestamp pairs
as they occurred in the order they occurred.
For :iscman:`kea-dhcp4` the log is identified by the label, ``PERFMON_DHCP4_PKT_EVENTS``,
and emitted at logger debug level 50 or higher. For a DHCPDISCOVER it is emitted
once the DHCPOFFER is ready to send and will look similar to the following (see
the second entry)::
2024-03-20 10:52:20.069 INFO [kea-dhcp4.leases/50033.140261252249344] DHCP4_LEASE_OFFER [hwtype=1 08:00:27:25:d3:f4], cid=[no info], tid=0xc288f9: lease 178.16.2.0 will be offered
2024-03-20 10:52:20.070 DEBUG [kea-dhcp4.perfmon-hooks/50033.140261252249344] PERFMON_DHCP4_PKT_EVENTS query: [hwtype=1 08:00:27:25:d3:f4], cid=[no info], tid=0xc288f9 events=[2024-Mar-20 14:52:20.067563 : socket_received, 2024-Mar-20 14:52:20.067810 : buffer_read, 2024-Mar-20 14:52:20.067897 : mt_queued, 2024-Mar-20 14:52:20.067952 : process_started, 2024-Mar-20 14:52:20.069614 : process_completed]
..
For :iscman:`kea-dhcp6` the log is identified by the label, ``PERFMON_DHCP6_PKT_EVENTS``,
and emitted only at logger debug level 50 or higher. For a DHCPV6_SOLICIT it is emitted
once the DHCPV6_ADVERTISE is ready to send and will look similar to the following (see
the second entry)::
2024-03-20 10:22:14.030 INFO [kea-dhcp6.leases/47195.139913679886272] DHCP6_LEASE_ADVERT duid=[00:03:00:01:08:00:27:25:d3:f4], [no hwaddr info], tid=0xb54806: lease for address 3002:: and iaid=11189196 will be advertised
2024-03-20 10:22:14.031 DEBUG [kea-dhcp6.perfmon-hooks/47195.139913679886272] PERFMON_DHCP6_PKT_EVENTS query: duid=[00:03:00:01:08:00:27:25:d3:f4], [no hwaddr info], tid=0xb54806 events=[2024-Mar-20 14:22:14.028729 : socket_received, 2024-Mar-20 14:22:14.028924 : buffer_read, 2024-Mar-20 14:22:14.029005 : process_started, 2024-Mar-20 14:22:14.030566 : process_completed]
..
Duration Monitoring
~~~~~~~~~~~~~~~~~~~
When monitoring is enabled, stack event data will be aggregated over a specified interval. These
aggregates are referred to as monitored durations or simply durations for ease. Durations are
uniquely identified by a "duration key" which consists of the following values:
* query message type - Message type of the client query (e.g.DHCPDISCOVER, DHCPV6_REQUEST)
* response message type - Message type of the server response (e.g. DHCPOFFER, DHCPV6_REPLY)
* start event label - Event that defines the beginning of the task (e.g. socket_received, process_started)
* stop event label - Event that defines the end of the task (e.g. buffer_read, process_completed)
* subnet id - subnet selected during message processing (or 0 for global durations)
As client queries are responded to their event stacks are used add to the monitored
durations. When the sampling interval has been reached for a given duration, it is reported.
.. Note:
Monitoring is not yet functional.
Alarms
~~~~~~
Alarms may be defined to watch specific durations. Each alarm defines a high-water mark and a
low-water mark. When the reported average value for duration exceeds the high-water mark, a
WARN level alarm log will be emitted at which point the alarm is considered "triggered". Once
triggered the WARN level log will be repeated at a specified alarm report interval as long the
reported average for the duration remains above the low-water mark. Once the average falls
below the low-water mark the alarm is "cleared" and an INFO level log will be emitted.
.. Note:
Alarms are not yet functional.
API Commands
~~~~~~~~~~~~
Commands to enable or disable monitoring, clear or alter alarms, and fetch duration datax
are anticipated but not yet supported.
Configuration
~~~~~~~~~~~~~
TBD
An example of the anticipated configuration is shown below:
.. code-block:: javascript
{
"hooks-libraries": [
{
"library": "lib/kea/hooks/libdhcp_perfmon.so",
"parameters": {
"enable-monitoring" : true,
"interval-width-secs" : 5,
"stats-mgr-reporting" : true,
"alarm-report-secs" : 600,
"alarms": [
{
"duration-key": {
"query-type" : "DHCPDISCOVER",
"response-type" : "DHCPOFFER",
"start-event" : "process-started",
"stop-event" : "process-completed",
"subnet-id" : 0
},
"enable-alarm" : true,
"high-water-ms" : 500,
"low-water-ms" : 25
}]
}
}],
}
..
Where:
* enable-monitoring
Enables event data aggregation for reporting, statisitcs, and alarms. Defaults to false.
* interval-width-secs
The amount of time, in seconds, that individual task durations are accumulated into an
aggregate before it is reported. Default is 60 seconds.
* stats-mgr-reporting
Enables reporting aggregates to StatsMgr. Defaults to true.
* alarm-report-secs
The amount of time, in seconds, between logging for an alarm once it has been triggered.
Defaults to 300 seconds.
* alarms
A optional list of alarms that monitor specific duration aggregates. Each alarm is
defined by the following:
* duration-key
Idenitifies the monitored duration to watch
* query-type - Message type of the client query (e.g.DHCPDISCOVER, DHCPV6_REQUEST)
* response-type - Message type of the server response (e.g. DHCPOFFER, DHCPV6_REPLY)
* start-event - Event that defines the beginning of the task (e.g. socket_received, process_started)
* stop-event - Event that defines the end of the task
* subnet-id - subnet selected during message processing (or 0 for global durations)
* enable-alarm
Enables or disables this alarm. Defaults to true.
* high-water-ms
The value, in milliseconds, that must be exceeded to trigger this alarm.
Must be greater than zero.
* low-water-ms
The value, in milliseconds, that must be subceeded to clear this alarm
Must be greater than zero but less than high-water-ms.
.. note::
Passive event logging is always enabled, even without specifying the 'parameters' section.

View File

@ -264,7 +264,7 @@ PerfMonConfig::CONFIG_KEYWORDS =
PerfMonConfig::PerfMonConfig(uint16_t family)
: family_(family),
enable_monitoring_(true),
enable_monitoring_(false),
interval_width_secs_(60),
stats_mgr_reporting_(true),
alarm_report_secs_(300) {

View File

@ -27,7 +27,8 @@ PerfMonMgr::PerfMonMgr(uint16_t family_)
void PerfMonMgr::configure(const ConstElementPtr & params) {
if (!params) {
isc_throw(dhcp::DhcpConfigError, "params must not be null");
// User wants passive logging only.
setEnableMonitoring(false);
return;
}

View File

@ -44,15 +44,15 @@ public:
// Verify initial values.
ASSERT_NO_THROW_LOG(config.reset(new PerfMonConfig(family_)));
ASSERT_TRUE(config);
EXPECT_TRUE(config->getEnableMonitoring());
EXPECT_FALSE(config->getEnableMonitoring());
EXPECT_EQ(config->getIntervalWidthSecs(), 60);
EXPECT_TRUE(config->getStatsMgrReporting());
EXPECT_EQ(config->getAlarmReportSecs(), 300);
EXPECT_TRUE(config->getAlarmStore());
// Verify accessors.
EXPECT_NO_THROW_LOG(config->setEnableMonitoring(false));
EXPECT_FALSE(config->getEnableMonitoring());
EXPECT_NO_THROW_LOG(config->setEnableMonitoring(true));
EXPECT_TRUE(config->getEnableMonitoring());
EXPECT_NO_THROW_LOG(config->setIntervalWidthSecs(4));
EXPECT_EQ(config->getIntervalWidthSecs(), 4);
@ -65,7 +65,7 @@ public:
// Verify shallow copy construction.
PerfMonConfigPtr config2(new PerfMonConfig(*config));
EXPECT_FALSE(config2->getEnableMonitoring());
EXPECT_TRUE(config2->getEnableMonitoring());
EXPECT_EQ(config2->getIntervalWidthSecs(), 4);
EXPECT_FALSE(config2->getStatsMgrReporting());
EXPECT_EQ(config2->getAlarmReportSecs(), 120);
@ -92,43 +92,43 @@ public:
// Empty map
__LINE__,
R"({ })",
true, 60, true, 300
false, 60, true, 300
},
{
// Only enable-monitoring",
__LINE__,
R"({ "enable-monitoring" : false })",
false, 60, true, 300
R"({ "enable-monitoring" : true })",
true, 60, true, 300
},
{
// Only interval-width-secs",
__LINE__,
R"({ "interval-width-secs" : 3 })",
true, 3, true, 300
false, 3, true, 300
},
{
// Only stats-mgr-reporting",
__LINE__,
R"({ "stats-mgr-reporting" : false })",
true, 60, false, 300
false, 60, false, 300
},
{
// Only alarm-report-secs",
__LINE__,
R"({ "alarm-report-secs" : 77 })",
true, 60, true, 77
false, 60, true, 77
},
{
// All parameters",
__LINE__,
R"(
{
"enable-monitoring" : false,
"enable-monitoring" : true,
"interval-width-secs" : 2,
"stats-mgr-reporting" : false,
"alarm-report-secs" : 120
})",
false, 2, false, 120
true, 2, false, 120
},
};

View File

@ -58,7 +58,7 @@ public:
// Verify initial values.
ASSERT_NO_THROW_LOG(mgr.reset(new PerfMonMgr(family_)));
ASSERT_TRUE(mgr);
EXPECT_TRUE(mgr->getEnableMonitoring());
EXPECT_FALSE(mgr->getEnableMonitoring());
EXPECT_EQ(mgr->getIntervalDuration(), seconds(60));
EXPECT_TRUE(mgr->getStatsMgrReporting());
EXPECT_EQ(mgr->getAlarmReportInterval(), seconds(300));