mirror of
https://gitlab.isc.org/isc-projects/kea
synced 2025-08-30 13:37:55 +00:00
[#3278] Updated ARM
Added ChangeLog entry, updated ARM, and changed enable-monitoring to false by default. doc/sphinx/arm/hooks-perfmon.rst Added a good deal more information src/hooks/dhcp/perfmon/perfmon_config.cc src/hooks/dhcp/perfmon/perfmon_mgr.cc enable-monitoring now defaults to false. Users must explicitly enable it. src/hooks/dhcp/perfmon/tests/perfmon_config_unittests.cc src/hooks/dhcp/perfmon/tests/perfmon_mgr_unittests.cc Updated tests
This commit is contained in:
@@ -1,3 +1,9 @@
|
||||
22124. [func] tmark
|
||||
PerfMon hook library can now parse its configuration
|
||||
and the ARM has been updated with more detailed
|
||||
information. Functionality is still limited.
|
||||
(Gitlab #3278)
|
||||
|
||||
Kea 2.5.7 (development) released on March 27, 2024
|
||||
|
||||
2213. [build] razvan
|
||||
|
@@ -9,7 +9,7 @@ to extend them with the ability to track and report performance related data.
|
||||
|
||||
.. note::
|
||||
|
||||
This library is currently under development and not yet functional.
|
||||
This library is currently under development and not fully functional.
|
||||
|
||||
Overview
|
||||
~~~~~~~~
|
||||
@@ -32,8 +32,161 @@ the server's configuration:
|
||||
],
|
||||
...
|
||||
}
|
||||
..
|
||||
|
||||
It tracks the life cycle of client query packets as they are processed by Kea,
|
||||
beginning with when the query was received by the kernel to when the response
|
||||
is ready to be sent. This tracking is driven by a packet event stack which
|
||||
contains a list of event/timestamp pairs for significant milestones that
|
||||
occur during the processing of a client query. The list of possible events is
|
||||
shown below:
|
||||
|
||||
#. socket_received - Kernel placed the client query into the socket buffer
|
||||
#. buffer_read - Server read the client query from the socket buffer
|
||||
#. mt_queued - Server placed the client query into its thread-pool work queue (MT mode only)
|
||||
#. process_started - Server has begun processing the query
|
||||
#. process_completed - Server has constructed the response and is ready to send it
|
||||
|
||||
This list is likely to expand over time. It will also be possible for hook developers
|
||||
to add their own events. This will be detailed in a future release in t.
|
||||
|
||||
Passive Event Logging
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
As long as the PerfMon hook library is loaded it will log the packet event stack
|
||||
contents for each client query which generates a response packet. The log entry
|
||||
will contain client query identifiers followed the list of event/timestamp pairs
|
||||
as they occurred in the order they occurred.
|
||||
|
||||
For :iscman:`kea-dhcp4` the log is identified by the label, ``PERFMON_DHCP4_PKT_EVENTS``,
|
||||
and emitted at logger debug level 50 or higher. For a DHCPDISCOVER it is emitted
|
||||
once the DHCPOFFER is ready to send and will look similar to the following (see
|
||||
the second entry)::
|
||||
|
||||
2024-03-20 10:52:20.069 INFO [kea-dhcp4.leases/50033.140261252249344] DHCP4_LEASE_OFFER [hwtype=1 08:00:27:25:d3:f4], cid=[no info], tid=0xc288f9: lease 178.16.2.0 will be offered
|
||||
2024-03-20 10:52:20.070 DEBUG [kea-dhcp4.perfmon-hooks/50033.140261252249344] PERFMON_DHCP4_PKT_EVENTS query: [hwtype=1 08:00:27:25:d3:f4], cid=[no info], tid=0xc288f9 events=[2024-Mar-20 14:52:20.067563 : socket_received, 2024-Mar-20 14:52:20.067810 : buffer_read, 2024-Mar-20 14:52:20.067897 : mt_queued, 2024-Mar-20 14:52:20.067952 : process_started, 2024-Mar-20 14:52:20.069614 : process_completed]
|
||||
|
||||
..
|
||||
|
||||
For :iscman:`kea-dhcp6` the log is identified by the label, ``PERFMON_DHCP6_PKT_EVENTS``,
|
||||
and emitted only at logger debug level 50 or higher. For a DHCPV6_SOLICIT it is emitted
|
||||
once the DHCPV6_ADVERTISE is ready to send and will look similar to the following (see
|
||||
the second entry)::
|
||||
|
||||
2024-03-20 10:22:14.030 INFO [kea-dhcp6.leases/47195.139913679886272] DHCP6_LEASE_ADVERT duid=[00:03:00:01:08:00:27:25:d3:f4], [no hwaddr info], tid=0xb54806: lease for address 3002:: and iaid=11189196 will be advertised
|
||||
2024-03-20 10:22:14.031 DEBUG [kea-dhcp6.perfmon-hooks/47195.139913679886272] PERFMON_DHCP6_PKT_EVENTS query: duid=[00:03:00:01:08:00:27:25:d3:f4], [no hwaddr info], tid=0xb54806 events=[2024-Mar-20 14:22:14.028729 : socket_received, 2024-Mar-20 14:22:14.028924 : buffer_read, 2024-Mar-20 14:22:14.029005 : process_started, 2024-Mar-20 14:22:14.030566 : process_completed]
|
||||
|
||||
..
|
||||
|
||||
Duration Monitoring
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When monitoring is enabled, stack event data will be aggregated over a specified interval. These
|
||||
aggregates are referred to as monitored durations or simply durations for ease. Durations are
|
||||
uniquely identified by a "duration key" which consists of the following values:
|
||||
|
||||
* query message type - Message type of the client query (e.g.DHCPDISCOVER, DHCPV6_REQUEST)
|
||||
* response message type - Message type of the server response (e.g. DHCPOFFER, DHCPV6_REPLY)
|
||||
* start event label - Event that defines the beginning of the task (e.g. socket_received, process_started)
|
||||
* stop event label - Event that defines the end of the task (e.g. buffer_read, process_completed)
|
||||
* subnet id - subnet selected during message processing (or 0 for global durations)
|
||||
|
||||
As client queries are responded to their event stacks are used add to the monitored
|
||||
durations. When the sampling interval has been reached for a given duration, it is reported.
|
||||
|
||||
.. Note:
|
||||
Monitoring is not yet functional.
|
||||
|
||||
Alarms
|
||||
~~~~~~
|
||||
|
||||
Alarms may be defined to watch specific durations. Each alarm defines a high-water mark and a
|
||||
low-water mark. When the reported average value for duration exceeds the high-water mark, a
|
||||
WARN level alarm log will be emitted at which point the alarm is considered "triggered". Once
|
||||
triggered the WARN level log will be repeated at a specified alarm report interval as long the
|
||||
reported average for the duration remains above the low-water mark. Once the average falls
|
||||
below the low-water mark the alarm is "cleared" and an INFO level log will be emitted.
|
||||
|
||||
.. Note:
|
||||
Alarms are not yet functional.
|
||||
|
||||
API Commands
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Commands to enable or disable monitoring, clear or alter alarms, and fetch duration datax
|
||||
are anticipated but not yet supported.
|
||||
|
||||
Configuration
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
TBD
|
||||
An example of the anticipated configuration is shown below:
|
||||
|
||||
.. code-block:: javascript
|
||||
|
||||
{
|
||||
"hooks-libraries": [
|
||||
{
|
||||
"library": "lib/kea/hooks/libdhcp_perfmon.so",
|
||||
"parameters": {
|
||||
"enable-monitoring" : true,
|
||||
"interval-width-secs" : 5,
|
||||
"stats-mgr-reporting" : true,
|
||||
"alarm-report-secs" : 600,
|
||||
"alarms": [
|
||||
{
|
||||
"duration-key": {
|
||||
"query-type" : "DHCPDISCOVER",
|
||||
"response-type" : "DHCPOFFER",
|
||||
"start-event" : "process-started",
|
||||
"stop-event" : "process-completed",
|
||||
"subnet-id" : 0
|
||||
},
|
||||
"enable-alarm" : true,
|
||||
"high-water-ms" : 500,
|
||||
"low-water-ms" : 25
|
||||
}]
|
||||
}
|
||||
}],
|
||||
}
|
||||
|
||||
..
|
||||
|
||||
Where:
|
||||
|
||||
* enable-monitoring
|
||||
Enables event data aggregation for reporting, statisitcs, and alarms. Defaults to false.
|
||||
* interval-width-secs
|
||||
The amount of time, in seconds, that individual task durations are accumulated into an
|
||||
aggregate before it is reported. Default is 60 seconds.
|
||||
* stats-mgr-reporting
|
||||
Enables reporting aggregates to StatsMgr. Defaults to true.
|
||||
* alarm-report-secs
|
||||
The amount of time, in seconds, between logging for an alarm once it has been triggered.
|
||||
Defaults to 300 seconds.
|
||||
* alarms
|
||||
A optional list of alarms that monitor specific duration aggregates. Each alarm is
|
||||
defined by the following:
|
||||
|
||||
* duration-key
|
||||
Idenitifies the monitored duration to watch
|
||||
|
||||
* query-type - Message type of the client query (e.g.DHCPDISCOVER, DHCPV6_REQUEST)
|
||||
* response-type - Message type of the server response (e.g. DHCPOFFER, DHCPV6_REPLY)
|
||||
* start-event - Event that defines the beginning of the task (e.g. socket_received, process_started)
|
||||
* stop-event - Event that defines the end of the task
|
||||
* subnet-id - subnet selected during message processing (or 0 for global durations)
|
||||
|
||||
* enable-alarm
|
||||
Enables or disables this alarm. Defaults to true.
|
||||
|
||||
* high-water-ms
|
||||
The value, in milliseconds, that must be exceeded to trigger this alarm.
|
||||
Must be greater than zero.
|
||||
|
||||
* low-water-ms
|
||||
The value, in milliseconds, that must be subceeded to clear this alarm
|
||||
Must be greater than zero but less than high-water-ms.
|
||||
|
||||
.. note::
|
||||
Passive event logging is always enabled, even without specifying the 'parameters' section.
|
||||
|
||||
|
@@ -264,7 +264,7 @@ PerfMonConfig::CONFIG_KEYWORDS =
|
||||
|
||||
PerfMonConfig::PerfMonConfig(uint16_t family)
|
||||
: family_(family),
|
||||
enable_monitoring_(true),
|
||||
enable_monitoring_(false),
|
||||
interval_width_secs_(60),
|
||||
stats_mgr_reporting_(true),
|
||||
alarm_report_secs_(300) {
|
||||
|
@@ -27,7 +27,8 @@ PerfMonMgr::PerfMonMgr(uint16_t family_)
|
||||
|
||||
void PerfMonMgr::configure(const ConstElementPtr & params) {
|
||||
if (!params) {
|
||||
isc_throw(dhcp::DhcpConfigError, "params must not be null");
|
||||
// User wants passive logging only.
|
||||
setEnableMonitoring(false);
|
||||
return;
|
||||
}
|
||||
|
||||
|
@@ -44,15 +44,15 @@ public:
|
||||
// Verify initial values.
|
||||
ASSERT_NO_THROW_LOG(config.reset(new PerfMonConfig(family_)));
|
||||
ASSERT_TRUE(config);
|
||||
EXPECT_TRUE(config->getEnableMonitoring());
|
||||
EXPECT_FALSE(config->getEnableMonitoring());
|
||||
EXPECT_EQ(config->getIntervalWidthSecs(), 60);
|
||||
EXPECT_TRUE(config->getStatsMgrReporting());
|
||||
EXPECT_EQ(config->getAlarmReportSecs(), 300);
|
||||
EXPECT_TRUE(config->getAlarmStore());
|
||||
|
||||
// Verify accessors.
|
||||
EXPECT_NO_THROW_LOG(config->setEnableMonitoring(false));
|
||||
EXPECT_FALSE(config->getEnableMonitoring());
|
||||
EXPECT_NO_THROW_LOG(config->setEnableMonitoring(true));
|
||||
EXPECT_TRUE(config->getEnableMonitoring());
|
||||
|
||||
EXPECT_NO_THROW_LOG(config->setIntervalWidthSecs(4));
|
||||
EXPECT_EQ(config->getIntervalWidthSecs(), 4);
|
||||
@@ -65,7 +65,7 @@ public:
|
||||
|
||||
// Verify shallow copy construction.
|
||||
PerfMonConfigPtr config2(new PerfMonConfig(*config));
|
||||
EXPECT_FALSE(config2->getEnableMonitoring());
|
||||
EXPECT_TRUE(config2->getEnableMonitoring());
|
||||
EXPECT_EQ(config2->getIntervalWidthSecs(), 4);
|
||||
EXPECT_FALSE(config2->getStatsMgrReporting());
|
||||
EXPECT_EQ(config2->getAlarmReportSecs(), 120);
|
||||
@@ -92,43 +92,43 @@ public:
|
||||
// Empty map
|
||||
__LINE__,
|
||||
R"({ })",
|
||||
true, 60, true, 300
|
||||
false, 60, true, 300
|
||||
},
|
||||
{
|
||||
// Only enable-monitoring",
|
||||
__LINE__,
|
||||
R"({ "enable-monitoring" : false })",
|
||||
false, 60, true, 300
|
||||
R"({ "enable-monitoring" : true })",
|
||||
true, 60, true, 300
|
||||
},
|
||||
{
|
||||
// Only interval-width-secs",
|
||||
__LINE__,
|
||||
R"({ "interval-width-secs" : 3 })",
|
||||
true, 3, true, 300
|
||||
false, 3, true, 300
|
||||
},
|
||||
{
|
||||
// Only stats-mgr-reporting",
|
||||
__LINE__,
|
||||
R"({ "stats-mgr-reporting" : false })",
|
||||
true, 60, false, 300
|
||||
false, 60, false, 300
|
||||
},
|
||||
{
|
||||
// Only alarm-report-secs",
|
||||
__LINE__,
|
||||
R"({ "alarm-report-secs" : 77 })",
|
||||
true, 60, true, 77
|
||||
false, 60, true, 77
|
||||
},
|
||||
{
|
||||
// All parameters",
|
||||
__LINE__,
|
||||
R"(
|
||||
{
|
||||
"enable-monitoring" : false,
|
||||
"enable-monitoring" : true,
|
||||
"interval-width-secs" : 2,
|
||||
"stats-mgr-reporting" : false,
|
||||
"alarm-report-secs" : 120
|
||||
})",
|
||||
false, 2, false, 120
|
||||
true, 2, false, 120
|
||||
},
|
||||
};
|
||||
|
||||
|
@@ -58,7 +58,7 @@ public:
|
||||
// Verify initial values.
|
||||
ASSERT_NO_THROW_LOG(mgr.reset(new PerfMonMgr(family_)));
|
||||
ASSERT_TRUE(mgr);
|
||||
EXPECT_TRUE(mgr->getEnableMonitoring());
|
||||
EXPECT_FALSE(mgr->getEnableMonitoring());
|
||||
EXPECT_EQ(mgr->getIntervalDuration(), seconds(60));
|
||||
EXPECT_TRUE(mgr->getStatsMgrReporting());
|
||||
EXPECT_EQ(mgr->getAlarmReportInterval(), seconds(300));
|
||||
|
Reference in New Issue
Block a user