Objectives
To allow hooks modules to dynamically add new metrics to the BIND statistics channels it is necessary to refactor the statistics code to use a more abstract data representation that can have metrics added to it at run-time as hooks are loaded. (#38)
Requirements
Data Structure
The statistics will be represented with an abstract data store represented as objects containing key/value pairs. The values therein can also be objects, or arrays of objects, allowing the creation of a nested tree of metrics.
The module specific code that adds metrics to the tree (and subsequently updates those metrics) MUST be agnostic to the serialisation formats used to describe the metrics (currently XML and JSON).
Conversely, the serialisation code MUST have no specific knowledge of the content of the tree, and MUST be able to produce its output based on traversal of the tree and the objects therein.
Exact compatibility with the current output of the statistics channel is an explicit non-requirement.
Marking the output data with a serialisation format version number is an explicit non-requirement. Code that parses the data MUST be able to cope with individual metrics (or groups thereof) being missing, or appearing in different places in the tree.
To simplify transitions, the tree MAY support "aliasing" of metrics so that they appear in multiple places in the tree, but if so care MUST be taken to ensure that cycles are not introduced.
Name Spacing
Each dynamically-loaded hook MUST be allocated a "root" object under which its statistics are stored in order to prevent collisions, e.g.
/statistics/modules/<hook-name>/...
Diagnostics
Certain parts of the current statistics output (e.g. lists of tasks and memory contexts) are more intended for diagnostic purposes and are also expensive to generate.
These metrics SHOULD be moved out of statistics and into a separate tree such that they are only generated when specifically requested and not generated automatically whenever a request for statistics is made.
Data Types
The data types that MUST be supported are:
-
gauge
for representing instantaneous readings of an absolute value -
counter
for representing the cumulative count of events such as incoming queries, etc. These values are only ever incremented, and often subsequently graphed as a "per second" value.NB: some of the metrics described in the current implementation as "counters" are really "gauges" (such as the count of active sockets).
-
timestamp
to indicate a moment in time, to 1ms accuracy -
text
to hold a label, for example the name of a memory context. These would not change once created.
TBD: query on numeric range required - JS manages with 64-bit doubles
with 53 bits of integer range (±9e15) which should be more than
sufficient whilst also allowing for floating point gauges if required
and can also encode timestamps since the UNIX epoch. However atomic
operations are not generally available for double
variables.
Concurrency
Since BIND is multithreaded, access to mutable values MUST be done in a thread-safe manner (via e.g. atomic operations or mutexes).
On-demand Metrics
Some metrics (especially gauges) need only be computed when they are requested. To support this, the hooks interface MUST contain a mechanism to inform a hook that a request for its statistics is about to be made and that it must fill in the values of any on-demand metrics.
Efficiency
Modules MUST retain a pointer to each of the individual metrics that they maintain and subsequent changes to those metrics MUST be of O(1) complexity.
Enumeration of the tree (or any subtree thereof) MUST be O(n) where n
is the number of nodes below that point of the tree. Traversal of the
tree SHOULD ideally be lockless.
Insertion or deletion of key/value pairs into an object MUST be O(log n) or better.
Appending a new value to an array MUST be O(1). Insertion or deletion of a value in the middle of the array SHOULD be O(log n) or better. Access to an arbitrary index in the array MUST be O(log n) or better.
HTTP Access
The HTTP API MUST permit access to any part of the nested tree, even down to individual metrics.
Access to alternate serialisation formats SHOULD be specified using the Content-Accept: HTTP header rather than encoded in the URL.
On-the-fly Gzip compression of the results SHOULD be supported when
requested by the client via the Content-Encoding
header.