mirror of
https://gitlab.isc.org/isc-projects/kea
synced 2025-09-02 23:15:20 +00:00
[5112] Several text corrections
This commit is contained in:
@@ -9,14 +9,14 @@
|
|||||||
|
|
||||||
@section parserIntro Parser background
|
@section parserIntro Parser background
|
||||||
|
|
||||||
Kea's format of choice is JSON, which is used in configuration files, in the
|
Kea's data format of choice is JSON (https://tools.ietf.org/html/rfc7159), which
|
||||||
command channel and also when communicating between DHCP servers and DHCP-DDNS
|
is used in configuration files, in the command channel and also when
|
||||||
component. It is almost certain that it will be used as the syntax for any
|
communicating between DHCP servers and DHCP-DDNS component. It is almost certain
|
||||||
upcoming features.
|
it will be used as the data format for any new features.
|
||||||
|
|
||||||
Historically, Kea used @ref isc::data::Element::fromJSON and @ref
|
Historically, Kea used @ref isc::data::Element::fromJSON and @ref
|
||||||
isc::data::Element::fromJSONFile methods to parse received data that is expected
|
isc::data::Element::fromJSONFile methods to parse received data that is expected
|
||||||
to be in JSON syntax. This in-house parser was developed back in early BIND10
|
to be in JSON syntax. This in-house parser was developed back in the early BIND10
|
||||||
days. Its two main advantages were that it didn't have any external dependencies
|
days. Its two main advantages were that it didn't have any external dependencies
|
||||||
and that it was already available in the source tree when the Kea project
|
and that it was already available in the source tree when the Kea project
|
||||||
started. On the other hand, it was very difficult to modify (several attempts to
|
started. On the other hand, it was very difficult to modify (several attempts to
|
||||||
@@ -49,9 +49,9 @@ and here: http://kea.isc.org/wiki/SimpleParser.
|
|||||||
To solve the issue of phase 1 mentioned earlier, a new parser has been developed
|
To solve the issue of phase 1 mentioned earlier, a new parser has been developed
|
||||||
that is based on flex and bison tools. The following text uses DHCPv6 as an
|
that is based on flex and bison tools. The following text uses DHCPv6 as an
|
||||||
example, but the same principle applies to DHCPv4 and D2 and CA will likely to
|
example, but the same principle applies to DHCPv4 and D2 and CA will likely to
|
||||||
follow. The new parser consists of two core elements (the following description
|
follow. The new parser consists of two core elements with a wrapper around them
|
||||||
is slightly oversimplified to convey the intent, more detailed description
|
(the following description is slightly oversimplified to convey the intent, more
|
||||||
is available in the following sections):
|
detailed description is available in the following sections):
|
||||||
|
|
||||||
-# Flex lexer (src/bin/dhcp6/dhcp6_lexer.ll) that is essentially a set of
|
-# Flex lexer (src/bin/dhcp6/dhcp6_lexer.ll) that is essentially a set of
|
||||||
regular expressions with C++ code that creates new tokens that represent whatever
|
regular expressions with C++ code that creates new tokens that represent whatever
|
||||||
@@ -87,20 +87,23 @@ is available in the following sections):
|
|||||||
(a token with a value of 100), RCURLY_BRACKET, RCURLY_BRACKET, END
|
(a token with a value of 100), RCURLY_BRACKET, RCURLY_BRACKET, END
|
||||||
|
|
||||||
-# Parser context. As there is some information that needs to be passed between
|
-# Parser context. As there is some information that needs to be passed between
|
||||||
parser and lexer, @ref isc::dhcp::Parser6Context is a convenient to wrapper
|
parser and lexer, @ref isc::dhcp::Parser6Context is a convenience wrapper
|
||||||
around those two bundled together. It also works as a nice encapsulation,
|
around those two bundled together. It also works as a nice encapsulation,
|
||||||
hiding all the flex/bison details underneath.
|
hiding all the flex/bison details underneath.
|
||||||
|
|
||||||
@section parserBuild Building flex/bison code
|
@section parserBuild Building flex/bison code
|
||||||
|
|
||||||
The only input file used by flex is the .ll file. The only input file used
|
The only input file used by flex is the .ll file. The only input file used by
|
||||||
by bison is the .yy file. When processed, those two tools will generate
|
bison is the .yy file. When making changes to the lexer or parser, only those
|
||||||
a number of .hh and .cc files. The major ones are names the same as their
|
two files are edited. When processed, those two tools will generate a number of
|
||||||
.ll and .yy counterparts (e.g. dhcp6_lexer.cc, dhcp6_parser.cc and dhcp6_parser.h),
|
.hh and .cc files. The major ones are named the same as their .ll and .yy
|
||||||
but there's a number of additional files created: location.hh, position.hh
|
counterparts (e.g. dhcp6_lexer.cc, dhcp6_parser.cc and dhcp6_parser.h), but
|
||||||
and stack.hh. Those are internal bison headers that are needed. To avoid every
|
there's a number of additional files created: location.hh, position.hh and
|
||||||
user to have flex and bison installed, we chose to generate the files and
|
stack.hh. Those are internal bison headers that are needed for compilation.
|
||||||
add them to the Kea repository. To generate those files, do the following:
|
|
||||||
|
To avoid every user to have flex and bison installed, we chose to generate the
|
||||||
|
files and add them to the Kea repository. To generate those files, do the
|
||||||
|
following:
|
||||||
|
|
||||||
@code
|
@code
|
||||||
./configure --enable-generate-parser
|
./configure --enable-generate-parser
|
||||||
@@ -120,7 +123,9 @@ generated may be different and cause unnecessarily large diffs, may cause
|
|||||||
coverity/cpp-check issues appear and disappear and cause general unhappiness.
|
coverity/cpp-check issues appear and disappear and cause general unhappiness.
|
||||||
To avoid those problems, we will introduce a requirement to generate flex/bison
|
To avoid those problems, we will introduce a requirement to generate flex/bison
|
||||||
files on one dedicated machine. This machine will likely be docs. Currently Ops
|
files on one dedicated machine. This machine will likely be docs. Currently Ops
|
||||||
is working on installing the necessary versions of flex/bison required
|
is working on installing the necessary versions of flex/bison required, but
|
||||||
|
for the time being we can use the versions installed in Francis' home directory
|
||||||
|
(export PATH=/home/fdupont/bin:$PATH).
|
||||||
|
|
||||||
Note: the above applies only to the code being merged on master. It is probably
|
Note: the above applies only to the code being merged on master. It is probably
|
||||||
ok to generate the files on your development branch with whatever version you
|
ok to generate the files on your development branch with whatever version you
|
||||||
@@ -145,10 +150,10 @@ documented, but the docs for it may be a bit cryptic. When developing new
|
|||||||
parsers, it's best to start by copying whatever we have for DHCPv6 and tweak as
|
parsers, it's best to start by copying whatever we have for DHCPv6 and tweak as
|
||||||
needed.
|
needed.
|
||||||
|
|
||||||
Second addition are flex conditions. They're defined with %x and they define a
|
Second addition are flex conditions. They're defined with %%x and they define a
|
||||||
state of the lexer. A good example of a state may be comment. Once the lexer
|
state of the lexer. A good example of a state may be comment. Once the lexer
|
||||||
detects that a comment has started, it switches to certain condition (by calling
|
detects that a comment's beginning, it switches to a certain condition (by calling
|
||||||
BEGIN(COMMENT) for example) and the code should ignore whatever follows
|
BEGIN(COMMENT) for example) and the code then ignores whatever follows
|
||||||
(especially strings that look like valid tokens) until the comment is closed
|
(especially strings that look like valid tokens) until the comment is closed
|
||||||
(when it returns to the default condition by calling BEGIN(INITIAL)). This is
|
(when it returns to the default condition by calling BEGIN(INITIAL)). This is
|
||||||
something that is not frequently used and the only use cases for it are the
|
something that is not frequently used and the only use cases for it are the
|
||||||
@@ -157,7 +162,7 @@ forementioned comments and file inclusions.
|
|||||||
Second addition are parser contexts. Let's assume we have a parser that uses
|
Second addition are parser contexts. Let's assume we have a parser that uses
|
||||||
"ip-address" regexp that would return IP_ADDRESS token. Whenever we want to
|
"ip-address" regexp that would return IP_ADDRESS token. Whenever we want to
|
||||||
allow "ip-address", the grammar allows IP_ADDRESS token to appear. When the
|
allow "ip-address", the grammar allows IP_ADDRESS token to appear. When the
|
||||||
lexer is called, it will match the regexp, will generate IP_ADDRESS token and
|
lexer is called, it will match the regexp, will generate the IP_ADDRESS token and
|
||||||
the parser will carry out its duty. This works fine as long as you have very
|
the parser will carry out its duty. This works fine as long as you have very
|
||||||
specific grammar that defines everything. Sadly, that's not the case in DHCP as
|
specific grammar that defines everything. Sadly, that's not the case in DHCP as
|
||||||
we have hooks. Hook libraries can have parameters that are defined by third
|
we have hooks. Hook libraries can have parameters that are defined by third
|
||||||
@@ -193,7 +198,7 @@ in src/bin/dhcp6/dhcp6_parser.yy. Here's a simplified excerpt of it:
|
|||||||
dhcp6_object: DHCP6 COLON LCURLY_BRACKET global_params RCURLY_BRACKET;
|
dhcp6_object: DHCP6 COLON LCURLY_BRACKET global_params RCURLY_BRACKET;
|
||||||
|
|
||||||
// This defines all parameters that may appear in the Dhcp6 object.
|
// This defines all parameters that may appear in the Dhcp6 object.
|
||||||
// It can either contain a global_param (defined below) or a
|
// It can either contain a global_param (defined below) or a
|
||||||
// global_params list, followed by a comma followed by a global_param.
|
// global_params list, followed by a comma followed by a global_param.
|
||||||
// Note this definition is recursive and can expand to a single
|
// Note this definition is recursive and can expand to a single
|
||||||
// instance of global_param or multiple instances separated by commas.
|
// instance of global_param or multiple instances separated by commas.
|
||||||
@@ -201,7 +206,7 @@ dhcp6_object: DHCP6 COLON LCURLY_BRACKET global_params RCURLY_BRACKET;
|
|||||||
global_params: global_param
|
global_params: global_param
|
||||||
| global_params COMMA global_param
|
| global_params COMMA global_param
|
||||||
;
|
;
|
||||||
|
|
||||||
// These are the parameters that are allowed in the top-level for
|
// These are the parameters that are allowed in the top-level for
|
||||||
// Dhcp6.
|
// Dhcp6.
|
||||||
global_param: preferred_lifetime
|
global_param: preferred_lifetime
|
||||||
@@ -222,9 +227,9 @@ global_param: preferred_lifetime
|
|||||||
| server_id
|
| server_id
|
||||||
| dhcp4o6_port
|
| dhcp4o6_port
|
||||||
;
|
;
|
||||||
|
|
||||||
renew_timer: RENEW_TIMER COLON INTEGER;
|
renew_timer: RENEW_TIMER COLON INTEGER;
|
||||||
|
|
||||||
// Many other definitions follow.
|
// Many other definitions follow.
|
||||||
@endcode
|
@endcode
|
||||||
|
|
||||||
@@ -244,7 +249,7 @@ rule.
|
|||||||
|
|
||||||
The "leaf" rules that don't contain any other rules, must be defined by a
|
The "leaf" rules that don't contain any other rules, must be defined by a
|
||||||
series of tokens. An example of such a rule is renew_timer above. It is defined
|
series of tokens. An example of such a rule is renew_timer above. It is defined
|
||||||
as a series of 3 tokens: RENEW_TIMER, COLON and INTEGER.
|
as a series of 3 tokens: RENEW_TIMER, COLON and INTEGER.
|
||||||
|
|
||||||
Speaking of integers, it is worth noting that some tokens can have values. Those
|
Speaking of integers, it is worth noting that some tokens can have values. Those
|
||||||
values are defined using %token clause. For example, dhcp6_parser.yy has the
|
values are defined using %token clause. For example, dhcp6_parser.yy has the
|
||||||
@@ -272,7 +277,7 @@ renew_timer with some extra code:
|
|||||||
@code
|
@code
|
||||||
renew_timer: RENEW_TIMER {
|
renew_timer: RENEW_TIMER {
|
||||||
cout << "renew-timer token detected, so far so good" << endl;
|
cout << "renew-timer token detected, so far so good" << endl;
|
||||||
} COLON {
|
} COLON {
|
||||||
cout << "colon detected!" << endl;
|
cout << "colon detected!" << endl;
|
||||||
} INTEGER {
|
} INTEGER {
|
||||||
uint32_t timer = $3;
|
uint32_t timer = $3;
|
||||||
@@ -298,11 +303,11 @@ ncr_protocol: NCR_PROTOCOL {
|
|||||||
ctx.enter(ctx.NCR_PROTOCOL); (1)
|
ctx.enter(ctx.NCR_PROTOCOL); (1)
|
||||||
} COLON ncr_protocol_value {
|
} COLON ncr_protocol_value {
|
||||||
ctx.stack_.back()->set("ncr-protocol", $4); (3)
|
ctx.stack_.back()->set("ncr-protocol", $4); (3)
|
||||||
ctx.leave();
|
ctx.leave(); (4)
|
||||||
};
|
};
|
||||||
|
|
||||||
ncr_protocol_value:
|
ncr_protocol_value:
|
||||||
UDP { $$ = ElementPtr(new StringElement("UDP", ctx.loc2pos(@1))); }
|
UDP { $$ = ElementPtr(new StringElement("UDP", ctx.loc2pos(@1))); }
|
||||||
| TCP { $$ = ElementPtr(new StringElement("TCP", ctx.loc2pos(@1))); } (2)
|
| TCP { $$ = ElementPtr(new StringElement("TCP", ctx.loc2pos(@1))); } (2)
|
||||||
;
|
;
|
||||||
@endcode
|
@endcode
|
||||||
@@ -358,8 +363,8 @@ The first line creates an instance of IntElement with a value of the token. The
|
|||||||
second line adds it to the current map (current = the last on the stack). This
|
second line adds it to the current map (current = the last on the stack). This
|
||||||
approach has a very nice property of being generic. This rule can be referenced
|
approach has a very nice property of being generic. This rule can be referenced
|
||||||
from global and subnet scope (and possibly other scopes as well) and the code
|
from global and subnet scope (and possibly other scopes as well) and the code
|
||||||
will add the IntElement object to whatever is last on the stack, be it
|
will add the IntElement object to whatever is last on the stack, be it global,
|
||||||
global, subnet or perhaps even something else (maybe we will allow preferred
|
subnet or perhaps even something else (maybe one day we will allow preferred
|
||||||
lifetime to be defined on a per pool or per host basis?).
|
lifetime to be defined on a per pool or per host basis?).
|
||||||
|
|
||||||
@section parserSubgrammar Parsing partial grammar
|
@section parserSubgrammar Parsing partial grammar
|
||||||
@@ -385,6 +390,9 @@ This trick is also implemented in the lexer. There's a flag called start_token_f
|
|||||||
When initially set to true, it will cause the lexer to emit an artificial
|
When initially set to true, it will cause the lexer to emit an artificial
|
||||||
token once, before parsing any input whatsoever.
|
token once, before parsing any input whatsoever.
|
||||||
|
|
||||||
|
This optional feature can be skipped altogether if you don't plan to parse parts
|
||||||
|
of the configuration.
|
||||||
|
|
||||||
@section parserBisonExtend Extending grammar
|
@section parserBisonExtend Extending grammar
|
||||||
|
|
||||||
Adding new parameters to existing parsers is very easy once you get hold of the
|
Adding new parameters to existing parsers is very easy once you get hold of the
|
||||||
@@ -402,7 +410,7 @@ Here's the complete set of necessary changes.
|
|||||||
@code
|
@code
|
||||||
SUBNET_4O6_INTERFACE_ID "4o6-interface-id"
|
SUBNET_4O6_INTERFACE_ID "4o6-interface-id"
|
||||||
@endcode
|
@endcode
|
||||||
This defines a token called SUBNET_4O6_INTERFACE_ID that, when needed to
|
This defines a token called SUBNET_4O6_INTERFACE_ID that, when needed to
|
||||||
be printed, will be represented as "4o6-interface-id".
|
be printed, will be represented as "4o6-interface-id".
|
||||||
|
|
||||||
2. Tell lexer how to recognize the new parameter:
|
2. Tell lexer how to recognize the new parameter:
|
||||||
@@ -439,7 +447,7 @@ Here's the complete set of necessary changes.
|
|||||||
weird that happens to match our reserved keywords. Therefore we switch to
|
weird that happens to match our reserved keywords. Therefore we switch to
|
||||||
no keyword context. This tells the lexer to interpret everything as string,
|
no keyword context. This tells the lexer to interpret everything as string,
|
||||||
integer or float.
|
integer or float.
|
||||||
|
|
||||||
4. Finally, extend the existing subnet4_param that defines all allowed parameters
|
4. Finally, extend the existing subnet4_param that defines all allowed parameters
|
||||||
in Subnet4 scope to also cover our new parameter (the new line marked with *):
|
in Subnet4 scope to also cover our new parameter (the new line marked with *):
|
||||||
@code
|
@code
|
||||||
|
Reference in New Issue
Block a user