OBSOLETE: This page documents early look at the HA+MT problem. For the actual design, see this page. This page is kept for historical reasons, as it contains useful comments.
To be able to make HA lib truly take advantage of the multi-threading Kea implementation, changes in the Kea core, Control Agent and HA lib must be addressed.
HA <- http -> Control Agent <- UNIX socket Kea commands/responses in JSON format -> Kea
The smallest requirement to improve performance it to make the entire communication ASYNC (non blocking) and to support out of order requests/replies.
As http protocol only allows in order request/replies (supports in order interleaved transactions), this is limiting the design for multi-threading support: if one packet requires an update (which takes 500 ms to process), and then another packet requires an update (which takes 5 ms to process), then the second packet will be postponed until the first packet is resolved. This, by design, adds unnecessary waiting times.
Currently, the communication and connection states allow only sequential and in order execution: send/receive send/receive
A truly ASYNC and out of order executions would permit:
send(1) send(2) send(3) receive(2) receive(3) send(4) receive(1) receive(4) ...
The most ambitious requirements would consist in having all connections in parallel.
a fully parallel (multiple connections) ASYNC and out of order executions would permit:
send(11) send(12) send(13) receive(12) receive(13) send(14) receive(11) receive(14) ...
send(21) send(22) receive(22) send(23) receive(21) receive(23) send(24) receive(24) ...
The most restrictive element in the current implementation is the fact that all connections use the same IO service, and they do not keep a state for each transaction. This is the reason why all communication is serial and in order (this makes sense in the context of the http protocol requirements).
The proper way to handle this is to simply remove the http layer and remove Control Agent from the HA <-> Kea communication.
The compromise, which would mean to simply keep http protocol and Control Agent would be to use multiple http connections, so we can send responses in parallel, and multiplex the processing at each end (Kea and HA)
Current limitations and required actions:
- Kea core:
- currently main thread handles all commands -> requirements:
- kea should handle HA commands in parallel
- create thread pool for commands and queue requests to it's task queue
- current race avoidance mechanism should work for HA specific commands
- address resource should be enough as pairs are using disjoint pools and any peer lease update should not interfere with assigned dhcp traffic
- HA lib / http client:
-
currently http ASYNC API only in order execution -> requirements:
- out of order (non http) ASYNC API OR
- use multiple http connections and multiplex the data at each end
- transaction ID can be used to identify and properly manage responses
- if using one MT instance and one ST instance, replies can be received out of order, so this must be supported in ST mode also
- each transaction should use a separate state so out of order or parallel send/receive can be supported
- each connection should use it's own IO service if full parallel communication is desired
- detailed in:
-
currently state per connection -> requirements:
- dropping http protocol and using TCP connection with own ASYNC and out of order requests/replies support protocol (recommended) OR
- crate multiple connections
- state per transaction
- current_request_ current_response_ parser_ current_callback_ current_transid_ must be moved inside a transaction object (in new TCP connection object)
- Control Agent:
- currently using http client and inherits all drawbacks -> requirements:
- simply remove Control Agent from HA <-> Kea communication (recommended) OR
- support ASYNC requests/replies (same requirements as HA lib)
- support multiple connections with multiple IO services (one per connection)
- support multiple UNIX socket connections with KEA for full parallel communication
Enhancements
multiple connections per url/endpoint (connection pool)
- each thread should use it's own 'url' connection and pop/push it from a connection pool just like lease managers do
HA connections are handled by main thread using global IO service
- this is limiting the speed of http replies handling
- multiple HA IO services should be run on different threads to take advantage of multi-threading replies and packet unparking (also done by main thread)
- (will become the next bottleneck - connections not handled in parallel)
a new listening socket for HA commands can be used
- no need to refactor IfaceMgr (which should be done anyway)
- to take advantage of the particularities of HA implementation, a separate socket can be used for parallel commands, which can use light-weight race avoidance while generic commands should stop dhcp thread pool if updating leases
- currently IfaceMgr handles fds related to DHCP traffic, HA connections, command channel, dhcpv4odhcpv6 IPCs, d2
a simple TCP connection can be used
- command channel in Kea should also listen on TCP, not only on unix sockets
- this would require less coding/encoding of commands inside http payload and would speed communication
- this would support out of order requests/replies which are adding unnecessary waiting times
- http does not provide security, so an option would be to use secure tunnel for HA interface (IPsec or TLS VPN) - out of Kea scope
- TCP direct connections are easier to make secure: for instance it is possible to check what is the port used by the peer and drop incoming connections using a not privileged and/or not expected port (one can fix the port and/or the address by calling bind before connect).