mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-22 01:51:26 +00:00

Author	SHA1	Message	Date
Ilya Maximets	1de4a08c22	json: Use functions to access json arrays. Internal implementation of JSON array will be changed in the future commits. Add access functions that users can rely on instead of accessing the internals of 'struct json' directly and convert all the users. Structure fields are intentionally renamed to make sure that no code is using the old fields directly. json_array() function is removed, as not needed anymore. Added new functions: json_array_size(), json_array_at(), json_array_set() and json_array_pop(). These are enough to cover all the use cases within OVS. The change is fairly large, however, IMO, it's a much overdue cleanup that we need even without changing the underlying implementation. Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-30 16:53:56 +02:00
Dmitry Porokh	421c94ee14	ovsdb: Introduce and use specialized uuid print functions. According to profiling data, converting UUIDs to strings is a frequent operation in some workloads. This typically results in a call to xasprintf(), which internally calls vsnprintf() twice, first to calculate the required buffer size, and then to format the string. This patch introduces specialized functions for printing UUIDs, which both reduces code duplication and improves performance. For example, on my laptop, 10,000,000 calls to the new uuid_to_string() function takes 1296 ms, while the same number of xasprintf() calls using UUID_FMT take 2498 ms. Signed-off-by: Dmitry Porokh <dporokh@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2025-05-08 09:28:21 +02:00
Adrian Moreno	9e56549c2b	hmap: use short version of safe loops if possible. Using SHORT version of the *_SAFE loops makes the code cleaner and less error prone. So, use the SHORT version and remove the extra variable when possible for hmap and all its derived types. In order to be able to use both long and short versions without changing the name of the macro for all the clients, overload the existing name and select the appropriate version depending on the number of arguments. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Ilya Maximets	08e9e53373	ovsdb: raft: Fix inability to read the database with DNS host names. Clustered OVSDB allows to use DNS names as addresses of raft members. However, if DNS resolution fails during the initial database read, this causes a fatal failure and exit of the ovsdb-server process. Also, if DNS name of a joining server is not resolvable for one of the followers, this follower will reject append requests for a new server to join until the name is successfully resolved. This makes a follower effectively non-functional while DNS is unavailable. To fix the problem relax the address verification. Allowing validation to pass if only name resolution failed and the address is valid otherwise. This will allow addresses to be added to the database, so connections could be established later when the DNS is available. Additionally fixing missed initialization of the dns-resolve module. Without it, DNS requests are blocking. This causes unexpected delays in runtime. Fixes: 771680d96fb6 ("DNS: Add basic support for asynchronous DNS resolving") Reported-at: https://bugzilla.redhat.com/2055097 Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Ilya Maximets	0de8829540	raft: Don't keep full json objects in memory if no longer needed. Raft log entries (and raft database snapshot) contains json objects of the data. Follower receives append requests with data that gets parsed and added to the raft log. Leader receives execution requests, parses data out of them and adds to the log. In both cases, later ovsdb-server reads the log with ovsdb_storage_read(), constructs transaction and updates the database. On followers these json objects in common case are never used again. Leader may use them to send append requests or snapshot installation requests to followers. However, all these operations (except for ovsdb_storage_read()) are just serializing the json in order to send it over the network. Json objects are significantly larger than their serialized string representation. For example, the snapshot of the database from one of the ovn-heater scale tests takes 270 MB as a string, but 1.6 GB as a json object from the total 3.8 GB consumed by ovsdb-server process. ovsdb_storage_read() for a given raft entry happens only once in a lifetime, so after this call, we can serialize the json object, store the string representation and free the actual json object that ovsdb will never need again. This can save a lot of memory and can also save serialization time, because each raft entry for append requests and snapshot installation requests serialized only once instead of doing that every time such request needs to be sent. JSON_SERIALIZED_OBJECT can be used in order to seamlessly integrate pre-serialized data into raft_header and similar json objects. One major special case is creation of a database snapshot. Snapshot installation request received over the network will be parsed and read by ovsdb-server just like any other raft log entry. However, snapshots created locally with raft_store_snapshot() will never be read back, because they reflect the current state of the database, hence already applied. For this case we can free the json object right after writing snapshot on disk. Tests performed with ovn-heater on 60 node density-light scenario, where on-disk database goes up to 97 MB, shows average memory consumption of ovsdb-server Southbound DB processes decreased by 58% (from 602 MB to 256 MB per process) and peak memory consumption decreased by 40% (from 1288 MB to 771 MB). Test with 120 nodes on density-heavy scenario with 270 MB on-disk database shows 1.5 GB memory consumption decrease as expected. Also, total CPU time consumed by the Southbound DB process reduced from 296 to 256 minutes. Number of unreasonably long poll intervals reduced from 2896 down to 1934. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-09-01 00:14:42 +02:00
Ilya Maximets	83fbd2e9dc	raft: Avoid having more than one snapshot in-flight. Previous commit 8c2c503bdb0d ("raft: Avoid sending equal snapshots.") took a "safe" approach to not send only exactly same snapshot installation requests. However, it doesn't make much sense to send more than one snapshot at a time. If obsolete snapshot installed, leader will re-send the most recent one. With this change leader will have only 1 snapshot in-flight per connection. This will reduce backlogs on raft connections in case new snapshot created while 'install_snapshot_request' is in progress or if election timer changed in that period. Also, not tracking the exact 'install_snapshot_request' we've sent allows to simplify the code. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829 Fixes: 8c2c503bdb0d ("raft: Avoid sending equal snapshots.") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-11-03 13:01:33 +01:00
Ilya Maximets	8c2c503bdb	raft: Avoid sending equal snapshots. Snapshots are huge. In some cases we could receive several outdated append replies from the remote server. This could happen in high scale cases if the remote server is overloaded and not able to process all the raft requests in time. As an action to each outdated append reply we're sending full database snapshot. While remote server is already overloaded those snapshots will stuck in jsonrpc backlog for a long time making it grow up to few GB. Since remote server wasn't able to timely process incoming messages it will likely not able to process snapshots leading to the same situation with low chances to recover. Remote server will likely stuck in 'candidate' state, other servers will grow their memory consumption due to growing jsonrpc backlogs: jsonrpc\|INFO\|excessive sending backlog, jsonrpc: ssl:192.16.0.3:6644, num of msgs: 3795, backlog: 8838994624. This patch is trying to avoid that situation by avoiding sending of equal snapshot install requests. This helps maintain reasonable memory consumption and allows the cluster to recover on a larger scale. Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-05-28 11:46:39 +02:00
Han Zhou	a76ba8254d	raft: Save and read new election timer in header snapshot. This patch store the latest election timer in snapshot during log compression, and when server restarts it reads the value from the log. Without this, any previous changes to election timer will be lost in the log, and if server restarts, it will use the default value instead of the changed value. Fixes: commit 8e35461 ("ovsdb raft: Support leader election time change online.") Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-08-23 14:42:57 -07:00
Han Zhou	8e35461419	ovsdb raft: Support leader election time change online. A new unixctl command cluster/change-election-timer is implemented to change leader election timeout base value according to the scale needs. The change takes effect upon consensus of the cluster, implemented through the append-request RPC. A new field "election-timer" is added to raft log entry for this purpose. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-08-21 11:30:08 -07:00
Numan Siddique	f31b8ae7a7	ovn-nbctl: Fix the ovn-nbctl test "LBs - daemon" which fails during rpm build When 'make check' is called by the mock rpm build (which disables networking), the test "ovn-nbctl: LBs - daemon" fails when it runs the command "ovn-nbctl lb-add lb0 30.0.0.1a 192.168.10.10:80,192.168.10.20:80". ovn-nbctl extracts the vip by calling the socket util function 'inet_parse_active()', and this function blocks when libunbound function ub_resolve() is called further down. ub_resolve() is a blocking function without timeout and all the ovs/ovn utilities use this function. As reported by Timothy Redaelli, the issue can also be reproduced by running the below commands $ sudo unshare -mn -- sh -c 'ip addr add dev lo 127.0.0.1 && \ mount --bind /dev/null /etc/resolv.conf && runuser $SUDO_USER' $ make sandbox SANDBOXFLAGS="--ovn" $ ovn-nbctl -vsocket_util:off lb-add lb0 30.0.0.1a \ 192.168.10.10:80,192.168.10.20:80 To address this issue, this patch adds a new bool argument 'resolve_host' to the function inet_parse_active() to resolve the host only if it is 'true'. ovn-nbctl/ovn-northd will pass 'false' when it calls this function to parse the load balancer values. Reported-by: Timothy Redaelli <tredaelli@redhat.com> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1641672 Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-11-05 07:11:10 -08:00
Ben Pfaff	1bb011218d	socket-util: Make inet_parse_active() and inet_parse_passive() more alike. Until now, the default_port parameters to these functions have had different types and different behavior. There is a reason for this, since it makes sense to listen on a kernel-selected port but it does not make sense to connect to a kernel-selected port, but this overlooks the possibility that a caller might want to parse a string in the format understood by inet_parse_active() without actually using it to connect to a remote host. This commit makes the behavior consistent and updates all the callers to work with the new semantics. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>	2018-04-16 14:51:09 -07:00
Ben Pfaff	1b1d2e6daa	ovsdb: Introduce experimental support for clustered databases. This commit adds support for OVSDB clustering via Raft. Please read ovsdb(7) for information on how to set up a clustered database. It is simple and boils down to running "ovsdb-tool create-cluster" on one server and "ovsdb-tool join-cluster" on each of the others and then starting ovsdb-server in the usual way on all of them. One you have a clustered database, you configure ovn-controller and ovn-northd to use it by pointing them to all of the servers, e.g. where previously you might have said "tcp:1.2.3.4" was the database server, now you say that it is "tcp:1.2.3.4,tcp:5.6.7.8,tcp:9.10.11.12". This also adds support for database clustering to ovs-sandbox. Acked-by: Justin Pettit <jpettit@ovn.org> Tested-by: aginwala <aginwala@asu.edu> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-03-24 12:04:53 -07:00

12 Commits