mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-22 09:58:01 +00:00

Author	SHA1	Message	Date
Ilya Maximets	1de4a08c22	json: Use functions to access json arrays. Internal implementation of JSON array will be changed in the future commits. Add access functions that users can rely on instead of accessing the internals of 'struct json' directly and convert all the users. Structure fields are intentionally renamed to make sure that no code is using the old fields directly. json_array() function is removed, as not needed anymore. Added new functions: json_array_size(), json_array_at(), json_array_set() and json_array_pop(). These are enough to cover all the use cases within OVS. The change is fairly large, however, IMO, it's a much overdue cleanup that we need even without changing the underlying implementation. Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-30 16:53:56 +02:00
Ilya Maximets	6c48b29f52	json: Always use the json_string() method to access the strings. We'll be changing the way strings are stored, so the direct access will not be safe anymore. Change all the users to use the proper API as they should have been doing anyway. This also means splitting the handling of strings and serialized objects in most cases as they will be treated differently. The only code outside of json implementation for which direct access is preserved is substitute_uuids() in test-ovsdb.c. It's an unusual string manipulation that is only needed for the testing, so doesn't seem worthy adding a new API function. We could introduce something like json_string_replace() if this use case will appear somewhere else in the future. Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2025-06-30 16:53:56 +02:00
Dmitry Porokh	421c94ee14	ovsdb: Introduce and use specialized uuid print functions. According to profiling data, converting UUIDs to strings is a frequent operation in some workloads. This typically results in a call to xasprintf(), which internally calls vsnprintf() twice, first to calculate the required buffer size, and then to format the string. This patch introduces specialized functions for printing UUIDs, which both reduces code duplication and improves performance. For example, on my laptop, 10,000,000 calls to the new uuid_to_string() function takes 1296 ms, while the same number of xasprintf() calls using UUID_FMT take 2498 ms. Signed-off-by: Dmitry Porokh <dporokh@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2025-05-08 09:28:21 +02:00
Ilya Maximets	e6844c6460	ovsdb-idl: Fix use of uninitialized datum for graph consistency check. Columns in 'new_datum' may not be initialized if they were not written, i.e., when the column in not in the 'written' bitmap. In this case the actual content should be read from the 'old_datum' instead. If the old one is also not available, then the default should be used. WARNING: MemorySanitizer: use-of-uninitialized-value 0 0x78d27f in ovsdb_idl_check_consistency lib/ovsdb-idl.c:732:21 1 0x4ee12a in idl_set tests/test-ovsdb.c:2586:9 2 0x4d7c4b in do_idl tests/test-ovsdb.c:2868:18 3 0x6c5704 in ovs_cmdl_run_command__ lib/command-line.c:247:17 4 0x6c4d28 in ovs_cmdl_run_command lib/command-line.c:278:5 5 0x4c39bf in main tests/test-ovsdb.c:80:5 6 0x7f912a02958f in __libc_start_call_main 7 0x7f912a02963f in __libc_start_main@GLIBC_2.2.5 8 0x432b54 in _start (tests/test-ovsdb+0x432b54) Use the ovsdb_idl_read() helper to read the actual correct data during the consistency check. Alternative might be to iterate over the 'written' bitmap and only check those columns, but it seems like that will reduce the intended scope of the check, since 'new_datum' may exist while the 'written' bitmap doesn't, e.g., when 'new_datum == old_datum'. Fixes: 11990a5274f7 ("ovsdb-idl: Check internal graph in OVSDB tests.") Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-11-29 18:37:50 +01:00
Xavier Simonart	c98759a4f3	ovsdb-idl: Fix IDL memory leak. In the following case, we could see multiple leaks detected for memory allocated by ovsdb_idl_txn_add_map_op: insert a row in a table, set a key and delete the row (all within the same transaction). For instance: Direct leak of 24 byte(s) in 1 object(s) allocated from: 0 0x4e60a7 in calloc (./tests/test-ovsdb+0x4e60a7) 1 0x5f9b32 in xcalloc__ ./lib/util.c:125:31 2 0x5f9b60 in xzalloc__ ./lib/util.c:135:12 3 0x5f9c25 in xzalloc ./ovs/lib/util.c:169:12 4 0x5d4899 in ovsdb_idl_txn_add_map_op ./lib/ovsdb-idl.c:4175:29 5 0x5d4758 in ovsdb_idl_txn_write_partial_map ./lib/ovsdb-idl.c:4332:5 6 0x53cbe8 in idltest_simple2_update_smap_setkey ./tests/idltest.c:4701:5 7 0x526fe2 in do_idl_partial_update_map_column ./tests/test-ovsdb.c:3027:5 8 0x59d99c in ovs_cmdl_run_command__ ./lib/command-line.c:247:17 9 0x59d79a in ovs_cmdl_run_command ./lib/command-line.c:278:5 10 0x51d458 in main ./tests/test-ovsdb.c:80:5 11 0x7f0a20a79b74 in __libc_start_main (/lib64/libc.so.6+0x27b74) Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2024-09-11 15:41:06 +02:00
Mike Pattrick	4102674b3e	ovsdb-idl: Preserve change_seqno when deleting rows. In the case of a weak reference, clearing all change_seqno's can delete useful information. Instead of clearing all seqno's when removing track_node, only clear those values in cases including row insertion, and row deleting if no dst_arcs remain. Fixes: 95689f166818 ("ovsdb-idl: Preserve references for deleted rows.") Reported-at: https://issues.redhat.com/browse/FDP-193 Signed-off-by: Mike Pattrick <mkp@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Simon Horman <horms@ovn.org>	2024-01-04 13:26:20 +00:00
Han Zhou	8833e7c8ed	ovsdb-idl: Provide API to disable set_db_change_aware request. For ovsdb clients that are short-lived, e.g. when using ovn-nbctl/ovn-sbctl to read some metrics from the OVN NB/SB server, they don't really need to be aware of db changes, because they exit immediately after getting the initial response for the requested data. In such use cases, however, the clients still send 'set_db_change_aware' request, which results in server side error logs when the server tries to send out the response for the 'set_db_change_aware' request, because at the moment the client that is supposed to receive the request has already closed the connection and exited. E.g.: 2023-01-10T18:23:29.431Z\|00007\|jsonrpc\|WARN\|unix#3: receive error: Connection reset by peer 2023-01-10T18:23:29.431Z\|00008\|reconnect\|WARN\|unix#3: connection dropped (Connection reset by peer) To avoid such problems, this patch provides an API to allow a client to choose to not send the 'set_db_change_aware' request. There was an earlier attempt to fix this [0], but it was not accepted back then as discussed in the email [1]. It was also discussed in the emails that an alternative approach is to use notification instead of request, but that would require protocol changes and taking backward compatibility into consideration. So this patch takes a different approach and tries to keep the change small. [0] http://patchwork.ozlabs.org/project/openvswitch/patch/1594380801-32134-1-git-send-email-dceara@redhat.com/ [1] https://mail.openvswitch.org/pipermail/ovs-discuss/2021-February/050919.html Reported-by: Girish Moodalbail <gmoodalbail@nvidia.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-July/050343.html Reported-by: Tobias Hofmann <tohofman@cisco.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2021-February/050914.html Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-01-16 20:14:10 +01:00
Numan Siddique	55b9507e68	ovsdb-idl: Add the support to specify the uuid for row insert. ovsdb-server allows the OVSDB clients to specify the uuid for the row inserts [1]. Both the C IDL client library and Python IDL are missing this feature. This patch adds this support. In C IDL, for each schema table, a new function is generated - <schema_table>insert_persistent_uuid(txn, uuid) which can be used the clients to persist the uuid. ovs-vsctl and other derivatives of ctl now supports the same in the generic 'create' command with the option "--id=<UUID>". In Python IDL, the uuid to persist can be specified in the Transaction.insert() function. [1] - a529e3cd1f("ovsdb-server: Allow OVSDB clients to specify the UUID for inserted rows.:) Acked-by: Adrian Moreno <amorenoz@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Acked-by: Terry Wilson <twilson@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-11-30 15:15:57 +01:00
Xavier Simonart	02f31a1262	ovsdb-idl: Preserve references for rows deleted in same IDL run as their insertion. Considering two DB rows, 'a' from table A and 'b' from table B (with column 'ref_a' a reference to table A): a = {A._uuid=<U1>} b = {B._uuid=<U2>, B.ref_a=<U1>} Assuming both records are inserted in the IDL client's in-memory view of the database, if row 'b' is also deleted in the same run, it should generate the following tracked changes: - for table A: - inserted records: a = {A._uuid=<U1>} - for table B: - inserted records: b = {B._uuid=<U2>, B.ref_a=<U1>} - deleted records: b = {B._uuid=<U2>, B.ref_a=<U1>} Before this patch, inserted and deleted records in table B would (in some cases [0]) be b = {B._uuid=<U2>, B.ref_a=[]}. Having B.ref_a=[] would violate the integrity of the database from client perspective. test-ovsdb has also been updated to show that one row can be both inserted and deleted within one IDL run. [0] In ovn-controller the fact that the reference is NULL caused a crash in the following case, when both commands were handled by ovn-controller within the same loop: $ ovn-nbctl ls-add sw0 -- lsp-add sw0 sw0-port1 -- \ lsp-set-addresses sw0-port1 "50:54:00:00:00:01 192.168.0.2" $ ovn-nbctl lsp-del sw0-port1 Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2126450 Fixes: 91e1ff5dde39 ("ovsdb-idl: Don't reparse orphaned rows.") Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-09-27 00:44:35 +02:00
Frode Nordahl	0443c1557e	Fix spelling error exposed in binaries. As reported by Debian lintian. Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-14 15:24:07 +02:00
Ilya Maximets	485ac63d10	ovsdb: Add lazy-copy support for ovsdb_datum objects. Currently ovsdb-server is using shallow copies of some JSON objects by keeping a reference counter. JSON string objects are also used directly as ovsdb atoms in database rows to avoid extra copies. Taking this approach one step further ovsdb_datum objects can also be mostly deduplicated by postponing the copy until it actually needed. datum object itself contains a type and 2 pointers to data arrays. Adding a one more pointer to a reference counter we may create a shallow copy of the datum by simply copying type and pointers and increasing the reference counter. Before modifying the datum, special function needs to be called to perform an actual copy of the object, a.k.a. unshare it. Most of the datum modifications are performed inside the special functions in ovsdb-data.c, so that is not very hard to track. A few places like ovsdb-server.c and column mutations are accessing and changing the data directly, so a few extra unshare() calls has to be added there. This change doesn't affect the maximum memory consumption too much, because most of the copies are short-living. However, not actually performing these copies saves up to 40% of CPU time on operations with large sets. Reported-at: https://bugzilla.redhat.com/2069089 Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-13 20:33:07 +02:00
Dumitru Ceara	c558f9f1e1	ovsdb-idl: Get per-database memory usage statistics. Clients might be connected to multiple databases (e.g., ovn-controller is connected to OVN_Southbound and Open_vSwitch databases) and the IDL memory statistics are more useful if they're not aggregated. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-06-28 13:41:42 +02:00
Dumitru Ceara	d94cd0d3ee	ovsdb-idl: Support write-only-changed IDL monitor mode. At a first glance, change tracking should never be allowed for write-only columns. However, some clients (e.g., ovn-northd) that are mostly exclusive writers of a database, use change tracking to avoid duplicating the IDL row records into a local cache when implementing incremental processing. The default behavior of the IDL is to automatically turn a write-only column into a read-write column whenever the client enables change tracking for that column. For the afore mentioned clients, this becomes a performance issue. Commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that don't change a column's value.") explains why writes that don't change a column's value cannot be optimized out early if the column is read/write. Furthermore, if there is at least one record in any table that changed during a transaction, then all records that have been written are added to the transaction, even if their values didn't change. If there are many such rows (e.g., like in ovn-northd's case) this incurs a significant overhead because: a. the client has to build this large transaction b. the transaction has to be sent over the network c. the server needs to parse this (mostly) no-op update We now introduce new IDL APIs allowing users to set a new monitoring mode flag, OVSDB_IDL_WRITE_CHANGED_ONLY, to indicate to the IDL that the atomicity constraints may be relaxed and written columns that don't change value can be skipped from the current transaction. We benchmarked ovn-northd performance when using this new mode against NB and SB databases taken from ovn-kubernetes scale tests. We noticed that when a minor change is performed to the Northbound database (e.g., NB_Global.nb_cfg is incremented) the time it takes to build the Southbound transaction becomes negligible (vs ~1.5 seconds before this change). End-to-end ovn-kubernetes scale tests on 120-node clusters also show significant reduction of latency to bring up pods; both average and P99 latency decreased by ~30%. Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-28 16:57:43 +02:00
Adrian Moreno	9e56549c2b	hmap: use short version of safe loops if possible. Using SHORT version of the *_SAFE loops makes the code cleaner and less error prone. So, use the SHORT version and remove the extra variable when possible for hmap and all its derived types. In order to be able to use both long and short versions without changing the name of the macro for all the clients, overload the existing name and select the appropriate version depending on the number of arguments. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Adrian Moreno	e9bf5bffb0	list: use short version of safe loops if possible. Using the SHORT version of the *_SAFE loops makes the code cleaner and less error-prone. So, use the SHORT version and remove the extra variable when possible. In order to be able to use both long and short versions without changing the name of the macro for all the clients, overload the existing name and select the appropriate version depending on the number of arguments. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Dumitru Ceara	4628be9ff8	ovsdb-idl: Fix use-after-free when destroying an IDL loop. Transactions that are still incomplete (waiting for a reply from the server) are kept in the IDL's 'outstanding_txns' map. When a transaction is destroyed, ovsdb_idl_txn_destroy() will take care of removing the transaction from the 'outstanding_txns' map if the transaction was incomplete but also abort it and disassemble it if needed. Aborting the transaction first, before ovsdb_idl_txn_destroy(), may cause an use-after-free if the transaction was outstanding; that's because the transaction would move to state "aborted" without being removed from the 'outstanding_txns' map. Fixes: 53a540e5311c ("ovsdb-idl: ovsdb_idl_loop_destroy must also destroy the committing txn.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-02-16 20:56:36 +01:00
Dumitru Ceara	28f36edd19	ovsdb-idl: Only process successful txn in ovsdb_idl_loop_run. Otherwise we hide the transaction result from the user. This may cause problems as the user will not detect error cases. For example, if the server refuses a transaction with "constraint violation", the user should be notified because the transaction might need to be retried. For clients that process database changes incrementally (using change tracking) this lack of failure notification creates a problem if it occurs while no other database changes happen. In that case: - ovsdb_idl_loop_run() silently consumes the failure, initializes a new transaction. - no other table update was received from the server so the user will not add anything to the new transaction. - ovsdb_idl_loop_commit_and_wait() will "succeed" as nothing changed from the client's perspective. In reality, the first transaction failed and the client wasn't given the chance to handle the failure. Commit 0401cf5f9e06 ("ovsdb idl: Try committing the pending txn in ovsdb_idl_loop_run.") tried to optimize for the common, successful case. Maintain the same approach and optimize for transactions that succeeded but fall back to the old mechanism of processing failures within ovsdb_idl_loop_commit_and_wait() instead. Fixes: 0401cf5f9e06 ("ovsdb idl: Try committing the pending txn in ovsdb_idl_loop_run.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-02-04 11:48:31 +01:00
Dumitru Ceara	53a540e531	ovsdb-idl: ovsdb_idl_loop_destroy must also destroy the committing txn. Found by AddressSanitizer when running OVN tests: Direct leak of 64 byte(s) in 1 object(s) allocated from: #0 0x498fb2 in calloc (/ic/ovn-ic+0x498fb2) #1 0x5f681e in xcalloc__ ovs/lib/util.c:121:31 #2 0x5f681e in xzalloc__ ovs/lib/util.c:131:12 #3 0x5f681e in xzalloc ovs/lib/util.c:165:12 #4 0x5e3697 in ovsdb_idl_txn_add_map_op ovs/lib/ovsdb-idl.c:4057:29 #5 0x4d3f25 in update_isb_pb_external_ids ic/ovn-ic.c:576:5 #6 0x4cc4cc in create_isb_pb ic/ovn-ic.c:716:5 #7 0x4cc4cc in port_binding_run ic/ovn-ic.c:803:21 #8 0x4cc4cc in ovn_db_run ic/ovn-ic.c:1700:5 #9 0x4c9c1c in main ic/ovn-ic.c:1984:17 #10 0x7f9ad9f4a0b2 in __libc_start_main Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-31 21:29:43 +01:00
Dumitru Ceara	91e1ff5dde	ovsdb-idl: Don't reparse orphaned rows. Rows that refer to rows that were inserted in the current IDL run should only be reparsed if they don't get deleted (become orphan) in the current IDL run. Fixes: 7b8aeadd60c8 ("ovsdb-idl: Re-parse backrefs of inserted rows only once.") Reported-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-11-30 13:44:47 +01:00
Ilya Maximets	7b8aeadd60	ovsdb-idl: Re-parse backrefs of inserted rows only once. While adding new rows ovsdb-idl re-parses all the other rows that references this new one. For example, current ovn-kubernetes creates load balancers and adds the same load balancer to all logical switches and logical routers. So, then a new load balancer is added, rows for all logical switches and routers re-parsed. During initial database connection (or re-connection with monitor/monitor_cond or monitor_cond_since with outdated last transaction id) the client downloads the whole content of a database. In case of OVN, there might be already thousands of load balancers configured. ovsdb-idl will process rows in that initial monitor reply one-by-one. Therefore, for each load balancer row, it will re-parse all rows for switches and routers. Assuming that we have 120 Logical Switches and 30K load balancers. Processing of the initial monitor reply will take 120 (switch rows) * 30K (load balancer references in a switch row) * 30K (load balancer rows) = 10^11 operations, which may take hours. ovn-kubernetes will use LB groups eventually, but there are other less obvious cases that cannot be changed that easily. Re-parsing doesn't change any internal structures of the IDL. It destroys and re-creates exactly same arcs between rows. The only thing that changes is the application-facing array of pointers. Since internal structures remains intact, suggested solution is to postpone the re-parsing of back references until all the monitor updates processed. This way we can re-parse each row only once. Tested in a sandbox with 120 LSs, 120 LRs and 3K LBs, where each load balancer added to each LS and LR, by re-statring ovn-northd and measuring the time spent in ovsdb_idl_run(). Before the change: OVN_Southbound: ovsdb_idl_run took: 924 ms OVN_Northbound: ovsdb_idl_run took: 825118 ms --> 13.75 minutes! After: OVN_Southbound: ovsdb_idl_run took: 692 ms OVN_Northbound: ovsdb_idl_run took: 1698 ms Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-11-19 17:25:30 +01:00
Dumitru Ceara	1bdda7b6d5	ovsdb-idl: Use functions to apply diff in place. On large scale deployments with records that contain large sets, this significantly improves client side performance as it avoids comparing full contents of the old and new rows. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-11-05 00:06:21 +01:00
Ilya Maximets	066741d9c5	ovsdb-idl: Add memory report function. Added new function to return memory usage statistics for database objects inside IDL. Statistics similar to what ovsdb-server reports. Not counting _Server database as it should be small, hence doesn't worth adding extra code to the ovsdb-cs module. Can be added later if needed. ovs-vswitchd is a user in OVS, but this API will be mostly useful for OVN daemons. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Han Zhou <hzhou@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>	2021-11-04 23:13:13 +01:00
Ilya Maximets	429b114c5a	ovsdb-data: Deduplicate string atoms. ovsdb-server spends a lot of time cloning atoms for various reasons, e.g. to create a diff of two rows or to clone a row to the transaction. All atoms, except for strings, contains a simple value that could be copied in efficient way, but duplicating strings every time has a significant performance impact. Introducing a new reference-counted structure 'ovsdb_atom_string' that allows to not copy strings every time, but just increase a reference counter. This change allows to increase transaction throughput in benchmarks up to 2x for standalone databases and 3x for clustered databases, i.e. number of transactions that ovsdb-server can handle per second. It also noticeably reduces memory consumption of ovsdb-server. Next step will be to consolidate this structure with json strings, so we will not need to duplicate strings while converting database objects to json and back. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Mark D. Gray <mark.d.gray@redhat.com>	2021-09-24 15:53:46 +02:00
Ilya Maximets	51946d2227	ovsdb-data: Optimize union of sets. Current algorithm of ovsdb_datum_union looks like this: for-each atom in b: if not bin_search(a, atom): push(a, clone(atom)) quicksort(a) So, the complexity looks like this: Nb * log2(Na) + Nb + (Na + Nb) * log2(Na + Nb) Comparisons clones Comparisons for quicksort for search ovsdb_datum_union() is heavily used in database transactions while new element is added to a set. For example, if new logical switch port is added to a logical switch in OVN. This is a very common use case where CMS adds one new port to an existing switch that already has, let's say, 100 ports. For this case ovsdb-server will have to perform: 1 * log2(100) + 1 clone + 101 * log2(101) Comparisons Comparisons for for search quicksort. ~7 1 ~707 Roughly 714 comparisons of atoms and 1 clone. Since binary search can give us position, where new atom should go (it's the 'low' index after the search completion) for free, the logic can be re-worked like this: copied = 0 for-each atom in b: desired_position = bin_search(a, atom) push(result, a[ copied : desired_position - 1 ]) copied = desired_position push(result, clone(atom)) push(result, a[ copied : Na ]) swap(a, result) Complexity of this schema: Nb * log2(Na) + Nb + Na Comparisons clones memory copy on push for search 'swap' is just a swap of a few pointers. 'push' is not a 'clone', but a simple memory copy of 'union ovsdb_atom'. In general, this schema substitutes complexity of a quicksort with complexity of a memory copy of Na atom structures, where we're not even copying strings that these atoms are pointing to. Complexity in the example above goes down from 714 comparisons to 7 comparisons and memcpy of 100 * sizeof (union ovsdb_atom) bytes. General complexity of a memory copy should always be lower than complexity of a quicksort, especially because these copies usually performed in bulk, so this new schema should work faster for any input. All in all, this change allows to execute several times more transactions per second for transactions that adds new entries to sets. Alternatively, union can be implemented as a linear merge of two sorted arrays, but this will result in O(Na) comparisons, which is more than Nb * log2(Na) in common case, since Na is usually far bigger than Nb. Linear merge will also mean per-atom memory copies instead of copying in bulk. 'replace' functionality of ovsdb_datum_union() had no users, so it just removed. But it can easily be added back if needed in the future. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Han Zhou <hzhou@ovn.org> Acked-by: Mark D. Gray <mark.d.gray@redhat.com>	2021-09-24 14:55:54 +02:00
Numan Siddique	7502849e95	ovsdb-idl: Add APIs to query if a table and a column is present. This patch adds 2 new APIs in the ovsdb-idl client library - ovsdb_idl_server_has_table() and ovsdb_idl_server_has_column() to query if a table and a column is present in the IDL or not. This patch also adds IDL helper functions which are auto generated from the schema which makes it easier for the clients. These APIs are required for scenarios where the server schema is old and missing a table or column and the client (built with a new schema version) does a transaction with the missing table or column. This results in a continuous loop of transaction failures. Related-Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1992705 Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-08-28 02:59:04 +02:00
Ilya Maximets	04f8881f57	ovsdb-idl: Fix the database update signaling if it has never been connected. The symptom of this issue is that OVS bridge looses its IP address on restart. Simple reproducer: 0. start ovsdb-server and ovs-vswitchd 1. ovs-vsctl add-br br0 2. ifconfig br0 10.0.0.1 up 3. ovs-appctl -t ovs-vswitchd exit 4. start ovs-vswitchd back. After step #3 ovs-vswitchd is down, but br0 interface exists and has configured IP address. After step #4 there is no IP address on the port br0. What happened: 1. ovsdb-cs connects to the database via ovsdb-idl and requests database lock. --> get_schema for _Server database --> lock request 2. ovsdb-cs receives schema for the _Server database. And sends monitor request. <-- schema for _Server --> monitor_cond for _Server 3. ovsdb-cs receives lock reply. <-- locked At this point ovsdb-cs generates OVSDB_CS_EVENT_TYPE_LOCKED event and passes it to ovsdb-idl. ovsdb-idl increases change_seqno. 4. ovsdb_idl_has_ever_connected() is 'true' now, because change_seqno is not zero. 5. ovs-vswitchd decides that it has connection with database and all the initial data, therefore initiates configuration of bridges. bridge_run():ovsdb_idl_has_ever_connected() --> true 6. Since monitor request for the Open_vSwitch database is not even sent yet, the database is empty. This leads to removal of all the ports and all other resources. 7. When data finally received, ovs-vswitchd re-creates bridges and ports, but IP addresses can not be restored. While splitting out ovsdb-cs from ovsdb-idl one part of the logic was lost. Particularly, before the split, ovsdb-idl updated change_seqno only in MONITORING state. Restoring the logic by updating the change_seqno only if may send transaction, i.e. lock is ours and ovsdb-cs is in the MONITORING state. This matches with the main purpose of increasing change_seqno at this point, i.e. to force the client to re-try the transaction. With this change ovsdb_idl_has_ever_connected() remains 'false' until the first monitor reply with the actual data received. This issue was reported several times during the last couple of weeks. Reported-at: https://bugzilla.redhat.com/1968445 Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-June/383512.html Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2021-June/051222.html Fixes: 1c337c43ac1c ("ovsdb-idl: Break into two layers.") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>	2021-06-11 01:11:57 +02:00
Dumitru Ceara	ac85cdb38c	ovsdb-idl: Mark arc sources as updated when destination is deleted. Considering two DB rows, 'a' from table A and 'b' from table B (with column 'ref_a' a reference to table A): a = {A._uuid=<U1>} b = {B._uuid=<U2>, B.ref_a=<U1>} When the IDL client processes an update that deletes row 'a', row 'b' is also marked as 'updated' if change tracking is enabled for table B. Fixes: 102781cc02c6 ("ovsdb-idl: Track changes for table references.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-04-01 14:15:49 +02:00
Dumitru Ceara	95689f1668	ovsdb-idl: Preserve references for deleted rows. Considering two DB rows, 'a' from table A and 'b' from table B (with column 'ref_a' a reference to table A): a = {A._uuid=<U1>} b = {B._uuid=<U2>, B.ref_a=<U1>} Assuming both records are present in the IDL client's in-memory view of the database, depending whether row 'b' is also deleted in the same transaction or not, deletion of row 'a' should generate the following tracked changes: 1. only row 'a' is deleted: - for table A: - deleted records: a = {A._uuid=<U1>} - for table B: - updated records: b = {B._uuid=<U2>, B.ref_a=[]} 2. row 'a' and row 'b' are deleted in the same update: - for table A: - deleted records: a = {A._uuid=<U1>} - for table B: - deleted records: b = {B._uuid=<U2>, B.ref_a=<U1>} To ensure this, we now delay reparsing row backrefs for deleted rows until all updates in the current run have been processed. Without this change, in scenario 2 above, the tracked changes for table B would be: - deleted records: b = {B._uuid=<U2>, B.ref_a=[]} In particular, for strong references, row 'a' can never be deleted in a transaction that happens strictly before row 'b' is deleted. In some cases [0] both rows are deleted in the same transaction and having B.ref_a=[] would violate the integrity of the database from client perspective. This would force the client to always validate that strong reference fields are non-NULL. This is not really an option because the information in the original reference is required for incrementally processing the record deletion. [0] with ovn-monitor-all=true, the following command triggers a crash in ovn-controller because a strong reference field becomes NULL: $ ovn-nbctl --wait=hv -- lr-add r -- lrp-add r rp 00:00:00:00:00:01 1.0.0.1/24 $ ovn-nbctl lr-del r Reported-at: https://bugzilla.redhat.com/1932642 Fixes: 72aeb243a52a ("ovsdb-idl: Tracking - preserve data for deleted rows.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-04-01 14:15:49 +02:00
Dumitru Ceara	4c0d093b17	ovsdb-idl.at: Make test outputs more predictable. IDL tests need predictable output from test-ovsdb. This used to be done by first sorting the output of test-ovsdb and then applying uuidfilt to predictably translate UUIDs. This was not reliable enough in case test-ovsdb processes two or more insert/delete operations in the same iteration because the order of lines in the output depends on the automatically generated UUID values. To fix this we change the way test-ovsdb and test-ovsdb.py generate outputs and prepend the table name and tracking information before printing the contents of a row. All existing ovsdb-idl.at and ovsdb-cluster.at tests are updated to expect the new output format. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-04-01 13:53:20 +02:00
Ben Pfaff	1c337c43ac	ovsdb-idl: Break into two layers. This change breaks the IDL into two layers: the IDL proper, whose interface to its client is unchanged, and a low-level library called the OVSDB "client synchronization" (CS) library. There are two reasons for this change. First, the IDL is big and complicated and I think that this change factors out some of that complication into a simpler lower layer. Second, the OVN northd implementation based on DDlog can benefit from the client synchronization library even though it would actually be made increasingly complicated by the IDL. Signed-off-by: Ben Pfaff <blp@ovn.org>	2021-01-21 15:33:56 -08:00
Ben Pfaff	a5c067a8b9	ovsdb-cs: New module that factors out code from ovsdb-idl. This new module has a single direct user now. In the future, it will also be used by OVN. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ilya Maximets <i.maximets@ovn.org>	2021-01-21 15:09:05 -08:00
Ben Pfaff	2653155874	ovsdb-idl: Add comment. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-18 18:32:17 -08:00
Ben Pfaff	de914f4ee5	ovsdb-idl: Fix memory leak sending messages without a session. When there's no open session, we still have to free the messages that we make but cannot send. I'm not confident that these fix actual bugs, because it seems possible that these code paths can only be hit when the session is nonnull. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-18 18:32:07 -08:00
Ben Pfaff	75439c4bdc	ovsdb-idl: Avoid redundant clearing and parsing of received data. ovsdb_idl_db_parse_monitor_reply() clears the IDL and parses the received data. There's no need to do it again afterward. Signed-off-by: Ben Pfaff <blp@ovn.org> Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.") Acked-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-18 18:31:22 -08:00
Dumitru Ceara	97dbef6de9	ovsdb-idl: Fix expected condition seqno when changes are pending. Commit 17f22fe46142 tried to address this but only covered some of the cases. The correct way to report the expected seqno is to take into account if there already is a condition change that was requested to the server but not acked yet. In that case, the new condition change request will be sent only after the already requested one is acked. That is, expected condition seqno when conditions are up to date is db->cond_seqno + 2 in this case. Fixes: 17f22fe46142 ("ovsdb-idl: Return correct seqno from ovsdb_idl_db_set_condition().") Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-04 20:00:36 +01:00
Dumitru Ceara	91a6a45802	ovsdb-idl: Fix use-after-free when deleting orphaned rows. It's possible that the IDL client processes multiple jsonrpc updates in a single ovsdb_idl_run(). Considering the following updates processed in a single IDL run: 1. Update row R1 from table A while R1 is also referenced by row R2 from table B: - this adds R1 to table A's track_list. 2. Delete row R1 from table A while R1 is also referenced by row R2 from table B: - because row R2 still refers to row R1, this will create an orphan R1. - at this point R1 is still in table A's hmap. When the IDL client calls ovsdb_idl_track_clear() after it has finished processing the tracked changes, row R1 gets freed leaving a dangling pointer in table A's hmap. To fix this we don't free rows in ovsdb_idl_track_clear() if they are orphan and still referenced by other rows, i.e., the row's 'dst_arcs' list is not empty. Later, when all arc sources (e.g., R2) are deleted, the orphan R1 will be cleaned up as well. The only exception is when the whole contents of the IDL are flushed, in ovsdb_idl_db_clear(), in which case it's safe to free all rows. Reported-by: Ilya Maximets <i.maximets@ovn.org> Fixes: 932104f483ef ("ovsdb-idl: Add support for change tracking.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-03 18:04:03 +01:00
Dumitru Ceara	08e130abb1	ovsdb-idl: Fix memleak when deleting orphan rows. Pure IDL orphan rows, i.e., for which no "insert" operation was seen, which are part of tables with change tracking enabled should also be freed when the table track_list is flushed. Reported-by: Ilya Maximets <i.maximets@ovn.org> Fixes: 72aeb243a52a ("ovsdb-idl: Tracking - preserve data for deleted rows.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-03 18:04:03 +01:00
Dumitru Ceara	2407099c9f	ovsdb-idl: Fix memleak when reinserting tracked orphan rows. Considering the following updates processed by an IDL client: 1. Delete row R1 from table A while R1 is also referenced by row R2 from table B: - because row R2 still refers to row R1, this will create an orphan R1 but also sets row->tracked_old_datum to report to the IDL client that the row has been deleted. 2. Insert row R1 to table A. - because orphan R1 already existed in the IDL, it will be reused. - R1 still has row->tracked_old_datum set (and may also be on the table->track_list). 3. Delete row R2 from table B and row R1 from table A. - row->tracked_old_datum is set again but the previous tracked_old_datum was never freed. IDL clients use the deleted old_datum values so when multiple delete operations are received for a row, always track the first one as that will match the contents of the row the IDL client knew about. Running the newly added test case with valgrind, without the fix, produces the following report: ==23113== 327 (240 direct, 87 indirect) bytes in 1 blocks are definitely lost in loss record 43 of 43 ==23113== at 0x4C29F73: malloc (vg_replace_malloc.c:309) ==23113== by 0x476761: xmalloc (util.c:138) ==23113== by 0x45D8B3: ovsdb_idl_insert_row (ovsdb-idl.c:3431) ==23113== by 0x45B7F9: ovsdb_idl_process_update2 (ovsdb-idl.c:2670) ==23113== by 0x45AFCF: ovsdb_idl_db_parse_update__ (ovsdb-idl.c:2479) ==23113== by 0x45B262: ovsdb_idl_db_parse_update (ovsdb-idl.c:2542) ==23113== by 0x45ABBE: ovsdb_idl_db_parse_update_rpc (ovsdb-idl.c:2358) ==23113== by 0x4576DD: ovsdb_idl_process_msg (ovsdb-idl.c:865) ==23113== by 0x457973: ovsdb_idl_run (ovsdb-idl.c:944) ==23113== by 0x40B7B9: do_idl (test-ovsdb.c:2523) ==23113== by 0x44425D: ovs_cmdl_run_command__ (command-line.c:247) ==23113== by 0x44430E: ovs_cmdl_run_command (command-line.c:278) ==23113== by 0x404BA6: main (test-ovsdb.c:76) Fixes: 72aeb243a52a ("ovsdb-idl: Tracking - preserve data for deleted rows.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-03 18:04:03 +01:00
Ilya Maximets	f0d23f6795	ovsdb-idl: Fix iteration over tracked rows with no actual data. When idl removes orphan rows, those rows are inserted into the 'track_list'. This allows iterators such as *_FOR_EACH_TRACKED () to return orphan rows that never had any data to the IDL user. In this case, it is difficult for the user to understand whether it is a row with no data (there was no "insert" / "modify" for this row) or it is a row with zero data (columns were cleared by DB transaction). The main problem with this condition is that rows without data will have NULL pointers instead of references that should be there according to the database schema. For example, ovn-controller might crash: ERROR: AddressSanitizer: SEGV on unknown address 0x000000000100 (pc 0x00000055e9b2 bp 0x7ffef6180880 sp 0x7ffef6180860 T0) The signal is caused by a READ memory access. Hint: address points to the zero page. #0 0x55e9b1 in handle_deleted_lport /controller/binding.c #1 0x55e903 in handle_deleted_vif_lport /controller/binding.c:2072:5 #2 0x55e059 in binding_handle_port_binding_changes /controller/binding.c:2155:23 #3 0x5a6395 in runtime_data_sb_port_binding_handler /controller/ovn-controller.c:1454:10 #4 0x5e15b3 in engine_compute /lib/inc-proc-eng.c:306:18 #5 0x5e0faf in engine_run_node /lib/inc-proc-eng.c:352:14 #6 0x5e0e04 in engine_run /lib/inc-proc-eng.c:377:9 #7 0x5a03de in main /controller/ovn-controller.c #8 0x7f4fd9c991a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) #9 0x483f0d in _start (/controller/ovn-controller+0x483f0d) It doesn't make much sense to return non-real rows to the user, so it's best to exclude them from iteration. Test included. Without the fix, provided test will print empty orphan rows that was never received by idl as tracked changes. Fixes: 932104f483ef ("ovsdb-idl: Add support for change tracking.") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>	2020-11-27 14:31:27 +01:00
Dumitru Ceara	17f22fe461	ovsdb-idl: Return correct seqno from ovsdb_idl_db_set_condition(). If an IDL client sets the same monitor condition twice, the expected seqno when the IDL contents are updated should be the same for both calls. In the following scenario: 1. Client calls ovsdb_idl_db_set_condition(db, table, cond1) 2. ovsdb_idl sends monitor_cond_change(cond1) but the server doesn't yet reply. 3. Client calls ovsdb_idl_db_set_condition(db, table, cond1) At step 3 the returned expected seqno should be the same as at step 1. Similarly, if step 2 is skipped, i.e., the client calls sets the condition twice in the same iteration, then both ovsdb_idl_db_set_condition() calls should return the same value. Fixes: 46437c5232bd ("ovsdb-idl: Enhance conditional monitoring API") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-11-16 17:47:12 +01:00
Mark Gray	12eb2f67df	ovsdb-idl: Fix _is_new() IDL functions. Currently all functions of the type _is_new() always return 'false'. This patch resolves this issue by using the 'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' instead of the 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' to determine if a row is new and by resetting the 'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' on clear. Further to this, the code is also updated to match the following behaviour: When a row is inserted, the 'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' is updated to match the new database change_seqno. The 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' is not set for inserted rows (only for updated rows). At the end of a run, ovsdb_idl_db_track_clear() should be called to clear all tracking information, this includes resetting all row 'change_seqno' to zero. This will ensure that subsequent runs will not see a previously 'new' row. add_tracked_change_for_references() is updated to only track rows that reference the current row. Also, update unit tests in order to test the _is_new(), _is_delete() functions. Suggested-by: Dumitru Ceara <dceara@redhat.com> Reported-at: https://bugzilla.redhat.com/1883562 Fixes: ca545a787ac0 ("ovsdb-idl.c: Increase seqno for change-tracking of table references.") Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-11-16 17:47:11 +01:00
Ben Pfaff	3630ab86f4	ovsdb-idl: Add comment with program name to ovsdb_idl_loop transactions. This can make it easier to see what daemon is committing transactions. Sometimes, in OVN especially, it can be hard to guess. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>	2020-11-02 15:15:19 -08:00
Han Zhou	d08a602b35	Revert "ovsdb-idl: Fix NULL deref reported by Coverity." This reverts commit 68bc6f88a3a36549fcd3b6248c25c5e2e6deb8f3. The commit causes a regression in OVN scale test. ovn-northd's CPU more than doubled for the test scenario: create and bind 12k ports. Below are some perf data of ovn-northd when running command: ovn-nbctl --wait=sb sync Before reverting this commit: - 92.42% 0.62% ovn-northd ovn-northd [.] main - 91.80% main + 68.93% ovn_db_run (inlined) + 22.45% ovsdb_idl_loop_commit_and_wait After reverting this commit: - 92.84% 0.60% ovn-northd ovn-northd [.] main - 92.24% main + 92.03% ovn_db_run (inlined) Reverting this commit avoided 22.45% of the CPU caused by ovsdb_idl_loop_commit_and_wait(). The commit changed the logic of ovsdb_idl_txn_write__() by adding the check "datum->keys && datum->values" before discarding unchanged data in a transaction. However, it is normal for OVSDB clients ( such as ovn-northd) to try to set columns with same empty data as it is before the transaction. IDL would discard these changes and avoid sending big transactions to server (which would end up as no-op on server side). In the ovn scale test scenario mentioned above, each iteration of ovn-northd would send a transaction to server that includes all rows of the huge Port_Binding table, which caused the significant CPU increase of ovn-northd (and also the OVN SB DB server), resulted in longer end to end latency of OVN configuration changes. For the original problem the commit 68bc6f88 was trying to fix, it doesn't seem to be a real problem. The NULL deref reported by Coverity may be addressed in a future patch using a different approach, if necessary. Signed-off-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-08-12 20:31:50 +02:00
Dumitru Ceara	9ed69557e5	ovsdb-idl: Force IDL retry when missing updates encountered. Adds a generic recovery mechanism which triggers an IDL retry with fast resync disabled in case the IDL has detected that it ended up in an inconsistent state due to other bugs in the ovsdb-server/ovsdb-idl implementation. Additionally, this commit also: - bumps IDL semantic error logs to level ERR to make them more visible. - triggers an IDL retry in cases when the IDL client used to try to recover (i.e., trying to add an existing row, trying to remove a non existent row). Fixes: db2b5757328c ("lib: add monitor2 support in ovsdb-idl.") Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-07-07 12:28:21 +02:00
Numan Siddique	0401cf5f9e	ovsdb idl: Try committing the pending txn in ovsdb_idl_loop_run. The function ovsdb_idl_loop_run(), after calling ovsdb_idl_run(), returns a transaction object (of type 'struct ovsdb_idl_txn'). The returned transaction object can be NULL if there is a pending transaction (loop->committing_txn) in the idl loop object. Normally the clients of idl library, first call ovsdb_idl_loop_run(), then do their own processing and create any idl transactions during this processing and then finally call ovsdb_idl_loop_commit_and_wait(). If ovsdb_idl_loop_run() returns NULL transaction object, then much of the processing done by the client gets wasted as in the case of ovn-controller. The client (in this case ovn-controller), can skip the processing and instead call ovsdb_idl_loop_commit_and_wait() if the transaction oject is NULL. But ovn-controller uses IDL tracking and it may loose the tracked changes in that run. This patch tries to improve this scenario, by checking if the pending transaction can be committed in the ovsdb_idl_loop_run() itself and if the pending transaction is cleared (because of the response messages from ovsdb-server due to a transaction message in the previous run), ovsdb_idl_loop_run() can return a valid transaction object. CC: Han Zhou <hzhou@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>	2020-06-16 14:06:50 -07:00
Dumitru Ceara	ae25f8c8ff	ovsdb-idl: Avoid inconsistent IDL state with OVSDB_MONITOR_V3. Assuming an ovsdb client connected to a database using OVSDB_MONITOR_V3 (i.e., "monitor_cond_since" method) with the initial monitor condition MC1. Assuming the following two transactions are executed on the ovsdb-server: TXN1: "insert record R1 in table T1" TXN2: "insert record R2 in table T2" If the client's monitor condition MC1 for table T2 matches R2 then the client will receive the following update3 message: method="update3", "insert record R2 in table T2", last-txn-id=TXN2 At this point, if the presence of the new record R2 in the IDL triggers the client to update its monitor condition to MC2 and add a clause for table T1 which matches R1, a monitor_cond_change message is sent to the server: method="monitor_cond_change", "clauses from MC2" In normal operation the ovsdb-server will reply with a new update3 message of the form: method="update3", "insert record R1 in table T1", last-txn-id=TXN2 However, if the connection drops in the meantime, this last update might get lost. It might happen that during the reconnect a new transaction happens that modifies the original record R1: TXN3: "modify record R1 in table T1" When the client reconnects, it will try to perform a fast resync by sending: method="monitor_cond_since", "clauses from MC2", last-txn-id=TXN2 Because TXN2 is still in the ovsdb-server transaction history, the server replies with the changes from the most recent transactions only, i.e., TXN3: result="true", last-txbb-id=TXN3, "modify record R1 in table T1" This causes the IDL on the client in to end up in an inconsistent state because it has never seen the update that created R1. Such a scenario is described in: https://bugzilla.redhat.com/show_bug.cgi?id=1808580#c22 To avoid this issue, the IDL will now maintain (up to) 3 different types of conditions for each DB table: - new_cond: condition that has been set by the IDL client but has not yet been sent to the server through monitor_cond_change. - req_cond: condition that has been sent to the server but the reply acknowledging the change hasn't been received yet. - ack_cond: condition that has been acknowledged by the server. Whenever the IDL FSM is restarted (e.g., voluntary or involuntary disconnect): - if there is a known last_id txn-id the code ensures that new_cond will contain the most recent condition set by the IDL client (either req_cond if there was a request in flight, or new_cond if the IDL client set a condition while the IDL was disconnected) - if there is no known last_id txn-id the code ensures that ack_cond will contain the most recent conditions set by the IDL client regardless whether they were acked by the server or not. When monitor_cond_since/monitor_cond requests are sent they will always include ack_cond and if new_cond is not NULL a follow up monitor_cond_change will be generated afterwards. On the other hand ovsdb_idl_db_set_condition() will always modify new_cond. This ensures that updates of type "insert" that happened before the last transaction known by the IDL but didn't match old monitor conditions are sent upon reconnect if the monitor condition has changed to include them in the meantime. Fixes: 403a6a0cb003 ("ovsdb-idl: Fast resync from server when connection reset.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-06-15 14:07:29 +02:00
Mark Michelson	89b522aee3	ovsdb-idl: Add function to reset min_index. If an administrator removes all of the databases in a cluster from disk, then ovsdb IDL clients will have a problem. The databases will all reset their stored indexes to 0, so The IDL client's min_index will be higher than the indexes of all databases in the cluster. This results in the client constantly connecting to databases, detecting the data as "stale", and then attempting to connect to another. This function provides a way to reset the IDL to an initial state with min_index of 0. This way, the client will not wrongly detect the database data as stale and will recover properly. Notice that this function is not actually used anywhere in this patch. This will be used by OVN, though, since OVN is the primary user of clustered OVSDB. Signed-off-by: Mark Michelson <mmichels@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-05-28 18:36:48 +02:00
William Tu	68bc6f88a3	ovsdb-idl: Fix NULL deref reported by Coverity. When 'datum.values' or 'datum.keys' is NULL, some code path calling into ovsdb_idl_txn_write__ triggers NULL deref. An example: ovsrec_open_vswitch_set_cur_cfg(const struct ovsrec_open_vswitch { struct ovsdb_datum datum; union ovsdb_atom key; datum.n = 1; datum.keys = &key; key.integer = cur_cfg; // 1. assign_zero: Assigning: datum.values = NULL. datum.values = NULL; // CID 1421356 (#1 of 1): Explicit null dereferenced (FORWARD_NULL) // 2. var_deref_model: Passing &datum to ovsdb_idl_txn_write_clone,\ // which dereferences null datum.values. ovsdb_idl_txn_write_clone(&row->header_, &ovsrec_open_vswitch_col } And with the following calls: ovsdb_idl_txn_write_clone ovsdb_idl_txn_write__ 6. deref_parm_in_call: Function ovsdb_datum_destroy dereferences datum->values ovsdb_datum_destroy Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: William Tu <u9012063@gmail.com>	2020-05-20 08:49:27 -07:00
Dumitru Ceara	2b7e536fa5	Revert "ovsdb-idl: Avoid sending redundant conditional monitoring updates" This reverts commit 5351980b047f4dd40be7a59a1e4b910df21eca0a. If the ovsdb-server reply to "monitor_cond_since" requests has "found" == false then ovsdb_idl_db_parse_monitor_reply() calls ovsdb_idl_db_clear() which iterates through all tables and unconditionally sets table->cond_changed to false. However, if the client had already set a new condition for some of the tables, this new condition request will never be sent to ovsdb-server until the condition is reset to a different value. This is due to the check in ovsdb_idl_db_set_condition(). One way to replicate the issue is described in the bugzilla reporting the bug, when ovn-controller is configured to use "ovn-monitor-all": https://bugzilla.redhat.com/show_bug.cgi?id=1808125#c6 Commit 5351980b047f tried to optimize sending redundant conditional monitoring updates but the chances that this scenario happens with the latest code is quite low since commit 403a6a0cb003 ("ovsdb-idl: Fast resync from server when connection reset.") changed the behavior of ovsdb_idl_db_parse_monitor_reply() to avoid calling ovsdb_idl_db_clear() in most cases. Reported-by: Dan Williams <dcbw@redhat.com> Reported-at: https://bugzilla.redhat.com/1808125 CC: Andy Zhou <azhou@ovn.org> Fixes: 5351980b047f ("ovsdb-idl: Avoid sending redundant conditional monitoring updates") Acked-by: Han Zhou <hzhou@ovn.org> Acked-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-03-26 22:27:45 +01:00
Han Zhou	ca367fa5f8	ovsdb-idl.c: Allows retry even when using a single remote. When clustered mode is used, the client needs to retry connecting to new servers when certain failures happen. Today it is allowed to retry new connection only if multiple remotes are used, which prevents using LB VIP with clustered nodes. This patch makes sure the retry logic works when using LB VIP: although same IP is used for retrying, the LB can actually redirect the connection to a new node. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-08-21 11:30:03 -07:00

1 2 3 4 5

233 Commits