We found some issues affecting Partial Map Update feature included in
master branch. This patch fixes a memory leak due to lack of freeing datum
allocated in the process of requesting a change to a map. It also fix an
error produced when NDEBUG flag is not set that causes an assertion when
preparing the map to be changed.
Fix of a memory leak not freeing datums.
Change use of ovsdb_idl_read function when preparing changes to maps.
Signed-off-by: arnoldo.lutz.guevara@hpe.com <arnoldo.lutz.guevara@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Calling ovsdb_idl_set_remote() might overwrite the 'idl->session'. The patch
fixes them by freeing 'idl->session' before it is overwritten.
Testcast ovn-controller - ovn-bridge-mappings reports two definitely losts in:
xmalloc (util.c:112)
jsonrpc_session_open (jsonrpc.c:784)
ovsdb_idl_create (ovsdb-idl.c:246)
main (ovn-controller.c:384)
and,
xmalloc (util.c:112)
jsonrpc_session_open (jsonrpc.c:784)
ovsdb_idl_set_remote (ovsdb-idl.c:289)
main (ovn-controller.c:409)
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
In the current implementation, every time an element of either a map or set
column has to be modified, the entire content of the column is sent to the
server to be updated. This is not a major problem if the information contained
in the column for the corresponding row is small, but there are cases where
these columns can have a significant amount of elements per row, or these
values are updated frequently, therefore the cost of the modifications becomes
high in terms of time and bandwidth.
In this solution, the ovsdb-idl code is modified to use the RFC 7047 'mutate'
operation, to allow sending partial modifications on map columns to the server.
The functionality is exposed to clients in the vswitch idl. This was
implemented through map operations.
A map operation is defined as an insertion, update or deletion of a key-value
pair inside a map. The idea is to minimize the amount of map operations
that are send to the OVSDB server when a transaction is committed.
In order to keep track of the requested map operations, structs map_op and
map_op_list were defined with accompanying functions to manipulate them. These
functions make sure that only one operation is send to the server for each
key-value that wants to be modified, so multiple operation on a key value are
collapsed into a single operation.
As an example, if a client using the IDL updates several times the value for
the same key, the functions will ensure that only the last value is send to
the server, instead of multiple updates. Or, if the client inserts a key-value,
and later on deletes the key before committing the transaction, then both
actions cancel out and no map operation is send for that key.
To keep track of the desired map operations on each transaction, a list of map
operations (struct map_op_list) is created for every column on the row on which
a map operation is performed. When a new map operation is requested on the same
column, the corresponding map_op_list is checked to verify if a previous
operations was performed on the same key, on the same transaction. If there is
no previous operation, then the new operation is just added into the list. But
if there was a previous operation on the same key, then the previous operation
is collapsed with the new operation into a single operation that preserves the
final result if both operations were to be performed sequentially. This design
keep a small memory footprint during transactions.
When a transaction is committed, the map operations lists are checked and
all map operations that belong to the same map are grouped together into a
single JSON RPC "mutate" operation, in which each map_op is transformed into
the necessary "insert" or "delete" mutators. Then the "mutate" operation is
added to the operations that will be send to the server.
Once the transaction is finished, all map operation lists are cleared and
deleted, so the next transaction starts with a clean board for map operations.
Using different structures and logic to handle map operations, instead of
trying to force the current structures (like 'old' and 'new' datums in the row)
to handle then, ensures that map operations won't mess up with the current
logic to generate JSON messages for other operations, avoids duplicating the
whole map for just a few changes, and is faster for insert and delete
operations, because there is no need to maintain the invariants in the 'new'
datum.
Signed-off-by: Edward Aymerich <edward.aymerich@hpe.com>
Signed-off-by: Arnoldo Lutz <arnoldo.lutz.guevara@hpe.com>
Co-authored-by: Arnoldo Lutz <arnoldo.lutz.guevara@hpe.com>
[blp@ovn.org made style changes and factored out error checking]
Signed-off-by: Ben Pfaff <blp@ovn.org>
Allows for auto detection and reconnect if the ovn-remote needs
to change. Ovn-controller test case updated to include testing
this code.
Signed-off-by: RYAN D. MOATS <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Add a external-id 'ovn-remote-probe-interval' for setting the activity probe
interval of the json session from ovn-controller to the OVN southbound database.
Signed-off-by: Huang Lei <lhuang8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Deletes need to be reordered as well as inserts and modifies,
otherwise, following tracked changes will see out of order
seqnos.
CC: Shad Ansari <shad.ansari@hpe.com>
Signed-off-by: RYAN D. MOATS <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This attempts to prevent namespace collisions with other list libraries
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Currently changes are added to the front of the track list, so
they are looped through in LIFO order. Incremental processing
is more efficient with a FIFO presentation, so
(1) add new changes to the back of the track list, and
(2) move updated changes to the back of the track list
Signed-off-by: RYAN D. MOATS <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A common error scenario with OVN is to attempt to use ovn-nbctl when
the OVN databases have not been created in ovsdb-server:
1. ovn-nbctl sends a "get_schema" request for the OVN db to ovsdb-server.
2. ovsdb-server fails to find requested db, sends error response
to ovn-nbctl.
3. ovn-nbctl receives the error response in ovsdb_idl_run(), but
takes no specific action.
4. ovn-nbctl hangs forever in IDL_S_SCHEMA_REQUESTED state (assuming
a timeout wasn't requested on the command line).
This commit adds a new IDL state, IDL_S_NO_SCHEMA, which is entered
when a negative response to a schema request is received. When in
this state, ovsdb_idl_is_alive() now returns 'false', allowing clients
(currently ovn-nbctl, ovn-sbctl, vtep-ctl, and ovs-vsctl) to detect this
condition and exit with an appropriate error message.
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A common error scenario with OVN is to attempt to use ovn-nbctl when
the OVN databases have not been created in ovsdb-server:
1. ovn-nbctl sends "get_schema" request for OVN db to ovsdb-server.
2. ovsdb-server fails to find requested db, sends error response
to ovn-nbctl.
3. ovn-nbctl receives the error response in ovsdb_idl_run(), but
takes no specific action.
4. ovn-nbctl hangs forever in IDL_S_SCHEMA_REQUESTED state (assuming
a timeout wasn't requested on the command line).
Add a log message to inform the user of this situation.
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Remove unused implementation of ovsdb_idl_row_apply_diff().
Reported-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Recent IDL change tracking patches allow quick traversal of changed
rows. This patch adds additional support to track changed columns.
It allows an IDL client to efficiently check if a specific column
of a row was updated by IDL.
Signed-off-by: Shad Ansari <shad.ansar@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Fixes the following sparse warning messages:
lib/ovsdb-idl.c:146:12: error: symbol 'table_updates_names' was not
declared. Should it be static?
lib/ovsdb-idl.c:147:12: error: symbol 'table_update_names' was not
declared. Should it be static?
lib/ovsdb-idl.c:148:12: error: symbol 'row_update_names' was not
declared. Should it be static?
Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Add support for monitor2. When idl starts to run, monitor2 will be
attempted first. In case the server is an older version that does
not recognize monitor2. IDL will then fall back to use "monitor"
method.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@ovn.org>
The new comment reflects with more clarity what ovsdb_idl_add_table() does.
Previous comment could be misunderstood, leading to believe that this function
replicates all columns on IDL. Hopefully this fix clarifies that columns are
not replicated, just minimal data for reference integrity is replicated.
A comment in ovsdb_idl_table_class is also modified to better reflect this
behaviour.
Signed-off-by: Edward Aymerich <edward.aymerich@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Ovsdb-idl notifies a client that something changed; it does not track
which table, row changed in what way (insert, modify or delete).
As a result, a client has to scan or reconfigure the entire idl after
ovsdb_idl_run(). This is presumably fine for typical ovs schemas where
tables are relatively small. In use-cases where ovsdb is used with
schemas that can have very large tables, the current ovsdb-idl
notification mechanism does not appear to scale - clients need to do a
lot of processing to determine the exact change delta.
This change adds support for:
- Table and row based change sequence numbers to record the
most recent IDL change sequence numbers associated with insert,
modify or delete update on that table or row.
- Change tracking of specific columns. This ensures that changed
rows (inserted, modified, deleted) that have tracked columns, are
tracked by IDL. The client can directly access the changed rows
with get_first, get_next operations without the need to scan the
entire table.
The tracking functionality is not enabled by default and needs to
be turned on per-column by the client after ovsdb_idl_create()
and before ovsdb_idl_run().
/* Example Usage */
idl = ovsdb_idl_create(...);
/* Track specific columns */
ovsdb_idl_track_add_column(idl, column);
/* Or, track all columns */
ovsdb_idl_track_add_all(idl);
for (;;) {
ovsdb_idl_run(idl);
seqno = ovsdb_idl_get_seqno(idl);
/* Process only the changed rows in Table FOO */
FOO_FOR_EACH_TRACKED(row, idl) {
/* Determine the type of change from the row seqnos */
if (foo_row_get_seqno(row, OVSDB_IDL_CHANGE_DELETE)
>= seqno)) {
printf("row deleted\n");
} else if (foo_row_get_seqno(row, OVSDB_IDL_CHANGE_MODIFY)
>= seqno))
printf("row modified\n");
} else if (foo_row_get_seqno(row, OVSDB_IDL_CHANGE_INSERT)
>= seqno))
printf("row inserted\n");
}
}
/* All changes processed - clear the change track */
ovsdb_idl_track_clear(idl);
}
Signed-off-by: Shad Ansari <shad.ansari@hp.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The same function is defined in both ovn-controller.c and
ovn-controller-vtep.c, so worth librarizing.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Russell Bryant <rbryant@redhat.com>
idl-loop is needed in implementing other controller (i.e., vtep controller).
So, this commit moves the logic into ovsdb-idl library module.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Russell Bryant <rbryant@redhat.com>
Until now, if ovs-vsctl (or another client of the C ovsdb-idl library) was
compiled against a schema that had a column or table that was not in the
database actually being used (e.g. during an upgrade), and the column or
table was selected for monitoring, then ovsdb-idl would fail to get any
data at all because ovsdb-server would report an error due to a request
about a column or a table it didn't know about.
This commit fixes the problem by making ovsdb-idl retrieve the database
schema from the database server and omit any tables or columns that don't
exist from its monitoring request. This works OK for the kinds of upgrades
that OVSDB otherwise supports gracefully because it will simply make the
missing columns or tables appear empty, which clients of the ovsdb-idl
library already have to tolerate.
VMware-BZ: #1413562
Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
struct list is a common name and can't be used in public headers.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
OVSDB has the concept of "immutable" columns, which are columns whose
values are fixed once a row is inserted. Until now, ovs-vsctl has not
allowed these columns to be modified at all. However, this is a little too
strict, because these columns can be set to any value at the time that the
row is inserted. This commit relaxes the ovs-vsctl requirement, then, to
allow an immutable column's value to be modified if its row has been
inserted within this transaction.
Requested-by: Mukesh Hira <mhira@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
If ovs-vsctl has to wait for ovs-vswitchd to reconfigure itself
according to the new database, then sometimes ovs-vsctl could
end up stuck in the event loop if OVSDB connection was dropped
while ovs-vsctl was still running.
This patch fixes this problem by letting ovs-vsctl to reconnect
to the OVSDB, if it has to wait cur_cfg field to be updated.
Issue: 1191997
Reported-by: Spiro Kourtessis <spiro@nicira.com>
Signed-Off-By: Ansis Atteka <aatteka@nicira.com>
This allows other libraries to use util.h that has already
defined NOT_REACHED.
Signed-off-by: Harold Lim <haroldl@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The MSVC C library printf() implementation does not support the 'z', 't',
'j', or 'hh' format specifiers. This commit changes the Open vSwitch code
to avoid those format specifiers, switching to standard macros from
<inttypes.h> where available and inventing new macros resembling them
where necessary. It also updates CodingStyle to specify the macros' use
and adds a Makefile rule to report violations.
Signed-off-by: Alin Serdean <aserdean@cloudbasesolutions.com>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
jsonrpc_session_recv() handles echo replies prior to this.
Found by inspection.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, ovs-vsctl has kept trying to the database server until it
succeeded or the timeout expired (if one was specified with --timeout).
This meant that if ovsdb-server wasn't running, then ovs-vsctl would hang.
The result was that almost every ovs-vsctl invocation in scripts specified
a timeout on the off-chance that the database server might not be running.
But it's difficult to choose a good timeout. A timeout that is too short
can cause spurious failures. A timeout that is too long causes long delays
if the server really isn't running.
This commit should alleviate this problem. It changes ovs-vsctl's behavior
so that, if it fails to connect to the server, it exits unsuccessfully.
This makes --timeout obsolete for the purpose of avoiding a hang if the
database server isn't running. (--timeout is still useful to avoid a hang
if ovsdb-server is running but ovs-vswitchd is not, for ovs-vsctl commands
that modify the database. --no-wait also avoids that issue.)
Bug #2393.
Bug #15594.
Reported-by: Jeff Merrick <jmerrick@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
For 1000 tunnels with CFM enabled, this reduces CPU use from
about 36% to about 30%.
Bug #15171.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
ovs-vswitchd should only write to write-only columns. Furthermore,
writing to a column which is not write-only can cause serious
performance degradations. This patch causes ovs-vswitchd to log
and reject writes to read-write columns.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Casts are sometimes necessary. One common reason that they are necessary
is for discarding a "const" qualifier. However, this can impede
maintenance: if the type of the expression being cast changes, then the
presence of the cast can hide a necessary change in the code that does the
cast. Using CONST_CAST, instead of a bare cast, makes these changes
visible.
Inspired by my own work elsewhere:
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/cast.h#n80
Signed-off-by: Ben Pfaff <blp@nicira.com>
String to string maps are used all over the Open vSwitch database.
Before this patch, they were implemented in the idl as parallel
string arrays. This strategy has proven a bit cumbersome. With
this patch, string to string maps are implemented using the smap
library.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Originally the IDL transaction state machine had a return value
TXN_TRY_AGAIN to signal the client to wait for a change in the database and
then retry its transaction. However, this logic was incomplete, because
it was possible for the database to change before the reply to the
transaction RPC was received, in which case the client would wait for a
further change. Commit 4fdfe5ccf84c (ovsdb-idl: Prevent occasional hang
when multiple database clients race.) fixed the problem by breaking
TXN_TRY_AGAIN into two status codes, TXN_AGAIN_WAIT that meant to wait for
a further change and TXN_AGAIN_NOW that meant that a change had already
occurred so try again immediately.
This is correct enough, but it is more complicated than necessary. It is
simpler and just as correct to use a single "try again" status that
requires the client to wait for a change relative to the database contents
*before* the transaction was committed. This commit makes that change.
It also changes ovsdb_idl_run()'s return type from bool to void because
its return type is hardly useful anymore.
Signed-off-by: Ben Pfaff <blp@nicira.com>
When a client of the IDL tries to commit a read-modify-write transaction
but the database has changed in the meantime, the IDL tells its client to
wait for the IDL to change and then try the transaction again by returning
TXN_TRY_AGAIN. The "wait for the IDL to change" part is important because
there's no point in retrying the transaction before the IDL has received
the database updates (the transaction would fail in the same way all over
again).
However, the logic was incomplete: the database update can be received
*before* the reply to the transaction RPC (I think that in the current
ovsdb-server implementation this will always happen, in fact). When this
happens, the right thing to do is to retry the transaction immediately;
if we wait, then we're waiting for an additional change to the database
that may never come, causing an indefinite hang.
This commit therefore breaks the "try again" IDL commit status code
into two, one that means "try again immediately" and another that means
"wait for a change then try again". When an update is processed after a
transaction is committed but before the reply is received, the "try again
now" tells the IDL client not to wait for another database change before
retrying its transaction.
Bug #5980.
Reported-by: Ram Jothikumar <rjothikumar@nicira.com>
Reproduced-by: Alex Yip <alex@nicira.com>
Synthetic rows lack a lot of important metadata that the IDL adds to rows
actually obtained from the database, and it's impractical to add that
metadata to synthetic rows. This means that the IDL functions to modify
these rows dereference null pointers and segfault. So, it's really
important not to pass synthetic rows to such functions. However, we've
screwed this up a number of times now and in the end it seems that it's
probably better to just ignore attempts to modify these rows. This commit
implements that.
Feature #8013.
Reported-by: Ethan Jackson <ethan@nicira.com>
Once in a while someone reports a problem caused by running multiple
ovs-vswitchd processes at the same time. This fixes the problem by
requiring ovs-vswitchd to obtain a database lock before taking any actions.
The state machine didn't have a proper state for "not yet committed or
aborted", which meant that destroying an ovsdb_idl_txn without committing
or aborting it caused a segfault. This fixes the problem by adding a new
state TXN_UNCOMMITTED to the state machine.
This is related to commit 79554078d "ovsdb-idl: Fix bad logic in
ovsdb_idl_txn_commit() state transitions", which fixed a related bug.
Bug #2438.