Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Originally the IDL transaction state machine had a return value
TXN_TRY_AGAIN to signal the client to wait for a change in the database and
then retry its transaction. However, this logic was incomplete, because
it was possible for the database to change before the reply to the
transaction RPC was received, in which case the client would wait for a
further change. Commit 4fdfe5ccf84c (ovsdb-idl: Prevent occasional hang
when multiple database clients race.) fixed the problem by breaking
TXN_TRY_AGAIN into two status codes, TXN_AGAIN_WAIT that meant to wait for
a further change and TXN_AGAIN_NOW that meant that a change had already
occurred so try again immediately.
This is correct enough, but it is more complicated than necessary. It is
simpler and just as correct to use a single "try again" status that
requires the client to wait for a change relative to the database contents
*before* the transaction was committed. This commit makes that change.
It also changes ovsdb_idl_run()'s return type from bool to void because
its return type is hardly useful anymore.
Signed-off-by: Ben Pfaff <blp@nicira.com>
When a client of the IDL tries to commit a read-modify-write transaction
but the database has changed in the meantime, the IDL tells its client to
wait for the IDL to change and then try the transaction again by returning
TXN_TRY_AGAIN. The "wait for the IDL to change" part is important because
there's no point in retrying the transaction before the IDL has received
the database updates (the transaction would fail in the same way all over
again).
However, the logic was incomplete: the database update can be received
*before* the reply to the transaction RPC (I think that in the current
ovsdb-server implementation this will always happen, in fact). When this
happens, the right thing to do is to retry the transaction immediately;
if we wait, then we're waiting for an additional change to the database
that may never come, causing an indefinite hang.
This commit therefore breaks the "try again" IDL commit status code
into two, one that means "try again immediately" and another that means
"wait for a change then try again". When an update is processed after a
transaction is committed but before the reply is received, the "try again
now" tells the IDL client not to wait for another database change before
retrying its transaction.
Bug #5980.
Reported-by: Ram Jothikumar <rjothikumar@nicira.com>
Reproduced-by: Alex Yip <alex@nicira.com>
Synthetic rows lack a lot of important metadata that the IDL adds to rows
actually obtained from the database, and it's impractical to add that
metadata to synthetic rows. This means that the IDL functions to modify
these rows dereference null pointers and segfault. So, it's really
important not to pass synthetic rows to such functions. However, we've
screwed this up a number of times now and in the end it seems that it's
probably better to just ignore attempts to modify these rows. This commit
implements that.
Feature #8013.
Reported-by: Ethan Jackson <ethan@nicira.com>
Once in a while someone reports a problem caused by running multiple
ovs-vswitchd processes at the same time. This fixes the problem by
requiring ovs-vswitchd to obtain a database lock before taking any actions.
The state machine didn't have a proper state for "not yet committed or
aborted", which meant that destroying an ovsdb_idl_txn without committing
or aborting it caused a segfault. This fixes the problem by adding a new
state TXN_UNCOMMITTED to the state machine.
This is related to commit 79554078d "ovsdb-idl: Fix bad logic in
ovsdb_idl_txn_commit() state transitions", which fixed a related bug.
Bug #2438.
Commit 1cc618c3252 "ovsdb-idl: Fix atomicity of writes that don't change a
column's value" fixed transactions that write the existing value to
some columns, ensuring that those columns still got written to the database
to avoid making the transaction nonatomic in the presence of writes that do
modify part of the database.
However, that commit was too conservative: we can still optimize out a
database transaction that writes *only* existing values to the database,
because if we drop such a transaction then the resulting database is still
one that could result from executing transactions in a serial order. This
commit implements that optimization.
As an example of what this commit does, before this commit, an "ovs-vsctl
set" command that specified the existing value for a column would do a
round-trip to the database to write that existing value. After this
commit, that round-trip would not occur.
Found by observing system startup.
Until now, ovs-vswitchd has been unable to configure IP addresses and
routes for bridges whose Bridge records lack a Port and an Interface
record for the bridge's local port (e.g. OFPP_LOCAL, the port with the
same name as the bridge itself). When such a bridge was reconfigured,
ovs-vswitchd would output a log message that worried people.
This commit fixes the internal limitation that led to the message being
printed.
Bug #5385.
Deciding what delete operations to issue on garbage-collected tables has
been a bit of a difficult issue for ovs-vsctl. When garbage collection was
introduced in commit c5f341a "ovsdb: Implement garbage collection",
ovs-vsctl did not issue any deletions for these tables at all. As a side
effect, ovs-vsctl did not notice that records were going to be deleted.
That meant that when multiple commands were issued in one ovs-vsctl run,
ovs-vsctl could get confused by apparent duplicate records that did not
in fact exist. Commit 28a14bf "ovs-vsctl: Back out garbage collection
changes" fixed the problem by putting all of the explicit deletions back
into ovs-vsctl.
However, adding these explicit deletions had the price that it then became
(again) impossible to use ovs-vsctl commands to delete duplicates, for
example to use "ovs-vsctl del-br" to delete a bridge that points to the
same Port records that some other Bridge record also does. This commit
makes that possible again, by implementing a compromise:
* Internally, ovs-vsctl deletes the records that it believes should be
deleted.
* ovsdb-idl suppresses the deletions when it makes the RPC call into
the database server.
Bug #5358.
Reported-by: Henrik Amren <henrik@nicira.com>
The existing ovsdb_idl_txn_commit() drops any writes that don't change a
column's value from what was last reported by the database. But this isn't
a valid optimization, because it breaks the atomicity of transactions.
Suppose columns A and B initially have values 1 and 2. Client 1 writes
value 1 to both columns in one transaction. Client 2 writes value 2 to
both columns in another transaction. The only possible valid results for
any serial ordering of transactions are 1,1 or 2,2. But if both clients
drop writes to columns that they have not modified, then 2,1 also becomes
possible (because client 1 just writes to B and client 2 just writes to A).
However, for write-only columns we can optimize this out because the IDL
can assume it is the only client writing to a column.
Found by inspection.
A JSONRPC_REPLY message always have a nonnull 'id' member, as ensured by
jsonrpc_msg_is_valid(). Checking for NULL here confused Coverity into
believing that the call to ovsdb_idl_txn_process_reply() just below could
cause a null pointer dereference, since ovsdb_idl_txn_process_reply() uses
the 'id' member without checking it for null.
Coverity #10713.
The IDL can only verify prerequisites for columns that it is monitoring,
but it didn't check for that. This assertion (which is the same as one in
ovsdb_idl_txn_write()) should alert us to such problems.
This would have found the problem fixed by the previous commit.
Until now, by default the IDL replicated all tables and all columns in the
database, and a few functions made it possible to avoid replicating
selected columns. This commit adds a mode in which nothing is replicated
by default and the client code is responsible for specifying each column
and table that it is interested in. The following commit adds a user for
this mode.
All of these changes avoid using the same name for two local variables
within a same function. None of them are actual bugs as far as I can tell,
but any of them could be confusing to the casual reader.
The one in lib/ovsdb-idl.c is particularly brilliant: inner and outer
loops both using (different) variables named 'i'.
Found with GCC -Wshadow.
Not replicating unneeded columns has some value in avoiding CPU time and
bandwidth to the database. In ovs-vswitchd, setting cur_cfg as write-only
also have great value in avoiding extra reconfiguration steps. When
ovs-vsctl is used in its default mode this essentially avoids half of the
reconfigurations that ovs-vswitchd currently does. What happens now is:
1. ovs-vsctl updates the database and increments next_cfg.
2. ovs-vswitchd notices the change to the database, reconfigures
itself, then increments cur_cfg to match next_cfg.
3. The database sends the change to cur_cfg back to ovs-vswitchd.
4. ovs-vswitchd reconfigures itself a second time.
By not replicating cur_cfg we avoid step 3 and save a whole reconfiguration
step.
Also, now that the database contains interface statistics, this avoids
reconfiguring every time that statistics are updated.
ovs-vswitchd has no need to replicate some parts of the database. In
particular, it doesn't need to replicate the bits that it never reads,
such as the external_ids column in the Open_vSwitch table. This saves
some memory, CPU time, and bandwidth to the database.
Another type of column that benefits from special treatment is "write-only
columns", that is, those that ovs-vswitchd writes and keeps up-to-date but
never expects another client to write, such as the cur_cfg column in the
Open_vSwitch table. If the IDL reports that the database has changed when
ovs-vswitchd updates such a column, then ovs-vswitchd reconfigures itself
for no reason, wasting CPU time. This commit also adds support for such
columns.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
The existing ovsdb_idl_txn_read() was somewhat difficult and expensive to
use, because it always made a copy of the data in the column. This was
necessary at the time it was introduced, because there was no way for it
to return a "default" value for columns that had not yet been populated
without allocating data and hence requiring the caller to free it.
Now that ovsdb_datum_default() exists, this is no longer required. This
commit introduces a pair of new functions, ovsdb_idl_read() and
ovsdb_idl_get(), that return a pointer to existing data and do not do any
copying. It also transitions all of ovsdb_idl_txn_read()'s callers to
the new interfaces.
Commit cde3f1 "ovsdb-idl: Drop unnecessary allocation from
ovsdb_idl_txn_insert()." does lazy allocation of row->written
on the assumption that ovsdb_idl_txn_write() will handle it.
However, this isn't the case for empty rows created by something
like "ovs-vsctl init" so add a check before reading the bitfield.
This makes it easy to create a bunch of records that are all related to
each other in a single ovs-vsctl invocation. It adds an example to the
ovs-vsctl manpage.
This patch fixed the following compile warning:
lib/ovsdb-idl.c: In function 'ovsdb_idl_txn_process_inc_reply':
lib/ovsdb-idl.c:1524: warning: format '%u' expects type 'unsigned int', but argument 5 has type 'size_t'
lib/ovsdb-idl.c:1538: warning: format '%ld' expects type 'long int', but argument 5 has type 'long long int'
lib/ovsdb-idl.c:1550: warning: format '%u' expects type 'unsigned int', but argument 5 has type 'size_t'
lib/ovsdb-idl.c: In function 'ovsdb_idl_txn_process_insert_reply':
lib/ovsdb-idl.c:1579: warning: format '%u' expects type 'unsigned int', but argument 5 has type 'size_t'
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
The fatal-signal library notices and records fatal signals (e.g. SIGTERM)
and terminates the process on the next trip through poll_block(). But
some special utilities do not always invoke poll_block() promptly, e.g.
"ovs-ofctl monitor" does not call poll_block() as long as OpenFlow messages
are available. But these special cases seem like they are all likely to
call into functions that themselves block (those with "_block" in their
names). So make a new rule that such functions should always call
fatal_signal_run(), either directly or through poll_block(). This commit
implements and documents that rule.
Bug #2625.
Before OVSDB was adopted in the vswitch, bridge ioctls were synchronous.
That is, an operation that, say, creates a new bridge was guaranteed to
have completed before brcompatd returned a success result to the kernel.
When OVSDB was adopted, however, we failed to maintain this property.
Instead, bridge creation (etc.) only happened some time after the return
value was passed back to the kernel. This causes a race condition against
software that creates or deletes bridges or ports and expects that the
operation is completed synchronously.
This commit restores the synchronous behavior.
Bug #2443.
When a transaction is in progress, newly inserted rows have NULL 'old'
values. These rows are not orphans, so ovsdb_idl_row_is_orphan() should
not treat them as such.
I do not believe that this changes behavior at all, because I have not been
able to find a case where ovsdb_idl_row_is_orphan() is called while a
transaction is in progress. It is a code cleanup.
The IDL was returning rows that had existed in the database and were
deleted by the current transaction (that is, row->old && !row->new).
This commit fixes the problem.
The condition used by next_real_row() was just blatantly wrong and
illogical. The correct condition is row->new != NULL. The old condition
only got one case wrong (the one mentioned above), even though it didn't
make much sense.
This fixes an ovs-vsctl call that assert-failed in a "set" command that
iterated through a table from which a previous ovs-vsctl command (in the
same invocation) had deleted a row.
If sending the transaction fails (jsonrpc_session_send() returns 0),
then we need to transition to TXN_TRY_AGAIN. (Transitioning to
TXN_INCOMPLETE is actually a no-op, because at this point in the code
we are guaranteed to be in that state already.)
Leaving the transaction in TXN_INCOMPLETE causes a segfault later in
ovsdb_idl_txn_destroy() when it calls hmap_remove() on the transaction's
txn_node.
This bug reveals a hole in the ovsdb_idl_txn state machine: destroying
a transaction without committing it or aborting it will cause the same
problem. This problem is *not* fixed by this patch: it really should be
handled by adding a new state TXN_UNCOMMITTED that indicates that the
transaction is not yet committed or aborted. That's too much for this
patch, and doesn't really matter for OVS at the moment since none of its
code paths destroy a transaction without committing or aborting it.
Bug #2435.
This also adds protocol compatibility to the database itself and to
ovsdb-client. It doesn't actually add multiple database support to
ovsdb-server, since we don't really need that yet.
The ovs-vsctl "create" command, and perhaps other commands, should print
the UUID of the newly created database row, but until now the IDL has not
provided a way to find that out. This commit adds the ability.
ovs-vsctl wants to use these functions directly, so make them available
through the ovsdb-idl public header instead of only through the private
one.
Also, change the prototypes to make them usable without casts.