This patch adds ASAN and UBSAN GitHub action tests for both
the userspace and kernel datapaths.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
This patch combines the existing ubsan and asan GitHub actions
tests into one.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Previously tests that were generating a SIGSEGV were excluded
from UBSAN runs. With the correct environment variable, these
tests can be run. This is working even with the clang version
supplied by Ubuntu 20.04.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
This patch identifies new static analysis issues during a GitHub action
run and reports them. The process involves analyzing the changes introduced
in the current commit and comparing them to those in the preceding commit.
However, there are two cases when the GitHub push action runner does not
provide enough details to determine the preceding commit. These cases are
a new branch or a forced push. The strategy for these exceptions is to
find the first common commit on any upstream branch, and use that.
An example error output might look like this:
error level: +0 -0 no changes
warning level: +2 +0
New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
note level: +0 -0 no changes
all levels: +2 +0
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
The make clang-analyze target reports an 'Dereference of null
pointer' and an 'Uninitialized argument value' issue due to
it assumes some function can return NULL.
This patch annotates these functions, so the static analyzer
is aware of this.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Large time warps can cause the 'long flow dump duration' log message
to happen. However, due to the level change, they could now cause
failures. This patch will stop these messages from failing the tests.
Fixes: 9bcfb8fb7784 ("ofproto-dpif-upcall: Change flow dump duration message to WARN level.")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Basic relay and active-backup command execution tests extended
to run some copies of ovsdb-server processes with --config-file.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
OVSDB server maintains a temporary file with the current database
configuration for the case it is restarted by a monitor process
after a crash. On startup the configuration from command line
arguments is stored there in a JSON format, also whenever user
changes the configuration with different UnixCtl commands, those
changes are getting added to the file. When restarted from the
crash it reads the configuration from the file and continues
with all the necessary remotes and databases.
This change allows it to be an external user-provided file that
OVSDB server will read the configuration from. The file can be
specified with a --config-file command line argument and it is
mutually exclusive with most other command line arguments that
set up remotes or databases, it is also mutually exclusive with
use of appctl commands that modify same configurations, e.g.
add/remove-db or add/remove-remote.
If the user wants to change the configuration of a running server,
they may change the file and call ovsdb-server/reload appctl.
OVSDB server will open a file, read and parse it, compare the
new configuration with the current one and adjust the running
configuration as needed. OVSDB server will try to keep existing
databases and connections intact, if the change can be applied
without disrupting the normal operation.
User-provided files are not trustworthy, so extra checks were
added to ensure a correct file format. If the file cannot be
correctly parsed, e.g. contains invalid JSON, no changes will
be applied and the server will keep using the previous
configuration until the next reload.
If config-file is provided for active-backup databases, permanent
disconnection of one of the backup databases no longer leads to
switching all other databases to 'active'. Only the disconnected
one will transition, since all of them have their own records in
the configuration file.
With this change, users can run all types of databases within
the same ovsdb-server process at the same time.
Simple configuration may look like this:
{
"remotes": {
"punix:db.sock": {},
"pssl:6641": {
"inactivity-probe": 16000,
"read-only": false,
"role": "ovn-controller"
}
},
"databases": {
"conf.db": {},
"sb.db": {
"service-model": "active-backup",
"backup": true,
"source": {
"tcp:127.0.0.1:6644": null
}
},
"OVN_Northbound": {
"service-model": "relay",
"source": {
"ssl:[fe:::1]:6642,ssl:[fe:::2]:6642": {
"max-backoff": 8000,
"inactivity-probe": 10000
}
}
}
}
}
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Allow setting all the JSON-RPC session options at once.
While at it, allow updating options the same way the source
can be updated while calling 'ovsdb_relay_add_db()' if the
relay is already configured.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Allow setting all the options for the source connection, not only the
inactivity probe interval.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Set all the options for the source connection, not only the
inactivity probe interval.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Just introduced structure 'jsonrpc_session_options' is the same
as part of the 'ovsdb_jsonrpc_options'. In fact, these options
do really belong to a lower layer. So, replace a copy of these
fields with a structure, so it can be easily passed to jsonrpc's
'jsonrpc_session_set_options()'.
Not creating separate JSON parsing/formatting functions to avoid
creating an extra nesting level for the users who will write
the JSON definition in a configuration file. I.e. keeping the
JSON object flat. Also, not changing the 'db_config->options'
to be 'jsonrpc_session_options', even though we don't need the
'role' or 'read-only' fields. This allows us to use the same
JSON parsing function for both the remotes ans database sources.
Can be changed in the future, but for now keeping as is to avoid
extra code complication.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
It's useful to have a way to update all the JSON-RPC session
options all at once and not call 3 separate functions every
time. This may also allow the internals of these options
to be better abstracted, i.e. allow users to not know what
are these options exactly.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
It is currently not possible for the role to change in runtime
(unless a manual DB transaction is crafted), but it will be with
addition of a config file.
If the role changes, listening socket will be closed, and all
the connections to this remote will be terminated.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Adding a --config-file option that will be used in the future to
allow users to provide the database server configuration via a
JSON file.
For now, it does nothing useful, but we define it as mutually
exclusive with all the command line options and UnixCtl commands
that configure values that will be available via a config file.
This will ensure that we don't have too many ways of configuring
the same thing at the same time.
New appctl command 'ovsdb-server/reload' is going to signal OVSDB
server that it needs to re-read the configuration file.
While at it, adding a missing 'usage' line for '--no-dbs'. This
option is rarely used, so it doesn't seem to be worth a separate fix.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Add a new structure 'db_config' that holds the user-provided
configuration of the database. And attach this configuration
to each of the databases on the server.
Each database has a service model: standalone, clustered, relay
or active-backup. Relays and A-B databases have a source, each
source has its own set of JSON-RPC session options. A-B also
have an indicator of it being active or backup and an optional
list of tables to exclude from replication.
All of that should be stored per database in the temporary
configuration file that is used in order to restore the config
after the OVSDB crash. For that, the save/load functions are
also updates.
This change is written in generic way assuming all the databases
can have different configuration including service model.
The only user-visible change here is a slight modification of
the ovsdb-server/sync-status appctl, since it now needs to
skip databases that are not active-backup and also should report
active-backup databases that are currently active, i.e. not
added to the replication module.
If the service model is not defined in the configuration, it
is assumed to be standalone or clustered, and determined from
the storage type while opening the database. If the service
model is defined, but doesn't match the actual storage type
in the database file, ovsdb-server will fail to open the
database. This should never happen with internally generated
config file, but may happen in the future with user-provided
configuration files. In this case the service model is used
for verification purposes only, if administrator wants to
assert a particular model.
Since the database 'source' connections can't use 'role' or
'read-only' options, a new flag added to corresponding JSON
parsing functions to skip these fields.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
When database is added to the replication, it should no longer
accept transactions that can modify it. When it's removed from
the replication, it should be writable again. Add this logic
to the replication module itself, so it can be removed from the
main ovsdb-server later.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Refactoring of the replication code, so each database is handled
separately from each other. Supposed to work the same way as before
with the only difference that each backup database will have its own
connection to the source and will have its own state machine.
From the user's perspective, the only visible difference is that
ovsdb-server/sync-status appctl now shows the status of each
database separately.
If one of the connections is permanently broken, all the databases
will be switched to active. This is done in order to preserve the
old behavior where we had only one connection.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Small refactoring so we can re-use this function in later commits.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Store JSON-RPC options for each remote separately, so it will be
possible to have different configurations per remote in the future.
These are also stored to and loaded from the temporary file that
OVSDB is using to restore runtime configuration of the server
restarted by the monitor process after a crash.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
These functions will be needed when we'll need to load/save
configuration of each OVSDB remote separately.
The parsing function is written in a way that it updates the
provided options and doesn't create a new structure. This
is done in order for different callers to have their own
default values and only update them with what is provided
by the user explicitly. For example, replication and relay
have different default probe intervals.
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Currently, the read-only option can be set on connections or JSON-RPC
server as a whole. However, there is no way to allow modifications in
one database, but not in the other.
Adding an internal read-only flag for a database itself. Will be used
later for running active and backup databases in a single process.
Marking the _Server database as read only is not necessary, because
modifications of internal databases are not allowed anyway, but it
doesn't hurt.
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Martin Kennelly observes that even though this data is available to
humans via the journal/log files, these aren't exactly easy for a
developer to make any kind of behavioral inferences. This kind of
log and counter would be useful when checking on system health to
let us know that an Open vSwitch component is noticing some kind of
system level hiccup.
Add a new coverage counter to track information on these events, and
let a developer or system engineer know how long these events have
occurred with some historical context.
Reported-at: https://lists.linuxfoundation.org/pipermail/ovs-discuss/2023-June/052523.html
Suggested-by: Martin Kennelly <mkennell@redhat.com>
Co-Authored-By: Aaron Conole <aconole@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Currently the 'Spent an unreasonably long Xms dumping flows' message
is set to the INFO level. However, based on this, we are also
drastically limiting the number of flows in the datapath, and this
would warrant a WARNING level.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Aaron Conole <aconole@redhat.com>
In case the difference between 'old' and 'new' rows is readily
available, it can be used to construct added/removed datums
instead. Diffs are typically much smaller than the column
itself. This change more than doubles the performance of a
transaction replay.
For example, with this change applied, initial read of OVSDB
file containing 136K small transactions for large OVN port
groups and address sets on my laptop takes 11 seconds vs 24
seconds without.
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Database file contains the column diff, but it is discarded once
the 'new' state of a row is constructed. Keep it in the transaction
row, as it can be used later by other parts of the code.
Diffs do not live long, we keep them around only while transaction
is alive, so should not affect memory consumption.
Users for this data will be added in later commits.
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Add new 'monitor' command to test-ovsdb utilities to make them just
run IDL loop infinitely. Other commands can still be placed before
the 'monitor', e.g. setting up conditions, tracking, running a few
transactions.
Having that, adding a couple test cases for IDL with online database
conversion. Test checks that IDL receives monitor cancellation
notification and successfully re-sends monitor requests.
While at it, adding debug logging to ovsdb-server processes for easier
debugging.
While working on a test the issue was discovered that schema for
standalone databases is not getting updated in the _Server database
after conversion. Checking the new schema bits only for clustered
databases for now.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Simon Horman <horms@ovn.org>
Currently python-ovs claims to be "db change aware" but does not
parse the "monitor_canceled" notification. Transactions can continue
being made, but the monitor updates will not be sent. This handles
monitor_cancel similarly to how ovsdb-cs currently does.
Fixes: c39751e445 ("python: Monitor Database table to manage lifecycle of IDL client.")
Signed-off-by: Terry Wilson <twilson@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Simon Horman <horms@ovn.org>
All existing test iterations assume that the FTP server is running on a
standard port, which may not always be the case. These tests helped find
problems in conntrack alg processing with non-standard ports.
Perform the necessary adjustments to ensure the test suite can start the
L7 server on a user-provided port.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@arknetworks.am>
Signed-off-by: Aaron Conole <aconole@redhat.com>
When a packet hits a flow rule without an explicitly specified helper,
OvS has to rely on automatic application layer gateway detection to
find related connections. This works as long as services are running on
their standard ports, e.g. when FTP servers use TCP port 21.
However, sometimes it's necessary to run services on non-standard ports.
In that case, there is no way for OvS to guess which protocol is used
within a given flow. Of course, this means that no related connections
can be recognized.
When a connection is committed with a particular helper, it's reasonable
to assume this helper will be used in subsequent CT actions, as long as
they don't override it. Achieve this behaviour by using the committed
connection's helper when a flow rule does not specify one.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@arknetworks.am>
Acked-by: Ivan Malov <ivan.malov@arknetworks.am>
Signed-off-by: Aaron Conole <aconole@redhat.com>
The current protocol detection logic relies on two pieces of metadata
passed as arguments: tp_src and tp_dst, which represent the L4 source
and destination port numbers from the flow that triggered the current
flow rule first, and was responsible for creating the current DP flow.
Since multiple network flows of many different kinds, potentially using
different protocols on all layers, can be processed by one flow rule,
using the metadata of some unrelated flow might lead to unexpected
results. For example, ICMP type and code can be interpreted as TCP
source and destination ports. This can confuse the code responsible for
the helper selection, leading to errors in traffic handling and
incorrect detection of related flows.
One of the easiest ways to fix this problem is to simply remove the
tp_src and tp_dst parameters from the picture. The current code base has
no good use for them.
The helper selection logic was based on these values and therefore needs
to be changed. Ensure that the helper specified in a flow rule is used,
given it is compatible with the L4 protocol of the packet. When a flow
rule does not specify a helper, one can still be picked using the given
packet's metadata like TCP/UDP ports.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@arknetworks.am>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Linux kernel commit ebddb1404900 ("net: move the nat function to
nf_nat_ovs for ovs and tc") introduced a regression into the kernel
datapath which prevented the openvswitch match key from being updated
when nat was undone for packets in the related conntrack state. This
issue caused these packets (usually ICMP/ICMPv6 error packets) to
match the wrong openflow rule.
This issue was fixed in linux kernel commit e6345d2824a3 ("netfilter:
nf_nat: fix action not being set for all ct states").
This test will fail for linux kernel versions v6.2 to v6.6, so test is
skipped for versions lower than v6.7.
Link: https://lore.kernel.org/netdev/20231221224311.130319-1-brad@faucet.nz/
Suggested-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Brad Cowie <brad@faucet.nz>
Signed-off-by: Aaron Conole <aconole@redhat.com>
The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded:
on systems with only mlx5, this test is always skipped.
Besides, the test tries to grab the first device listed by dpdk-devbind.py,
regardless of the PCI device status regarding kmod binding.
Remove dependency on this DPDK script and use a minimal script that
reads PCI sysfs.
This script is not perfect, as one can imagine PCI devices bound to
vfio-pci for virtual machines.
Plus, this script only tries to take over vfio-pci devices. mlx5 devices
can't be taken over blindly as it could mean losing connectivity to the
machine if the netdev was in use for this system.
For those two reasons, add a new environment variable DPDK_PCI_ADDR for
testers to select the PCI device of their liking.
For consistency and grep, the temporary file PCI_ADDR is renamed
to DPDK_PCI_ADDR.
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
codespell dictionary contains a list of widely used words
which enchant alone could fail on. for an example:
refcount, pthread, enqueuing, etc.
Load that dictionary, if exists, into enchant spell checker.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
The enchant dictionary synchronizes additions to the source file.
Keep the two word source separate by adding the extra words only
to the current session.
Fixes: 999c7773a6 ("checkpatch: add a comment spell-checker")
Signed-off-by: Gaetan Rivet <gaetanr@nvidia.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
vconn_sent counter is supposed to increase each time send() return
0, no matter if the vconn log debug is on or off.
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Cheng Li <lic121@chinatelecom.cn>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The documentation for log_backtrace() states the backtrace is
logged at DEBUG level, while in reality it is logged at ERROR
level.
Fixes: d0b99d38ed ("backtrace: Add log_backtrace()")
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
It's not a big problem, but it would be nice to ensure that
the backup database cannot be locally converted.
Fixes: e51879e99b ("ovsdb: Make OVSDB backup sever read only")
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The DSCP_DEFAULT is not zero and is a value that supposed
to be used for all connections by default.
Fixes: f125905cdd ("Allow configuring DSCP on controller and manager connections.")
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
We compare the logs in some tests, for example record/replay tests.
And those fail if for some reason the JSON object traversal happens
in the different order.
Sort the output in debug logs in order to fix sporadic test failures.
Should not affect performance in real-world cases as the actual
outgoing message is still not sorted.
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
It's easier to analyze failures when the lines that are different
are shown next to each other.
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
While reassessing weak references the code attempts to collect added
and removed atoms, even if the column didn't change. In case the
column contains a large set, it may take significant amount of time
to process.
Add a check for the column actually being changed either by removing
references to deleted rows or by direct removal.
For example, rows in OVN Port_Group tables frequently have two large
sets - 'ports' and 'acls'. In case a new ACL is added to the set
without changing the ports, ports don't need to be reassessed.
Fixes: 4dbff9f0a6 ("ovsdb: transaction: Incremental reassessment of weak refs.")
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
While counting strong references current code first generates a
difference between old and new datums and only after it checks
the types of the atoms to be strong references.
Similar thing happens while assessing weak references. First
the added/removed are generated and then we check for atoms
to be weak references.
Check the type first to avoid unnecessary work. This change
doubles the performance of transactions that modify large sets
of references.
For example, with this change applied, initial read of OVSDB
file containing 136K transactions of large OVN port groups
and address sets on my laptop takes 24 seconds vs 43 seconds
without.
Fixes: 4dbff9f0a6 ("ovsdb: transaction: Incremental reassessment of weak refs.")
Fixes: b2712d026e ("ovsdb: transaction: Use diffs for strong reference counting.")
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Datum of UUID and _version columns is accessed directly via
ovsdb_row_get_uuid_rw() and ovsdb_row_get_version_rw() functions
instead of ovsdb_data_* functions. Meaning, the data will be
directly modified even if it is shared between rows.
Fix that by unsharing the data whenever RW pointer is taken.
The issue was mostly hidden because weak reference assessment
code always called ovsdb_datum_subtract() even if not needed.
This way all the new transaction rows were always implicitly
unshared.
Also making ovsdb_datum_subtract() call conditional, so the
issue can be hit by existing unit tests.
Fixes: 485ac63d10 ("ovsdb: Add lazy-copy support for ovsdb_datum objects.")
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>