Update the CI and docs to use DPDK 23.11.1.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Remove the notion of cluster/leave --force since it was never
implemented. Instead of these instructions, document how a broken
cluster can be re-initialized with the old database contents.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Update link to pacemaker main page as the existing link is broken.
Also, use HTTPS.
Broken link flagged by make check-docs
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <horms@ovn.org>
The previously enabled 'hacking' checks were only applicable to Python 2
code. OVS doesn't support Python 2 for a while now so it's fine to
remove the dependency on hacking.
A similar change landed in OVN a while ago:
https://github.com/ovn-org/ovn/commit/271186fa7d76
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Recently OVS adopted a policy of using the inclusive naming word list v1
[1, 2].
In keeping with this policy rename the primary development branch from
'master' to 'main'. This patch does not actually make that change, but
rather updates references to the branch in documentation in the source
tree. It is intended to be applied at (approximately) the same time
that the change is made.
OVS is currently hosted on GitHub. We can expect the following behaviour
after the rename:
1. GitHub pull requests against are renamed branch are automatically
re-homed on new branch
2. GitHub Issues do not seem to be affected - at least the test issue I
created had no association with a branch
3. URLs accessed via the GitHub web UI are automatically renamed
(so long as a new branch called master is not created).
4. Using the git cli command, fetch will fetch the new branch (main),
and fetch -p will remove (prune) the old branch (master)
[1] df5e5cf4318a ("Documentation: Add section on inclusive language.")
[2] https://inclusivenaming.org/word-lists/
Signed-off-by: Simon Horman <horms@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
---
Notes:
* Now is the time to raise any concerns regarding this patch.
It is planned to implement this change next week.
* If you have an automation that fetches the master branch then
the suggested action is:
1. Before the branch rename occurs: update the automation to pull main an
fall back to pulling master if that fails
2. After the rename occurs: Update the automation to only fetch main
* After the change it may be necessary to update your local
git configuration for checked out branches.
For example:
# Fetch origin: new remote main branch; remote master branch is deleted
git fetch -tp origin
# Rename local branch
git branch -m master main
# Update local main branch to use remote main branch as it's upstream
git branch --set-upstream-to=origin/main main
* As a follow-up, after the rename, I plan to post a patch which removes
references to master in CI jobs
The Kernel datapath is no longer present in the primary development
branch of the OVS tree. Update documentation to more clearly reflect
this.
Documentation relating to the kernel datapath in the OVS tree can
be removed once 2.17 is EOL.
Also, update wording of affected text as there is more than one upstream
networking maintainer these days.
Signed-off-by: Simon Horman <horms@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
During normal operations, it is useful to understand when a particular flow
gets removed from the system. This can be useful when debugging performance
issues tied to ofproto flow changes, trying to determine deployed traffic
patterns, or while debugging dynamic systems where ports come and go.
Prior to this change, there was a lack of visibility around flow expiration.
The existing debugging infrastructure could tell us when a flow was added to
the datapath, but not when it was removed or why.
This change introduces a USDT probe at the point where the revalidator
determines that the flow should be removed. Additionally, we track the
reason for the flow eviction and provide that information as well. With
this change, we can track the complete flow lifecycle for the netlink
datapath by hooking the upcall tracepoint in kernel, the flow put USDT, and
the revalidator USDT, letting us watch as flows are added and removed from
the kernel datapath.
This change only enables this information via USDT probe, so it won't be
possible to access this information any other way (see:
Documentation/topics/usdt-probes.rst).
Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py)
which serves as a demonstration of how the new USDT probe might be used
going forward.
Co-authored-by: Aaron Conole <aconole@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Kevin Sprague <ksprague0711@gmail.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Recently OVS adopted a policy of using the inclusive naming word list v1
[1, 2].
This patch addresses the use of the term master repository by
using the term main repository instead.
This is as distinct from addressing the use of a master branch,
which remains as a follow-up task.
[1] df5e5cf ("Documentation: Add section on inclusive language.")
[2] https://inclusivenaming.org/word-lists/
Signed-off-by: Simon Horman <horms@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
OpenSSL 1.1.0 changed the library names from libeay32 and ssleay32 to
standard libssl and libcrypto. All the versions of OpenSSL that used
old names reached their official EoL, so it should be safe to just
migrate to new names. They can still be supported via premium support
option, but I don't think that is important for us.
Also, OpenSSL installers for older versions had the following folder
structure:
C:\OPENSSL-WIN64\
+---bin
+---include
| +---openssl
+---lib
| libeay32.lib
| ssleay32.lib
+---VC
libeay32MD.lib
libeay32MDd.lib
libeay32MT.lib
libeay32MTd.lib
ssleay32MD.lib
ssleay32MDd.lib
ssleay32MT.lib
ssleay32MTd.lib
With newer OpenSSL 3+ the structure is different:
C:\OPENSSL-WIN64
+---bin
+---include
| +---openssl
+---lib
+---VC
+---x64
+---MD
| libcrypto.lib
| libssl.lib
+---MDd
| libcrypto.lib
| libssl.lib
+---MT
| libcrypto.lib
| libssl.lib
+---MTd
libcrypto.lib
libssl.lib
Basically, instead of one generic library in the lib folder and a bunch
of differently named versions of it for different type of linkage, we
now have multiple instances of the library located in different folders
based on the linkage type. So, we have to provide an exact path in
order to find the library.
'lib/VC/x64/MT' was chosen in this patch since it is a way used for
building in build-aux/ccl.
MD stands for dynamic linking, MT is static, 'd' stands for debug
versions of the libraries.
While at it, fixing documentation examples to point to Win64 default
installation folder.
Acked-by: Simon Horman <horms@ovn.org>
Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This updates links to several upstream Kernel documents.
1. Lore is now the canonical archive for the netdev mailing list
2. net-next is now maintained by the netdev team,
of which David Miller is currently a member,
rather than only by David.
Also, use HTTPS rather than HTTP.
3. The Netdev FAQ has evolved into the Netdev Maintainer Handbook.
4. The Kernel security document link was dead,
provide the current canonical location for this document instead.
1., 2. & 3. Found by inspection
4. Flagged by check-docs
Signed-off-by: Simon Horman <horms@ovn.org>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Update link to OCF Resource Agents documentation as the existing link
is broken. Also, use HTTPS.
Broken link flagged by make check-docs
Signed-off-by: Simon Horman <horms@ovn.org>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Correct spelling errors in .rst files flagged by codespell.
Also correct some minor grammar errors in nearby documentation.
Signed-off-by: Simon Horman <horms@ovn.org>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
IANAL, but I think we can extend the copyright attached
to documentation to cover the current year: we are still
actively working on the documentation.
Signed-off-by: Simon Horman <horms@ovn.org>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
The userspace conntrack only supported hash for port selection.
With the patch, both userspace and kernel datapath support the random
flag.
The default behavior remains the same, that is, if no flags are
specified, hash is selected.
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Simon Horman <horms@ovn.org>
Updating the reference documentation with the inclusion of possible building
problems with libjemalloc and solution suggestions.
Reported-at: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
Signed-off-by: Roberto Bartzen Acosta <roberto.acosta@luizalabs.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Frode Nordahl <frode.nordahl@canonical.com>
[simon: rebased; added leading '$' to last configure example]
Signed-off-by: Simon Horman <horms@ovn.org>
OVSDB server maintains a temporary file with the current database
configuration for the case it is restarted by a monitor process
after a crash. On startup the configuration from command line
arguments is stored there in a JSON format, also whenever user
changes the configuration with different UnixCtl commands, those
changes are getting added to the file. When restarted from the
crash it reads the configuration from the file and continues
with all the necessary remotes and databases.
This change allows it to be an external user-provided file that
OVSDB server will read the configuration from. The file can be
specified with a --config-file command line argument and it is
mutually exclusive with most other command line arguments that
set up remotes or databases, it is also mutually exclusive with
use of appctl commands that modify same configurations, e.g.
add/remove-db or add/remove-remote.
If the user wants to change the configuration of a running server,
they may change the file and call ovsdb-server/reload appctl.
OVSDB server will open a file, read and parse it, compare the
new configuration with the current one and adjust the running
configuration as needed. OVSDB server will try to keep existing
databases and connections intact, if the change can be applied
without disrupting the normal operation.
User-provided files are not trustworthy, so extra checks were
added to ensure a correct file format. If the file cannot be
correctly parsed, e.g. contains invalid JSON, no changes will
be applied and the server will keep using the previous
configuration until the next reload.
If config-file is provided for active-backup databases, permanent
disconnection of one of the backup databases no longer leads to
switching all other databases to 'active'. Only the disconnected
one will transition, since all of them have their own records in
the configuration file.
With this change, users can run all types of databases within
the same ovsdb-server process at the same time.
Simple configuration may look like this:
{
"remotes": {
"punix:db.sock": {},
"pssl:6641": {
"inactivity-probe": 16000,
"read-only": false,
"role": "ovn-controller"
}
},
"databases": {
"conf.db": {},
"sb.db": {
"service-model": "active-backup",
"backup": true,
"source": {
"tcp:127.0.0.1:6644": null
}
},
"OVN_Northbound": {
"service-model": "relay",
"source": {
"ssl:[fe:::1]:6642,ssl:[fe:::2]:6642": {
"max-backoff": 8000,
"inactivity-probe": 10000
}
}
}
}
}
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
When a packet hits a flow rule without an explicitly specified helper,
OvS has to rely on automatic application layer gateway detection to
find related connections. This works as long as services are running on
their standard ports, e.g. when FTP servers use TCP port 21.
However, sometimes it's necessary to run services on non-standard ports.
In that case, there is no way for OvS to guess which protocol is used
within a given flow. Of course, this means that no related connections
can be recognized.
When a connection is committed with a particular helper, it's reasonable
to assume this helper will be used in subsequent CT actions, as long as
they don't override it. Achieve this behaviour by using the committed
connection's helper when a flow rule does not specify one.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@arknetworks.am>
Acked-by: Ivan Malov <ivan.malov@arknetworks.am>
Signed-off-by: Aaron Conole <aconole@redhat.com>
The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded:
on systems with only mlx5, this test is always skipped.
Besides, the test tries to grab the first device listed by dpdk-devbind.py,
regardless of the PCI device status regarding kmod binding.
Remove dependency on this DPDK script and use a minimal script that
reads PCI sysfs.
This script is not perfect, as one can imagine PCI devices bound to
vfio-pci for virtual machines.
Plus, this script only tries to take over vfio-pci devices. mlx5 devices
can't be taken over blindly as it could mean losing connectivity to the
machine if the netdev was in use for this system.
For those two reasons, add a new environment variable DPDK_PCI_ADDR for
testers to select the PCI device of their liking.
For consistency and grep, the temporary file PCI_ADDR is renamed
to DPDK_PCI_ADDR.
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
As a community we should strive to be inclusive.
As such it seems appropriate to adopt an word list,
to help guide the use of inclusive language.
This patch proposes use of the Inclusive Naming Word List v1.0.
Link: https://inclusivenaming.org/word-lists/
Signed-off-by: Simon Horman <horms@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Extend 'pmd-sleep-max' so that individual PMD thread cores may have a
specified max sleep request value.
Existing behaviour is maintained.
Any PMD thread core without a value will use the global value if set
or default no sleep.
To set PMD thread cores 8 and 9 to never request a load based sleep
and all other PMD thread cores to be able to request a max sleep of
50 usecs:
$ ovs-vsctl set open_vswitch . other_config:pmd-sleep-max=50,8:0,9:0
To set PMD thread cores 10 and 11 to request a max sleep of 100 usecs
and all other PMD thread cores to never request a sleep:
$ ovs-vsctl set open_vswitch . other_config:pmd-sleep-max=10:100,11:100
'pmd-sleep-show' is updated to show the max sleep value for each PMD
thread.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Branches 2.17/3.0/3.1/3.2 are using newer DPDK LTS releases.
Update the faq.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
It is an example and the dates are not set in stone, so updating
the table it is not very important. But it's nice to see currently
supported releases there as well as the near future plans.
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
For better usability, the function pairs get_config() and
set_config() for netdevs should be symmetric: Options which are
accepted by set_config() should be returned by get_config() and the
latter should output valid options for set_config() only.
This patch moves key-value pairs which are not valid options from
get_config() to the get_status() callback. For example, get_config()
in lib/netdev-dpdk.c returned {configured,requested}_{rx,tx}_queues
previously. For requested rx queues the proper option name is n_rxq,
so requested_rx_queues has been renamed respectively. Tx queues
cannot be changed by the user, hence requested_tx_queues has been
dropped. Both configured_{rx,tx}_queues will be returned as
n_{r,t}xq in the get_status() callback.
The netdev dpdk classes no longer share a common get_config() callback,
instead both the dpdk_class and the dpdk_vhost_client_class define
their own callbacks. The get_config() callback for dpdk_vhost_class has
been dropped because it does not have a set_config() callback.
The documentation in vswitchd/vswitch.xml for status columns as well
as tests have been updated accordingly.
Reported-at: https://bugzilla.redhat.com/1949855
Signed-off-by: Jakob Meng <code@jakobmeng.de>
Reviewed-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
For better usability, the function pairs get_config() and
set_config() for netdevs should be symmetric: Options which are
accepted by set_config() should be returned by get_config() and the
latter should output valid options for set_config() only. This patch
also moves key-value pairs which are not valid options from get_config()
to the get_status() callback.
The documentation in vswitchd/vswitch.xml for status columns has been
updated accordingly.
Reported-at: https://bugzilla.redhat.com/1949855
Signed-off-by: Jakob Meng <code@jakobmeng.de>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Since last month ReadTheDocs only supports building with a new
configuration file provided in the repository itself:
https://blog.readthedocs.com/migrate-configuration-v2/
So, all our documentation builds are failing for quite some time.
Add the configuration file to unblock documentation updates.
Need to remove the upper restriction on the sphinx version.
sphinx 2.0 is very old at this point and pip fails to install
it along with other dependencies on the rtd server.
Note: Sphinx 2.0 moved from HTML4 to HTML5 renderer and tables
no longer have borders by default. That should be addressed
via CSS file in the ovs-sphinx-theme.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Added a reference to the DPDK documentation as a result of
analyzing the OVS code for potential performance impacts due
to the Downfall mitigation.
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
From: Antonin Bas <abas@vmware.com>
Since Open vSwitch 2.7, the max_len option has no effect, and the full
packet is always sent to controllers. This was confirmed with both the
kernel and netdev datapaths.
Reported-by: Antonin Bas <abas@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/295
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Antonin Bas <abas@vmware.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Max requested sleep time and status for a PMD thread
is logged at start up or when changed, but it can be
convenient to have a command to dump this information
explicitly.
It is envisaged that this will be expanded for individual
pmds in the future, hence adding to dpif_netdev_pmd_info().
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
other_config:pmd-maxsleep is a config option to allow
PMD thread cores to sleep under low or no load conditions.
Rename it to 'pmd-sleep-max' to allow a more structured
name and so that additional options or command can follow
the 'pmd-sleep-xyz' pattern.
Use of other_config:pmd-maxsleep is deprecated to be
removed in a future release and will result in a warning.
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This adds a Python version of the async DNS support added in:
771680d96 DNS: Add basic support for asynchronous DNS resolving
The above version uses the unbound C library, and this
implimentation uses the SWIG-wrapped Python version of that.
In the event that the Python unbound library is not available,
a warning will be logged and the resolve() method will just
return None. For the case where inet_parse_active() is passed
an IP address, it will not try to resolve it, so existing
behavior should be preserved in the case that the unbound
library is unavailable.
Intentional differences from the C version are as follows:
OVS_HOSTS_FILE environment variable can bet set to override
the system 'hosts' file. This is primarily to allow testing to
be done without requiring network connectivity.
Since resolution can still be done via hosts file lookup, DNS
lookups are not disabled when resolv.conf cannot be loaded.
The Python socket_util module has fallen behind its C equivalent.
The bare minimum change was done to inet_parse_active() to support
sync/async dns, as there is no equivalent to
parse_sockaddr_components(), inet_parse_passive(), etc. A TODO
was added to bring socket_util.py up to equivalency to the C
version.
Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Some control protocols are used to maintain link status between
forwarding engines (e.g. LACP). When the system is not sized properly,
the PMD threads may not be able to process all incoming traffic from the
configured Rx queues. When a signaling packet of such protocols is
dropped, it can cause link flapping, worsening the situation.
Use the rte_flow API to redirect these protocols into a dedicated Rx
queue. The assumption is made that the ratio between control protocol
traffic and user data traffic is very low and thus this dedicated Rx
queue will never get full. Re-program the RSS redirection table to only
use the other Rx queues.
The additional Rx queue will be assigned a PMD core like any other Rx
queue. Polling that extra queue may introduce increased latency and
a slight performance penalty at the benefit of preventing link flapping.
This feature must be enabled per port on specific protocols via the
rx-steering option. This option takes "rss" followed by a "+" separated
list of protocol names. It is only supported on ethernet ports. This
feature is experimental.
If the user has already configured multiple Rx queues on the port, an
additional one will be allocated for control packets. If the hardware
cannot satisfy the number of requested Rx queues, the last Rx queue will
be assigned for control plane. If only one Rx queue is available, the
rx-steering feature will be disabled. If the hardware does not support
the rte_flow matchers/actions, the rx-steering feature will be
completely disabled on the port and regular rss will be performed
instead.
It cannot be enabled when other-config:hw-offload=true as it may
conflict with the offloaded flows. Similarly, if hw-offload is enabled,
custom rx-steering will be forcibly disabled on all ports and replaced
by regular rss.
Example use:
ovs-vsctl add-bond br-phy bond0 phy0 phy1 -- \
set interface phy0 type=dpdk options:dpdk-devargs=0000:ca:00.0 -- \
set interface phy0 options:rx-steering=rss+lacp -- \
set interface phy1 type=dpdk options:dpdk-devargs=0000:ca:00.1 -- \
set interface phy1 options:rx-steering=rss+lacp
As a starting point, only one protocol is supported: LACP. Other
protocols can be added in the future. NIC compatibility should be
checked.
To validate that this works as intended, I used a traffic generator to
generate random traffic slightly above the machine capacity at line rate
on a two ports bond interface. OVS is configured to receive traffic on
two VLANs and pop/push them in a br-int bridge based on tags set on
patch ports.
+----------------------+
| DUT |
|+--------------------+|
|| br-int || in_port=patch10,actions=mod_dl_src:$patch11,
|| || mod_dl_dst:$tgen0,
|| || output:patch10
|| || in_port=patch11,actions=mod_dl_src:$patch10
|| || mod_dl_dst:$tgen0,
|| patch10 patch11 || output:patch10
|+---|-----------|----+|
| | | |
|+---|-----------|----+|
|| patch00 patch01 ||
|| tag:10 tag:20 ||
|| ||
|| br-phy || default flow, action=NORMAL
|| ||
|| bond0 || balance-slb, lacp=passive, lacp-time=fast
|| phy0 phy1 ||
|+------|-----|-------+|
+-------|-----|--------+
| |
+-------|-----|--------+
| port0 port1 | balance L3/L4, lacp=active, lacp-time=fast
| lag | mode trunk VLANs 10, 20
| |
| switch |
| |
| vlan 10 vlan 20 | mode access
| port2 port3 |
+-----|----------|-----+
| |
+-----|----------|-----+
| tgen0 tgen1 | Random traffic that is properly balanced
| | across the bond ports in both directions.
| traffic generator |
+----------------------+
Without rx-steering, the bond0 links are randomly switching to
"defaulted" when one of the LACP packets sent by the switch is dropped
because the RX queues are full and the PMD threads did not process them
fast enough. When that happens, all traffic must go through a single
link which causes above line rate traffic to be dropped.
~# ovs-appctl lacp/show-stats bond0
---- bond0 statistics ----
member: phy0:
TX PDUs: 347246
RX PDUs: 14865
RX Bad PDUs: 0
RX Marker Request PDUs: 0
Link Expired: 168
Link Defaulted: 0
Carrier Status Changed: 0
member: phy1:
TX PDUs: 347245
RX PDUs: 14919
RX Bad PDUs: 0
RX Marker Request PDUs: 0
Link Expired: 147
Link Defaulted: 1
Carrier Status Changed: 0
When rx-steering is enabled, no LACP packet is dropped and the bond
links remain enabled at all times, maximizing the throughput. Neither
the "Link Expired" nor the "Link Defaulted" counters are incremented
anymore.
This feature may be considered as "QoS". However, it does not work by
limiting the rate of traffic explicitly. It only guarantees that some
protocols have a lower chance of being dropped because the PMD cores
cannot keep up with regular traffic.
The choice of protocols is limited on purpose. This is not meant to be
configurable by users. Some limited configurability could be considered
in the future but it would expose to more potential issues if users are
accidentally redirecting all traffic in the isolated queue.
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
At some point in OVS history, some virtio features were announced as
supported (ECN and UFO virtio features).
The userspace TSO code, which has been added later, does not support
those features and tries to disable them.
This breaks OVS upgrades: if an existing VM already negotiated such
features, their lack on reconnection to an upgraded OVS triggers a
vhost socket disconnection by Qemu.
This results in an endless loop because Qemu then retries with the same
set of virtio features.
This patch proposes to try and detect those vhost socket disconnection
and fallback restoring the old virtio features (and disabling TSO for
this vhost port).
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Currently, database schema conversion in case of clustered database
produces a transaction record with both new schema and converted
database data. So, the sequence of events is following:
1. Get the new schema.
2. Convert the database to a new schema.
3. Translate the newly converted database into JSON.
4. Write the schema + data JSON to the storage.
5. Destroy converted version of a database.
6. Read schema + data JSON from the storage and parse.
7. Create a new database from a parsed database data.
8. Replace current database with the new one.
Most of these steps are very computationally expensive. Also,
conversion to/from JSON is much more expensive than direct database
conversion with ovsdb_convert() that can make use of shallow data
copies.
Instead of doing all that, let's make use of previously introduced
ability to not write the converted data into the storage. The process
will look like this then:
1. Get the new schema.
2. Convert the database to a new schema
(to verify that it is possible).
3. Write the schema to the storage.
4. Destroy converted version of a database.
5. Read the new schema from the storage and parse.
6. Convert the database to a new schema.
7. Replace current database with the new one.
Most of the operations here are performed on the small schema object,
instead of the actual database data. Two remaining data operations
(actual conversion) are noticeably faster than conversion to/from
JSON due to reference counting and shallow data copies.
Steps 4-6 can be optimized later to not convert twice on the
process that initiates the conversion.
The change results in following performance improvements in conversion
of OVN_Southbound database schema from version 20.23.0 to 20.27.0
(measured on a single-server RAFT cluster with no clients):
| Before | After
+---------+-------------------+---------+------------------
DB size | Total | Max poll interval | Total | Max poll interval
--------+---------+-------------------+---------+------------------
542 MB | 47 sec. | 26 sec. | 15 sec. | 10 sec.
225 MB | 19 sec. | 10 sec. | 6 sec. | 4.5 sec.
542 MB database had 19.5 M atoms, 225 MB database had 7.5 M atoms.
Overall performance improvement is about 3x.
Also, note that before this change database conversion basically
doubles the database file on disk. Now it only writes a small
schema JSON.
Since the change requires backward-incompatible database file format
changes, documentation is updated on how to perform an upgrade.
Handled the same way as we did for the previous incompatible format
change in 2.15 (column diffs).
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
SRv6 (Segment Routing IPv6) tunnel vport is responsible
for encapsulation and decapsulation the inner packets with
IPv6 header and an extended header called SRH
(Segment Routing Header). See spec in:
https://datatracker.ietf.org/doc/html/rfc8754
This patch implements SRv6 tunneling in userspace datapath.
It uses `remote_ip` and `local_ip` options as with existing
tunnel protocols. It also adds a dedicated `srv6_segs` option
to define a sequence of routers called segment list.
Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
tc does not support conntrack ALGs. Even worse, with tc enabled, they
should not be used/configured at all. This is because even though TC
will ignore the rules with ALG configured, i.e., they will flow through
the kernel module, return traffic might flow through a tc conntrack
rule, and it will not invoke the ALG helper.
Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This patch adds a Python script that can be used to analyze the
revalidator runs by providing statistics (including some real time
graphs).
The USDT events can also be captured to a file and used for
later offline analysis.
The following blog explains the Open vSwitch revalidator
implementation and how this tool can help you understand what is
happening in your system.
https://developers.redhat.com/articles/2022/10/19/open-vswitch-revalidator-process-explained
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Now that the timer slack for the PMD threads is reduced we can also
reduce the start/increment for PMD load based sleeping to match it.
This will further reduce initial sleep times making it more resilient
to interfaces that might be sensitive to large sleep times.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The default Linux timer slack groups timer expires into 50 uS intervals.
With some traffic patterns this can mean that returning to process
packets after a sleep takes too long and packets are dropped.
Add a helper to util.c and set use it to reduce the timer slack
for PMD threads, so that sleeps with smaller resolutions can be done
to prevent sleeping for too long.
Fixes: de3bbdc479a9 ("dpif-netdev: Add PMD load based sleeping.")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2023-January/401121.html
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Co-authored-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Sleep for an incremental amount of time if none of the Rx queues
assigned to a PMD have at least half a batch of packets (i.e. 16 pkts)
on an polling iteration of the PMD.
Upon detecting the threshold of >= 16 pkts on an Rxq, reset the
sleep time to zero (i.e. no sleep).
Sleep time will be increased on each iteration where the low load
conditions remain up to a total of the max sleep time which is set
by the user e.g:
ovs-vsctl set Open_vSwitch . other_config:pmd-maxsleep=500
The default pmd-maxsleep value is 0, which means that no sleeps
will occur and the default behaviour is unchanged from previously.
Also add new stats to pmd-perf-show to get visibility of operation
e.g.
...
- sleep iterations: 153994 ( 76.8 % of iterations)
Sleep time (us): 9159399 ( 59 us/iteration avg.)
...
Reviewed-by: Robin Jarry <rjarry@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This archive website disappeared.
On the other hand, the link to an obsolete dpif-provider man page
probably did not provide much info and we can simply mention the current
file.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>