We used this builder before, but from the project configuration
on the website. ReadTheDocs doesn't allow to change it there
anymore and it doesn't allow to see the full name of the previously
used builder (!!), so I failed to migrate it to the config file.
The result is that older link like:
https://docs.openvswitch.org/en/latest/howto/dpdk/
Now require .html:
https://docs.openvswitch.org/en/latest/howto/dpdk.html
Fixing now by switching the builder back.
Fixes: e388bd73b70d ("readthedocs: Add the configuration file.")
Reported-by: Antonin Bas <abas@vmware.com>
Reported-by: David Marchand <david.marchand@redhat.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/310
Reviewed-by: Antonin Bas <abas@vmware.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Many system tests currently use ping with the combination of a
low packet count (-c 3), short interval between sends (-i 0.3)
and a _deadline_ of 2 seconds (-d 2).
This combination of options may lead to a situation where more
than count packets are sent however ping will stop when count
packets are received. This results in a failed test due to how
the result is checked, for example:
ping6 -q -c 3 -i 0.3 -w 2 fc00::3 | FORMAT_PING
@@ -1,2 +1,2 @@
-3 packets transmitted, 3 received, 0% packet loss, time 0ms
+4 packets transmitted, 3 received, 25% packet loss, time 0ms
To reiterate, in the above example there is no packet loss, but
ping stops after _receiving_ 3 packets, not bothering with
waiting for the response to the fourth packet it just sent out.
If we look at the iputils ping manual for the -w deadline option
we can read that this is expected behavior:
> Specify a timeout, in seconds, before ping exits regardless of
> how many packets have been sent or received. In this case ping
> does not stop after count packet are sent, it waits either for
> deadline expire or until count probes are answered or for some
> error notification from network.
To avoid these kinds of failures in checks where a response is
expected, we replace ping -w with ping -W.
We keep ping -w for checks where it is expected to NOT get a
response.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Depending on system configuration, executing tcpdump without the
-n parameter, may prolong the execution time for tcpdump while it
attempts name resolution.
This delay may in turn lead to test failures due to contents of
tables to check being evicted.
We recently started to see this problem with the
"conntrack -IPv6 ICMP6 Related with SNAT" test.
For consistency, this patch adds the -n parameter to all tcpdump
calls in system-traffic.at.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This should make it a little more visible that documentation
build fails on ReadTheDocs.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Since last month ReadTheDocs only supports building with a new
configuration file provided in the repository itself:
https://blog.readthedocs.com/migrate-configuration-v2/
So, all our documentation builds are failing for quite some time.
Add the configuration file to unblock documentation updates.
Need to remove the upper restriction on the sphinx version.
sphinx 2.0 is very old at this point and pip fails to install
it along with other dependencies on the rtd server.
Note: Sphinx 2.0 moved from HTML4 to HTML5 renderer and tables
no longer have borders by default. That should be addressed
via CSS file in the ovs-sphinx-theme.
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Currently we log the 980-ish byte long tc_action structure as a
single long hex string. That is very hard to read and hard to
spot the difference between two. And most of the fields are zero.
Use the sparse hex dump instead as we do for keys already.
Ex.:
Action 1 mismatch:
- Expected Action:
00000000 f0 3c 00 00 01 00 00 00-00 00 00 00 00 00 00 00
000003d0 00 00 00 00 ff ff ff ff-
- Received Action:
00000000 f0 3c 00 00 01 01 00 00-00 00 00 00 00 00 00 00
000003d0 00 00 00 00 ff ff ff ff-
Without the change, each action would be a 1900+ characters
long string of mostly zeroes.
Acked-by: Simon Horman <horms@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This issue has been observed when running traffic tests with a dpdk
enabled userspace datapath (though those tests are added in a separate
series).
However, the described issue also affects the kernel datapath which is
why this patch is sent separately.
A main thread executing the 'revalidator/purge' command could race with
revalidator threads that can be dumping/sweeping the purged flows at the
same time.
This race can be reproduced (with dpif debug logs) by running the
conntrack - ICMP related unit tests with the userspace datapath:
2023-10-09T14:11:55.242Z|00177|unixctl|DBG|received request
revalidator/purge[], id=0
2023-10-09T14:11:55.242Z|00044|dpif(revalidator17)|DBG|netdev@ovs-netdev:
flow_dump ufid:68ff6817-fb3b-4b30-8412-9cf175318294 <empty>,
packets:0, bytes:0, used:never
2023-10-09T14:11:55.242Z|00178|dpif|DBG|netdev@ovs-netdev: flow_del
ufid:07046e91-30a6-4862-9048-1a76b5a88a5b
recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0),
ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),
packet_type(ns=0,id=0),
eth(src=a6:0a:bf:e2:f3:f2,dst=62:23:0f:f6:2c:75),
eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0,
ttl=64,frag=no),udp(src=37380,dst=10000), packets:0, bytes:0,
used:never
...
2023-10-09T14:11:55.242Z|00049|dpif(revalidator17)|WARN|netdev@ovs-netdev:
failed to flow_get (No such file or directory)
ufid:07046e91-30a6-4862-9048-1a76b5a88a5b <empty>, packets:0,
bytes:0, used:never
2023-10-09T14:11:55.242Z|00050|ofproto_dpif_upcall(revalidator17)|WARN|
Failed to acquire udpif_key corresponding to unexpected flow
(No such file or directory):
ufid:07046e91-30a6-4862-9048-1a76b5a88a5b
...
2023-10-09T14:11:55.242Z|00183|unixctl|DBG|replying with success, id=0: ""
To avoid this race, a first part of the fix is to pause (if not already
paused) the revalidators while the main thread is purging the datapath
flows.
Then a second issue is observed by running the same unit test with the
kernel datapath. Its dpif implementation dumps flows via a netlink request
(see dpif_flow_dump_create(), dpif_netlink_flow_dump_create(),
nl_dump_start(), nl_sock_send__()) in the leader revalidator thread,
before pausing revalidators:
2023-10-09T14:44:28.742Z|00122|unixctl|DBG|received request
revalidator/purge[], id=0
...
2023-10-09T14:44:28.742Z|00125|dpif|DBG|system@ovs-system: flow_del
ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 recirc_id(0),dp_hash(0),
skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0),
ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2,
dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1,
tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00),
packets:0, bytes:0, used:never
...
2023-10-09T14:44:28.742Z|00129|unixctl|DBG|replying with success, id=0: ""
...
2023-10-09T14:44:28.742Z|00006|dpif(revalidator21)|DBG|system@ovs-system:
flow_dump ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>,
packets:0, bytes:0, used:never
...
2023-10-09T14:44:28.742Z|00012|dpif(revalidator21)|WARN|system@ovs-system:
failed to flow_get (No such file or directory)
ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0,
bytes:0, used:never
2023-10-09T14:44:28.742Z|00013|ofproto_dpif_upcall(revalidator21)|WARN|
Failed to acquire udpif_key corresponding to unexpected flow
(No such file or directory):
ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9
To avoid evaluating already deleted flows, the second part of the fix is
to ensure that dumping from the leader revalidator thread is done out of
any pause request.
As a result of this patch, the unit test "offloads - delete ufid mapping
if device not exist - offloads enabled" does not need to waive the random
warning logs when purging dp flows.
Fixes: 98bb4286970d ("tests: Add command to purge revalidators of flows.")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Variable "want_key" in function check_condition and variable "key"
in function set_column were not destroyed in exception branch.
This patch calls ovsdb_atom_destroy to release resources to avoid
memory leak.
Fixes: 79c1a00fb5a5 ("db-ctl-base: Don't die in set_column() on error.")
Fixes: e09b3af3e249 ("db-ctl-base: Don't die in is_condition_satisfied() on error")
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Zengyuan Wang <wangzengyuan@huawei.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Currently checksum recalculation is not supported with TC offload for
IPIP and GRE packets. This patch adds support for TC offloading of
IPIP and GRE packets by adding the correct csum action.
Without this patch the following warning can be seen in the logging:
Can't offload rewrite of IP/IPV6 with ip_proto: X.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
When the IP header is modified, for example, by NAT or a ToS/TTL change,
the IP header checksum needs recalculation. In addition to the IP header
checksum, for UDPLITE, its checksum also needs recalculation when any
of the addresses change.
This patch adds support for TC offloading of IGMP, UDPLITE, and SCTP
packets by adding the correct csum action.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Use test title prefix according to filename the test is in
for tunnel.at and ofproto-dpif.at.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
The cited commit fixed missing mirror packets by reset mirror when
packets are modified but setting geneve options was also treated as
a modified packet but should be treated as a part of set_tunnel
which doesn't reset mirror.
Fixes: feed7f677505 ("ofproto-dpif-upcall: Mirror packets that are modified.")
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
This commit adds a non-exhaustive list of tests for some of the
functions declared in `lib/byteq`.
These unit tests have been executed via `make check` and they
successfully passed.
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
The status options pci-vendor_id and pci-device_id for dpdk netdevs
have been replaced by bus_info. This patch updates the documentation
in vswitchd/vswitch.xml accordingly.
Fixes: a77c7796f23a ("dpdk: Update to use v22.11.1.")
Signed-off-by: Jakob Meng <code@jakobmeng.de>
Acked-by: Simon Horman <horms@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Vport's offloads are done on the tracked orig-in-port, but the flow itself
is associated in the vport's map.
Removing the physdev will flush all the ports that are on its map, but
not the ones on other netdevs' maps. Since flows take reference count on
both their vport and their physdev, the physdev still has references on.
Trying to remove it and re-add it fails with "already in use" error.
Fix it by flushing the physdev's offload flows in all related netdevs,
e.g. the netdev itself, or for physical devices, all vports.
Fixes: adbd4301a249 ("netdev-offload-dpdk: Use per-netdev offload metadata.")
Reported-by: wuxi_seu@163.com
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Added a reference to the DPDK documentation as a result of
analyzing the OVS code for potential performance impacts due
to the Downfall mitigation.
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
ofconn connection parameters, such as probe_interval and max_backoff,
are always set to their default values when vswitchd starts up even if
the user has configured these to be something different in ovsdb:
$ ovs-vsctl set controller UUID inactivity_probe=9000
$ journalctl -u ovs-vswitchd.service | grep "inactivity"
ovs|10895|rconn|DBG|dp1<->tcp:127.0.0.1:6653: idle 9 seconds,
sending inactivity probe
$ systemctl restart openvswitch-switch.service
$ journalctl -u ovs-vswitchd.service | grep "inactivity"
ovs|00848|rconn|DBG|dp1<->tcp:127.0.0.1:6653: idle 5 seconds,
sending inactivity probe
This bug was introduced by commit a0baa7df (connmgr: Make treatment of
active and passive connections more uniform.).
This happens because ofservice_reconfigure() loops over each
ofconn in ofservice->conns and calls ofconn_reconfigure() on it
to set the configuration parameters, however when ofservice_reconfigure()
is called from ofservice_create(), ofservice->conns hasn't been populated
yet so ofconn_reconfigure() is never called.
This commit moves the ofservice_reconfigure() call to ofconn_create()
where ofservice->conns is populated.
This commit also removes the hardcoded default values for
inactivity_probe (5s) and max_backoff (8s) on initial creation
of the ofservice, as these config values are available from the
ofproto_controller struct c.
Signed-off-by: Brad Cowie <brad@faucet.nz>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <horms@ovn.org>
This patch adds an utility that can be used to determine if
an issue is related to a lack of Linux kernel resources.
This tool is also featured in a Red Hat developers blog article:
https://developers.redhat.com/articles/2023/07/24/troubleshooting-open-vswitch-kernel-blame
Reviewed-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
C++ does not allow implicit conversion from void pointer to a specific
pointer type. This change removes the cast from uint32_t* to void* in
`hash_words_32aligned` and adds an explicit typecast from uint32_t* to
uint64_t* in `hash_words_inline`.
This issue was initially discovered on G++ v9.2.0 when a downstream C++
application included the hash.h header file and was compiled on an AMD
Ryzen Zen 2 CPU (__SSE4_2__ && __x86_64__). On the latest G++ version,
it would throw an error. On the latest GCC version with `-Wc++-compat`,
it would throw a warning.
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This commit adds various null pointer checks to some files in the `lib`,
`ovsdb`, and `vtep` directories to fix several Coverity defects. These
changes are grouped together as they perform similar checks, returning
early, skipping some action, or logging a warning if a null pointer is
encountered.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This commit adds some `ovs_assert()` checks to some return values of
`dp_packet_data()` to ensure that they are not NULL and to prevent
null-pointer dereferences, which might lead to unwanted crashes. We use
assertions since it should be impossible for these calls to
`dp_packet_data()` to return NULL.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This patch adds WARNINGs for the subject line length and the format,
i.e., the sentence should start with a capital and end with a dot.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
While sending a reply to the monitor_cond_since request, server
includes the last transaction ID. And it sends new IDs with each
subsequent update. Current implementation doesn't use the one
supplied with a monitor reply, and only takes into account IDs
provided with monitor updates. That may cause various issues:
1. Performance: During initialization, the last-id is set to zero.
If re-connection will happen after receiving a monitor reply,
but before any monitor update, the client will send a new
monitor request with an all-zero last-id and will re-download
the whole database again.
2. Data inconsistency: Assuming one of the clients sends a
transaction, but our python client disconnects before receiving
a monitor update for this transaction. The last-id will point
to a database state before this transaction. On re-connection,
this last-id will be sent and the monitor reply will contain
a diff with a new data from that transaction. But if another
disconnection happens right after that, on second re-connection
our python client will send another monitor_cond_since with
exactly the same last-id. That will cause receiving the same
set of updates again. And since it's an update2 message with
a diff of the data, the client will remove previously applied
result of the transaction. At this point it will have a
different database view with the server potentially leading
to all sorts of data inconsistency problems.
Fix that by always updating the last-id from the latest monitor
reply.
Fixes: 46d44cf3be0d ("python: idl: Add monitor_cond_since support.")
Acked-by: Simon Horman <horms@ovn.org>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The SCTP protocol ports were excluded from the netlink
encoding. In that case the nl_ct_flush_tuple() would
return EOPNOTSUPP, that could result in some CT entries
not being properly flushed if we would hit SCTP entry earlier
than others.
This at the same time allows to flush SCTP on its own
in during partial match. This should still be considered
a bug, because OvS currently supports SCTP CT entries,
and it should also support partial flush for them the same
way it supports partial flush for TCP/UDP.
Reported-at: https://bugzilla.redhat.com/2228037
Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Many tests are retrieving the schema name twice and also producing
an ordinal schema which is not used in these tests.
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
If a packet originating from the controller recirculates after going
through a patch port, it gets dropped with the following message:
ofproto_dpif_upcall(handler8)|INFO|received packet on unassociated
datapath port 4294967295
This happens because there's no xport_uuid in the recirculation node
and at the same type in_port refers to the patch port.
The patch, in the case of zeroed uuid, checks that in_port belongs to
the bridge and returns the related ofproto.
Reported-at: https://bugzilla.redhat.com/2170920
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
When the a revalidator thread is updating statistics for an XC_LEARN
xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn.
The revalidator will update stats for rules even if they are in a
removed state or marked as invisible. However, ofproto_flow_mod_learn
will detect if a flow has been removed and re-add it in that case. This
can result in an old learn action replacing the new learn action that
had replaced it in the first place.
This change adds a new last_used parameter to ofproto_flow_mod_learn
allowing the caller to provide a timestamp that will be fed into the
learned rule's modified time. The provided timestamp should be the time
of the last packet activity. If last_used is not set then the current
time is used, as is the current behaviour.
This change also adds a check when replacing a learned rule to favour the
newest rule.
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2213892
Signed-off-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This patch adds an assortment of `ovs_assert` statements to check for
null pointers. We use assertions since it should be impossible for any
of these pointers to be NULL.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This commit adds zero-initializations by changing `SFL_ALLOC` from
`malloc` to `xzalloc`, adding a `memset` call to `sflAlloc`,
initializing a `pollfd` struct variable with zeroes, and changing some
calls to `xmalloc` to `xzalloc`. This is to prevent potential data leaks
or undefined behavior from potentially uninitialized variables.
Some variables would always be initialized by either the code flow or
the compiler. Thus, some of the associated Coverity reports might be
false positives. That said, it is still considered best practice to
zero-initialize variables upfront just in case to ensure the overall
resilience and security of OVS, as long as they do not impact
performance-critical code. As a bonus, it would also make static
analyzer tools, such as Coverity, happy.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The patch avoids the extra allocation for nat_conn.
Currently, when doing NAT, the userspace conntrack will use an extra
conn for the two directions in a flow. However, each conn has actually
the two keys for both orig and rev directions. This patch introduces a
key_node[CT_DIRS] member as per Aaron's suggestion in the conn which
consists of a key, direction, and a cmap_node for hash lookup so
addressing the feedback received by the original patch [0].
With this adjustment, we also remove the assertion that connections in
the table are DEFAULT while updating connection state and/or removing
connections.
[0] https://patchwork.ozlabs.org/project/openvswitch/patch/20201129033255.64647-2-hepeng.0320@bytedance.com/
Reported-by: Michael Plato <michael.plato@tu-berlin.de>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-September/052065.html
Signed-off-by: Peng He <hepeng.0320@bytedance.com>
Co-authored-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Tested-by: Frode Nordahl <frode.nordahl@canonical.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
When a server becomes unstable due to system overloading or intermittent
partitioning, it may miss some heartbeats and then starts election with
a new term, which would disrupt the otherwise healthy cluster formed by
the rest of the healthy nodes. Such situation may exist for a long time
until the "flapping" server is shutdown or recovered completely, which
can severely impact the availability of the cluster. The pre-vote
mechanism introduced in the raft paper section 9.6 can prevent such
problems. This patch implements the pre-vote mechanism.
Note: during the upgrade, since the old version doesn't recognize the
new optional field in the vote rpc (and the ovsdb_parse_finish validates
that all fields in the jsonrpc are parsed), an error log may be noticed
on old nodes if an upgraded node happens to become candidate first and
vote for itself, and the vote request will be discarded. If this happens
before enough nodes complete the upgrade, the vote from the upgraded
node may not reach the quorum. This results in re-election, and any old
nodes should be able to vote and get elected as leader. So, in unlucky
cases there can be more leader elections happening during the upgrade.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
As reported by Ales when doing some OVN integration tests with OVS 3.2,
net/tap has broken L4 checksum offloads.
Fixes are pending on DPDK side.
Until they get in a LTS release used by OVS, disable those Tx offloads.
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Tests using mfex_fuzzy.py will fail on systems lacking the Python
cryptography library.
cryptography is not required, it is only imported in mfex_fuzzy.py to
silence some warnings when scapy tries to load some libraries.
Fixes: c3ed0bf34b8a ("tests/mfex: Silence Blowfish/CAST5 deprecation warnings.")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
autoconf 2.70 deprecated the AC_PROG_CC_C99 macro and the AC_PROG_CC
was recommended for use instead. However, older versions of that
suggested macro do not attempt enabling C99, so it is not a direct
replacement. Autoconf 2.69 and older are still widely used in many
distributions.
Another difference is that AC_PROG_CC attempts to enable C11 in new
versions of autoconf. But since we have CI jobs that check -std=c99
builds now, we can afford enabling C11 by default without risking
compatibility issues.
Fix a deprecation warning by using a new AC_PROG_CC macro with
autoconf 2.70+. AC_PROG_CC and AC_PROG_CC_C99 seems to produce the
same configuration script in autoconf 2.70+ anyway, so we're already
kind of using a new macro on systems with a new autoconf.
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
On slow systems or at high testsuite concurrency sending 256 packets
can take more than 10 seconds. This is causing expiration of one of
the flows and a subsequent test failure.
Use time warping instead to avoid the time dependency.
Fixes: 21410ff800cc ("dpif-netdev: Fix dpif_netdev_flow_put.")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2023-August/052623.html
Reported-by: Sangeetha Elumalai <sangeethaeng2017@gmail.com>
Reviewed-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
OVS currently sets RTE_MBUF_F_TX_IPV[46] flags in early stages of the
packet reception and keeps track of the IP packet type as the packet
goes through OVS pipeline.
When a packet leaves OVS and hits a DPDK driver, OVS may not request IP
checksum offloading but leaves one of this packet type flag in ol_flags.
The DPDK api describes that RTE_MBUF_F_TX_IPV4 must be set when
requesting some Tx offloads (like RTE_MBUF_F_TX_IPSUM,
RTE_MBUF_F_TX_TCP_CKSUM, .., RTE_MBUF_F_TX_TCP_SEG).
Even though setting RTE_MBUF_F_TX_IPV4 without requesting a Tx offload
is undefined, this can confuse some drivers (like net/iavf) which then
reads zeroed l2_len and l3_len and ends up dropping the packet.
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2231081
Fixes: 5d11c47d3ebe ("userspace: Enable IP checksum offloading by default.")
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Quoting Paul Ganssle, setuptools maintainer:
* The setuptools project has stopped maintaining all direct
invocations of setup.py years ago, and distutils is deprecated.
There are undoubtedly many ways that your setup.py-based system is
broken today, even if it's not failing loudly or obviously.
* Direct invocations of setup.py cannot bootstrap their own
dependencies, and so some CLI is necessary for dependency
management.
* The setuptools project no longer wants to provide any public CLI,
and will be actively removing the existing interface (though the
time scale for this is long).
* PEP 517, 518 and other standards-based packaging are the future of
the Python ecosystem and a lot of progress has been made on making
this upgrade seamless.
As described in the recommendations in the end of the article: `python3
setup.py sdist` should be replaced by `python3 -m build --sdist`.
Link: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html#summary
Signed-off-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
setup.py upload is now deprecated. When used, pypi.org returns an error:
Upload failed (400): Invalid value for blake2_256_digest. Error: Use
a valid, hex-encoded, BLAKE2 message digest.
Use twine which is the recommended replacement tool to upload on
pypi.org.
Link: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html#summary
Reported-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The python/build folder contents are completely unrelated to the ovs
python bindings. These files are only used during the build for various
subsystems (docs, man pages, code generation, etc.).
Having that folder in that location prevents from running:
cd python && python3 -m build
Which is a way to generate PEP517 compatible source archives and binary
wheel packages.
Rename that folder to ovs_build_helpers which is more explicit. Update
all imports accordingly.
Link: https://peps.python.org/pep-0517/
Link: https://pypi.org/project/build/
Signed-off-by: Robin Jarry <rjarry@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The 'len' of a netdev_flow_key initialized by netdev_flow_key_init()
is always zero, which may cause errors when cloning a netdev_flow_key
by netdev_flow_key_clone().
Currently the 'len' member of a netdev_flow_key initialized by
netdev_flow_key_init() is not used, so this error will not cause any
bad behavior for now.
Fixes: c82f496c3b69 ("dpif-netdev: Use unmasked key when adding datapath flows.")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Zhiqi Chen <chenzhiqi.123@bytedance.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
From: Antonin Bas <abas@vmware.com>
Since Open vSwitch 2.7, the max_len option has no effect, and the full
packet is always sent to controllers. This was confirmed with both the
kernel and netdev datapaths.
Reported-by: Antonin Bas <abas@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/295
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Antonin Bas <abas@vmware.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
tbl defaults to expecting table entries to be separated by tab
characters. However, commit 5a0e4aec1af5cf7741c490bce704577e51e536b9
converted these to spaces and inadvertently broke the rendering. Use
semicolons as separators instead; these are less prone to being broken
by tree-wide changes, and match the style used by
build-aux/extract-ofp-fields.
Fixes: 5a0e4aec1af5 ("treewide: Convert leading tabs to spaces.")
Reported-by: Lucas Nussbaum <lucas@debian.org>
Reported-at: https://bugs.debian.org/1042358
Co-authored-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Colin Watson <cjwatson@ubuntu.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Some systems may provide backtrace() in libc but for some reason
not provide any frames when attempting to use it.
On those systems the fatal_signal_init() function currently logs
this debug message: "Capturing of dummy backtrace has failed."
A consequence of this logging may be false negative test results.
Logging the fact that backtrace() does not work has limited value
on a production system and I propose we drop it.
Fixes: 759a29dc2d97 ("backtrace: Extend the backtrace functionality.")
Reported-at: https://launchpad.net/bugs/2032623
Acked-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>