mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-29 13:27:59 +00:00

Author	SHA1	Message	Date
Ilya Maximets	1dbc3b9f34	drop-stats.at: Fix frequent failures of the recursion too deep test. The test doesn't wait for old flows being revalidated before sending the second packet. The packet hits old flows and doesn't increase the new drop counter as a result. Solution is to wait for revalidators to clean up old flows. This fixes frequent test failures on CirrusCI. Fixes: a13a0209750c ("userspace: Improved packet drop statistics.") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-06-24 20:29:44 +02:00
Frode Nordahl	88e3ae5d6f	ofproto-dpif-xlate: Fix internal CT state for non-recirc traffic. In some circumstances a flow may get its ct_state set without conscious intervention by the OVS user space code. Commit 355fef6f2ccbc optimizes out unnecessary ct_clear actions based on an internal struct xlate_ctx->conntracked state flag. Before this commit the xlate_ctx->conntracked state flag would be initialized to 'false' and only set during thawing for recirculation. This patch checks the flow ct_state for the non-recirc case and sets the internal conntracked state appropriately. A system traffic test is also added to avoid regression. Fixes: 355fef6f2ccbc ("ofproto-dpif-xlate: Avoid successive ct_clear datapath actions.") Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-06-07 15:49:26 +02:00
Aaron Conole	ca44218515	classifier: Adjust segment boundary to execute prerequisite processing. During flow processing, the flow wildcards are checked as a series of stages, and these stages are intended to carry dependencies in a single direction. But when the neighbor discovery processing, for example, was executed there is an incorrect dependency chain - we need fields from stage 4 to determine whether we need fields from stage 3. We can build a set of flow rules to demonstrate this: table=0,priority=100,ipv6,ipv6_src=1000::/10 actions=resubmit(,1) table=0,priority=0 actions=NORMAL table=1,priority=110,ipv6,ipv6_dst=1000::3 actions=resubmit(,2) table=1,priority=100,ipv6,ipv6_dst=1000::4 actions=resubmit(,2) table=1,priority=0 actions=NORMAL table=2,priority=120,icmp6,nw_ttl=255,icmp_type=135,icmp_code=0,nd_sll=10🇩🇪ad:be:ef:10 actions=NORMAL table=2,priority=100,tcp actions=NORMAL table=2,priority=100,icmp6 actions=NORMAL table=2,priority=0 actions=NORMAL With this set of flows, any IPv6 packet that executes through this pipeline will have the corresponding nd_sll field flagged as required match for classification even if that field doesn't make sense in such a context (for example, TCP packets). When the corresponding flow is installed into the kernel datapath, this field is not reflected when the revalidator executes the dump stage (see net/openvswitch/flow_netlink.c for more details). During the sweep stage, revalidator will compare the dumped WC with a generated WC - these will mismatch because the generated WC will match on the Neighbor Discovery fields, while the datapath WC will not match on these fields. We will then invalidate the flow and as a side effect force an upcall. By redefining the boundary, we shift these fields to the l4 subtable, and cause masks to be generated matching just the requisite fields. The list of fields being shifted: struct in6_addr nd_target; struct eth_addr arp_sha; struct eth_addr arp_tha; ovs_be16 tcp_flags; ovs_be16 pad2; struct ovs_key_nsh nsh; A standout field would be tcp_flags moving from l3 subtable matches to the l4 subtable matches. This reverts a partial performance optimization in the case of stateless firewalling. The tcp_flags field might have been a good candidate to retain in the l3 segment, but it got overloaded with ICMPv6 ND matching, and therefore we can't preserve this kind of optimization. Two other approaches were considered - moving the nd_target field alone and collapsing the l3/l4 segments into a single subtable for matching. Moving any field individually introduces ABI mismatch, and doesn't completely address the problems with other neighbor discovery related fields (such as nd_sll/nd_tll). Collapsing the two subtables creates an issue with datapath flow explosion, since the l3 and l4 fields will be unwildcarded together (this can be seen with some of the existing classifier tests). A simple test is added to showcase the behavior. Fixes: 476f36e83bc5 ("Classifier: Staged subtable matching.") Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2081773 Reported-by: Numan Siddique <nusiddiq@redhat.com> Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Aaron Conole <aconole@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Tested-by: Numan Siddique <numans@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-06-07 15:17:52 +02:00
Peng He	071b802c61	checkpatch.py: Add checks for easy-to-misuse APIs. Signed-off-by: Peng He <hepeng.0320@bytedance.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-30 23:39:56 +02:00
Peng He	c67941e974	ovs-rcu: Add ovsrcu_barrier. ovsrcu_barrier will block the current thread until all the postponed rcu job has been finished. it's like a OVS version of the Linux kernel rcu_barrier(). Signed-off-by: Peng He <hepeng.0320@bytedance.com> Co-authored-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-30 23:34:39 +02:00
Lin Huang	ba462b3589	dpif-netdev: Fix ALB 'rebalance_intvl' max hard limit. Currently the pmd-auto-lb-rebal-interval's value was not been checked properly. It maybe a negative, or too big value (>2 weeks between rebalances), which will be lead to a big unsigned value. So reset it to default if the value exceeds the max permitted as described in vswitchd.xml. Fixes: 5bf84282482a ("Adding support for PMD auto load balancing") Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-30 23:28:22 +02:00
Ilya Maximets	04e5adfedd	ovsdb: raft: Fix transaction double commit due to lost leadership. While becoming a follower, the leader aborts all the current 'in-flight' commands, so the higher layers can re-try corresponding transactions when the new leader is elected. However, most of these commands are already sent to followers as append requests, hence they will actually be committed by the majority of the cluster members, i.e. will be treated as committed by the new leader, unless there is an actual network problem between servers. However, the old leader will decline append replies, since it's not the leader anymore and commands are already completed with RAFT_CMD_LOST_LEADERSHIP status. New leader will replicate the commit index back to the old leader. Old leader will re-try the previously "failed" transaction, because "cluster error"s are temporary. If a transaction had some prerequisites that didn't allow double committing or there are other database constraints (like indexes) that will not allow a transaction to be committed twice, the server will reply to the client with a false-negative transaction result. If there are no prerequisites or additional database constraints, the server will execute the same transaction again, but as a follower. E.g. in the OVN case, this may result in creation of duplicated logical switches / routers / load balancers. I.e. resources with the same non-indexed name. That may cause issues later where ovn-nbctl will not be able to add ports to these switches / routers. Suggested solution is to not complete (abort) the commands, but allow them to be completed with the commit index update from a new leader. It the similar behavior to what we do in order to complete commands in a backward scenario when the follower becomes a leader. That scenario was fixed by commit 5a9b53a51ec9 ("ovsdb raft: Fix duplicated transaction execution when leader failover."). Code paths for leader and follower inside the raft_update_commit_index were very similar, so they were refactored into one, since we also needed an ability to complete more than one command for a follower. Failure test added to exercise scenario of a leadership transfer. Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.") Fixes: 3c2d6274bcee ("raft: Transfer leadership before creating snapshots.") Reported-at: https://bugzilla.redhat.com/2046340 Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-26 11:43:53 +02:00
Ilya Maximets	e8f557df33	sha1: Use implementation from openssl if available. Implementation of SHA1 in OpenSSL library is much faster and optimized for all available CPU architectures and instruction sets. OVS should use it instead of internal implementation if possible. Depending on compiler options OpenSSL's version finishes our sha1 unit tests from 3 to 12 times faster. Performance of OpenSSL's version is constant, but OVS's implementation highly depends on compiler. Interestingly, default build with '-g -O2' works faster than optimized '-march=native -Ofast'. Tests with ovsdb-server on big databases shows ~5-10% improvement of the time needed for database compaction (sha1 is only a part of this operation), depending on compiler options. We still need internal implementation, because OpenSSL can be not available on some platforms. Tests enhanced to check both versions of API. Reviewed-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-26 11:43:53 +02:00
Aaron Conole	7b3a4c2e86	Revert "odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP." This reverts commit c645550bb249 ("odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP.") Always forcing a slow path action can result in some over-broad flows which swallow all traffic and force them to userspace, as reported in the thread at https://mail.openvswitch.org/pipermail/ovs-dev/2021-September/387706.html and at https://mail.openvswitch.org/pipermail/ovs-dev/2021-September/387689.html Revert the ODP_FIT_TOO_LITTLE return for IGMP specifically. Additionally, remove the userspace wc mask to prevent revalidator from cycling flows. Fixes: c645550bb249 ("odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP.") Signed-off-by: Aaron Conole <aconole@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-26 11:43:48 +02:00
Kumar Amber	738c76a503	dpcls: Change info-get function to fetch dpcls usage stats. Modified the dplcs info-get command output to include the count for different dpcls implementations. $ovs-appctl dpif-netdev/subtable-lookup-info-get Available dpcls implementations: autovalidator (Use count: 1, Priority: 5) generic (Use count: 0, Priority: 1) avx512_gather (Use count: 0, Priority: 3) Test case to verify changes: 1061: PMD - dpcls configuration ok Signed-off-by: Kumar Amber <kumar.amber@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com> Co-authored-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-05-24 09:53:18 +01:00
Rosemarie O'Riorden	da9424ad07	tests: Properly kill ovsdb test processes. The FreeBSD CI builds keep failing because processes of tests are not properly killed. This leaves the build hanging until it times out, and ultimately fails. Changes name of pidfile pid2 to 2.pid so that the on_exit 'kill `cat .pid`' will capture all pidfiles. Fixes pidfile name logic in test that uses OVSDB_SERVER_SHUTDOWN_N, so that all pidfile names match the form .pid. Replaces unnecessary --pidfile="`pwd`"/pid with just --pidfile, because by default this argument creates a pidfile named <proc-name>.pid. Removes extra [test ! -e pid \|\| kill `cat pid`] that run upon AT_CHECK failure, because those processes are killed with on_exit. Also adds on_exit in tests where it was missing. Fixes: 561205007e17 ("tests: Get rid of overly specific --pidfile and --unixctl options.") Fixes: 0be15ad76f0f ("ovsdb-server.at: Add unit test for record/replay.") Fixes: 7964ffe7d2bf ("ovsdb: relay: Add support for transaction forwarding.") Fixes: e879d33e8398 ("ovsdb/jsonrpc-server: ovsdb-server closes accepted connections immediately.") Reported-at: https://github.com/cirruslabs/cirrus-ci-docs/issues/910 Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-17 23:50:42 +02:00
Dumitru Ceara	d7c0b90fa3	ci: Add UB Sanitizer. Note: There still is an UB instance when using SSE4.2 as reported here: https://mail.openvswitch.org/pipermail/ovs-dev/2022-January/390904.html Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-17 23:10:41 +02:00
Dumitru Ceara	933aaf9444	ofp-actions: Ensure aligned accesses to masked fields. For example is parsing the OVN "eth.dst[40] = 1;" action, which triggered the following warning from UndefinedBehaviorSanitizer: lib/meta-flow.c:3210:9: runtime error: member access within misaligned address 0x000000de4e36 for type 'const union mf_value', which requires 8 byte alignment 0x000000de4e36: note: pointer points here 00 00 00 00 01 00 00 00 00 00 00 00 00 00 70 4e de 00 00 00 00 00 ^ 10 51 de 00 00 00 00 00 c0 4f 0 0x5818bc in mf_format lib/meta-flow.c:3210 1 0x5b6047 in format_SET_FIELD lib/ofp-actions.c:3342 2 0x5d68ab in ofpact_format lib/ofp-actions.c:9213 3 0x5d6ee0 in ofpacts_format lib/ofp-actions.c:9237 4 0x410922 in test_parse_actions tests/test-ovn.c:1360 To avoid this we now change the internal representation of the set_field actions, struct ofpact_set_field, such that the mask is always stored at a correctly aligned address, multiple of OFPACT_ALIGNTO. We also need to adapt the "ovs-ofctl show-flows - Oversized flow" test because now the ofpact representation of the set_field action uses more bytes in memory (for the extra alignment). Change the test to use dec_ttl instead. Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-17 23:09:50 +02:00
Dumitru Ceara	3764f5188a	treewide: Fix invalid bit shift operations. UB Sanitizer reports: tests/test-hash.c:59:40: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' 0 0x44c3c9 in get_range128 tests/test-hash.c:59 1 0x44cb2e in check_hash_bytes128 tests/test-hash.c:178 2 0x44d14d in test_hash_main tests/test-hash.c:282 [...] ofproto/ofproto-dpif-xlate.c:5607:45: runtime error: left shift of 65535 by 16 places cannot be represented in type 'int' 0 0x53fe9f in xlate_sample_action ofproto/ofproto-dpif-xlate.c:5607 1 0x54d625 in do_xlate_actions ofproto/ofproto-dpif-xlate.c:7160 2 0x553b76 in xlate_actions ofproto/ofproto-dpif-xlate.c:7806 3 0x4fcb49 in upcall_xlate ofproto/ofproto-dpif-upcall.c:1237 4 0x4fe02f in process_upcall ofproto/ofproto-dpif-upcall.c:1456 5 0x4fda99 in upcall_cb ofproto/ofproto-dpif-upcall.c:1358 [...] tests/test-util.c:89:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' 0 0x476415 in test_ctz tests/test-util.c:89 [...] lib/dpif-netlink.c:396:33: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' 0 0x571b9f in dpif_netlink_open lib/dpif-netlink.c:396 Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-17 23:06:46 +02:00
Ilya Maximets	8c506d3725	ofp-monitor: Fix abort on malformed flow update event. nx_to_ofp_flow_update_event() aborts the execution if incorrect event is passed, so checking has to be done before conversion in order to avoid the crash while decoding malformed flow update message: ==397030==ERROR: AddressSanitizer: ABRT on unknown address 0x... ) 0 0x7fd26688418b in raise 1 0x7fd266863858 in abort 2 0x6a6cbd in nx_to_ofp_flow_update_event lib/ofp-monitor.c:399:9 3 0x6a6cbd in ofputil_decode_flow_update lib/ofp-monitor.c:856:25 4 0x56491d in ofp_print_flow_monitor_reply lib/ofp-print.c:779:22 5 0x55f0a0 in ofp_to_string__ lib/ofp-print.c:1154:16 6 0x55f0a0 in ofp_to_string lib/ofp-print.c:1244:21 7 0x5603a5 in ofp_print lib/ofp-print.c:1288:28 Credit to OSS-Fuzz. Additionally removed the extra 'reply' word from the error message, since ofpraw_get_name(raw) already has one. Fixes: c3e64047d1cc ("ofp-monitor: Support flow monitoring for OpenFlow 1.3, 1.4+.") Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=47112 Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-04 18:39:24 +02:00
Vasu Dasari	c3e64047d1	ofp-monitor: Support flow monitoring for OpenFlow 1.3, 1.4+. Extended OpenFlow monitoring support * OpenFlow 1.3 with ONF extensions * OpenFlow 1.4+ as defined in OpenFlow specification 1.4+. ONF extensions are similar to Nicira extensions except for onf_flow_monitor_request{} where out_port is defined as 32-bit number OF(1.1) number, oxm match formats are used in update and request messages. Flow monitoring support in 1.4+ is slightly different from Nicira and ONF extensions. * More flow monitoring flags are defined. * Monitor add/modify/delete command is introduced in flow_monitor request message. * Addition of out_group as part of flow_monitor request message Description of changes: 1. Generate ofp-msgs.inc to be able to support 1.3, 1.4+ flow Monitoring messages. include/openvswitch/ofp-msgs.h 2. Modify openflow header files with protocol specific headers. include/openflow/openflow-1.3.h include/openflow/openflow-1.4.h 3. Modify OvS abstraction of openflow headers. ofp-monitor.h leverages enums from on nicira extensions for creating protocol abstraction headers. OF(1.4+) enums are superset of nicira extensions. include/openvswitch/ofp-monitor.h 4. Changes to these files reflect encoding and decoding of new protocol messages. lib/ofp-monitor.c 5. Changes to modules using ofp-monitor APIs. Most of the changes here are to migrate enums from nicira to OF 1.4+ versions. ofproto/connmgr.c ofproto/connmgr.h ofproto/ofproto-provider.h ofproto/ofproto.c 6. Extended protocol decoding tests to verify all protocol versions FLOW_MONITOR_CANCEL FLOW_MONITOR_PAUSED FLOW_MONITOR_RESUMED FLOW_MONITOR request FLOW_MONITOR reply tests/ofp-print.at 7. Modify flow monitoring tests to be able executed by all protocol versions. tests/ofproto.at 7. Modified documentation highlighting the change utilities/ovs-ofctl.8.in NEWS Signed-off-by: Vasu Dasari <vdasari@gmail.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-June/383915.html Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-28 21:27:11 +02:00
Vasu Dasari	d8ab75cd69	ofp-monitor: Extend Flow Monitoring support for OF 1.0-1.2 with Nicira Extensions. Currently OVS supports flow-monitoring for OpenFlow 1.0 and Nicira Extenstions. Any other OpenFlow versioned messages are not accepted. This change will allow OpenFlow1.0-1.2 Flow Monitoring with Nicira extensions be accepted. Also made sure that flow-monitoring updates, flow monitoring pause messages, resume messages are sent in the same OpenFlow version as that of flow-monitor request. Description of changes: 1. Generate ofp-msgs.inc to be able to support 1.0-1.2 Flow Monitoring messages. include/openvswitch/ofp-msgs.h 2. Support vconn to accept user specified version and use it for vconn flow-monitoring session ofproto/ofproto.c 3. Modify APIs to use protocol as an argument to encode and decode messages include/openvswitch/ofp-monitor.h lib/ofp-monitor.c ofproto/connmgr.c ofproto/connmgr.h ofproto/ofproto.c 4. Modified following testcases to be verified across supported OF Versions ofproto - flow monitoring ofproto - flow monitoring with !own ofproto - flow monitoring with out_port ofproto - flow monitoring pause and resume ofproto - flow monitoring usable protocols tests/ofproto.at 5. Updated NEWS with the support added with this commit Signed-off-by: Vasu Dasari <vdasari@gmail.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-December/050820.html Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-28 21:26:40 +02:00
Dumitru Ceara	d94cd0d3ee	ovsdb-idl: Support write-only-changed IDL monitor mode. At a first glance, change tracking should never be allowed for write-only columns. However, some clients (e.g., ovn-northd) that are mostly exclusive writers of a database, use change tracking to avoid duplicating the IDL row records into a local cache when implementing incremental processing. The default behavior of the IDL is to automatically turn a write-only column into a read-write column whenever the client enables change tracking for that column. For the afore mentioned clients, this becomes a performance issue. Commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that don't change a column's value.") explains why writes that don't change a column's value cannot be optimized out early if the column is read/write. Furthermore, if there is at least one record in any table that changed during a transaction, then all records that have been written are added to the transaction, even if their values didn't change. If there are many such rows (e.g., like in ovn-northd's case) this incurs a significant overhead because: a. the client has to build this large transaction b. the transaction has to be sent over the network c. the server needs to parse this (mostly) no-op update We now introduce new IDL APIs allowing users to set a new monitoring mode flag, OVSDB_IDL_WRITE_CHANGED_ONLY, to indicate to the IDL that the atomicity constraints may be relaxed and written columns that don't change value can be skipped from the current transaction. We benchmarked ovn-northd performance when using this new mode against NB and SB databases taken from ovn-kubernetes scale tests. We noticed that when a minor change is performed to the Northbound database (e.g., NB_Global.nb_cfg is incremented) the time it takes to build the Southbound transaction becomes negligible (vs ~1.5 seconds before this change). End-to-end ovn-kubernetes scale tests on 120-node clusters also show significant reduction of latency to bring up pods; both average and P99 latency decreased by ~30%. Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-28 16:57:43 +02:00
Thilak Raj Surendra Babu	c1c8cb8a18	ofproto-dpif-xlate: Clear out vlan flow fields while processing native tunnel. When a packet is received over an access port that needs to be sent over a vxlan tunnel,the access port VLAN id is used in the lookup leading to a wrong packet being crafted and sent over the tunnel. Clear out the flow 's VLAN field as it should not be used while performing mac lookup for the outer tunnel and also at this point the VLAN action related to inner flow is already committed. Fixes: 7c12dfc527a5 ("tunneling: Avoid datapath-recirc by combining recirc actions at xlate.") Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393566.html Reported-at: https://bugzilla.redhat.com/2060552 Signed-off-by: Thilak Raj Surendra Babu <thilakraj.sb@nutanix.com> Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com> Co-authored-by: Rosemarie O'Riorden <roriorden@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-27 21:38:21 +02:00
Jan Scheurich	0e0eef533f	ofproto-xlate: Fix crash when forwarding packet between legacy_l3 tunnels. A packet received from a tunnel port with legacy_l3 packet-type (e.g. lisp, L3 gre, gtpu) is conceptually wrapped in a dummy Ethernet header for processing in an OF pipeline that is not packet-type-aware. Before transmission of the packet to another legacy_l3 tunnel port, the dummy Ethernet header is stripped again. In ofproto-xlate, wrapping in the dummy Ethernet header is done by simply changing the packet_type to PT_ETH. The generation of the push_eth datapath action is deferred until the packet's flow changes need to be committed, for example at output to a normal port. The deferred Ethernet encapsulation is marked in the pending_encap flag. This patch fixes a bug in the translation of the output action to a legacy_l3 tunnel port, where the packet_type of the flow is reverted from PT_ETH to PT_IPV4 or PT_IPV6 (depending on the dl_type) to remove its Ethernet header without clearing the pending_encap flag if it was set. At the subsequent commit of the flow changes, the unexpected combination of pending_encap == true with an PT_IPV4 or PT_IPV6 packet_type hit the OVS_NOT_REACHED() abortion clause. The pending_encap is now cleared in this situation. Reported-by: Dincer Beken <dbeken@blackned.de> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Dincer Beken <dbeken@blackned.de> Signed-off-by: Dincer Beken <dbeken@blackned.de> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-27 00:32:38 +02:00
Paolo Valerio	4ea1bb6391	system-traffic: Fix fragment reassembly with L3 L4 protocol information. The test relied on the flows installed by recv_upcalls() after upcall_receive() returned ENODEV if the packet was initially originated by packet-out with OFPP_CONTROLLER as in_port. Since 323ae1e808e6 ("ofproto-dpif-xlate: Fix recirculation when in_port is OFPP_CONTROLLER.") the test stopped working because recirculation in such scenario got fixed and upcall_receive() no longer returns ENODEV. Fix it by setting an invalid as "in_port" in order to similarly trigger the same behavior. Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-26 21:55:45 +02:00
Kevin Traynor	3b18b8656b	alb.at: Add tests for cross-numa polling. PMD auto load balance currently only operates when the polling PMD thread core will not change numa after reassignment. Add a unit test for this. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-04 22:52:12 +02:00
Kevin Traynor	cdc9a196b1	pmd.at: Add tests for multi non-local numa pmds. Ensure that if there are no local numa PMD thread cores available that pmd cores from all other non-local numas will be used. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-04 22:52:12 +02:00
Paolo Valerio	0027b3b46c	ofproto-dpif-xlate: Fix NULL pointer dereference in xlate_normal(). Considering the following flows: ovs-ofctl dump-flows br0 cookie=0x0, table=0, priority=0 actions=NORMAL and assuming a packet originated from packet-out in this way: ovs-ofctl packet-out br0 \ "in_port=controller,packet=<UDP packet>,action=ct(table=0)" If in_port is OFPP_NONE or OFPP_CONTROLLER, this leads to a NULL pointer (xport) dereference in xlate_normal(). Fix it by checking the xport pointer validity while deciding whether it is a candidate for mac learning or not. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-04-04 21:29:37 +02:00
Adrian Moreno	745c80f52c	hindex: remove the next variable in safe loops. Using SHORT version of the _SAFE loops makes the code cleaner and less error prone. So, use the SHORT version and remove the extra variable when possible for HINDEX__SAFE. In order to be able to use both long and short versions without changing the name of the macro for all the clients, overload the existing name and select the appropriate version depending on the number of arguments. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:03 +02:00
Adrian Moreno	2d40277382	hindex: use multi-variable iterators. Re-write hindex's loops using multi-variable helpers. For safe loops, use the LONG version to maintain backwards compatibility. Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:03 +02:00
Adrian Moreno	ef39616486	cmap: use multi-variable iterators. Re-write cmap's loops using multi-variable helpers. For iterators based on cmap_cursor, we just need to make sure the NODE variable is not used after the loop, so we set it to NULL. Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Adrian Moreno	9e56549c2b	hmap: use short version of safe loops if possible. Using SHORT version of the *_SAFE loops makes the code cleaner and less error prone. So, use the SHORT version and remove the extra variable when possible for hmap and all its derived types. In order to be able to use both long and short versions without changing the name of the macro for all the clients, overload the existing name and select the appropriate version depending on the number of arguments. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Adrian Moreno	860e69a8c3	hmap: implement UB-safe hmap pop iterator. HMAP_FOR_EACH_POP iterator has an additional difficulty, which is the use of two iterator variables of different types. In order to re-write this loop in a UB-safe manner, create a iterator struct to be used as loop variable. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Adrian Moreno	9e8d960a6b	hmap: use multi-variable helpers for hmap loops. Rewrite hmap's loops using multi-variable helpers. For SAFE loops, use the LONG version of the multi-variable macros to keep backwards compatibility. Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Adrian Moreno	e9bf5bffb0	list: use short version of safe loops if possible. Using the SHORT version of the *_SAFE loops makes the code cleaner and less error-prone. So, use the SHORT version and remove the extra variable when possible. In order to be able to use both long and short versions without changing the name of the macro for all the clients, overload the existing name and select the appropriate version depending on the number of arguments. Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Adrian Moreno	d4566085ed	list: use multi-variable helpers for list loops. Use multi-variable iteration helpers to rewrite non-safe loops. There is an important behavior change compared with the previous implementation: When the loop ends normally (i.e: not via "break;"), the object pointer provided by the user is NULL. This is safer because it's not guaranteed that it would end up pointing a valid address. For pop iterator, set the variable to NULL when the loop ends. Clang-analyzer has successfully picked the potential null-pointer dereference on the code that triggered this change (bond.c) and nothing else has been detected. For _SAFE loops, use the LONG version for backwards compatibility. Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-30 16:59:02 +02:00
Ilya Maximets	9d86459516	system-traffic.at: Fix flaky DNAT load balancing test. 'conntrack - DNAT load balancing' test fails from time to time because not all the group buckets are getting hit. In short, the test creates a group with 3 buckets with the same weight. It creates 12 TCP sessions and expects that each bucket will be used at least once. However, there is a solid chance that this will not happen. The probability of having at least one empty bucket is: C(3, 1) x (2/3)^N - C(3, 2) x (1/3)^N Where N is the number of distinct TCP sessions. For N=12, the probability is about 0.023, i.e. there is a 2.3% chance for a test to fail, which is not great for CI. Increasing the number of sessions to 50 to reduce the probability of failure down to 4.7e-9. In my testing, the configuration with 50 TCP sessions didn't fail after 6000 runs. Should be good enough for CI systems. Fixes: 2c66ebe47a88 ("ofp-actions: Allow conntrack action in group buckets.") Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Michael Phelan <michael.phelan@intel.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-25 20:30:37 +01:00
Ilya Maximets	2e2217c126	tests: Fix incorrect usage of OVS_WAIT_UNTIL. OVS_WAIT_UNTIL() macro has only 2 arguments and doesn't check the output of the command, but bonding and route tests are trying to use it as if it was AT_CHECK macro. That makes checks in those tests mostly useless, since they are not actually checking anything except for command returning zero. Introducing a new macro OVS_WAIT_UNTIL_EQUAL that will actually perform the comparison with the desired output. Using it for the bonding and route tests and fixing all the caught incorrect expected outputs along the way. Adding an explicit argument check to the OVS_WAIT_UNTIL/WHILE to avoid the problem in the future. Fixes: b4e50218a0f8 ("bond: Add 'primary' interface concept for active-backup mode.") Fixes: 9e11517e6ca6 ("ovs-router: Fix flushing of local routes.") Acked-by: Aaron Conole <aconole@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-22 22:09:34 +01:00
Eelco Chaudron	31b467a751	odp-util: Fix output for tc to be equal to kernel. When the same flow is programmed in the kernel and tc, they look different due to the way they are translated. They take the userspace approach by always including the packet type attribute. To make the outputs the same, show the ethernet header when the packet type is wildcarded, and not printed. So without the fix the kernel would show (ovs-appctl dpctl/dump-flows): in_port(3),eth(),eth_type(0x0800),ipv4(frag=no), ..., actions:output Where as TC would show: in_port(3),eth_type(0x0800),ipv4(frag=no), ..., actions:output Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-21 00:37:41 +01:00
Kumar Amber	c44876b9e4	system-dpdk: Fix mfex autovalidator tests. AVX512 DPIF must be active in order for the MFEX AutoValidator to be executed. If the DPIF-AVX512 is not available, the unit test is skipped, as the scalar DPIF does not use the MFEX function-pointer based optimizations. Fixes: 50be6715c083 ("test/sytem-dpdk: Add unit test for mfex autovalidator") Suggested-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Signed-off-by: Kumar Amber <kumar.amber@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-11 21:17:48 +01:00
Dumitru Ceara	b1e783dde4	tests: Ignore log about failing to set NETLINK_EXT_ACK. Since 4a6a4734622e ("netlink-socket: Log extack error messages in netlink transactions."), tests fail on older systems that don't support NETLINK_EXT_ACK. It's not really an issue, so we can just ignore the log. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-11 21:14:51 +01:00
Ilya Maximets	8d480c5cec	ovsdb-cluster.at: Avoid test failures due to different hashing. Depending on compiler flags and CPU architecture different hash function are used. That impacts the order of tables and columns in database representation making ovsdb report different columns in the warning about ephemeral-to-persistent conversion. Stripping out changing parts of the message to avoid the issue. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-11 21:13:33 +01:00
Flavio Leitner	7ed60839d0	system-tso: Skip encap tests when userspace TSO is enabled. It seems Linux native tunnel configuration changed to enable checksum by default and that causes the check-system-tso unit test below to fail: 10: datapath - ping over vxlan tunnel FAILED (system-traffic.at:248) That happens because userspace TSO doesn't support encapsulation as mentioned in the current documentation. In this specific case, udp_extract_tnl_md() checks if the checksum is correct, but since TSO is enabled, the outer UDP header contains only the pseudo checksum and not the full packet checksum. Although the packet is marked correctly with UDP csum offload flag and the code could use that to verify the pseudo csum, more work is needed to properly translate the offloading flags from the outer headers to the inner headers. For example, if the payload is a TCP packet, most probably the flag DP_PACKET_OL_TX_UDP_CKSUM doesn't make sense after decapsulating that. This patch skips the tunnel tests when the userspace TSO is enabled. Fixes: 29bb3093eb8b ("userspace: Enable TSO support for non-DPDK.") Signed-off-by: Flavio Leitner <fbl@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-04 19:51:35 +01:00
Ilya Maximets	a3e97b1af1	ovsdb: relay: Add transaction history support. Even though relays can be scaled to the big number of servers to handle a lot more clients, lack of transaction history may cause significant load if clients are re-connecting. E.g. in case of the upgrade of a large-scale OVN deployment, relays can be taken down one by one forcing all the clients of one relay to jump to other ones. And all these clients will download the database from scratch from a new relay. Since relay itself supports monitor_cond_since connection to the main cluster, it receives the last transaction id along with each update. Since these transaction ids are 'eid's of actual transactions, they can be used by relay for a transaction history. Relay may not receive all the transaction ids, because the main cluster may combine several changes into a single monitor update. However, all relays will, likely, receive same updates with the same transaction ids, so the case where transaction id can not be found after re-connection between relays should not be very common. If some id is missing on the relay (i.e. this update was merged with some other update and newer id was used) the client will just re-download the database as if there was a normal transaction history miss. OVSDB client synchronization module updated to provide the last transaction id along with the update. Relay module updated to use these ids as a transaction id. If ids are zero, relay decides that the main server doesn't support transaction ids and disables the transaction history accordingly. Using ovsdb_txn_replay_commit() instead of ovsdb_txn_propose_commit_block(), so transactions are added to the history. This can be done, because relays has no file storage, so there is no need to write anything. Relay tests modified to test both standalone and clustered database as a main server. Checks added to ensure that all servers receive the same transaction ids in monitor updates. Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-03-03 15:21:21 +01:00
Ilya Maximets	999ba294fb	ovsdb: raft: Fix inability to join the cluster after interrupted attempt. If the joining server re-connects while catching up (e.g. if it crashed or connection got closed due to inactivity), the data we sent might be lost, so the server will never reply to append request or a snapshot installation request. At the same time, leader will decline all the subsequent requests to join from that server with the 'in progress' resolution. At this point the new server will never be able to join the cluster, because it will never receive the raft log while leader thinks that it was already sent. This happened in practice when one of the servers got preempted for a few seconds, so the leader closed connection due to inactivity. Destroying the joining server if disconnection detected. This will allow to start the joining from scratch when the server re-connects and sends the new join request. We can't track re-connection in the raft_conn_run(), because it's incoming connection and the jsonrpc will not keep it alive or try to reconnect. Next time the server re-connects it will be an entirely new raft conn. Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.") Reported-at: https://bugzilla.redhat.com/2033514 Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>	2022-02-25 14:15:12 +01:00
Ilya Maximets	6de8868d19	reconnect: Fix broken inactivity probe if there is no other reason to wake up. The purpose of reconnect_deadline__() function is twofold: 1. Its result is used to tell if the state has to be changed right now in reconnect_run(). 2. Its result also used to determine when the process need to wake up and call reconnect_run() for a next time, i.e. when the state may need to be changed next time. Since introduction of the 'receive-attempted' feature, the function returns LLONG_MAX if the deadline is in the future. That works for the first case, but doesn't for the second one, because we don't really know when we need to call reconnect_run(). This is the problem for applications where jsonrpc connection is the only source of wake ups, e.g. ovn-northd. When the network goes down silently, e.g. server looses IP address due to DHCP failure, ovn-northd will sleep in the poll loop indefinitely after being told that it doesn't need to call reconnect_run() (deadline == LLONG_MAX). Fixing that by actually returning the expected time if it is in the future, so we will know when to wake up. In order to keep the 'receive-attempted' feature, returning 'now + 1' in case where the time has already passed, but receive wasn't attempted. That will trigger a fast wake up, so the application will be able to attempt the receive even if there was no real events. In a correctly written application we should not fall into this case more than once in a row. '+ 1' ensures that we will not transition into a different state prematurely, i.e. before the receive is actually attempted. Fixes: 4241d652e465 ("jsonrpc: Avoid disconnecting prematurely due to long poll intervals.") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-02-24 17:04:32 +01:00
Paolo Valerio	989895501c	system-traffic.at: Avoid sporadic failures during conntrack IPv6 HTTP/FTP tests. Some sporadic false positive may be visible for the following tests: - conntrack - IPv6 HTTP - conntrack - FTP over IPv6 The failures show up randomly. The reason appears to be source address used when performing the request using wget: -tcp,orig=(src=fc00::1,dst=fc00::2,sport=<cleared>,dport=<cleared>),\ reply=(src=fc00::2,dst=fc00::1,sport=<cleared>,dport=<cleared>),\ protoinfo=(state=<cleared>) +tcp,orig=(src=fe80::f0eb:f8ff:fef0:138f,dst=fc00::2,sport=<cleared>,dport=<cleared>),\ reply=(src=fc00::2,dst=fe80::f0eb:f8ff:fef0:138f,sport=<cleared>,dport=<cleared>),\ protoinfo=(state=<cleared>) It seems that the problem can be addressed in multiple ways, but using "nodad" seems to be safe enough to fix the issue that now, after hundreds of attempts, is no longer present. Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-02-14 21:38:02 +01:00
Paolo Valerio	e969370d30	system-traffic.at: Do not use ranges with broadcast address. turn a bunch of test ranges from 10.1.1.240-10.1.1.255 to 10.1.1.240-10.1.1.254. 10.1.1.255 is the broadcast address for 10.1.1.0/24 and can be picked to SNAT packets causing the subsequent failure of the test. Fixes: 9ac0aadab9f9 ("conntrack: Add support for NAT.") Fixes: e32cd4c6292e ("conntrack: ignore port for ICMP/ICMPv6 NAT.") Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-02-14 20:49:39 +01:00
Kumar Amber	b9cf520705	system-dpdk.at: Add warning log in mfex fuzzy test. Some specific warning are seen on various systems which may not be visible on others but good to add such logs to test to avoid test-case failure. Thw warning only effects the fuzzy tests due to more than 1000+ flows being offloading simultanously. Wilcarding flow count number as for different systems under test the number could vary in the warning log. Suggested-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-02-11 11:42:17 +00:00
Adrian Moreno	f0a9000ca6	ofproto: Fix ipfix not always sampling on egress. We are currently requiring in_port to be a valid port number for ipfix sampling even if the sampling is done on the output port (egress). This restriction is not really needed and can affect pipelines that intentionally set the in_port to OFPP_NONE during flow processing. For instance, OVN does this, see: cfa547821 Fix ovn-controller generated packets from getting dropped for reject ACL action. This patch skips ipfix sampling only if both (ingress and egress) ports are invalid. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2016346 Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-02-09 16:02:49 +01:00
Martin Varghese	712202ff7d	ofproto-dpif-xlate: Fix packet drops with decap action on MPLS Multicast. Added PT_MPLS_MC support in function xlate_generic_decap_action to fix packet drops when decap action is applied on packets with packet_type (ns=1,type=0x8848). Fixes: 1917ace89364 ("Encap & Decap actions for MPLS packet type.") Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-31 21:27:06 +01:00
Martin Varghese	3ae3e86059	tests: Fix cosmetic errors in system-traffic.at. Removed extra lines in multiple encap decap mpls actions & encap decap mpls actions tests. Converted title of encap decap mpls actions tests to lowercase for consistency. Fixes: 1917ace89364 ("Encap & Decap actions for MPLS packet type.") Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-31 21:27:06 +01:00
Dumitru Ceara	c1691cceac	ovsdb-cs: Clear last_id on reconnect if condition changes in-flight. When reconnecting, if there are condition changes already sent to the server but not yet acked, reset the db's 'last-id', esentially clearing the local cache after reconnect. This is needed because the client cannot easily differentiate between the following cases: a. either the server already processed the requested monitor condition change but the FSM was restarted before the client was notified. In this case the client should clear its local cache because it's out of sync with the monitor view on the server side. b. OR the server hasn't processed the requested monitor condition change yet. Conditions changing at the same time with a reconnection happening are rare so the performance impact of this patch should be minimal. Also, the tests are updated to cover the fact that we cannot control which of the two scenarios ("a" and "b" above) are hit during the test. Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-31 21:23:47 +01:00
Ilya Maximets	9632f5551f	tests: Add de-serialization check to the json string benchmark. Since we're testing serialization, it also makes sense to test the opposite operation. Should be useful in the future for exploring possible optimizations. CMD: $ ./tests/ovstest json-string-benchmark Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com>	2022-01-31 21:15:25 +01:00

1 2 3 4 5 ...

3402 Commits