2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 13:27:59 +00:00

3402 Commits

Author SHA1 Message Date
Ilya Maximets
1dbc3b9f34 drop-stats.at: Fix frequent failures of the recursion too deep test.
The test doesn't wait for old flows being revalidated before sending
the second packet.  The packet hits old flows and doesn't increase the
new drop counter as a result.

Solution is to wait for revalidators to clean up old flows.  This fixes
frequent test failures on CirrusCI.

Fixes: a13a0209750c ("userspace: Improved packet drop statistics.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-06-24 20:29:44 +02:00
Frode Nordahl
88e3ae5d6f ofproto-dpif-xlate: Fix internal CT state for non-recirc traffic.
In some circumstances a flow may get its ct_state set without
conscious intervention by the OVS user space code.

Commit 355fef6f2ccbc optimizes out unnecessary ct_clear actions
based on an internal struct xlate_ctx->conntracked state flag.

Before this commit the xlate_ctx->conntracked state flag would
be initialized to 'false' and only set during thawing for
recirculation.

This patch checks the flow ct_state for the non-recirc case and
sets the internal conntracked state appropriately.  A system
traffic test is also added to avoid regression.

Fixes: 355fef6f2ccbc ("ofproto-dpif-xlate: Avoid successive ct_clear datapath actions.")
Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-06-07 15:49:26 +02:00
Aaron Conole
ca44218515 classifier: Adjust segment boundary to execute prerequisite processing.
During flow processing, the flow wildcards are checked as a series of
stages, and these stages are intended to carry dependencies in a single
direction.  But when the neighbor discovery processing, for example, was
executed there is an incorrect dependency chain - we need fields from
stage 4 to determine whether we need fields from stage 3.

We can build a set of flow rules to demonstrate this:
  table=0,priority=100,ipv6,ipv6_src=1000::/10 actions=resubmit(,1)
  table=0,priority=0 actions=NORMAL
  table=1,priority=110,ipv6,ipv6_dst=1000::3 actions=resubmit(,2)
  table=1,priority=100,ipv6,ipv6_dst=1000::4 actions=resubmit(,2)
  table=1,priority=0 actions=NORMAL
  table=2,priority=120,icmp6,nw_ttl=255,icmp_type=135,icmp_code=0,nd_sll=10🇩🇪ad:be:ef:10 actions=NORMAL
  table=2,priority=100,tcp actions=NORMAL
  table=2,priority=100,icmp6 actions=NORMAL
  table=2,priority=0 actions=NORMAL

With this set of flows, any IPv6 packet that executes through this pipeline
will have the corresponding nd_sll field flagged as required match for
classification even if that field doesn't make sense in such a context
(for example, TCP packets).  When the corresponding flow is installed into
the kernel datapath, this field is not reflected when the revalidator
executes the dump stage (see net/openvswitch/flow_netlink.c for more details).

During the sweep stage, revalidator will compare the dumped WC with a
generated WC - these will mismatch because the generated WC will match on
the Neighbor Discovery fields, while the datapath WC will not match on
these fields.  We will then invalidate the flow and as a side effect
force an upcall.

By redefining the boundary, we shift these fields to the l4 subtable, and
cause masks to be generated matching just the requisite fields.  The list
of fields being shifted:

    struct in6_addr nd_target;
    struct eth_addr arp_sha;
    struct eth_addr arp_tha;
    ovs_be16 tcp_flags;
    ovs_be16 pad2;
    struct ovs_key_nsh nsh;

A standout field would be tcp_flags moving from l3 subtable matches to
the l4 subtable matches.  This reverts a partial performance optimization
in the case of stateless firewalling.  The tcp_flags field might have
been a good candidate to retain in the l3 segment, but it got overloaded
with ICMPv6 ND matching, and therefore we can't preserve this kind of
optimization.

Two other approaches were considered - moving the nd_target field alone
and collapsing the l3/l4 segments into a single subtable for matching.
Moving any field individually introduces ABI mismatch, and doesn't
completely address the problems with other neighbor discovery related
fields (such as nd_sll/nd_tll).  Collapsing the two subtables creates
an issue with datapath flow explosion, since the l3 and l4 fields will
be unwildcarded together (this can be seen with some of the existing
classifier tests).

A simple test is added to showcase the behavior.

Fixes: 476f36e83bc5 ("Classifier: Staged subtable matching.")
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2081773
Reported-by: Numan Siddique <nusiddiq@redhat.com>
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Cian Ferriter <cian.ferriter@intel.com>
Tested-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-06-07 15:17:52 +02:00
Peng He
071b802c61 checkpatch.py: Add checks for easy-to-misuse APIs.
Signed-off-by: Peng He <hepeng.0320@bytedance.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-30 23:39:56 +02:00
Peng He
c67941e974 ovs-rcu: Add ovsrcu_barrier.
ovsrcu_barrier will block the current thread until all the postponed
rcu job has been finished. it's like a OVS version of the Linux
kernel rcu_barrier().

Signed-off-by: Peng He <hepeng.0320@bytedance.com>
Co-authored-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-30 23:34:39 +02:00
Lin Huang
ba462b3589 dpif-netdev: Fix ALB 'rebalance_intvl' max hard limit.
Currently the pmd-auto-lb-rebal-interval's value was not been
checked properly.

It maybe a negative, or too big value (>2 weeks between rebalances),
which will be lead to a big unsigned value. So reset it to default
if the value exceeds the max permitted as described in vswitchd.xml.

Fixes: 5bf84282482a ("Adding support for PMD auto load balancing")
Signed-off-by: Lin Huang <linhuang@ruijie.com.cn>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-30 23:28:22 +02:00
Ilya Maximets
04e5adfedd ovsdb: raft: Fix transaction double commit due to lost leadership.
While becoming a follower, the leader aborts all the current
'in-flight' commands, so the higher layers can re-try corresponding
transactions when the new leader is elected.  However, most of these
commands are already sent to followers as append requests, hence they
will actually be committed by the majority of the cluster members,
i.e. will be treated as committed by the new leader, unless there is
an actual network problem between servers.  However, the old leader
will decline append replies, since it's not the leader anymore and
commands are already completed with RAFT_CMD_LOST_LEADERSHIP status.

New leader will replicate the commit index back to the old leader.
Old leader will re-try the previously "failed" transaction, because
"cluster error"s are temporary.

If a transaction had some prerequisites that didn't allow double
committing or there are other database constraints (like indexes) that
will not allow a transaction to be committed twice, the server will
reply to the client with a false-negative transaction result.

If there are no prerequisites or additional database constraints,
the server will execute the same transaction again, but as a follower.

E.g. in the OVN case, this may result in creation of duplicated logical
switches / routers / load balancers.  I.e. resources with the same
non-indexed name.  That may cause issues later where ovn-nbctl will
not be able to add ports to these switches / routers.

Suggested solution is to not complete (abort) the commands, but allow
them to be completed with the commit index update from a new leader.
It the similar behavior to what we do in order to complete commands
in a backward scenario when the follower becomes a leader.  That
scenario was fixed by commit 5a9b53a51ec9 ("ovsdb raft: Fix duplicated
transaction execution when leader failover.").

Code paths for leader and follower inside the raft_update_commit_index
were very similar, so they were refactored into one, since we also
needed an ability to complete more than one command for a follower.

Failure test added to exercise scenario of a leadership transfer.

Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Fixes: 3c2d6274bcee ("raft: Transfer leadership before creating snapshots.")
Reported-at: https://bugzilla.redhat.com/2046340
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-26 11:43:53 +02:00
Ilya Maximets
e8f557df33 sha1: Use implementation from openssl if available.
Implementation of SHA1 in OpenSSL library is much faster and optimized
for all available CPU architectures and instruction sets.  OVS should
use it instead of internal implementation if possible.

Depending on compiler options OpenSSL's version finishes our sha1
unit tests from 3 to 12 times faster.  Performance of OpenSSL's
version is constant, but OVS's implementation highly depends on
compiler.  Interestingly, default build with  '-g -O2' works faster
than optimized '-march=native -Ofast'.

Tests with ovsdb-server on big databases shows ~5-10% improvement of
the time needed for database compaction (sha1 is only a part of this
operation), depending on compiler options.

We still need internal implementation, because OpenSSL can be not
available on some platforms.  Tests enhanced to check both versions
of API.

Reviewed-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-26 11:43:53 +02:00
Aaron Conole
7b3a4c2e86 Revert "odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP."
This reverts commit c645550bb249 ("odp-util: Always report
ODP_FIT_TOO_LITTLE for IGMP.")

Always forcing a slow path action can result in some over-broad
flows which swallow all traffic and force them to userspace, as reported
in the thread at
https://mail.openvswitch.org/pipermail/ovs-dev/2021-September/387706.html
and at
https://mail.openvswitch.org/pipermail/ovs-dev/2021-September/387689.html

Revert the ODP_FIT_TOO_LITTLE return for IGMP specifically.
Additionally, remove the userspace wc mask to prevent revalidator from
cycling flows.

Fixes: c645550bb249 ("odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP.")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-26 11:43:48 +02:00
Kumar Amber
738c76a503 dpcls: Change info-get function to fetch dpcls usage stats.
Modified the dplcs info-get command output to include
the count for different dpcls implementations.

$ovs-appctl dpif-netdev/subtable-lookup-info-get

Available dpcls implementations:
  autovalidator (Use count: 1, Priority: 5)
  generic (Use count: 0, Priority: 1)
  avx512_gather (Use count: 0, Priority: 3)

Test case to verify changes:
        1061: PMD - dpcls configuration     ok

Signed-off-by: Kumar Amber <kumar.amber@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com>
Co-authored-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2022-05-24 09:53:18 +01:00
Rosemarie O'Riorden
da9424ad07 tests: Properly kill ovsdb test processes.
The FreeBSD CI builds keep failing because processes of tests are not
properly killed. This leaves the build hanging until it times out, and
ultimately fails.

Changes name of pidfile pid2 to 2.pid so that the
on_exit 'kill `cat *.pid`' will capture all pidfiles.

Fixes pidfile name logic in test that uses OVSDB_SERVER_SHUTDOWN_N, so
that all pidfile names match the form *.pid.

Replaces unnecessary --pidfile="`pwd`"/pid with just --pidfile, because
by default this argument creates a pidfile named <proc-name>.pid.

Removes extra [test ! -e pid || kill `cat pid`] that run upon AT_CHECK
failure, because those processes are killed with on_exit. Also adds
on_exit in tests where it was missing.

Fixes: 561205007e17 ("tests: Get rid of overly specific --pidfile and --unixctl options.")
Fixes: 0be15ad76f0f ("ovsdb-server.at: Add unit test for record/replay.")
Fixes: 7964ffe7d2bf ("ovsdb: relay: Add support for transaction forwarding.")
Fixes: e879d33e8398 ("ovsdb/jsonrpc-server: ovsdb-server closes accepted connections immediately.")
Reported-at: https://github.com/cirruslabs/cirrus-ci-docs/issues/910
Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-17 23:50:42 +02:00
Dumitru Ceara
d7c0b90fa3 ci: Add UB Sanitizer.
Note: There still is an UB instance when using SSE4.2 as reported here:
https://mail.openvswitch.org/pipermail/ovs-dev/2022-January/390904.html

Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-17 23:10:41 +02:00
Dumitru Ceara
933aaf9444 ofp-actions: Ensure aligned accesses to masked fields.
For example is parsing the OVN "eth.dst[40] = 1;" action, which
triggered the following warning from UndefinedBehaviorSanitizer:

  lib/meta-flow.c:3210:9:
  runtime error: member access within misaligned address 0x000000de4e36
  for type 'const union mf_value', which requires 8 byte alignment
  0x000000de4e36: note: pointer points here
   00 00 00 00 01 00  00 00 00 00 00 00 00 00 70 4e de 00 00 00 00 00
               ^
   10 51 de 00 00 00 00 00  c0 4f

      0 0x5818bc in mf_format lib/meta-flow.c:3210
      1 0x5b6047 in format_SET_FIELD lib/ofp-actions.c:3342
      2 0x5d68ab in ofpact_format lib/ofp-actions.c:9213
      3 0x5d6ee0 in ofpacts_format lib/ofp-actions.c:9237
      4 0x410922 in test_parse_actions tests/test-ovn.c:1360

To avoid this we now change the internal representation of the set_field
actions, struct ofpact_set_field, such that the mask is always stored
at a correctly aligned address, multiple of OFPACT_ALIGNTO.

We also need to adapt the "ovs-ofctl show-flows - Oversized flow" test
because now the ofpact representation of the set_field action uses more
bytes in memory (for the extra alignment).  Change the test to use
dec_ttl instead.

Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-17 23:09:50 +02:00
Dumitru Ceara
3764f5188a treewide: Fix invalid bit shift operations.
UB Sanitizer reports:
  tests/test-hash.c:59:40:
  runtime error: shift exponent 64 is too large for 64-bit type
  'long unsigned int'
      0 0x44c3c9 in get_range128 tests/test-hash.c:59
      1 0x44cb2e in check_hash_bytes128 tests/test-hash.c:178
      2 0x44d14d in test_hash_main tests/test-hash.c:282
      [...]
  ofproto/ofproto-dpif-xlate.c:5607:45:
  runtime error: left shift of 65535 by 16 places cannot be represented
  in type 'int'
      0 0x53fe9f in xlate_sample_action ofproto/ofproto-dpif-xlate.c:5607
      1 0x54d625 in do_xlate_actions ofproto/ofproto-dpif-xlate.c:7160
      2 0x553b76 in xlate_actions ofproto/ofproto-dpif-xlate.c:7806
      3 0x4fcb49 in upcall_xlate ofproto/ofproto-dpif-upcall.c:1237
      4 0x4fe02f in process_upcall ofproto/ofproto-dpif-upcall.c:1456
      5 0x4fda99 in upcall_cb ofproto/ofproto-dpif-upcall.c:1358
      [...]
  tests/test-util.c:89:23:
  runtime error: left shift of 1 by 31 places cannot be represented in
  type 'int'
      0 0x476415 in test_ctz tests/test-util.c:89
      [...]
  lib/dpif-netlink.c:396:33:
  runtime error: left shift of 1 by 31 places cannot be represented in
  type 'int'
      0 0x571b9f in dpif_netlink_open lib/dpif-netlink.c:396

Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-17 23:06:46 +02:00
Ilya Maximets
8c506d3725 ofp-monitor: Fix abort on malformed flow update event.
nx_to_ofp_flow_update_event() aborts the execution if incorrect
event is passed, so checking has to be done before conversion
in order to avoid the crash while decoding malformed flow update
message:

 ==397030==ERROR: AddressSanitizer: ABRT on unknown address 0x... )
  0 0x7fd26688418b in raise
  1 0x7fd266863858 in abort
  2 0x6a6cbd in nx_to_ofp_flow_update_event lib/ofp-monitor.c:399:9
  3 0x6a6cbd in ofputil_decode_flow_update lib/ofp-monitor.c:856:25
  4 0x56491d in ofp_print_flow_monitor_reply lib/ofp-print.c:779:22
  5 0x55f0a0 in ofp_to_string__ lib/ofp-print.c:1154:16
  6 0x55f0a0 in ofp_to_string lib/ofp-print.c:1244:21
  7 0x5603a5 in ofp_print lib/ofp-print.c:1288:28

Credit to OSS-Fuzz.

Additionally removed the extra 'reply' word from the error message,
since ofpraw_get_name(raw) already has one.

Fixes: c3e64047d1cc ("ofp-monitor: Support flow monitoring for OpenFlow 1.3, 1.4+.")
Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=47112
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-05-04 18:39:24 +02:00
Vasu Dasari
c3e64047d1 ofp-monitor: Support flow monitoring for OpenFlow 1.3, 1.4+.
Extended OpenFlow monitoring support
* OpenFlow 1.3 with ONF extensions
* OpenFlow 1.4+ as defined in OpenFlow specification 1.4+.

ONF extensions are similar to Nicira extensions except for onf_flow_monitor_request{}
where out_port is defined as 32-bit number OF(1.1) number, oxm match formats are
used in update and request messages.

Flow monitoring support in 1.4+ is slightly different from Nicira and ONF
extensions.
 * More flow monitoring flags are defined.
 * Monitor add/modify/delete command is introduced in flow_monitor
   request message.
 * Addition of out_group as part of flow_monitor request message

Description of changes:
1. Generate ofp-msgs.inc to be able to support 1.3, 1.4+ flow Monitoring messages.
    include/openvswitch/ofp-msgs.h

2. Modify openflow header files with protocol specific headers.
    include/openflow/openflow-1.3.h
    include/openflow/openflow-1.4.h

3. Modify OvS abstraction of openflow headers. ofp-monitor.h leverages  enums
   from on nicira extensions for creating protocol abstraction headers. OF(1.4+)
   enums are superset of nicira extensions.
    include/openvswitch/ofp-monitor.h

4. Changes to these files reflect encoding and decoding of new protocol messages.
    lib/ofp-monitor.c

5. Changes to modules using ofp-monitor APIs. Most of the changes here are to
   migrate enums from nicira to OF 1.4+ versions.
    ofproto/connmgr.c
    ofproto/connmgr.h
    ofproto/ofproto-provider.h
    ofproto/ofproto.c

6. Extended protocol decoding tests to verify all protocol versions
        FLOW_MONITOR_CANCEL
        FLOW_MONITOR_PAUSED
        FLOW_MONITOR_RESUMED
        FLOW_MONITOR request
        FLOW_MONITOR reply
    tests/ofp-print.at

7. Modify flow monitoring tests to be able executed by all protocol versions.
    tests/ofproto.at

7. Modified documentation highlighting the change
    utilities/ovs-ofctl.8.in
    NEWS

Signed-off-by: Vasu Dasari <vdasari@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-June/383915.html
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-28 21:27:11 +02:00
Vasu Dasari
d8ab75cd69 ofp-monitor: Extend Flow Monitoring support for OF 1.0-1.2 with Nicira Extensions.
Currently OVS supports flow-monitoring for OpenFlow 1.0 and Nicira Extenstions.
Any other OpenFlow versioned messages are not accepted. This change will allow
OpenFlow1.0-1.2 Flow Monitoring with Nicira extensions be accepted. Also made
sure that flow-monitoring updates, flow monitoring pause messages, resume
messages are sent in the same OpenFlow version as that of flow-monitor request.

Description of changes:

1. Generate ofp-msgs.inc to be able to support 1.0-1.2 Flow Monitoring messages.
include/openvswitch/ofp-msgs.h

2. Support vconn to accept user specified version and use it for vconn
flow-monitoring session
ofproto/ofproto.c

3. Modify APIs to use protocol as an argument to encode and decode messages
include/openvswitch/ofp-monitor.h
lib/ofp-monitor.c
ofproto/connmgr.c
ofproto/connmgr.h
ofproto/ofproto.c

4. Modified following testcases to be verified across supported OF Versions
    ofproto - flow monitoring
    ofproto - flow monitoring with !own
    ofproto - flow monitoring with out_port
    ofproto - flow monitoring pause and resume
    ofproto - flow monitoring usable protocols
tests/ofproto.at

5. Updated NEWS with the support added with this commit

Signed-off-by: Vasu Dasari <vdasari@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-December/050820.html
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-28 21:26:40 +02:00
Dumitru Ceara
d94cd0d3ee ovsdb-idl: Support write-only-changed IDL monitor mode.
At a first glance, change tracking should never be allowed for
write-only columns.  However, some clients (e.g., ovn-northd) that are
mostly exclusive writers of a database, use change tracking to avoid
duplicating the IDL row records into a local cache when implementing
incremental processing.

The default behavior of the IDL is to automatically turn a write-only
column into a read-write column whenever the client enables change
tracking for that column.

For the afore mentioned clients, this becomes a performance issue.
Commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that don't
change a column's value.") explains why writes that don't change a
column's value cannot be optimized out early if the column is
read/write.

Furthermore, if there is at least one record in any table that
changed during a transaction, then *all* records that have been
written are added to the transaction, even if their values didn't
change.  If there are many such rows (e.g., like in ovn-northd's
case) this incurs a significant overhead because:
a. the client has to build this large transaction
b. the transaction has to be sent over the network
c. the server needs to parse this (mostly) no-op update

We now introduce new IDL APIs allowing users to set a new monitoring
mode flag, OVSDB_IDL_WRITE_CHANGED_ONLY, to indicate to the IDL that the
atomicity constraints may be relaxed and written columns that don't
change value can be skipped from the current transaction.

We benchmarked ovn-northd performance when using this new mode
against NB and SB databases taken from ovn-kubernetes scale tests.
We noticed that when a minor change is performed to the Northbound
database (e.g., NB_Global.nb_cfg is incremented) the time it takes to
build the Southbound transaction becomes negligible (vs ~1.5 seconds
before this change).

End-to-end ovn-kubernetes scale tests on 120-node clusters also show
significant reduction of latency to bring up pods; both average and P99
latency decreased by ~30%.

Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-28 16:57:43 +02:00
Thilak Raj Surendra Babu
c1c8cb8a18 ofproto-dpif-xlate: Clear out vlan flow fields while processing native tunnel.
When a packet is received over an access port that needs to be sent
over a vxlan tunnel,the access port VLAN id is used in the lookup
leading to a wrong packet being crafted and sent over the tunnel.
Clear out the flow 's VLAN field as it should not be used while
performing mac lookup for the outer tunnel and also at this point
the VLAN action related to inner flow is already committed.

Fixes: 7c12dfc527a5 ("tunneling: Avoid datapath-recirc by combining recirc actions at xlate.")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393566.html
Reported-at: https://bugzilla.redhat.com/2060552
Signed-off-by: Thilak Raj Surendra Babu <thilakraj.sb@nutanix.com>
Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com>
Co-authored-by: Rosemarie O'Riorden <roriorden@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-27 21:38:21 +02:00
Jan Scheurich
0e0eef533f ofproto-xlate: Fix crash when forwarding packet between legacy_l3 tunnels.
A packet received from a tunnel port with legacy_l3 packet-type (e.g.
lisp, L3 gre, gtpu) is conceptually wrapped in a dummy Ethernet header
for processing in an OF pipeline that is not packet-type-aware. Before
transmission of the packet to another legacy_l3 tunnel port, the dummy
Ethernet header is stripped again.

In ofproto-xlate, wrapping in the dummy Ethernet header is done by
simply changing the packet_type to PT_ETH. The generation of the
push_eth datapath action is deferred until the packet's flow changes
need to be committed, for example at output to a normal port. The
deferred Ethernet encapsulation is marked in the pending_encap flag.

This patch fixes a bug in the translation of the output action to a
legacy_l3 tunnel port, where the packet_type of the flow is reverted
from PT_ETH to PT_IPV4 or PT_IPV6 (depending on the dl_type) to remove
its Ethernet header without clearing the pending_encap flag if it was
set. At the subsequent commit of the flow changes, the unexpected
combination of pending_encap == true with an PT_IPV4 or PT_IPV6
packet_type hit the OVS_NOT_REACHED() abortion clause.

The pending_encap is now cleared in this situation.

Reported-by: Dincer Beken <dbeken@blackned.de>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Dincer Beken <dbeken@blackned.de>
Signed-off-by: Dincer Beken <dbeken@blackned.de>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-27 00:32:38 +02:00
Paolo Valerio
4ea1bb6391 system-traffic: Fix fragment reassembly with L3 L4 protocol information.
The test relied on the flows installed by recv_upcalls() after
upcall_receive() returned ENODEV if the packet was initially
originated by packet-out with OFPP_CONTROLLER as in_port.

Since 323ae1e808e6 ("ofproto-dpif-xlate: Fix recirculation when in_port is OFPP_CONTROLLER.")
the test stopped working because recirculation in such scenario got
fixed and upcall_receive() no longer returns ENODEV.

Fix it by setting an invalid as "in_port" in order to similarly
trigger the same behavior.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-26 21:55:45 +02:00
Kevin Traynor
3b18b8656b alb.at: Add tests for cross-numa polling.
PMD auto load balance currently only operates when the polling PMD
thread core will not change numa after reassignment.

Add a unit test for this.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-04 22:52:12 +02:00
Kevin Traynor
cdc9a196b1 pmd.at: Add tests for multi non-local numa pmds.
Ensure that if there are no local numa PMD thread
cores available that pmd cores from all other non-local
numas will be used.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-04 22:52:12 +02:00
Paolo Valerio
0027b3b46c ofproto-dpif-xlate: Fix NULL pointer dereference in xlate_normal().
Considering the following flows:

ovs-ofctl dump-flows br0
 cookie=0x0, table=0, priority=0 actions=NORMAL

and assuming a packet originated from packet-out in this way:

ovs-ofctl packet-out br0 \
    "in_port=controller,packet=<UDP packet>,action=ct(table=0)"

If in_port is OFPP_NONE or OFPP_CONTROLLER, this leads to a
NULL pointer (xport) dereference in xlate_normal().

Fix it by checking the xport pointer validity while deciding whether
it is a candidate for mac learning or not.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-04-04 21:29:37 +02:00
Adrian Moreno
745c80f52c hindex: remove the next variable in safe loops.
Using SHORT version of the *_SAFE loops makes the code cleaner and less
error prone. So, use the SHORT version and remove the extra variable
when possible for HINDEX_*_SAFE.

In order to be able to use both long and short versions without changing
the name of the macro for all the clients, overload the existing name
and select the appropriate version depending on the number of arguments.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:03 +02:00
Adrian Moreno
2d40277382 hindex: use multi-variable iterators.
Re-write hindex's loops using multi-variable helpers.

For safe loops, use the LONG version to maintain backwards
compatibility.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:03 +02:00
Adrian Moreno
ef39616486 cmap: use multi-variable iterators.
Re-write cmap's loops using multi-variable helpers.

For iterators based on cmap_cursor, we just need to make sure the NODE
variable is not used after the loop, so we set it to NULL.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:02 +02:00
Adrian Moreno
9e56549c2b hmap: use short version of safe loops if possible.
Using SHORT version of the *_SAFE loops makes the code cleaner and less
error prone. So, use the SHORT version and remove the extra variable
when possible for hmap and all its derived types.

In order to be able to use both long and short versions without changing
the name of the macro for all the clients, overload the existing name
and select the appropriate version depending on the number of arguments.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:02 +02:00
Adrian Moreno
860e69a8c3 hmap: implement UB-safe hmap pop iterator.
HMAP_FOR_EACH_POP iterator has an additional difficulty, which is the
use of two iterator variables of different types.

In order to re-write this loop in a UB-safe manner, create a iterator
struct to be used as loop variable.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:02 +02:00
Adrian Moreno
9e8d960a6b hmap: use multi-variable helpers for hmap loops.
Rewrite hmap's loops using multi-variable helpers.

For SAFE loops, use the LONG version of the multi-variable macros to
keep backwards compatibility.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:02 +02:00
Adrian Moreno
e9bf5bffb0 list: use short version of safe loops if possible.
Using the SHORT version of the *_SAFE loops makes the code cleaner
and less error-prone. So, use the SHORT version and remove the extra
variable when possible.

In order to be able to use both long and short versions without changing
the name of the macro for all the clients, overload the existing name
and select the appropriate version depending on the number of arguments.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:02 +02:00
Adrian Moreno
d4566085ed list: use multi-variable helpers for list loops.
Use multi-variable iteration helpers to rewrite non-safe loops.

There is an important behavior change compared with the previous
implementation: When the loop ends normally (i.e: not via "break;"), the
object pointer provided by the user is NULL. This is safer because it's
not guaranteed that it would end up pointing a valid address.

For pop iterator, set the variable to NULL when the loop ends.

Clang-analyzer has successfully picked the potential null-pointer
dereference on the code that triggered this change (bond.c) and nothing
else has been detected.

For _SAFE loops, use the LONG version for backwards compatibility.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:02 +02:00
Ilya Maximets
9d86459516 system-traffic.at: Fix flaky DNAT load balancing test.
'conntrack - DNAT load balancing' test fails from time to time
because not all the group buckets are getting hit.

In short, the test creates a group with 3 buckets with the same
weight.  It creates 12 TCP sessions and expects that each bucket
will be used at least once.  However, there is a solid chance that
this will not happen.  The probability of having at least one
empty bucket is:

  C(3, 1) x (2/3)^N - C(3, 2) x (1/3)^N

Where N is the number of distinct TCP sessions.  For N=12, the
probability is about 0.023, i.e. there is a 2.3% chance for a
test to fail, which is not great for CI.

Increasing the number of sessions to 50 to reduce the probability
of failure down to 4.7e-9.  In my testing, the configuration with
50 TCP sessions didn't fail after 6000 runs.  Should be good
enough for CI systems.

Fixes: 2c66ebe47a88 ("ofp-actions: Allow conntrack action in group buckets.")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Michael Phelan <michael.phelan@intel.com>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-25 20:30:37 +01:00
Ilya Maximets
2e2217c126 tests: Fix incorrect usage of OVS_WAIT_UNTIL.
OVS_WAIT_UNTIL() macro has only 2 arguments and doesn't check the
output of the command, but bonding and route tests are trying to use
it as if it was AT_CHECK macro.  That makes checks in those tests
mostly useless, since they are not actually checking anything except
for command returning zero.

Introducing a new macro OVS_WAIT_UNTIL_EQUAL that will actually
perform the comparison with the desired output.  Using it for
the bonding and route tests and fixing all the caught incorrect
expected outputs along the way.

Adding an explicit argument check to the OVS_WAIT_UNTIL/WHILE to
avoid the problem in the future.

Fixes: b4e50218a0f8 ("bond: Add 'primary' interface concept for active-backup mode.")
Fixes: 9e11517e6ca6 ("ovs-router: Fix flushing of local routes.")
Acked-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-22 22:09:34 +01:00
Eelco Chaudron
31b467a751 odp-util: Fix output for tc to be equal to kernel.
When the same flow is programmed in the kernel and tc, they
look different due to the way they are translated. They take
the userspace approach by always including the packet type
attribute. To make the outputs the same, show the ethernet
header when the packet type is wildcarded, and not printed.

So without the fix the kernel would show (ovs-appctl dpctl/dump-flows):

  in_port(3),eth(),eth_type(0x0800),ipv4(frag=no), ..., actions:output

Where as TC would show:

  in_port(3),eth_type(0x0800),ipv4(frag=no), ..., actions:output

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-21 00:37:41 +01:00
Kumar Amber
c44876b9e4 system-dpdk: Fix mfex autovalidator tests.
AVX512 DPIF must be active in order for the MFEX AutoValidator to be executed.
If the DPIF-AVX512 is not available, the unit test is skipped, as the
scalar DPIF does not use the MFEX function-pointer based optimizations.

Fixes: 50be6715c083 ("test/sytem-dpdk: Add unit test for mfex autovalidator")
Suggested-by: Cian Ferriter <cian.ferriter@intel.com>
Acked-by: Cian Ferriter <cian.ferriter@intel.com>
Signed-off-by: Kumar Amber <kumar.amber@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-11 21:17:48 +01:00
Dumitru Ceara
b1e783dde4 tests: Ignore log about failing to set NETLINK_EXT_ACK.
Since 4a6a4734622e ("netlink-socket: Log extack error messages in
netlink transactions."), tests fail on older systems that don't support
NETLINK_EXT_ACK.  It's not really an issue, so we can just ignore the
log.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-11 21:14:51 +01:00
Ilya Maximets
8d480c5cec ovsdb-cluster.at: Avoid test failures due to different hashing.
Depending on compiler flags and CPU architecture different hash
function are used.  That impacts the order of tables and columns
in database representation making ovsdb report different columns
in the warning about ephemeral-to-persistent conversion.

Stripping out changing parts of the message to avoid the issue.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-11 21:13:33 +01:00
Flavio Leitner
7ed60839d0 system-tso: Skip encap tests when userspace TSO is enabled.
It seems Linux native tunnel configuration changed to enable
checksum by default and that causes the check-system-tso unit
test below to fail:
 10: datapath - ping over vxlan tunnel    FAILED (system-traffic.at:248)

That happens because userspace TSO doesn't support encapsulation
as mentioned in the current documentation. In this specific case,
udp_extract_tnl_md() checks if the checksum is correct, but since
TSO is enabled, the outer UDP header contains only the pseudo
checksum and not the full packet checksum.

Although the packet is marked correctly with UDP csum offload flag
and the code could use that to verify the pseudo csum, more work
is needed to properly translate the offloading flags from the outer
headers to the inner headers.  For example, if the payload is a
TCP packet, most probably the flag DP_PACKET_OL_TX_UDP_CKSUM doesn't
make sense after decapsulating that.

This patch skips the tunnel tests when the userspace TSO is enabled.

Fixes: 29bb3093eb8b ("userspace: Enable TSO support for non-DPDK.")
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-04 19:51:35 +01:00
Ilya Maximets
a3e97b1af1 ovsdb: relay: Add transaction history support.
Even though relays can be scaled to the big number of servers to
handle a lot more clients, lack of transaction history may cause
significant load if clients are re-connecting.
E.g. in case of the upgrade of a large-scale OVN deployment, relays
can be taken down one by one forcing all the clients of one relay to
jump to other ones.  And all these clients will download the database
from scratch from a new relay.

Since relay itself supports monitor_cond_since connection to the
main cluster, it receives the last transaction id along with each
update.  Since these transaction ids are 'eid's of actual transactions,
they can be used by relay for a transaction history.

Relay may not receive all the transaction ids, because the main cluster
may combine several changes into a single monitor update.  However,
all relays will, likely, receive same updates with the same transaction
ids, so the case where transaction id can not be found after
re-connection between relays should not be very common.  If some id
is missing on the relay (i.e. this update was merged with some other
update and newer id was used) the client will just re-download the
database as if there was a normal transaction history miss.

OVSDB client synchronization module updated to provide the last
transaction id along with the update.  Relay module updated to use
these ids as a transaction id.  If ids are zero, relay decides that
the main server doesn't support transaction ids and disables the
transaction history accordingly.

Using ovsdb_txn_replay_commit() instead of ovsdb_txn_propose_commit_block(),
so transactions are added to the history.  This can be done, because
relays has no file storage, so there is no need to write anything.

Relay tests modified to test both standalone and clustered database
as a main server.  Checks added to ensure that all servers receive the
same transaction ids in monitor updates.

Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-03 15:21:21 +01:00
Ilya Maximets
999ba294fb ovsdb: raft: Fix inability to join the cluster after interrupted attempt.
If the joining server re-connects while catching up (e.g. if it crashed
or connection got closed due to inactivity), the data we sent might be
lost, so the server will never reply to append request or a snapshot
installation request.  At the same time, leader will decline all the
subsequent requests to join from that server with the 'in progress'
resolution.  At this point the new server will never be able to join
the cluster, because it will never receive the raft log while leader
thinks that it was already sent.

This happened in practice when one of the servers got preempted for a
few seconds, so the leader closed connection due to inactivity.

Destroying the joining server if disconnection detected.  This will
allow to start the joining from scratch when the server re-connects
and sends the new join request.

We can't track re-connection in the raft_conn_run(), because it's
incoming connection and the jsonrpc will not keep it alive or
try to reconnect.  Next time the server re-connects it will be an
entirely new raft conn.

Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Reported-at: https://bugzilla.redhat.com/2033514
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
2022-02-25 14:15:12 +01:00
Ilya Maximets
6de8868d19 reconnect: Fix broken inactivity probe if there is no other reason to wake up.
The purpose of reconnect_deadline__() function is twofold:

1. Its result is used to tell if the state has to be changed right now
   in reconnect_run().
2. Its result also used to determine when the process need to wake up
   and call reconnect_run() for a next time, i.e. when the state may
   need to be changed next time.

Since introduction of the 'receive-attempted' feature, the function
returns LLONG_MAX if the deadline is in the future.  That works for
the first case, but doesn't for the second one, because we don't
really know when we need to call reconnect_run().

This is the problem for applications where jsonrpc connection is the
only source of wake ups, e.g. ovn-northd.  When the network goes down
silently, e.g. server looses IP address due to DHCP failure, ovn-northd
will sleep in the poll loop indefinitely after being told that it
doesn't need to call reconnect_run() (deadline == LLONG_MAX).

Fixing that by actually returning the expected time if it is in the
future, so we will know when to wake up.  In order to keep the
'receive-attempted' feature, returning 'now + 1' in case where the
time has already passed, but receive wasn't attempted.  That will
trigger a fast wake up, so the application will be able to attempt the
receive even if there was no real events.  In a correctly written
application we should not fall into this case more than once in a row.
'+ 1' ensures that we will not transition into a different state
prematurely, i.e. before the receive is actually attempted.

Fixes: 4241d652e465 ("jsonrpc: Avoid disconnecting prematurely due to long poll intervals.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-02-24 17:04:32 +01:00
Paolo Valerio
989895501c system-traffic.at: Avoid sporadic failures during conntrack IPv6 HTTP/FTP tests.
Some sporadic false positive may be visible for the following tests:

- conntrack - IPv6 HTTP
- conntrack - FTP over IPv6

The failures show up randomly.
The reason appears to be source address used when performing the
request using wget:
-tcp,orig=(src=fc00::1,dst=fc00::2,sport=<cleared>,dport=<cleared>),\
    reply=(src=fc00::2,dst=fc00::1,sport=<cleared>,dport=<cleared>),\
    protoinfo=(state=<cleared>)
+tcp,orig=(src=fe80::f0eb:f8ff:fef0:138f,dst=fc00::2,sport=<cleared>,dport=<cleared>),\
    reply=(src=fc00::2,dst=fe80::f0eb:f8ff:fef0:138f,sport=<cleared>,dport=<cleared>),\
    protoinfo=(state=<cleared>)

It seems that the problem can be addressed in multiple ways, but using
"nodad" seems to be safe enough to fix the issue that now, after
hundreds of attempts, is no longer present.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-02-14 21:38:02 +01:00
Paolo Valerio
e969370d30 system-traffic.at: Do not use ranges with broadcast address.
turn a bunch of test ranges from 10.1.1.240-10.1.1.255 to
10.1.1.240-10.1.1.254. 10.1.1.255 is the broadcast address for
10.1.1.0/24 and can be picked to SNAT packets causing the subsequent
failure of the test.

Fixes: 9ac0aadab9f9 ("conntrack: Add support for NAT.")
Fixes: e32cd4c6292e ("conntrack: ignore port for ICMP/ICMPv6 NAT.")
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-02-14 20:49:39 +01:00
Kumar Amber
b9cf520705 system-dpdk.at: Add warning log in mfex fuzzy test.
Some specific warning are seen on various systems
which may not be visible on others but good to add
such logs to test to avoid test-case failure.

Thw warning only effects the fuzzy tests due to
more than 1000+ flows being offloading simultanously.

Wilcarding flow count number as for different systems
under test the number could vary in the warning log.

Suggested-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Kumar Amber <kumar.amber@intel.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Cian Ferriter <cian.ferriter@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2022-02-11 11:42:17 +00:00
Adrian Moreno
f0a9000ca6 ofproto: Fix ipfix not always sampling on egress.
We are currently requiring in_port to be a valid port number for ipfix
sampling even if the sampling is done on the output port (egress).

This restriction is not really needed and can affect pipelines that
intentionally set the in_port to OFPP_NONE during flow processing. For
instance, OVN does this, see:

cfa547821 Fix ovn-controller generated packets from getting dropped for
reject ACL action.

This patch skips ipfix sampling only if both (ingress and egress) ports
are invalid.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2016346
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-02-09 16:02:49 +01:00
Martin Varghese
712202ff7d ofproto-dpif-xlate: Fix packet drops with decap action on MPLS Multicast.
Added PT_MPLS_MC support in function xlate_generic_decap_action to fix
packet drops when decap action is applied on packets with packet_type
(ns=1,type=0x8848).

Fixes: 1917ace89364 ("Encap & Decap actions for MPLS packet type.")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-31 21:27:06 +01:00
Martin Varghese
3ae3e86059 tests: Fix cosmetic errors in system-traffic.at.
Removed extra lines in multiple encap decap mpls actions &
encap decap mpls actions tests.

Converted title of encap decap mpls actions tests to lowercase
for consistency.

Fixes: 1917ace89364 ("Encap & Decap actions for MPLS packet type.")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-31 21:27:06 +01:00
Dumitru Ceara
c1691cceac ovsdb-cs: Clear last_id on reconnect if condition changes in-flight.
When reconnecting, if there are condition changes already sent to the
server but not yet acked, reset the db's 'last-id', esentially clearing
the local cache after reconnect.

This is needed because the client cannot easily differentiate between
the following cases:
a. either the server already processed the requested monitor
   condition change but the FSM was restarted before the
   client was notified.  In this case the client should
   clear its local cache because it's out of sync with the
   monitor view on the server side.
b. OR the server hasn't processed the requested monitor
   condition change yet.

Conditions changing at the same time with a reconnection happening are
rare so the performance impact of this patch should be minimal.

Also, the tests are updated to cover the fact that we cannot control
which of the two scenarios ("a" and "b" above) are hit during the test.

Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-31 21:23:47 +01:00
Ilya Maximets
9632f5551f tests: Add de-serialization check to the json string benchmark.
Since we're testing serialization, it also makes sense to test
the opposite operation.  Should be useful in the future for
exploring possible optimizations.

CMD: $ ./tests/ovstest json-string-benchmark

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
2022-01-31 21:15:25 +01:00