2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 05:18:13 +00:00

18449 Commits

Author SHA1 Message Date
Greg Rose
acb46c58a0 doc: Deprecate building Linux kernel module from OVS source tree.
It is decided (1) to deprecate building the Linux kernel module
from the Open vSwitch source tree.

Update the NEWS and FAQ to provide notice.

1. https://mail.openvswitch.org/pipermail/ovs-dev/2020-December/378831.html

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 19:32:49 +01:00
Ilya Maximets
2ccd66f594 ovsdb: Use column diffs for ovsdb and raft log entries.
Currently, ovsdb-server stores complete value for the column in a database
file and in a raft log in case this column changed.  This means that
transaction that adds, for example, one new acl to a port group creates
a log entry with all UUIDs of all existing acls + one new.  Same for
ports in logical switches and routers and more other columns with sets
in Northbound DB.

There could be thousands of acls in one port group or thousands of ports
in a single logical switch.  And the typical use case is to add one new
if we're starting a new service/VM/container or adding one new node in a
kubernetes or OpenStack cluster.  This generates huge amount of traffic
within ovsdb raft cluster, grows overall memory consumption and hurts
performance since all these UUIDs are parsed and formatted to/from json
several times and stored on disks.  And more values we have in a set -
more space a single log entry will occupy and more time it will take to
process by ovsdb-server cluster members.

Simple test:

1. Start OVN sandbox with clustered DBs:
   # make sandbox SANDBOXFLAGS='--nbdb-model=clustered --sbdb-model=clustered'

2. Run a script that creates one port group and adds 4000 acls into it:
   # cat ../memory-test.sh
   pg_name=my_port_group
   export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file -vsocket_util:off)
   ovn-nbctl pg-add $pg_name
   for i in $(seq 1 4000); do
     echo "Iteration: $i"
     ovn-nbctl --log acl-add $pg_name from-lport $i udp drop
   done
   ovn-nbctl acl-del $pg_name
   ovn-nbctl pg-del $pg_name
   ovs-appctl -t $(pwd)/sandbox/nb1 memory/show
   ovn-appctl -t ovn-nbctl exit
   ---

4. Check the current memory consumption of ovsdb-server processes and
   space occupied by database files:
   # ls sandbox/[ns]b*.db -alh
   # ps -eo vsz,rss,comm,cmd | egrep '=[ns]b[123].pid'

Test results with current ovsdb log format:

   On-disk Nb DB size     :  ~369 MB
   RSS of Nb ovsdb-servers:  ~2.7 GB
   Time to finish the test:  ~2m

In order to mitigate memory consumption issues and reduce computational
load on ovsdb-servers let's store diff between old and new values
instead.  This will make size of each log entry that adds single acl to
port group (or port to logical switch or anything else like that) very
small and independent from the number of already existing acls (ports,
etc.).

Added a new marker '_is_diff' into a file transaction to specify that
this transaction contains diffs instead of replacements for the existing
data.

One side effect is that this change will actually increase the size of
file transaction that removes more than a half of entries from the set,
because diff will be larger than the resulted new value.  However, such
operations are rare.

Test results with change applied:

   On-disk Nb DB size     :  ~2.7 MB  ---> reduced by 99%
   RSS of Nb ovsdb-servers:  ~580 MB  ---> reduced by 78%
   Time to finish the test:  ~1m27s   ---> reduced by 27%

After this change new ovsdb-server is still able to read old databases,
but old ovsdb-server will not be able to read new ones.
Since new servers could join ovsdb cluster dynamically it's hard to
implement any runtime mechanism to handle cases where different
versions of ovsdb-server joins the cluster.  However we still need to
handle cluster upgrades.  For this case added special command line
argument to disable new functionality.  Documentation updated with the
recommended way to upgrade the ovsdb cluster.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 19:23:02 +01:00
Ilya Maximets
980bca7079 AUTHORS: Add Yalei Li.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 19:18:47 +01:00
Eli Britstein
ac661e6280 netdev-offload-dpdk: Implement flow flush.
Remove all the rules for the specified netdev.

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com>
Acked-by: Emma Finn <emma.finn@intel.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 19:01:00 +01:00
Eli Britstein
95e266f788 netdev-offload-dpdk: Refactor disassociate and flow destroy.
Refactor disassociation to be removed from flow destroy, and to use
already found object instead of re-searching it.

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com>
Acked-by: Emma Finn <emma.finn@intel.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 19:01:00 +01:00
Eli Britstein
d131664baf netdev-offload-dpdk: Keep netdev in offload object.
Keep the netdev of the offload rule as a field in the offload object as
a pre-step towards support flushing of the offload rules.

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com>
Acked-by: Emma Finn <emma.finn@intel.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 19:01:00 +01:00
Eli Britstein
62d1c28e9c dpif-netdev: Flush offload rules upon port deletion.
When a port is deleted, flow deletion requests are posted, and the netdev
is removed from offload netdevs map. Following flow deletion handling may
be done after the netdev has already been removed from the offload
netdevs map, so the HW rule is not removed and the data object is not
freed (memory leak). Flush offload rules upon port deletion, and disable
pending handling of offloads to fix it.

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com>
Acked-by: Emma Finn <emma.finn@intel.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 19:01:00 +01:00
Kevin Traynor
1178df2d18 AUTHORS: Add Christophe Fontaine.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 18:52:16 +01:00
Kevin Traynor
30de755202 dpif-netdev: Add PMD auto load balance status log.
When any PMD auto load balance parameters change, it is useful
to also log if the feature is enabled or disabled.

|dpif_netdev|INFO|PMD auto load balance load threshold changed to 70%
|dpif_netdev|INFO|PMD auto load balance is disabled

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 18:52:16 +01:00
Christophe Fontaine
62ab5594c2 dpif-netdev: Add parameters to configure PMD auto load balance.
Two important parts of how PMD auto load balance operates are how
loaded a core needs to be and how much improvement is estimated
before a PMD auto load balance can trigger.

Previously they were hardcoded to 95% loaded and 25% variance
improvement.

These default values may not be suitable for all use cases and
we may want to use a more (or less) aggressive rebalance, either
on the pmd load threshold or on the minimum variance improvement
threshold.

The defaults are not changed, but "pmd-auto-lb-load-threshold" and
"pmd-auto-lb-improvement-threshold" parameters are added to override
the defaults.

$ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-load-threshold="70"
$ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-improvement-threshold="20"

Signed-off-by: Christophe Fontaine <cfontain@redhat.com>
Co-Authored-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 18:51:27 +01:00
Kevin Traynor
e4db0b69e2 dpif-netdev: Add log for PMD auto load balance interval parameter.
Previously if the parameter for the PMD auto load balance minimum
interval was changed at runtime, it was not logged unless the
PMD auto load balance feature was also changed to enabled.

Log the parameter anytime it changes, and use minutes when it is
logged as that is the user input format.

Fixes: 5bf84282482a ("Adding support for PMD auto load balancing")
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-15 18:38:47 +01:00
Ilya Maximets
32c8c10d94 acinclude: Strip out -mno-avx512f provided by DPDK.
DPDK forces '-mno-avx512f' flag for the application if the toolchain
used to build DPDK had broken AVX512 support.  But OVS could be built
with a completely different or fixed toolchain with correct avx512
support.  In this case OVS will detect that toolchain is good and will
try to build AVX512-optimized classifier.  However, '-mno-avx512f'
flag will be passed from the DPDK side breaking the build:

  In file included from /gcc/x86_64-linux-gnu/8/include/immintrin.h:55,
                 from /gcc/x86_64-linux-gnu/8/include/x86intrin.h:48,
                 from /dpdk/../x86_64-linux-gnu/dpdk/rte_vect.h:28,
                 from /dpdk/../x86_64-linux-gnu/dpdk/rte_memcpy.h:17,
                 from /dpdk/rte_mempool.h:51,
                 from /dpdk/rte_mbuf.h:38,
                 from ../lib/dp-packet.h:25,
                 from ../lib/dpif.h:380,
                 from ../lib/dpif-netdev.h:23,
                 from ../lib/dpif-netdev-lookup-avx512-gather.c:22:
  /usr/lib/gcc/x86_64-linux-gnu/8/include/avx512bwintrin.h:413:1: error:
     inlining failed in call to always_inline '_mm512_sad_epu8':
     target specific option mismatch
   _mm512_sad_epu8 (__m512i __A, __m512i __B)

Fix that by stripping out `-mno-avx512f` as we already do for '-march'.
This will allow the OVS to decide if the AVX512 can be used.

Reordering of CFLAGS (i.e. adding DPDK flags before OVS ones) is not an
option since autotools might reorder them back later and it's very
unpredictable.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/201
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2021-01-15 11:27:19 +00:00
Emma Finn
584bacb52d Revert "netdev-offload-dpdk: Fix for broken ethernet matching HWOL for XL710NIC."
Removing temporary patch - 023f257 (netdev-offload-dpdk: Fix for broken
ethernet matching HWOL for XL710NIC).
Ethernet pattern is now being set correctly withtin the i40e PMD.

Signed-off-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2021-01-15 10:24:20 +00:00
Aaron Conole
78e712c0b1 lldp: do not leak memory on multiple instances of TLVs
Upstream commit:
    commit a8d3c90feca548fc0656d95b5d278713db86ff61
    Date: Tue, 17 Nov 2020 09:28:17 -0500

    lldp: avoid memory leak from bad packets

    A packet that contains multiple instances of certain TLVs will cause
    lldpd to continually allocate memory and leak the old memory.  As an
    example, multiple instances of system name TLV will cause old values
    to be dropped by the decoding routine.

    Reported-at: https://github.com/openvswitch/ovs/pull/337
    Reported-by: Jonas Rudloff <jonas.t.rudloff@gmail.com>
    Signed-off-by: Aaron Conole <aconole@redhat.com>

Vulnerability: CVE-2020-27827
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-13 16:54:02 +01:00
Peng He
81158b920f ipf: Avoid accessing to a freed rp.
if there are multiple pkts in the batch, the loop will access a
freed rp, which cause ovs crash.

Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
Signed-off-by: Peng He <hepeng.0320@bytedance.com>
Acked-by: Mark Gray <mark.d.gray@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-13 16:05:21 +01:00
Yalei Li
bdd58e62ea rhel: Fix libunwind dev package.
There is no unwind-devel package, only libunwind-devel package is found.
No error is reported with libunwind-devel during compilation.

Fixes: 7e0c91eb0714 ("debian and rhel: Add libunwind dev package.")
Signed-off-by: Yalei Li <liyl43@chinatelecom.cn>
Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-13 15:54:43 +01:00
Flavio Leitner
7f79ae2fb6 Documentation: Simplify the website main page.
The initial website page is difficult to read because of
the large amount of links from different parts of the whole
documentation. Most of all those links come from their
index page referenced in the section 'Contents' on the side.

Another issue is that because the page is static, new links
might not get included.

This patch simplifies the main page by highlighting the project
level documentation. The static part is reduced to the main
level index pages.

All the links are available by clicking on 'Full Table of
Contents' at the end of Documentation section.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2021-01-07 12:13:33 -08:00
Ben Pfaff
fcf281b0b6 reconnect: Add Python implementation of received_attempt(), and test.
This follows up on commit 4241d652e465 ("jsonrpc: Avoid disconnecting
prematurely due to long poll intervals."), which implemented the same
thing in C.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Requested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-07 10:58:53 -08:00
Ilya Maximets
98b1d633c4 dpctl: Fix broken Windows build due to missing strndup.
AppVeyor reports:

  lib/dpctl.c(1433): error C4013: 'strndup' undefined;
                                  assuming extern returning int
  make[2]: *** [lib/dpctl.lo] Error 1

Replacing missing 'strndup' with a portable pair of functions.

Fixes: bf8812cd7e20 ("dpctl: Add add/mod/del-flows command.")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-06 19:07:08 +01:00
Mark Gray
fe5ff26a49 ovs-monitor-ipsec: Add option to not restart IKE daemon.
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-06 12:04:45 +01:00
Mark Gray
aa8bed0986 ovs-monitor-ipsec: Allow exit of ipsec daemon maintaining state.
When 'ovs-monitor-ipsec' exits, it clears all persistent state (i.e.
active ipsec connections, /etc/ipsec.conf, certs/keys). In some
use-cases, we may want to exit and maintain state so that ipsec
connectivity is maintained. One example of this is during an
upgrade. This will require the caller to clear this persistent
state when appropriate (e.g. before 'ovs-monitor-ipsec') is restarted.

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-06 12:03:04 +01:00
Mark Gray
409c35a2f1 ovs-ctl: Use 'stop_daemon' to stop ovs-monitor-ipsec.
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-05 19:44:30 +01:00
Mark Gray
2ee0f4485a ovs-monitor-ipsec: Fix active connection regex.
Connections are added to IPsec using a connection name
that is determined from the OVS port name and the tunnel
type.

GRE connections take the form:
  <iface>-<ver>
Other connections take the form:
  <iface>-in-<ver>
  <iface>-out-<ver>

The regex '|' operator parses strings left to right looking
for the first match that it can find. '.*' is also greedy. This
causes incorrect interface names to be parsed from active
connections as other tunnel types are parsed as type
GRE. This gives unexpected "is outdated" warnings and the
connection is torn down.

For example,

'ovn-424242-in-1' will produce an incorrect interface name of
'ovn-424242-in' instead of 'ovn-424242'.

There are a number of ways this could be resolved including
a cleverer regular expression, or re.findall(). However, this
approach was taken as it simplifies the code easing maintainability.

Fixes: 22c5eafb6efa ("ipsec: reintroduce IPsec support for tunneling")
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1908789
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-05 19:36:39 +01:00
Mark Gray
6d2a5be5f6 ovs-monitor-ipsec: set correct 'leftcert' and 'rightcert' name.
In Libreswan case, 'ovs-monitor-ipsec' incorrectly configures
'leftcert' and 'rightcert' names for self-signed certificates.
This patch resolves that.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1906280
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-05 19:34:10 +01:00
Mark Gray
1d4190c1ee ovs-monitor-ipsec: Add support for tunnel 'local_ip'.
In the libreswan case, 'ovs-monitor-ipsec' sets
'left' to '%defaultroute' which will use the local address
of the default route interface as the source IP address. In
multihomed environments, this may not be correct if the user
wants to specify what the source IP address is. In OVS, this
can be set for tunnel ports using the 'local_ip' option. This
patch also uses that option to populate the 'ipsec.conf'
configuration. If the 'local_ip' option is not present, it
will default to the previous behaviour of using '%defaultroute'

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1906280
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-05 19:27:41 +01:00
Mark Gray
b9c6da7edc ovs-monitor-ipsec: Suppress "unknown %d argument" warning.
As 'ovs-vswitchd' does not understand IPsec tunnel options, it
gives a warning message. This can be safely suppressed.

Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1906701
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-05 19:21:32 +01:00
Mark Gray
cafe492d76 ovs-monitor-ipsec: Fix _nss_clear_database() parse error.
_nss_clear_database() runs `certutil` in order to get a list
of certificates currently loaded in NSS. This fails with error:

"ovs-monitor-ipsec | ERR | Failed to clear NSS database.
startswith first arg must be bytes or a tuple of bytes, not str"

Modify subprocess.Popen() to write in 'text' mode so that
'startwith' can correctly parse output.

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-05 19:19:43 +01:00
Eelco Chaudron
bf8812cd7e dpctl: Add add/mod/del-flows command.
When you would like to add, modify, or delete a lot of flows in the
datapath, for example when you want to measure performance, adding
one flow at the time won't scale. This as it takes a decent amount
of time to set up the datapath connection.

This new command is in-line with the same command available in
ovs-ofctl which allows the same thing, with the only difference that
we do not verify all lines before we start execution. This allows for
a continuous add/delete stream. For example with a command like this:

python3 -c 'while True:
  for i in range(0, 1000):
    print("add in_port(0),eth(),eth_type(0x800),ipv4(src=100.1.{}.{}) 1".format(int(i / 256), i % 256))
  for i in range(0, 1000):
    print("delete in_port(0),eth(),eth_type(0x800),ipv4(src=100.1.{}.{})".format(int(i / 256), i % 256))' \
|  sudo utilities/ovs-dpctl add-flows -

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-05 18:55:45 +01:00
Daniel Alvarez
2221e8b034 ovs-ctl: Don't overwrite external-id hostname.
ovs-ctl started to add the hostname as external-id [0] at some point.

However, this can be problematic as if it's already set by an external
entity it will get overwritten. In RHEL systems, systemd will invoke
ovs-ctl to start OVS and that will overwrite it to the hostname of the
machine.

For OVN this can have a big impact because if, for whatever reason the
hostname changes and the host gets restarted, ovn-controller won't
claim the ports back leaving the workloads unaccessible.

Also, it makes sense to not overwrite it as 1) it's an external_id,
so it will actually let external entities to configure it (unlike now),
and 2) it's optional. In the case that some systems were relying on
ovs-ctl to set the external-id for the first time (e.g onboarding
of a new hypervisor), this patch is not changing such behavior.

For more details, see discussion at [1].

[0] https://mail.openvswitch.org/pipermail/ovs-dev/2016-March/312054.html
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2020-May/370813.html

Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-01-05 18:23:32 +01:00
Justin Pettit
def6eb1ea2 security.rst: Add more information about the Downstream mailing list.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
2020-12-26 16:12:01 -08:00
Ilya Maximets
22d0244a56 AUTHORS: Add Renat Nurgaliyev.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-23 00:34:18 +01:00
Martin Varghese
ebe0e518b0 tunnel: Bareudp Tunnel Support.
There are various L3 encapsulation standards using UDP being discussed to
leverage the UDP based load balancing capability of different networks.
MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.

The Bareudp tunnel provides a generic L3 encapsulation support for
tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
tunnel.

An example to create bareudp device to tunnel MPLS traffic is
given

$ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
             type=bareudp options:remote_ip=2.1.1.3
             options:local_ip=2.1.1.2 \
             options:payload_type=0x8847 options:dst_port=6635

The bareudp device supports special handling for MPLS & IP as
they can have multiple ethertypes. MPLS procotcol can have ethertypes
ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have
ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).

The bareudp device to tunnel L3 traffic with multiple ethertypes
(MPLS & IP) can be created by passing the L3 protocol name as string in
the field payload_type. An example to create bareudp device to tunnel
MPLS unicast & multicast traffic is given below.::

$ ovs-vsctl add-port  br_mpls udp_port -- set interface
            udp_port \
            type=bareudp options:remote_ip=2.1.1.3
            options:local_ip=2.1.1.2 \
            options:payload_type=mpls options:dst_port=6635

Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-By: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-22 12:51:22 +01:00
Ilya Maximets
55f2b065ac odp-util: Fix netlink message overflow with userdata.
Too big userdata could overflow netlink message leading to out-of-bound
memory accesses or assertion while formatting nested actions.

Fix that by checking the size and returning correct error code.

Credit to OSS-Fuzz.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27640
Fixes: e995e3df57ea ("Allow OVS_USERSPACE_ATTR_USERDATA to be variable length.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
2020-12-22 00:25:04 +01:00
Jianbo Liu
c5b4b0ce95 dpif-netlink: Fix issues of the offloaded flows counter.
The n_offloaded_flows counter is saved in dpif, and this is the first
one when ofproto is created. When flow operation is done by ovs-appctl
commands, such as, dpctl/add-flow, a new dpif is opened, and the
n_offloaded_flows in it can't be used. So, instead of using counter,
the number of offloaded flows is queried from each netdev, then sum
them up. To achieve this, a new API is added in netdev_flow_api to get
how many flows assigned to a netdev.

In order to get better performance, this number is calculated directly
from tc_to_ufid hmap for netdev-offload-tc, because flow dumping by tc
takes much time if there are many flows offloaded.

Fixes: af0618470507 ("dpif-netlink: Count the number of offloaded rules")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 20:25:59 +01:00
Brad Cowie
7313787336 Update tutorial for newer versions of Faucet and Open vSwitch.
Newer versions of Faucet use a dynamic OpenFlow pipeline based on what
features are enabled in the configuration file. Update log output, flow
table dumps and explanations to be consistent with newer Faucet versions.

Remove mentions of bugs that we have since fixed in Faucet since the
tutorial was originally written.

Adds documentation on changes to Open vSwitch commands to recommend
using a version that is compatible with the features of the tutorial.

Reported-by: Matthias Ableidinger <ableimat@gmx.at>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047180.html
Signed-off-by: Brad Cowie <brad@wand.net.nz>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 20:25:59 +01:00
Ilya Maximets
a35e9ab302 NEWS: Move '--offload-stats' entry to correct release.
Patch landed to 2.13, not 2.12.

Fixes: 164413156cf9 ("Add offload packets statistics")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
2020-12-21 20:25:59 +01:00
Ilya Maximets
9eeb44aadd ovsdb-tool: Fix datum leak in the show-log command.
Fixes: 4e92542cefb7 ("ovsdb-tool: Make "show-log" convert raw JSON to easier-to-read syntax.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
2020-12-21 20:25:59 +01:00
Ilya Maximets
b2b7e388f4 test-stream: Silence memory leak report.
AddressSanitizer reports this as a leak.
Let's just free the memory before exiting to avoid the noise.

'stream_close()' doesn't update the pointer, so this will not
change the return value.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
2020-12-21 20:25:59 +01:00
Lorenzo Bianconi
e8451e1443 raft: Add some debugging information to cluster/status command.
Introduce the following info useful for cluster debugging to
cluster/status command:
- time elapsed from last start/complete election
- election trigger (e.g. timeout)
- number of disconnections
- time elapsed from last raft messaged received

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 18:53:37 +01:00
Eelco Chaudron
a27d70a898 conntrack: add generic IP protocol support
Currently, userspace conntrack only tracks TCP, UDP, and ICMP, and all
other IP protocols are discarded, and the +inv state is returned. This
is not in line with the kernel conntrack. Where if no L4 information can
be extracted it's treated as generic L3. The change below mimics the
behavior of the kernel.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 18:50:52 +01:00
XiaoXiong Ding
ced0d8fb92 ofproto-dpif-xlate: Stop forwarding MLD reports to group ports.
According with rfc4541 section 2.1.1, a snooping switch
should forward membership reports only to ports with
routers attached.The current code violates the RFC
forwarding membership reports to group ports as well.
The same issue doesn't exist with IPv4.

Fixes: 06994f879c ("mcast-snooping: Add Multicast Listener Discovery support")
Signed-off-by: XiaoXiong Ding <dingxiaoxiong@huawei.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 18:45:36 +01:00
David Marchand
02f76fb42a github: Fix Ubuntu package installation.
Before trying to install a package, APT cache must be updated to avoid
asking for an unavailable version of a package.

Fixes: 6cb2f5a630e3 ("github: Add GitHub Actions workflow.")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 15:49:36 +01:00
Ben Pfaff
2653155874 ovsdb-idl: Add comment.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-18 18:32:17 -08:00
Ben Pfaff
47da7fa5a0 ovsdb-idl: Improve prototypes.
Adding parameter names makes these prototypes clearer.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-18 18:32:14 -08:00
Ben Pfaff
ba67afcf2b ovsdb-idl: Remove prototype for function that is not defined or used.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-18 18:32:10 -08:00
Ben Pfaff
de914f4ee5 ovsdb-idl: Fix memory leak sending messages without a session.
When there's no open session, we still have to free the messages that
we make but cannot send.

I'm not confident that these fix actual bugs, because it seems possible
that these code paths can only be hit when the session is nonnull.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-18 18:32:07 -08:00
Ben Pfaff
75439c4bdc ovsdb-idl: Avoid redundant clearing and parsing of received data.
ovsdb_idl_db_parse_monitor_reply() clears the IDL and parses the
received data.  There's no need to do it again afterward.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.")
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-18 18:31:22 -08:00
Ben Pfaff
4241d652e4 jsonrpc: Avoid disconnecting prematurely due to long poll intervals.
Open vSwitch has a few different jsonrpc-based protocols that depend on
jsonrpc_session to make sure that the connection is up and working.
In turn, jsonrpc_session uses the "reconnect" state machine to send
probes if nothing is received.  This works fine in normal circumstances.
In unusual circumstances, though, it can happen that the program is
busy and doesn't even try to receive anything for a long time.  Then the
timer can time out without a good reason; if it had tried to receive
something, it would have.

There's a solution that the clients of jsonrpc_session could adopt.
Instead of first calling jsonrpc_session_run(), which is what calls into
"reconnect" to deal with timing out, and then calling into
jsonrpc_session_recv(), which is what tries to receive something, they
could use the opposite order.  That would make sure that the timeout
was always based on a recent attempt to receive something.  Great.

The actual code in OVS that uses jsonrpc_session, though, tends to use
the opposite order, and there are enough users and this is a subtle
enough issue that it could get flipped back around even if we fixed it
now.  So this commit takes a different approach.  Instead of fixing
this in the users of jsonrpc_session, we fix it in the users of
reconnect: make them tell when they've tried to receive something (or
disable this particular feature).

This commit fixes the problem that way.  It's kind of hard to reproduce
but I'm pretty sure that I've seen it a number of times in testing.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-18 18:06:29 -08:00
Ian Stokes
252e1e5764 dpdk: Update to use DPDK v20.11.
This commit adds support for DPDK v20.11, it includes the following
changes.

1. travis: Remove explicit DPDK kmods configuration.
2. sparse: Fix build with 20.05 DPDK tracepoints.
3. netdev-dpdk: Remove experimental API flag.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=173216&state=*

4. sparse: Update to DPDK 20.05 trace point header.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=179604&state=*

5. sparse: Fix build with DPDK 20.08.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=200181&state=*

6. build: Add support for DPDK meson build.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=199138&state=*

7. netdev-dpdk: Remove usage of RTE_ETH_DEV_CLOSE_REMOVE flag.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=207850&state=*

8. netdev-dpdk: Fix build with 20.11-rc1.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=209006&state=*

9. sparse: Fix __ATOMIC_* redefinition errors

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=209452&state=*

10. build: Remove DPDK make build references.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=216682&state=*

For credit all authors of the original commits to 'dpdk-latest' with the
above changes have been added as co-authors for this commit.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Co-authored-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Sunil Pai G <sunil.pai.g@intel.com>
Co-authored-by: Sunil Pai G <sunil.pai.g@intel.com>
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Co-authored-by: Eli Britstein <elibr@nvidia.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Govindharajan, Hariprasad <hariprasad.govindharajan@intel.com>
Tested-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2020-12-16 17:44:06 +00:00
Jianbo Liu
af06184705 dpif-netlink: Count the number of offloaded rules
Add a counter for the offloaded rules, and display it in the command
of "ovs-appctl upcall/show".

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-12-07 07:30:15 +01:00