2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 22:05:19 +00:00
Commit Graph

18716 Commits

Author SHA1 Message Date
Dan Williams
fb1e7863e8 ovsdb-tool: add --election-timer=ms option to 'create-cluster'
After creating the new clustered database write a raft entry that
sets the desired election timer. This allows CMSes to set the
election timer at cluster start and avoid an error-prone
election timer modification process after the cluster is up.

Reported-at: https://bugzilla.redhat.com/1831778

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2021-05-27 13:26:29 -07:00
Ilya Maximets
13c0eaa7b4 dpif-netlink: Fix send of uninitialized memory in ct limit requests.
ct limit requests never initializes the whole 'struct ovs_zone_limit'
sending uninitialized stack memory to kernel:

 Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
    at 0x5E23867: sendmsg (in /usr/lib64/libpthread-2.28.so)
    by 0x54F761: nl_sock_transact_multiple__ (netlink-socket.c:858)
    by 0x54FB6E: nl_sock_transact_multiple.part.9 (netlink-socket.c:1079)
    by 0x54FCC0: nl_sock_transact_multiple (netlink-socket.c:1044)
    by 0x54FCC0: nl_sock_transact (netlink-socket.c:1108)
    by 0x550B6F: nl_transact (netlink-socket.c:1804)
    by 0x53BEA2: dpif_netlink_ct_get_limits (dpif-netlink.c:3052)
    by 0x588B57: dpctl_ct_get_limits (dpctl.c:2178)
    by 0x586FF2: dpctl_unixctl_handler (dpctl.c:2870)
    by 0x52C241: process_command (unixctl.c:310)
    by 0x52C241: run_connection (unixctl.c:344)
    by 0x52C241: unixctl_server_run (unixctl.c:395)
    by 0x407526: main (ovs-vswitchd.c:128)
  Address 0x10b87480 is 32 bytes inside a block of size 4,096 alloc'd
    at 0x4C30F0B: malloc (vg_replace_malloc.c:307)
    by 0x52CDE4: xmalloc (util.c:138)
    by 0x4F7E07: ofpbuf_init (ofpbuf.c:123)
    by 0x4F7E07: ofpbuf_new (ofpbuf.c:151)
    by 0x53BDE3: dpif_netlink_ct_get_limits (dpif-netlink.c:3025)
    by 0x588B57: dpctl_ct_get_limits (dpctl.c:2178)
    by 0x586FF2: dpctl_unixctl_handler (dpctl.c:2870)
    by 0x52C241: process_command (unixctl.c:310)
    by 0x52C241: run_connection (unixctl.c:344)
    by 0x52C241: unixctl_server_run (unixctl.c:395)
    by 0x407526: main (ovs-vswitchd.c:128)
  Uninitialised value was created by a stack allocation
    at 0x46AAA0: ct_dpif_get_limits (ct-dpif.c:197)

Fix that by using designated initializers that will clear all the
non-specified fields.

Fixes: 906ff9d229 ("dpif-netlink: Implement conntrack zone limit")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
2021-05-24 20:21:56 +02:00
Ilya Maximets
328d20e0f8 ofproto-dpif: Fix use of uninitialized attributes of timeout policy.
'cdtp' is allocated on a stack and it has uninitialized 'present'
field that specifies which attributes are actually set.  This
causes use of uninitialized attributes.

 Conditional jump or move depends on uninitialised value(s)
    at 0x539651: dpif_netlink_get_nl_tp_udp_attrs (dpif-netlink.c:3206)
    by 0x539651: dpif_netlink_get_nl_tp_attrs (dpif-netlink.c:3234)
    by 0x539651: dpif_netlink_ct_set_timeout_policy (dpif-netlink.c:3370)
    by 0x42B615: ct_add_timeout_policy_to_dpif (ofproto-dpif.c:5421)
    by 0x42B615: ct_set_zone_timeout_policy (ofproto-dpif.c:5525)
    by 0x40BD98: ct_zones_reconfigure (bridge.c:756)
    by 0x40BD98: datapath_reconfigure (bridge.c:804)
    by 0x40D737: bridge_reconfigure (bridge.c:903)
    by 0x411720: bridge_run (bridge.c:3331)
    by 0x40751C: main (ovs-vswitchd.c:127)

Clearing the whole structure to avoid this kind of problems.

Fixes: 993cae678b ("ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
2021-05-24 20:21:56 +02:00
Ilya Maximets
577b9a8169 netdev-linux: Fix use of uninitialized LAG master name.
'if_indextoname' may fail leaving the 'master_name' uninitialized:

 Conditional jump or move depends on uninitialised value(s)
    at 0x4C34329: strlen (vg_replace_strmem.c:459)
    by 0x51C638: hash_string (hash.h:342)
    by 0x51C638: hash_name (shash.c:28)
    by 0x51CC51: shash_find (shash.c:231)
    by 0x51CD38: shash_find_data (shash.c:245)
    by 0x4A797F: netdev_from_name (netdev.c:2013)
    by 0x544148: netdev_linux_update_lag (netdev-linux.c:676)
    by 0x544148: netdev_linux_run (netdev-linux.c:769)
    by 0x4A5997: netdev_run (netdev.c:186)
    by 0x40752B: main (ovs-vswitchd.c:129)
  Uninitialised value was created by a stack allocation
    at 0x543AFA: netdev_linux_run (netdev-linux.c:722)

Fixes: d22f8927c3 ("netdev-linux: monitor and offload LAG slaves to TC")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
2021-05-24 20:21:56 +02:00
Adrian Moreno
0b3ff31d35 ofp_actions: Fix set_mpls_tc formatting.
Apart from a cut-and-paste typo, the man page claims that mpls_labels
can be provided in hexadecimal format but that's currently not the case.

Fix mpls ofp-action formatting, add size checks on ofp-action parsing
and add some unit tests.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-05-19 12:42:16 +02:00
Ilya Maximets
7731d26144 dpif-netdev: Remove meter rate from the bucket size calculation.
Implementation of meters supposed to be a classic token bucket with 2
typical parameters: rate and burst size.

Burst size in this schema is the maximum number of bytes/packets that
could pass without being rate limited.

Recent changes to userspace datapath made meter implementation to be
in line with the kernel one, and this uncovered several issues.

The main problem is that maximum bucket size for unknown reason
accounts not only burst size, but also the numerical value of rate.
This creates a lot of confusion around behavior of meters.

For example, if rate is configured as 1000 pps and burst size set to 1,
this should mean that meter will tolerate bursts of 1 packet at most,
i.e. not a single packet above the rate should pass the meter.
However, current implementation calculates maximum bucket size as
(rate + burst size), so the effective bucket size will be 1001.  This
means that first 1000 packets will not be rate limited and average
rate might be twice as high as the configured rate.  This also makes
it practically impossible to configure meter that will have burst size
lower than the rate, which might be a desirable configuration if the
rate is high.

Inability to configure low values of a burst size and overall inability
for a user to predict what will be a maximum and average rate from the
configured parameters of a meter without looking at the OVS and kernel
code might be also classified as a security issue, because drop meters
are frequently used as a way of protection from DoS attacks.

This change removes rate from the calculation of a bucket size, making
it in line with the classic token bucket algorithm and essentially
making the rate and burst tolerance being predictable from a users'
perspective.

Same change will be proposed for the kernel implementation.
Unit tests changed back to their correct version and enhanced.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
2021-05-18 22:11:14 +02:00
Ben Pfaff
76b720ed11 ovs-actions: Document normal pipeline.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2021-05-14 13:58:01 -07:00
Ilya Maximets
60638d61aa AUTHORS: Add Wang Yibo.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-05-14 21:36:19 +02:00
Wang Yibo
f173527ec8 ovs-ofctl: Fix coredump when using "add-groups" command.
When using ovs-ofctl add-groups with only "switch" argument, a coredump
is generated.  The main reason is that the command "ovs-ofctl add-groups"
need two arguments but the limitation of min-args of this command is
set to 1.

Fixes: 7395c05254 ("Implement OpenFlow 1.1+ "groups" protocol.")
Submitted-at: https://github.com/openvswitch/ovs/pull/360
Signed-off-by: Wang Yibo <bobxxwang@126.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-05-14 16:08:43 +02:00
Ilya Maximets
3c2d6274bc raft: Transfer leadership before creating snapshots.
With a big database writing snapshot could take a lot of time, for
example, on one of the systems compaction of 300MB database takes
about 10 seconds to complete.  For the clustered database, 40% of this
time takes conversion of the database to the file transaction json
format, the rest of time is formatting a string and writing to disk.
Of course, this highly depends on the disc and CPU speeds.  300MB is
the very possible database size for the OVN Southbound DB, and it might
be even bigger than that.

During compaction the database is not available and the ovsdb-server
doesn't do any other tasks.  If leader spends 10-15 seconds writing a
snapshot, the cluster is not functional for that time period.  Leader
also, likely, has some monitors to serve, so the one poll interval may
be 15-20 seconds long in the end.  Systems with so big databases
typically has very high election timers configured (16 seconds), so
followers will start election only after this significant amount of
time.  Once leader is back to the operational state, it will
re-connect and try to join the cluster back.  In some cases, this might
also trigger the 'connected' state flapping on the old leader
triggering a re-connection of clients.  This issue has been observed
with large-scale OVN deployments.

One of the methods to improve the situation is to transfer leadership
before compacting.  This allows to keep the cluster functional,
while one of the servers writes a snapshot.

Additionally logging the time spent for compaction if it was longer
than 1 second.  This adds a bit of visibility to 'unreasonably long
poll interval's.

Reported-at: https://bugzilla.redhat.com/1960391
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
2021-05-14 16:00:22 +02:00
Dumitru Ceara
b5bb044fbe ovsdb-cs: Consider all tables when computing expected cond seqno.
In ovsdb_cs_db_set_condition(), take into account all pending condition
changes for all tables when computing the db->cond_seqno at which the
monitor is expected to be updated.

In the following scenario, with two tables, A and B, the old code
performed the following steps:
1. Initial db->cond_seqno = X.
2. Client changes condition for table A:
   - A->new_cond gets set
   - expected cond seqno returned to the client: X + 1
3. ovsdb-cs sends the monitor_cond_change for table A
   - A->req_cond <- A->new_cond
4. Client changes condition for table B:
   - B->new_cond gets set
   - expected cond seqno returned to the client: X + 1
   - however, because the condition change at step 3 is still not replied
     to, table B's monitor_cond_change request is not sent yet.
5. ovsdb-cs receives the reply for the condition change at step 3:
   - db->cond_seqno <- X + 1
6. ovsdb-cs sends the monitor_cond_change for table B
7. ovsdb-cs receives the reply for the condition change at step 6:
  - db->cond_seqno <- X + 2

The client was incorrectly informed that it will have all relevant
updates for table B at seqno X + 1 while actually that happens later, at
seqno X + 2.

Fixes: 46437c5232 ("ovsdb-idl: Enhance conditional monitoring API")
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-05-14 13:44:37 +02:00
Ben Pfaff
7100c220e6 mac-learning: Remove obsolete comments about tags.
Commit a14502a7c5 ("classifier: Retire partitions.") in 2015 removed
OVS support for tags for the second time (!), so these comments about
them should be removed too.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2021-05-12 13:17:31 -07:00
Hariprasad Govindharajan
b45958bf75 dpdk: Use DPDK 20.11.1 release.
Modify ci linux build script to use the latest DPDK stable release.
Modify Documentation to use the latest DPDK stable release 20.11.1
Update NEWS file to reflect the latest DPDK stable releases.
FAQ is updated to reflect the latest DPDK for each branch.

Signed-off-by: Hariprasad Govindharajan <hariprasad.govindharajan@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2021-05-12 09:15:44 +01:00
Ilya Maximets
484e9a4dd7 github: Fix up malformed /etc/hosts.
For some reason /etc/hosts in GHA now contains a plain text line
like this:

  Note: Don't Delete this file. Also, don't remove this line. ...

This breaks libunbound and makes a series of unit tests to emit
following warning:
  |00001|dns_resolve|WARN|Failed to read etc/hosts: syntax error

Working around this issue by removing a bad line from /etc/hosts
until this fixed properly by GitHub team.  This in combination
with other fixes should unblock CI.

Bug for virtual-environments:
  https://github.com/actions/virtual-environments/issues/3353

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
2021-05-11 17:06:32 +02:00
Ilya Maximets
843cd43040 doc: automake: Add support for sphinx 4.0.
File layout for man pages in sphinx 4 by default changed [1] from:

  Documentation/_ref/man/page.section

 to:

  Documentation/_ref/man/section/page.section

Ajusting our build scripts so they will be able to locate files
in new places.  This fixes our CI build.

[1] https://github.com/sphinx-doc/sphinx/issues/7996

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
2021-05-11 16:32:20 +02:00
Ilya Maximets
b03daccfda cirrus: Look up existing versions of python dependencies.
FreeBSD sometimes changes the base version of python3 that is
used for packages.  This affects package names.  For example,
currently CI is broken, because there is no more py37- versions
of sphinx and openssl available, only py38- ones:

  pkg: No packages available to install matching 'py37-openssl'
       have been found in the repositories
  pkg: No packages available to install matching 'py37-sphinx'
       have been found in the repositories

We had the same issue last year with 3.6 -> 3.7 transition:
  dfa2e3d049 ("cirrus: Use python 3.7 packages on FreeBSD.")

Fixing that by searching for a package instead of using a specific
version.  This should help to avoid same issues in the future.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-05-11 16:32:20 +02:00
Ben Pfaff
721488d4a5 classifier: Make find_match_wc() prototype and definition match.
The prototype said *, the definition said [CLS_MAX_TRIES].  GCC 11
complains about this (though it is perfectly valid from a C standards
perspective).  It would probably be better to make them both use
[CLS_MAX_TRIES] but that's only allowed if the struct's definition is
visible at the point of the prototype, which it's not.  Instead of
moving the definition, this commit just changes both usages to *.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
2021-05-07 12:14:31 -07:00
Ben Pfaff
b96d2756f2 tnl-neigh-cache: Include expected array sizes in prototypes.
GCC 11 warned that the prototypes and the definitions of these
functions were different: the prototypes used [], the definitions used
[IFNAMSIZ].  This is perfectly valid from a C standards perspective, but
it's confusing and providing sizes gives the compiler and the developer
a useful hint as to usage.  Therefore, this commit adds the expected
sizes.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
2021-05-07 12:14:28 -07:00
Ben Pfaff
f09182a3b4 ofp-group: Use big-enough buffer in ofputil_format_group().
GCC 11 pointed out that ofputil_group_to_string()'s prototype asks for
a buffer with one byte more than supplied.  This fixes the problem.

This wasn't a buffer overflow because ofputil_group_to_string() honors
the buffer size passed in, which was correct.  The worst that could
happen was truncating the last byte of a group name.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
2021-05-07 12:14:13 -07:00
Ben Pfaff
09fe18af2d xml2nroff: Fix formatting of action headers in ovs-actions(7) manpage.
The action headings were coming out all smashed together, like
"Theoutputaction".  This fixes them so that they appear correctly, like
"The output action".

The previous code stripped starting and ending spaces on a per-node
basis, so that "The ", "<code>output</code>", and " action" each got
stripped down to "The", "output", "action" after processing.  This
commit changes it so that stripping happens after concatenation, fixing
the problem.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-By: Timothy Redaelli <tredaelli@redhat.com>
Tested-By: Timothy Redaelli <tredaelli@redhat.com>
2021-05-06 09:56:42 -07:00
Gavin Li
a019868a62 ofproto/ofproto-dpif-sflow: Check sflow agent in case of race
sFlow agent may not exist while calling dpif_sflow_received. The sFlow
may be deleted in another thread, eg, by user, which will cause crash.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2021-04-21 11:53:50 +02:00
Mark Gray
5dce24d04d ipsec: Fix race in system tests.
This patch fixes an issue where, depending on timing fluctuations,
each node has not fully loaded all connections before the other
node begins to establish a connection. In this failure case, the
"ovs-monitor-ipsec" instance on the "left" node may `ipsec auto --start`
a connection which then gets rejected by the "right" side. Almost,
simulaneously, the "right" side may initiate a connection that gets
rejected by the "left" side. This can happen as, for all tunnels except
for GRE, each node has two connections (an "in" connection and an "out"
connection) that get added one after the other. If the "in" connection
"starts" on both sides, the "out" connection from the other node
may not be available causing the connection to fail. At this point,
"Libreswan" will wait to retry the connection. In the interim, the
OVS system test times out. This race manifests itself more frequently
in a virtualized environment.

This patch resolves this issue by waiting for the "left" node to load
all connections before starting the "right" side. This will cause
the "left" side to fail to establish a connection with the "right"
side (as the "right" side connections have not been loaded) but will
cause the "right" side to succeed to establish a connection as all
connections will have been loaded on the "left" side.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-April/381857.html
Fixes: 8fc62df8b1 ("ipsec: Introduce IPsec system tests for Libreswan.")
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Tested-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-20 00:00:22 +02:00
Ilya Maximets
fd1114e967 AUTHORS: Add Michal Kazior.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-20 00:00:22 +02:00
Michal Kazior
d90b4f2928 rtnetlink: ignore IFLA_WIRELESS events.
Some older wireless drivers - ones relying on the
old and long deprecated wireless extension ioctl
system - can generate quite a bit of IFLA_WIRELESS
events depending on their configuration and
runtime conditions. These are delivered as
RTNLGRP_LINK via RTM_NEWLINK messages.

These tend to be relatively easily identifiable
because they report the change mask being 0. This
isn't guaranteed but in practice it shouldn't be a
problem. None of the wireless events that I ever
observed actually carry any unique information
about netdev states that ovs-vswitchd is
interested in. Hence ignoring these shouldn't
cause any problems.

These events can be responsible for a significant
CPU churn as ovs-vswitchd attempts to do plenty of
work for each and every netlink message regardless
of what that message carries. On low-end devices
such as consumer-grade routers these can lead to a
lot of CPU cycles being wasted, adding up to heat
output and reducing performance.

It could be argued that wireless drivers in
question should be fixed, but that isn't exactly a
trivial thing to do. Patching ovs seems far more
viable while still making sense.

Signed-off-by: Michal Kazior <michal@plume.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-20 00:00:22 +02:00
Ilya Maximets
f9d3039039 dpif: Fix use of uninitialized execute hash.
'dpif_execute_helper_cb' doesn't initilalize the 'hash' field that
may be passed down to datapath and might cause execution of a different
set of actions, e.g. on recirculation.

 Thread 6 handler27:
 Conditional jump or move depends on uninitialised value(s)
    at 0x53A2C2: dpif_netlink_encode_execute (dpif-netlink.c:1841)
    by 0x53A2C2: dpif_netlink_operate__ (dpif-netlink.c:1919)
    by 0x53A82D: dpif_netlink_operate_chunks (dpif-netlink.c:2238)
    by 0x53A82D: dpif_netlink_operate (dpif-netlink.c:2297)
    by 0x48135F: dpif_operate (dpif.c:1366)
    by 0x481923: dpif_execute.part.24 (dpif.c:1320)
    by 0x481C46: dpif_execute (dpif.c:1312)
    by 0x481C46: dpif_execute_helper_cb (dpif.c:1243)
    by 0x4AE943: odp_execute_actions (odp-execute.c:865)
    by 0x47F272: dpif_execute_with_help (dpif.c:1296)
    by 0x4812FF: dpif_operate (dpif.c:1422)
    by 0x442226: handle_upcalls (ofproto-dpif-upcall.c:1617)
    by 0x442226: recv_upcalls.isra.36 (ofproto-dpif-upcall.c:855)
    by 0x442351: udpif_upcall_handler (ofproto-dpif-upcall.c:755)
    by 0x4FDE2C: ovsthread_wrapper (ovs-thread.c:383)
    by 0x5E19159: start_thread (in /usr/lib64/libpthread-2.28.so)
    by 0x69ECF72: clone (in /usr/lib64/libc-2.28.so)
  Uninitialised value was created by a stack allocation
    at 0x481966: dpif_execute_helper_cb (dpif.c:1159)

Additionally added a missing comment to the 'struct dpif_execute'.

Fixes: 0442bfb11d ("ofproto-dpif-upcall: Echo HASH attribute back to datapath.")
Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-20 00:00:22 +02:00
Ilya Maximets
4a6be85c8d odp-util: Fix use of uninitialized erspan metadata.
'struct erspan_metadata' contains union with fields of different
sizes, hence not all the memory initiliazed.  This memory goes
to syscalls and also used to compare ukeys with memcmp which may
cause unexpected behavior.

 Thread 15 revalidator13:
 Conditional jump or move depends on uninitialised value(s)
    at 0x4C377B6: bcmp (vg_replace_strmem.c:1111)
    by 0x43F844: ofpbuf_equal (ofpbuf.h:273)
    by 0x43F844: revalidate_ukey__ (ofproto-dpif-upcall.c:2227)
    by 0x43F9C9: revalidate_ukey (ofproto-dpif-upcall.c:2294)
    by 0x4425C2: revalidate.isra.33 (ofproto-dpif-upcall.c:2734)
    by 0x4434B8: udpif_revalidator (ofproto-dpif-upcall.c:943)
    by 0x4FDE2C: ovsthread_wrapper (ovs-thread.c:383)
    by 0x5E19159: start_thread (in /usr/lib64/libpthread-2.28.so)
    by 0x69ECF72: clone (in /usr/lib64/libc-2.28.so)
  Uninitialised value was created by a stack allocation
    at 0x4B1CE0: tun_key_to_attr (odp-util.c:3129)

Fixes: 98514eea21 ("erspan: add kernel datapath support")
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-20 00:00:22 +02:00
Ilya Maximets
3311ca0d44 AUTHORS: Add Ariel Levkovich.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-20 00:00:22 +02:00
Ariel Levkovich
ea71a9d443 netdev-offload-tc: Add support for ct_state flag rel.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-19 23:57:05 +02:00
Yunjian Wang
0c147fb4e5 dpif-netlink: Fix using uninitialized info.tc_modify_flow_deleted in out label.
Before info.tc_modify_flow_deleted is assigned a value, error
processing of other statements goes to the out label. In the
out label, the uninitialized variant is used for condition
determination, which may cause uncertain behavior.

Fixes: 65b84d4a32 ("dpif-netlink: avoid netlink modify flow put op failed after tc modify flow put op failed.")
Signed-off-by: Mengfan Lv <lvmengfan@huawei.com>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-19 23:17:46 +02:00
Tianyu Yuan
44ea24427e Add test cases for ingress_policing parameters
tests/ovs-vsctl.at: Add ingress_policing test in ovs-vsctl unit test
tests/system-offloads-traffic.at: Check ingress_policing with offloads enabled and disabled

Exercise OVS setting of ingress_policing parameters using ovs-vsctl and verify that the correct values are stored on OVSDB.
Verify the ingress_policing parameters with tc command. Also check offload and non-offload in tc software datapath based on tc filter type (matchall and basic).

Example invocation:
make check TESTSUITEFLAGS='-k ingress_policing'
make check-offloads TESTSUITEFLAGS='-k ingress_policing'

Signed-off-by: Tianyu Yuan <tianyu.yuan@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
2021-04-13 15:36:44 +02:00
Yong.Xu
67e0e0bc15 netdev-linux: correct unit of burst parameter
Correct calculation of burst parameter used when configuring TC policer
action for ingress port-based policing in the case where TC offload is in
use. This now matches the value calculated for the case where TC offload is
not in use.

The division by 8 is to convert from bits to bytes.
Its unclear why 64 was previously used.

Fixes: e7f6ba220 ("lib/tc: add ingress ratelimiting support for tc-offload")
Signed-off-by: Yong Xu <yong.xu@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
2021-04-13 15:36:35 +02:00
Ben Pfaff
d2d0513106 AUTHORS: Add Aidan Shribman.
Signed-off-by: Ben Pfaff <blp@ovn.org>
2021-04-08 08:57:40 -07:00
Aidan Shribman
23f9ec9eb0 make: don't prompt during build
When running repeated builds using `make build` you get prompts in cases
the `mv` command is about to overwrite a file which is write-protect.
This patch forced the `mv` w/o prompting for approval.

Signed-off-by: Aidan Shribman <aidan.shribman@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2021-04-08 08:57:37 -07:00
Ben Pfaff
332b8a3e4e util: Add allocation wrappers that don't increment coverage counters.
The thread-local data allocators can't increment coverage counters
because this can cause reentrancy.  Until now, this code has used
explicit calls to malloc().  This code replaces them by calls to the
new functions.  This will make it easier in an upcoming patch to update
all the code that can run out of memory.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-07 09:44:30 -07:00
Ben Pfaff
4cb38cd222 ovs-lldp: Get rid of pointless null pointer check.
lldpd_alloc_hardware() always returns nonnull.

At the same time, there's no reason that lldpd_alloc_hardware() doesn't
take a const char *, so change that.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-07 09:34:24 -07:00
Mark Gray
f8be30acf2 ipsec: Update ordering of imports.
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 19:13:31 +02:00
Mark Gray
8fc62df8b1 ipsec: Introduce IPsec system tests for Libreswan.
This patch adds system tests for OVS IPsec using Libreswan.
If Libreswan is not present on the system, the tests will
be skipped.

These tests set up an underlay switch with bridge 'br0'
to carry encrypted traffic between two emulated "nodes".
Each "node" is a separate network namespace ('left' and
'right') and runs an instance of the Libreswan "pluto"
daemon, ovs-monitor-ipsec, ovs-vswitch and ovsdb-server.

Each test sets up IPsec between the two emulated "nodes"
using various configurations (currently tunnel
type, IPv6/IPv6, authentication method, local_ip). After
configuration, connectivity between the two nodes is
tested and the underlay traffic is also inspected to
ensure the traffic is encrypted.

All IPsec system tests can be run by using the ipsec
keyword:

sudo make check-kernel TESTSUITEFLAGS='-k ipsec'

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 19:13:31 +02:00
Mark Gray
d6afbc00d5 ipsec: Allow custom file locations.
"ovs_monitor_ipsec" assumes certain file locations for a number
of Libreswan objects. This patch allows these locations to be
configurable at startup in the Libreswan case.

This additional flexibility enables system testing for
OVS IPsec.

Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 19:13:31 +02:00
Mark Gray
4ce8bb159e system-common-macros: clean up veth device on test failure.
'on_exit' should be run directly after creation
of veth device.

Fixes: 119db2cb18 ("kmod-macros: Move some code to traffic-common-macros.")
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 19:13:31 +02:00
Mark Gray
58b4146e0b ipsec: Fix IPv6 default route support for Libreswan.
When configuring IPsec, "ovs-monitor-ipsec" honours
the 'local_ip' option in the 'Interface' table by configuring
the 'left' side of the Libreswan connection with 'local_ip'.
If 'local_ip' is not specified, "ovs-monitor-ipsec" sets
'left' to '%defaultroute' which is interpreted as the IP
address of the default gateway interface.

However, when 'remote_ip' is an IPv6 address, Libreswan
still interprets '%defaultroute' as the IPv4 address on the
default gateway interface (see:
https://github.com/libreswan/libreswan/issues/416) giving
an "address family inconsistency" error.

This patch resolves this issue by specifying the
connection as IPv6 when the 'remote_ip' is IPv6 and
'local_ip' has not been set.

Fixes: 22c5eafb6e ("ipsec: reintroduce IPsec support for tunneling")
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 19:12:52 +02:00
Dumitru Ceara
ac85cdb38c ovsdb-idl: Mark arc sources as updated when destination is deleted.
Considering two DB rows, 'a' from table A and 'b' from table B (with
column 'ref_a' a reference to table A):
a = {A._uuid=<U1>}
b = {B._uuid=<U2>, B.ref_a=<U1>}

When the IDL client processes an update that deletes row 'a', row 'b'
is also marked as 'updated' if change tracking is enabled for table B.

Fixes: 102781cc02 ("ovsdb-idl: Track changes for table references.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 14:15:49 +02:00
Dumitru Ceara
95689f1668 ovsdb-idl: Preserve references for deleted rows.
Considering two DB rows, 'a' from table A and 'b' from table B (with
column 'ref_a' a reference to table A):
a = {A._uuid=<U1>}
b = {B._uuid=<U2>, B.ref_a=<U1>}

Assuming both records are present in the IDL client's in-memory view of
the database, depending whether row 'b' is also deleted in the same
transaction or not, deletion of row 'a' should generate the following
tracked changes:

1. only row 'a' is deleted:
- for table A:
  - deleted records: a = {A._uuid=<U1>}
- for table B:
  - updated records: b = {B._uuid=<U2>, B.ref_a=[]}

2. row 'a' and row 'b' are deleted in the same update:
- for table A:
  - deleted records: a = {A._uuid=<U1>}
- for table B:
  - deleted records: b = {B._uuid=<U2>, B.ref_a=<U1>}

To ensure this, we now delay reparsing row backrefs for deleted rows
until all updates in the current run have been processed.

Without this change, in scenario 2 above, the tracked changes for table
B would be:
- deleted records: b = {B._uuid=<U2>, B.ref_a=[]}

In particular, for strong references, row 'a' can never be deleted in
a transaction that happens strictly before row 'b' is deleted.  In some
cases [0] both rows are deleted in the same transaction and having
B.ref_a=[] would violate the integrity of the database from client
perspective.  This would force the client to always validate that
strong reference fields are non-NULL.  This is not really an option
because the information in the original reference is required for
incrementally processing the record deletion.

[0] with ovn-monitor-all=true, the following command triggers a crash
    in ovn-controller because a strong reference field becomes NULL:
    $ ovn-nbctl --wait=hv -- lr-add r -- lrp-add r rp 00:00:00:00:00:01 1.0.0.1/24
    $ ovn-nbctl lr-del r

Reported-at: https://bugzilla.redhat.com/1932642
Fixes: 72aeb243a5 ("ovsdb-idl: Tracking - preserve data for deleted rows.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 14:15:49 +02:00
Dumitru Ceara
4c0d093b17 ovsdb-idl.at: Make test outputs more predictable.
IDL tests need predictable output from test-ovsdb.

This used to be done by first sorting the output of test-ovsdb and then
applying uuidfilt to predictably translate UUIDs.  This was not
reliable enough in case test-ovsdb processes two or more insert/delete
operations in the same iteration because the order of lines in the
output depends on the automatically generated UUID values.

To fix this we change the way test-ovsdb and test-ovsdb.py generate
outputs and prepend the table name and tracking information before
printing the contents of a row.

All existing ovsdb-idl.at and ovsdb-cluster.at tests are updated to
expect the new output format.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 13:53:20 +02:00
Flavio Fernandes
31629b5383 ovs-ofctl: Fix segfault due to bad meter n_bands.
Meter commands internally use ofctl_meter_mod__() and
ofctl_meter_request__() functions, which have an optional parameter
called str.  When str is NULL, these 2 functions initialize a struct
with meter bands set as NULL. It also needs to set meter n_bands to 0.

Once del-meters change in test dpif-netdev.at is added, the valgrind
report on test '992: dpif-netdev - meters' shows this issue:

   Conditional jump or move depends on uninitialised value(s)
      at 0x473534: ofputil_put_bands (ofp-meter.c:207)
      by 0x473534: ofputil_encode_meter_mod (ofp-meter.c:557)
      by 0x40FBA2: ofctl_meter_mod__ (ovs-ofctl.c:4038)
      by 0x417BD3: ovs_cmdl_run_command__ (command-line.c:247)
      by 0x4078BA: main (ovs-ofctl.c:179)
    Uninitialised value was created by a stack allocation
      at 0x409350: ofctl_del_meters (ovs-ofctl.c:4088)

Fixes: 3200ed5805 ("ovs-ofctl: Add meter support.")
Signed-off-by: Flavio Fernandes <flavio@flaviof.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-01 01:42:28 +02:00
Ansis Atteka
b4a9c9cd84 netlink: make Netlink socket receive buffer 4x larger
Under high load I observed that Netlink socket buffer constantly
fills up for daemons listening for Conntrack Table notifications:

netlink_notifier|WARN|netlink receive buffer overflowed

This patch mitigates the problem by increasing socket
receive buffer size.  Ideally we should try to calculate
buffer size required, but it would be more sophisticated
solution than simply increasing buffer size.

Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ansis Atteka <aatteka@ovn.org>
VMware-BZ: #2724821
2021-03-31 09:32:05 -05:00
Alexey Roytman
50b0b4d866 db-ctl-base: Warn if "destroy" command lacks --all or record argument.
Signed-off-by: Alexey Roytman <roytman@il.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2021-03-30 15:47:49 -07:00
Ilya Maximets
9801d7c4b8 github: Fix handling of python packages.
GitHub Actions doesn't have python locations in PATH and different
runners might have different configuration for default python
location and versions.  For example, on some runners python2 might
be installed or not.

Missing PATH causes weird situations on older branches where during
one run our scripts can locate just installed flake8 and can't do
that on a different run.  But this might also create other
unpredictable issues on all branches.

It's required to use actions/setup-python@v2 in order to have
predictable version of python installed and paths correctly configured.
Due to some bugs in GHA itself it doesn't set $HOME/.local/bin into
PATH, so we have to do that manually for now in order to use '--user'.
This might be fixed later in actions/setup-python or in base runners.
We already setting it for DPDK 20.11 (I think the issue was spotted
but not fully investigated).  Moving PATH updates to a separate step
to make them more explicit and available for all steps of the job.

Unfortunately actions/setup-python@v2 also makes invisible python
packages installed from Ubuntu repositories.  Switching them to
'pip3 install'.

Fixes: 6cb2f5a630 ("github: Add GitHub Actions workflow.")
Reported-by: Numan Siddique <numans@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-03-30 21:16:27 +02:00
Tonghao Zhang
c3690ccbce dpif-netdev: Refactor and fix the buckets calculation.
The way that "burst_size" was used as total buckets is
very strange. If user set the "burst_size" too smaller while
"rate" larger, that may affect the meter normal work.
This patch refactor the buckets calculation, and start
with a full buckets. It also makes the calculation equal
to what kernel datapath does.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-03-30 21:16:27 +02:00
Tonghao Zhang
759aaa8513 dpif-netdev: Fix the meter buckets overflow.
When setting the meter rate to 4.3+Gbps, there is an overflow, the
meters don't work as expected.

$ ovs-ofctl -O OpenFlow13 add-meter br-int "meter=1 kbps stats bands=type=drop rate=4294968"

It was overflow when we set the rate to 4294968, because "burst_size" in
the ofputil_meter_band is uint32_t type. This patch remove the "up"
in the dp_meter_band struction, and introduce "rate", "burst_size" and
"bucket" (uint64_t) to userspace datapath's meter band. This patch don't
change the public API and structure.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-03-30 20:17:54 +02:00
Ben Pfaff
af0ce38666 AUTHORS: Add Shahar Klein.
Signed-off-by: Ben Pfaff <blp@ovn.org>
2021-03-25 18:21:56 -07:00