2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 21:38:13 +00:00

3402 Commits

Author SHA1 Message Date
Darrell Ball
a7f33fdbfb conntrack: Support zone limits.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-03 10:11:13 -08:00
Martin Varghese
df711aae1f Datapath: Change in openvswitch kernel module to support MPLS label depth of 3 in ingress direction.
Upstream commit:
    commit fbdcdd78da7c95f1b970d371e1b23cbd3aa990f3
    Author: Martin Varghese <martin.varghese@nokia.com>
    Date:   Mon Nov 4 07:27:44 2019 +0530

    Change in Openvswitch to support MPLS label depth of 3 in ingress
    direction

    The openvswitch was supporting a MPLS label depth of 1 in the
    ingress direction though the userspace OVS supports a max depth
    of 3 labels.This change enables openvswitch module to support a
    max depth of 3 labels in the ingress.

    Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-22 09:26:47 -08:00
Flavio Leitner
d0d571493c ofproto-dpif: Allow IPv6 ND Extensions only if supported
The IPv6 ND Extensions is only implemented in userspace datapath,
but nothing prevents that to be used with other datapaths.

This patch probes the datapath and only allows if the support
is available.

Fixes: 9b2b84973 ("Support for match & set ICMPv6 reserved and options type fields")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-21 17:15:47 -08:00
Han Zhou
9bfb280a17 ovsdb raft: Fix election timer parsing in snapshot RPC.
Commit a76ba825 took care of saving and restoring election timer in
file header snapshot, but it didn't handle the parsing of election
timer in install_snapshot_request/reply RPC, which results in problems,
e.g. when election timer change log is compacted in snapshot and then a
new node join the cluster, the new node will use the default timer
instead of the new value.  This patch fixed it by parsing election
timer in snapshot RPC.

At the same time the patch updates the test case to cover the DB compact and
join senario. The test reveals another 2 problems related to clustered DB
compact, as commented in the test case's XXX, which need to be addressed
separately.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-21 10:51:10 -08:00
William Tu
27501802d0 ofproto-dpif: Expose datapath capability to ovsdb.
The patch adds support for fetching the datapath's capabilities
from the result of 'check_support()', and write the supported capability
to a new database column, called 'capabilities' under Datapath table.

To see how it works, run:
 # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
 # ovs-vsctl -- --id=@m create Datapath datapath_version=0 \
     'ct_zones={}' 'capabilities={}' \
     -- set Open_vSwitch . datapaths:"netdev"=@m

 # ovs-vsctl list-dp-cap netdev
 ufid=true sample_nesting=true clone=true tnl_push_pop=true \
 ct_orig_tuple=true ct_eventmask=true ct_state=true \
 ct_clear=true max_vlan_headers=1 recirc=true ct_label=true \
 max_hash_alg=1 ct_state_nat=true ct_timeout=true \
 ct_mark=true ct_orig_tuple6=true check_pkt_len=true \
 masked_set_action=true max_mpls_depth=3 trunc=true ct_zone=true

Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
---
v5:
    Add improved documentation from Ben and
    fix checkpatch error (tab and line 79 char)
v4:
    rebase to master
v3:
    fix 32-bit build, reported by Greg
    travis: https://travis-ci.org/williamtu/ovs-travis/builds/599276267
v2:
	rebase to master
2019-11-21 09:19:59 -08:00
Ilya Maximets
e8f5634484 netdev-afxdp: Best-effort configuration of XDP mode.
Until now there was only two options for XDP mode in OVS: SKB or DRV.
i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.

Devices like 'veth' interfaces in Linux supports native XDP, but
doesn't support zero-copy mode.  This case can not be covered by
existing API and we have to use slower generic XDP for such devices.
There are few more issues, e.g. TCP is not supported in generic XDP
mode for veth interfaces due to kernel limitations, however it is
supported in native mode.

This change introduces ability to use native XDP without zero-copy
along with best-effort configuration option that enabled by default.
In best-effort case OVS will sequentially try different modes starting
from the fastest one and will choose the first acceptable for current
interface.  This will guarantee the best possible performance.

If user will want to choose specific mode, it's still possible by
setting the 'options:xdp-mode'.

This change additionally changes the API by renaming the configuration
knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
themselves to be more user-friendly.

The full list of currently supported modes:
  * native-with-zerocopy - former DRV
  * native               - new one, DRV without zero-copy
  * generic              - former SKB
  * best-effort          - new one, chooses the best available from
                           3 above modes

Since 'best-effort' is a default mode, users will not need to
explicitely set 'xdp-mode' in most cases.

TCP related tests enabled back in system afxdp testsuite, because
'best-effort' will choose 'native' mode for veth interfaces
and this mode has no issues with TCP.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
2019-11-20 16:48:26 +01:00
Ilya Maximets
0c489cc54a tests: Wait up to OVS_CTL_TIMEOUT seconds.
While running tests under valgrind, it could take more than 10 seconds
for process to disappear after successful 'ovs-appctl exit' command.

Same applies to some other events that tests are waiting for with
OVS_WAIT macro.  This makes tests to fail frequently under valgrind.

Using OVS_CTL_TIMEOUT variable instead of constant 10 seconds seems
reasonable to avoid this issue because it controls timeouts of all
control utilities and needs to be adjusted while running under valgrind
anyway.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-11-06 15:33:37 -08:00
Ashish Varma
3e613cd81c ofp-monitor: Fixed the usage of 'usable_protocols' variable in 'parse_flow_monitor_request' function.
'usable_protocols' is now getting set to OFPUTIL_P_OF10_ANY on return from
'parse_flow_monitor_request' function. The calling function now checks for the
value in this variable against the 'allowed_protocols' variable.
Also a check is added for a match field which is not supported in OpenFlow 1.0
and return an error.
Modified the man page of ovs-ofctl to reflect Flow Monitor support as
OpenFlow 1.0 Nicira extension only.

Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-01 14:35:00 -07:00
Ilya Maximets
54e2baec09 flow: Fix crash on vlan packets with partial offloading.
parse_tcp_flags() does not care about vlan tags in a packet thus
not able to parse them.  As a result, if partial offloading is
enabled in userspace datapath vlan packets are not parsed, i.e.
has no initialized offsets.  This causes OVS crash on any attempt
to access/modify packet header fields.

For example, having the flow with following actions:
  in_port=1,ip,actions=mod_nw_src:192.168.0.7,output:IN_PORT

will lead to OVS crash on vlan packet handling:

 Process terminating with default action of signal 11 (SIGSEGV)
 Invalid read of size 4
    at 0x785657: get_16aligned_be32 (unaligned.h:249)
    by 0x785657: odp_set_ipv4 (odp-execute.c:82)
    by 0x785657: odp_execute_masked_set_action (odp-execute.c:527)
    by 0x785657: odp_execute_actions (odp-execute.c:894)
    by 0x74CDA9: dp_netdev_execute_actions (dpif-netdev.c:7355)
    by 0x74CDA9: packet_batch_per_flow_execute (dpif-netdev.c:6339)
    by 0x74CDA9: dp_netdev_input__ (dpif-netdev.c:6845)
    by 0x74DB6E: dp_netdev_input (dpif-netdev.c:6854)
    by 0x74DB6E: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
    by 0x74E863: dpif_netdev_run (dpif-netdev.c:5264)
    by 0x703F57: type_run (ofproto-dpif.c:370)
    by 0x6EC8B8: ofproto_type_run (ofproto.c:1760)
    by 0x6DA52B: bridge_run__ (bridge.c:3188)
    by 0x6E083F: bridge_run (bridge.c:3252)
    by 0x1642E4: main (ovs-vswitchd.c:127)
  Address 0xc is not stack'd, malloc'd or (recently) free'd

Fix that by properly parsing vlan tags first.  Function 'parse_dl_type'
transformed for that purpose as it had no users anyway.

Added unit test for packet modification with partial offloading that
triggers above crash.

Fixes: aab96ec4d81e ("dpif-netdev: retrieve flow directly from the flow mark")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-10-25 18:06:50 +02:00
Ilya Maximets
51528c2994 tests: Fix indentation in userspace packet type aware test.
CC: Ben Pfaff <blp@ovn.org>
Fixes: 7be29a47576d ("ofproto-dpif: Remove tabs from output.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-10-25 15:50:29 +00:00
Numan Siddique
cec7005bde ovsdb-server: Allow replication from older schema version servers.
Presently, replication is not allowed if there is a schema version mismatch between
the schema returned by the active ovsdb-server and the local db schema. This is
causing failures in OVN DB HA deployments during uprades.

In the case of OpenStack tripleo deployment with OVN, OVN DB ovsdb-servers are
deployed on a multi node controller cluster in active/standby mode. During
minor updates or major upgrades, the cluster is updated one at a time. If
a node A is running active OVN DB ovsdb-servers and when it is updated, another
node B becomes active. After the update when OVN DB ovsdb-servers in A are started,
these ovsdb-servers fail to replicate from the active if there is a schema
version mismatch.

This patch addresses this issue by allowing replication even if there is a
schema version mismatch only if all the active db schema tables and its colums are
present in the local db schema.

This should not result in any data loss.

Signed-off-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-24 14:25:40 -07:00
Ben Pfaff
49df3c0fe7 docs: DPDK isn't a datapath, so don't use the term.
The DPDK library allows OVS fast access to packet I/O in userspace.  It
is not a datapath.  This commit avoids using that term.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-23 12:38:19 -07:00
Bhargava Shastry
0d8fb0211c ossfuzz: Simplify miniflow fuzzer harness.
Google's oss-fuzz builder bots were complaining that miniflow_target is
too slow to fuzz in that some tests take longer than a second to
complete. This patch fixes this by replacing the random flow generation
within the harness to a more simpler scenario.

Signed-off-by: Bhargava Shastry <bshas3@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:58:53 -07:00
Ilya Maximets
15394e0ff7 tests: Get rid of timeout options for control utilities.
'OVS_CTL_TIMEOUT' environment variable is exported in tests/atlocal.in
and controls timeouts for all OVS utilities in testsuite.

There should be no manual tweaks for each single command.

This helps with running tests under valgrind where commands could
take really long time as you only need to change 'OVS_CTL_TIMEOUT'
in a single place.

Few manual timeouts were left in places where they make sense.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-10-16 12:11:32 +02:00
Ben Pfaff
36e5d97f9b ovs-vlan-bug-workaround: Remove.
This workaround only applied to kernels earlier than 2.6.37, but OVS
only supports 3.10 and later.

As the original author of this code, I won't miss it.

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-14 15:34:53 -07:00
Ilya Maximets
1e9ec310b8 tests: Allow valgrind check for afxdp testsuite.
New 'make' target 'check-afxdp-valgrind'.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-10-09 16:09:26 -07:00
Greg Rose
ae05d68139 tests: Only run test on kernel datapath
The recently added test to check for the correct L3 L4 protocol
information after conntrack reassembles a packet should not run
in the userspace datapath.  It is specific to a kernel datapath
regression.

Also change the name of the test to make it more informative and
less redundant and add comments with a short explanation.

Fixes: d7fd61a ("tests: Add check for correct l3l4 conntrack frag reassembly")
Suggested-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-04 10:26:36 -07:00
Greg Rose
d7fd61ae33 tests: Add check for correct l3l4 conntrack frag reassembly
Two commits recently fixed an issue with setting the corrrect l3 and l4
flow information when conntrack reassembles packet fragments.

c98f776 datapath: Clear the L4 portion of the key for "later" fragments
2609173 datapath: Properly set L4 keys on "later" IP fragments

This test checks for regressions that might break this feature.  It
counts on the fact that when the bug is present the udp src port
will not be correct.  It will either be zero or else some other
garbage value.  So the test feeds some fragments through for
reassembly and then checks to make sure that the udp srce port
is actually the correct value of 5001.

Tested by reverting the above commits and observing that the test
then fails.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-03 11:03:33 -07:00
Ben Pfaff
05bf1dbb98 ovn: Remove remaining pieces.
A preceding commit removed the last remaining dependencies on OVN code,
so remove the OVN code.

Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-30 13:13:26 -07:00
Ben Pfaff
817db73019 ovsdb-cluster: Use ovs-vsctl instead of ovn-nbctl and ovn-sbctl.
This removes a dependency on OVN from the tests.

This adds some options to ovs-vsctl to allow it to be used for testing
the clustering feature.  The new options are undocumented because
they're really just useful for testing clustering.

Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-30 13:13:26 -07:00
Yi-Hung Wei
47d76e9f03 datapath: Fix conntrack cache with timeout
This patch is from the following upstream net-next commit along with
an updated system traffic test to avoid regression.

Upstream commit:
    commit 7177895154e6a35179d332f4a584d396c50d0612
    Author: Yi-Hung Wei <yihung.wei@gmail.com>
    Date:   Thu Aug 22 13:17:50 2019 -0700

        openvswitch: Fix conntrack cache with timeout

        This patch addresses a conntrack cache issue with timeout policy.
        Currently, we do not check if the timeout extension is set properly in the
        cached conntrack entry.  Thus, after packet recirculate from conntrack
        action, the timeout policy is not applied properly.  This patch fixes the
        aforementioned issue.

        Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
        Reported-by: kbuild test robot <lkp@intel.com>
        Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
        Acked-by: Pravin B Shelar <pshelar@ovn.org>
        Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
2019-09-30 13:17:38 -07:00
Ben Pfaff
1ca0323e7c Require Python 3 and remove support for Python 2.
Python 2 reaches end-of-life on January 1, 2020, which is only
a few months away.  This means that OVS needs to stop depending
on in the next release that should occur roughly that same time.
Therefore, this commit removes all support for Python 2.  It
also makes Python 3 a mandatory build dependency.

Some of the interesting consequences:

- HAVE_PYTHON, HAVE_PYTHON2, and HAVE_PYTHON3 conditionals have
  been removed, since we now know that Python3 is available.

- $PYTHON and $PYTHON2 are removed, and $PYTHON3 is always
  available.

- Many tests for Python 2 support have been removed, and the ones
  that depended on Python 3 now run unconditionally.  This allowed
  several macros in the testsuite to be removed, making the code
  clearer.  This does make some of the changes to the testsuite
  files large due to indentation level changes.

- #! lines for Python now use /usr/bin/python3 instead of
  /usr/bin/python.

- Packaging depends on Python 3 packages.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Tested-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-27 09:23:50 -07:00
Yi-Hung Wei
187bb41fbf ofproto-dpif-xlate: Translate timeout policy in ct action
This patch derives the timeout policy based on ct zone from the
internal data structure that we maintain on dpif layer.

It also adds a system traffic test to verify the zone-based conntrack
timeout feature.  The test uses ovs-vsctl commands to configure
the customized ICMP and UDP timeout on zone 5 to a shorter period.
It then injects ICMP and UDP traffic to conntrack, and checks if the
corresponding conntrack entry expires after the predefined timeout.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>

ofproto-dpif: Checks if datapath supports OVS_CT_ATTR_TIMEOUT

This patch checks whether datapath supports OVS_CT_ATTR_TIMEOUT. With this
check, ofproto-dpif-xlate can use this information to decide whether to
translate the ct timeout policy.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2019-09-26 13:51:04 -07:00
Yi-Hung Wei
ebe62ec1b9 datapath: Add support for conntrack timeout policy
This patch adds support for specifying a timeout policy for a
connection in connection tracking system in kernel datapath.
The timeout policy will be attached to a connection when the
connection is committed to conntrack.

This patch introduces a new odp field OVS_CT_ATTR_TIMEOUT in the
ct action that specifies the timeout policy in the datapath.
In the following patch, during the upcall process, the vswitchd will use
the ct_zone to look up the corresponding timeout policy and fill
OVS_CT_ATTR_TIMEOUT if it is available.

The datapath code is from the following two net-next upstream commits.

Upstream commit:
commit 06bd2bdf19d2f3d22731625e1a47fa1dff5ac407
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date:   Tue Mar 26 11:31:14 2019 -0700

    openvswitch: Add timeout support to ct action

    Add support for fine-grain timeout support to conntrack action.
    The new OVS_CT_ATTR_TIMEOUT attribute of the conntrack action
    specifies a timeout to be associated with this connection.
    If no timeout is specified, it acts as is, that is the default
    timeout for the connection will be automatically applied.

    Example usage:
    $ nfct timeout add timeout_1 inet tcp syn_sent 100 established 200
    $ ovs-ofctl add-flow br0 in_port=1,ip,tcp,action=ct(commit,timeout=timeout_1)

    CC: Pravin Shelar <pshelar@ovn.org>
    CC: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 6d670497e01803b486aa72cc1a718401ab986896
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Tue Apr 2 09:53:14 2019 +0300

    openvswitch: use after free in __ovs_ct_free_action()

    We free "ct_info->ct" and then use it on the next line when we pass it
    to nf_ct_destroy_timeout().  This patch swaps the order to avoid the use
    after free.

    Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2019-09-26 13:50:17 -07:00
William Tu
45339539f6 ovs-vsctl: Add conntrack zone commands.
The patch adds commands creating/deleting/listing conntrack zone
timeout policies:
  $ ovs-vsctl {add,del,list}-zone-tp dp zone=zone_id ...

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2019-09-26 13:50:17 -07:00
Yifeng Sun
dc0bd12f5b userspace: Enable non-bridge port as tunnel endpoint.
For userspace datapath, currently only the bridge itself, the LOCAL port,
can be the tunnel endpoint to encap/decap tunnel packets.  This patch
enables non-bridge port as tunnel endpoint.  One use case is for users to
create a bridge and a vtep port as tap, and configure underlay IP at vtep
port as the tunnel endpoint.

This patch causes failure for test "ptap - L3 over patch port". This is
because this test is already using non-bridge port gre1 as tunnel endpoint.
In this test, a flow is added to redirect tunnel packets to gre1 port,
as shown below:
  ovs-ofctl add-flow br1 in_port=p1,actions=output=gre1

It later generates a datapath flow which matches an extra eth field:
  - recirc_id(0),...,eth_type(0x0800),...
  + recirc_id(0),...,eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),...

With this patch, this flow need only a NORMAL action.

Signed-off-by: William Tu <u9012063@gmail.com>
Co-authored-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-25 12:18:14 -07:00
Darrell Ball
64207120c8 conntrack: Add option to disable TCP sequence checking.
This may be needed in some special cases, such as to support some hardware
offload implementations.  Note that disabling TCP sequence number
verification is not an optimization in itself, but supporting some
hardware offload implementations may offer better performance.  TCP
sequence number verification is enabled by default.  This option is only
available for the userspace datapath.  Access to this option is presently
provided via 'dpctl' commands as the need for this option is quite node
specific, by virtue of which nics are in use on a given node.  A test is
added to verify this option.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/359188.html
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-25 12:11:32 -07:00
Darrell Ball
594570ea1c conntrack: Optimize recirculations.
Cache the 'conn' context and use it when it is valid.  The cached 'conn'
context will get reset if it is not expected to be valid; the cost to do
this is negligible.  Besides being most optimal, this also handles corner
cases, such as decapsulation leading to the same tuple, as in tunnel VPN
cases.  A negative test is added to check the resetting of the cached
'conn'.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-25 08:58:11 -07:00
Aliasgar Ginwala
00de46f9ee ovsdb-tool: Convert clustered db to standalone db.
Add support in ovsdb-tool for migrating clustered dbs to standalone dbs.
E.g. usage to migrate nb/sb db to standalone db from raft:
ovsdb-tool cluster-to-standalone ovnnb_db.db ovnnb_db_cluster.db

Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-23 15:38:44 -07:00
Ben Pfaff
cf0e423946 db-ctl-base: Give better error messages for ambiguous abbreviations.
Tables and columns may be abbreviated to unique prefixes, but until
now the error messages have just said there's more than one match.
This commit makes the error messages list the possibilities.

Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-18 08:42:34 -07:00
Ben Pfaff
c73d632a4b tests: Fix ovs-vsctl unit test failure regression.
This worked fine as long as there was only one table whose name started
with "C", but now we have three of them.

Acked-by: Justin Pettit <jpettit@ovn.org>
Fixes: 61a5264d60d0 ("ovs-vswitchd: Add Datapath, CT_Zone, and CT_Zone_Policy tables.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-17 14:38:23 -07:00
Mark Michelson
f3e24610ea Remove OVN.
OVN is separated into its own repo. This commit removes the OVN source,
OVN tests, and OVN documentation. It also removes mentions of OVN from
most documentation. The only place where OVN has been left is in
changelogs/NEWS, since we shouldn't mess with the history of the
project.

There is an exception here. The ovsdb-cluster tests rely on ovn-nbctl
and ovn-sbctl to run. Therefore those ovn utilities, as well as their
dependencies remain in the repo with this commit.

Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-06 14:54:58 -07:00
Ilya Maximets
9b0064a3ca tests: Add track-origins flag to valgrind.
Useful for tracking where the uninitialized memory came from.
Report example:

  Thread 13 revalidator11:
  Conditional jump or move depends on uninitialised value(s)
     at 0x4C35D96: __memcmp_sse4_1 (in vgpreload_memcheck.so)
     by 0x9D4404: ofpbuf_equal (ofpbuf.h:273)
     by 0x9D4404: revalidate_ukey__ (ofproto-dpif-upcall.c:2219)
     <...>
     by 0x6AF488E: clone (clone.S:95)
   Uninitialised value was created by a stack allocation
     at 0x9D4450: compose_slow_path (ofproto-dpif-upcall.c:1062)

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-09-06 11:45:39 +03:00
Flavio Leitner
ea41c034db tnl-neigh: Use outgoing ofproto version.
When a packet needs to be encapsulated in userspace, the endpoint
address needs to be resolved to fill in the headers. If it is not,
then currently OvS sends either a Neighbor Solicitation (IPv6)
or an ARP Query (IPv4) to resolve it.

The problem is that the NS/ARP packet will go through the flow
rules in the new bridge, but inheriting the ofproto table version
from the original packet to be encapsulated. When those versions
don't match, the result is unexpected because no flow rules might
be visible, which would cause the default table rule to be used
to drop the packet. Or only part of the flow rules would be visible
and so on.

Since the NS/ARP packet is created by OvS and will be injected in
the outgoing bridge, use the corresponding ofproto version instead.

Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-By: Vasu Dasari <vdasari@gmail.com>
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-28 11:23:42 -07:00
Lorenzo Bianconi
93482cb0da test: do not require python2 for CHECK_CONNTRACK macro
Do not strictly require python2 for CHECK_CONNTRACK macro definitions in
system-{kmod,userspace}-macros.at

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-22 15:40:18 -07:00
Lorenzo Bianconi
9bac5e78a4 OVN: fix DNAT/SNAT system-ovn unit tests
Fix conntrack checks in the following tests in tests/system-ovn.at:
- ovn -- DNAT and SNAT on distributed router - N/S
- ovn -- DNAT and SNAT on distributed router - E/W

Fixes: a6ee09882283 ("OVN: run local logical flows first in S_ROUTER_OUT_SNAT table")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-22 15:35:54 -07:00
Han Zhou
c194367cbf ovsdb monitor: Fix crash when using non-zero last-id with standalone DB.
When a client uses monitor-cond-since with a non-zero last-id but the
server is not in cluster mode for the DB being monitored, it leads to
segmentation fault because the txn_history list is not initialized in
this case.

Program terminated with signal SIGSEGV, Segmentation fault.
1536            struct ovsdb_txn *txn = h_node->txn;
(gdb) bt
0  ovsdb_monitor_get_changes_after (txn_uuid=txn_uuid@entry=0x7ffe8605b7e0, dbmon=0x17c1b40, p_mcs=p_mcs@entry=0x17c4900) at ovsdb/monitor.c:1536
1  0x000000000040da2d in ovsdb_jsonrpc_monitor_create (request_id=0x1804630, version=<optimized out>, params=0x17ad330, db=0x18015b0, s=<optimized out>) at ovsdb/jsonrpc-server.c:1469
2  ovsdb_jsonrpc_session_got_request (request=0x17ad520, s=<optimized out>) at ovsdb/jsonrpc-server.c:1002
3  ovsdb_jsonrpc_session_run (s=<optimized out>) at ovsdb/jsonrpc-server.c:556
...

Although it doesn't happen in normal use cases, no one can prevent a
client to send this on purpose or in a corner case when a client firstly
connected to a clustered DB but later the server restarted with a
non-clustered DB.

This patch fixes it by always initialize the txn_history list to avoid
the undefined behavior in this case. It adds a test case to cover it, too.

Fixes: 695e815 ("ovsdb-server: Transaction history tracking.")
Reported-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 14:47:06 -07:00
Han Zhou
8e35461419 ovsdb raft: Support leader election time change online.
A new unixctl command cluster/change-election-timer is implemented to
change leader election timeout base value according to the scale needs.

The change takes effect upon consensus of the cluster, implemented through
the append-request RPC.  A new field "election-timer" is added to raft log
entry for this purpose.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 11:30:08 -07:00
Han Zhou
923f01cad6 raft.c: Set candidate_retrying if no leader elected since last election.
candiate_retrying is used to determine if the current node is disconnected
from the cluster when the node is in candiate role. However, a node
can flap between candidate and follower role before a leader is elected
when majority of the cluster is down, so is_connected() will flap, too, which
confuses clients.

This patch avoids the flapping with the help of a new member had_leader,
so that if no leader was elected since last election, we know we are
still retrying, and keep as disconnected from the cluster.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 11:30:07 -07:00
Han Zhou
89771c1e65 raft.c: Stale leader should disconnect from cluster.
As mentioned in RAFT paper, section 6.2:

Leaders: A server might be in the leader state, but if it isn’t the current
leader, it could be needlessly delaying client requests. For example, suppose a
leader is partitioned from the rest of the cluster, but it can still
communicate with a particular client. Without additional mechanism, it could
delay a request from that client forever, being unable to replicate a log entry
to any other servers. Meanwhile, there might be another leader of a newer term
that is able to communicate with a majority of the cluster and would be able to
commit the client’s request. Thus, a leader in Raft steps down if an election
timeout elapses without a successful round of heartbeats to a majority of its
cluster; this allows clients to retry their requests with another server.

Reported-by: Aliasgar Ginwala <aginwala@ebay.com>
Tested-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 11:30:05 -07:00
Han Zhou
ca367fa5f8 ovsdb-idl.c: Allows retry even when using a single remote.
When clustered mode is used, the client needs to retry connecting
to new servers when certain failures happen. Today it is allowed to
retry new connection only if multiple remotes are used, which prevents
using LB VIP with clustered nodes. This patch makes sure the retry
logic works when using LB VIP: although same IP is used for retrying,
the LB can actually redirect the connection to a new node.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-21 11:30:03 -07:00
Ilya Maximets
aacad8685e checkpatch: Fix regexp for if, while, etc inside macros.
This allows to use a one-character expression inside the 'if'
statement and multiple spaces before the line continuation character.

Fixes false positive in case like this:

  #define MACRO(ARG)     \
      if (a) {           \
          do_work(ARG);  \
      }

Fixes: 16770c6d9179 ("checkpatch: support macro continuation")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
2019-08-12 10:56:35 +03:00
Ilya Maximets
29004db273 ovsdb-data: Don't put strings with digits in quotes.
No need to use quotes for strings like "br0".

Keeping UUIDs always in quotes to avoid different treatment of those
that starts with digits and those that starts with letters.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-07-25 19:43:46 +03:00
Yifeng Sun
df0ecb2e13 test: Fix fragment-related tests that fail on 4.19+ due to small-sized packets
These fragment-related tests are failing on later kernels (4.19.x)
because kernel quietly drops any packet fragment that is not the last
but has a size smaller than IPV6_MIN_MTU. This patch fixes
them by increasing their sizes to IPV6_MIN_MTU.

Reviewed-by: Darrell Ball <dlu998@gmail.com>
Reivewed-at: https://github.com/openvswitch/ovs/pull/278
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-07-22 10:22:48 -07:00
Vasu Dasari
8416e50f8d tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted
Say an ARP entry is learnt on a OVS port and when such a port is deleted,
learnt entry should be removed from the port. It would have be aged out after
ARP ageout time. This code will clean up immediately.

Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to
verify neighbors are added and removed on deletion of a ports and bridges.

Discussion for this addition is at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html

Signed-off-by: Vasu Dasari <vdasari@gmail.com>
Reviewed-by: Flavio Fernandes <flavio@flaviof.com>
Reviewed-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-07-22 09:20:43 -07:00
William Tu
0c5a65f20e system-traffic: Make nsh test more robust.
The patch adds '-n' to tcpdump to avoid address coverting. Add '-l' for rhel8
to avoid buffering. Since '-U' is used to output to stdout, simply use 'cat'
to search result.  Use OVS_WAIT_UNTIL instead of sleep, and also remove/add
some newlines. Finally, move tcpdump captured interface into the namespace,
(capture p1 instead of ovs-p1), and tested using af_xdp.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
2019-07-22 13:29:51 +03:00
William Tu
0de1b42596 netdev-afxdp: add new netdev type for AF_XDP.
The patch introduces experimental AF_XDP support for OVS netdev.
AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket
type built upon the eBPF and XDP technology.  It is aims to have comparable
performance to DPDK but cooperate better with existing kernel's networking
stack.  An AF_XDP socket receives and sends packets from an eBPF/XDP program
attached to the netdev, by-passing a couple of Linux kernel's subsystems
As a result, AF_XDP socket shows much better performance than AF_PACKET
For more details about AF_XDP, please see linux kernel's
Documentation/networking/af_xdp.rst. Note that by default, this feature is
not compiled in.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
2019-07-19 17:42:06 +03:00
Dumitru Ceara
605535f9ad OVN: Add ovn-northd IGMP support
New IP Multicast Snooping Options are added to the Northbound DB
Logical_Switch:other_config column. These allow enabling IGMP snooping and
querier on the logical switch and get translated by ovn-northd to rows in
the IP_Multicast Southbound DB table.

ovn-northd monitors for changes done by ovn-controllers in the Southbound DB
IGMP_Group table. Based on the entries in IGMP_Group ovn-northd creates
Multicast_Group entries in the Southbound DB, one per IGMP_Group address X,
containing the list of logical switch ports (aggregated from all controllers)
that have IGMP_Group entries for that datapath and address X. ovn-northd
also creates a logical flow that matches on IP multicast traffic destined
to address X and outputs it on the tunnel key of the corresponding
Multicast_Group entry.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-07-16 14:46:28 -07:00
Dumitru Ceara
c89f5f171a OVN: Add IGMP SB definitions and ovn-controller support
A new IP_Multicast table is added to Southbound DB. This table stores the
multicast related configuration for each datapath. Each row will be
populated by ovn-northd and will control:
- if IGMP Snooping is enabled or not, the snooping table size and multicast
  group idle timeout.
- if IGMP Querier is enabled or not (only if snooping is enabled too), query
  interval, query source addresses (Ethernet and IP) and the max-response
  field to be stored in outgoing queries.
- an additional "seq_no" column is added such that ovn-sbctl or if needed a
  CMS can flush currently learned groups. This can be achieved by incrementing
  the "seq_no" value.

A new IGMP_Group table is added to Southbound DB. This table stores all the
multicast groups learned by ovn-controllers. The table is indexed by
datapath, group address and chassis. For a learned multicast group on a
specific datapath each ovn-controller will store its own row in this table.
Each row contains the list of chassis-local ports on which the group was
learned. Rows in the IGMP_Group table are updated or deleted only by the
ovn-controllers that created them.

A new action ("igmp") is added to punt IGMP packets on a specific logical
switch datapath to ovn-controller if IGMP snooping is enabled.

Per datapath IGMP multicast snooping support is added to pinctrl:
- incoming IGMP reports are processed and multicast groups are maintained
  (using the OVS mcast-snooping library).
- each OVN controller syncs its in-memory IGMP groups to the Southbound DB
  in the IGMP_Group table.
- pinctrl also sends periodic IGMPv3 general queries for all datapaths where
  querier is enabled.

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Co-authored-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-07-16 14:46:27 -07:00
Vasu Dasari
c99d14775f ovs-macros: An option to suspend test execution on error
Origins for this patch are captured at
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048923.html.

Summarizing here, when a test fails, it would be good to pause test execution
and let the developer poke around the system to see current status of system.

As part of this patch, made a small tweaks to ovs-macros.at, so that when test
suite fails, ovs_on_exit() function will be called. And in this function, a check
is made to see if an environment variable to OVS_PAUSE_TEST is set. If it is
set, then test suite is paused and will continue to wait for user input
Ctrl-D. Meanwhile user can poke around the system to see why test case has
failed. Once done with investigation, user can press ctrl-d to cleanup the
test suite.

For example, to re-run test case 139:

export OVS_PAUSE_TEST=1
cd tests/system-userspace-testsuite.dir/139
sudo -E ./run

When error occurs, above command would display something like this:
=====================================================
Set environment variable to use various ovs utilities
export OVS_RUNDIR=/opt/vdasari/Developer/ovs/_build-gcc/tests/system-userspace-testsuite.dir/139
Press ENTER to continue:

=====================================================
And from another window, one can execute ovs-xxx commands like:
export OVS_RUNDIR=/opt/vdasari/Developer/ovs/_build-gcc/tests/system-userspace-testsuite.dir/139
$ ovs-ofctl dump-ports br0
.
.

To be able to pause while performing `make check`, one can do:
$ OVS_PAUSE_TEST=1 make check TESTSUITEFLAGS='-v'

Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Vasu Dasari <vdasari@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-07-16 09:53:44 -07:00