Upstream commit:
commit fbdcdd78da7c95f1b970d371e1b23cbd3aa990f3
Author: Martin Varghese <martin.varghese@nokia.com>
Date: Mon Nov 4 07:27:44 2019 +0530
Change in Openvswitch to support MPLS label depth of 3 in ingress
direction
The openvswitch was supporting a MPLS label depth of 1 in the
ingress direction though the userspace OVS supports a max depth
of 3 labels.This change enables openvswitch module to support a
max depth of 3 labels in the ingress.
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The IPv6 ND Extensions is only implemented in userspace datapath,
but nothing prevents that to be used with other datapaths.
This patch probes the datapath and only allows if the support
is available.
Fixes: 9b2b84973 ("Support for match & set ICMPv6 reserved and options type fields")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Commit a76ba825 took care of saving and restoring election timer in
file header snapshot, but it didn't handle the parsing of election
timer in install_snapshot_request/reply RPC, which results in problems,
e.g. when election timer change log is compacted in snapshot and then a
new node join the cluster, the new node will use the default timer
instead of the new value. This patch fixed it by parsing election
timer in snapshot RPC.
At the same time the patch updates the test case to cover the DB compact and
join senario. The test reveals another 2 problems related to clustered DB
compact, as commented in the test case's XXX, which need to be addressed
separately.
Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The patch adds support for fetching the datapath's capabilities
from the result of 'check_support()', and write the supported capability
to a new database column, called 'capabilities' under Datapath table.
To see how it works, run:
# ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
# ovs-vsctl -- --id=@m create Datapath datapath_version=0 \
'ct_zones={}' 'capabilities={}' \
-- set Open_vSwitch . datapaths:"netdev"=@m
# ovs-vsctl list-dp-cap netdev
ufid=true sample_nesting=true clone=true tnl_push_pop=true \
ct_orig_tuple=true ct_eventmask=true ct_state=true \
ct_clear=true max_vlan_headers=1 recirc=true ct_label=true \
max_hash_alg=1 ct_state_nat=true ct_timeout=true \
ct_mark=true ct_orig_tuple6=true check_pkt_len=true \
masked_set_action=true max_mpls_depth=3 trunc=true ct_zone=true
Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
---
v5:
Add improved documentation from Ben and
fix checkpatch error (tab and line 79 char)
v4:
rebase to master
v3:
fix 32-bit build, reported by Greg
travis: https://travis-ci.org/williamtu/ovs-travis/builds/599276267
v2:
rebase to master
Until now there was only two options for XDP mode in OVS: SKB or DRV.
i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.
Devices like 'veth' interfaces in Linux supports native XDP, but
doesn't support zero-copy mode. This case can not be covered by
existing API and we have to use slower generic XDP for such devices.
There are few more issues, e.g. TCP is not supported in generic XDP
mode for veth interfaces due to kernel limitations, however it is
supported in native mode.
This change introduces ability to use native XDP without zero-copy
along with best-effort configuration option that enabled by default.
In best-effort case OVS will sequentially try different modes starting
from the fastest one and will choose the first acceptable for current
interface. This will guarantee the best possible performance.
If user will want to choose specific mode, it's still possible by
setting the 'options:xdp-mode'.
This change additionally changes the API by renaming the configuration
knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
themselves to be more user-friendly.
The full list of currently supported modes:
* native-with-zerocopy - former DRV
* native - new one, DRV without zero-copy
* generic - former SKB
* best-effort - new one, chooses the best available from
3 above modes
Since 'best-effort' is a default mode, users will not need to
explicitely set 'xdp-mode' in most cases.
TCP related tests enabled back in system afxdp testsuite, because
'best-effort' will choose 'native' mode for veth interfaces
and this mode has no issues with TCP.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
While running tests under valgrind, it could take more than 10 seconds
for process to disappear after successful 'ovs-appctl exit' command.
Same applies to some other events that tests are waiting for with
OVS_WAIT macro. This makes tests to fail frequently under valgrind.
Using OVS_CTL_TIMEOUT variable instead of constant 10 seconds seems
reasonable to avoid this issue because it controls timeouts of all
control utilities and needs to be adjusted while running under valgrind
anyway.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: William Tu <u9012063@gmail.com>
'usable_protocols' is now getting set to OFPUTIL_P_OF10_ANY on return from
'parse_flow_monitor_request' function. The calling function now checks for the
value in this variable against the 'allowed_protocols' variable.
Also a check is added for a match field which is not supported in OpenFlow 1.0
and return an error.
Modified the man page of ovs-ofctl to reflect Flow Monitor support as
OpenFlow 1.0 Nicira extension only.
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
parse_tcp_flags() does not care about vlan tags in a packet thus
not able to parse them. As a result, if partial offloading is
enabled in userspace datapath vlan packets are not parsed, i.e.
has no initialized offsets. This causes OVS crash on any attempt
to access/modify packet header fields.
For example, having the flow with following actions:
in_port=1,ip,actions=mod_nw_src:192.168.0.7,output:IN_PORT
will lead to OVS crash on vlan packet handling:
Process terminating with default action of signal 11 (SIGSEGV)
Invalid read of size 4
at 0x785657: get_16aligned_be32 (unaligned.h:249)
by 0x785657: odp_set_ipv4 (odp-execute.c:82)
by 0x785657: odp_execute_masked_set_action (odp-execute.c:527)
by 0x785657: odp_execute_actions (odp-execute.c:894)
by 0x74CDA9: dp_netdev_execute_actions (dpif-netdev.c:7355)
by 0x74CDA9: packet_batch_per_flow_execute (dpif-netdev.c:6339)
by 0x74CDA9: dp_netdev_input__ (dpif-netdev.c:6845)
by 0x74DB6E: dp_netdev_input (dpif-netdev.c:6854)
by 0x74DB6E: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
by 0x74E863: dpif_netdev_run (dpif-netdev.c:5264)
by 0x703F57: type_run (ofproto-dpif.c:370)
by 0x6EC8B8: ofproto_type_run (ofproto.c:1760)
by 0x6DA52B: bridge_run__ (bridge.c:3188)
by 0x6E083F: bridge_run (bridge.c:3252)
by 0x1642E4: main (ovs-vswitchd.c:127)
Address 0xc is not stack'd, malloc'd or (recently) free'd
Fix that by properly parsing vlan tags first. Function 'parse_dl_type'
transformed for that purpose as it had no users anyway.
Added unit test for packet modification with partial offloading that
triggers above crash.
Fixes: aab96ec4d81e ("dpif-netdev: retrieve flow directly from the flow mark")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Presently, replication is not allowed if there is a schema version mismatch between
the schema returned by the active ovsdb-server and the local db schema. This is
causing failures in OVN DB HA deployments during uprades.
In the case of OpenStack tripleo deployment with OVN, OVN DB ovsdb-servers are
deployed on a multi node controller cluster in active/standby mode. During
minor updates or major upgrades, the cluster is updated one at a time. If
a node A is running active OVN DB ovsdb-servers and when it is updated, another
node B becomes active. After the update when OVN DB ovsdb-servers in A are started,
these ovsdb-servers fail to replicate from the active if there is a schema
version mismatch.
This patch addresses this issue by allowing replication even if there is a
schema version mismatch only if all the active db schema tables and its colums are
present in the local db schema.
This should not result in any data loss.
Signed-off-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The DPDK library allows OVS fast access to packet I/O in userspace. It
is not a datapath. This commit avoids using that term.
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Google's oss-fuzz builder bots were complaining that miniflow_target is
too slow to fuzz in that some tests take longer than a second to
complete. This patch fixes this by replacing the random flow generation
within the harness to a more simpler scenario.
Signed-off-by: Bhargava Shastry <bshas3@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
'OVS_CTL_TIMEOUT' environment variable is exported in tests/atlocal.in
and controls timeouts for all OVS utilities in testsuite.
There should be no manual tweaks for each single command.
This helps with running tests under valgrind where commands could
take really long time as you only need to change 'OVS_CTL_TIMEOUT'
in a single place.
Few manual timeouts were left in places where they make sense.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
This workaround only applied to kernels earlier than 2.6.37, but OVS
only supports 3.10 and later.
As the original author of this code, I won't miss it.
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
New 'make' target 'check-afxdp-valgrind'.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
The recently added test to check for the correct L3 L4 protocol
information after conntrack reassembles a packet should not run
in the userspace datapath. It is specific to a kernel datapath
regression.
Also change the name of the test to make it more informative and
less redundant and add comments with a short explanation.
Fixes: d7fd61a ("tests: Add check for correct l3l4 conntrack frag reassembly")
Suggested-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Two commits recently fixed an issue with setting the corrrect l3 and l4
flow information when conntrack reassembles packet fragments.
c98f776 datapath: Clear the L4 portion of the key for "later" fragments
2609173 datapath: Properly set L4 keys on "later" IP fragments
This test checks for regressions that might break this feature. It
counts on the fact that when the bug is present the udp src port
will not be correct. It will either be zero or else some other
garbage value. So the test feeds some fragments through for
reassembly and then checks to make sure that the udp srce port
is actually the correct value of 5001.
Tested by reverting the above commits and observing that the test
then fails.
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A preceding commit removed the last remaining dependencies on OVN code,
so remove the OVN code.
Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This removes a dependency on OVN from the tests.
This adds some options to ovs-vsctl to allow it to be used for testing
the clustering feature. The new options are undocumented because
they're really just useful for testing clustering.
Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch is from the following upstream net-next commit along with
an updated system traffic test to avoid regression.
Upstream commit:
commit 7177895154e6a35179d332f4a584d396c50d0612
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date: Thu Aug 22 13:17:50 2019 -0700
openvswitch: Fix conntrack cache with timeout
This patch addresses a conntrack cache issue with timeout policy.
Currently, we do not check if the timeout extension is set properly in the
cached conntrack entry. Thus, after packet recirculate from conntrack
action, the timeout policy is not applied properly. This patch fixes the
aforementioned issue.
Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Python 2 reaches end-of-life on January 1, 2020, which is only
a few months away. This means that OVS needs to stop depending
on in the next release that should occur roughly that same time.
Therefore, this commit removes all support for Python 2. It
also makes Python 3 a mandatory build dependency.
Some of the interesting consequences:
- HAVE_PYTHON, HAVE_PYTHON2, and HAVE_PYTHON3 conditionals have
been removed, since we now know that Python3 is available.
- $PYTHON and $PYTHON2 are removed, and $PYTHON3 is always
available.
- Many tests for Python 2 support have been removed, and the ones
that depended on Python 3 now run unconditionally. This allowed
several macros in the testsuite to be removed, making the code
clearer. This does make some of the changes to the testsuite
files large due to indentation level changes.
- #! lines for Python now use /usr/bin/python3 instead of
/usr/bin/python.
- Packaging depends on Python 3 packages.
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Tested-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch derives the timeout policy based on ct zone from the
internal data structure that we maintain on dpif layer.
It also adds a system traffic test to verify the zone-based conntrack
timeout feature. The test uses ovs-vsctl commands to configure
the customized ICMP and UDP timeout on zone 5 to a shorter period.
It then injects ICMP and UDP traffic to conntrack, and checks if the
corresponding conntrack entry expires after the predefined timeout.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
ofproto-dpif: Checks if datapath supports OVS_CT_ATTR_TIMEOUT
This patch checks whether datapath supports OVS_CT_ATTR_TIMEOUT. With this
check, ofproto-dpif-xlate can use this information to decide whether to
translate the ct timeout policy.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
This patch adds support for specifying a timeout policy for a
connection in connection tracking system in kernel datapath.
The timeout policy will be attached to a connection when the
connection is committed to conntrack.
This patch introduces a new odp field OVS_CT_ATTR_TIMEOUT in the
ct action that specifies the timeout policy in the datapath.
In the following patch, during the upcall process, the vswitchd will use
the ct_zone to look up the corresponding timeout policy and fill
OVS_CT_ATTR_TIMEOUT if it is available.
The datapath code is from the following two net-next upstream commits.
Upstream commit:
commit 06bd2bdf19d2f3d22731625e1a47fa1dff5ac407
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date: Tue Mar 26 11:31:14 2019 -0700
openvswitch: Add timeout support to ct action
Add support for fine-grain timeout support to conntrack action.
The new OVS_CT_ATTR_TIMEOUT attribute of the conntrack action
specifies a timeout to be associated with this connection.
If no timeout is specified, it acts as is, that is the default
timeout for the connection will be automatically applied.
Example usage:
$ nfct timeout add timeout_1 inet tcp syn_sent 100 established 200
$ ovs-ofctl add-flow br0 in_port=1,ip,tcp,action=ct(commit,timeout=timeout_1)
CC: Pravin Shelar <pshelar@ovn.org>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 6d670497e01803b486aa72cc1a718401ab986896
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date: Tue Apr 2 09:53:14 2019 +0300
openvswitch: use after free in __ovs_ct_free_action()
We free "ct_info->ct" and then use it on the next line when we pass it
to nf_ct_destroy_timeout(). This patch swaps the order to avoid the use
after free.
Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
For userspace datapath, currently only the bridge itself, the LOCAL port,
can be the tunnel endpoint to encap/decap tunnel packets. This patch
enables non-bridge port as tunnel endpoint. One use case is for users to
create a bridge and a vtep port as tap, and configure underlay IP at vtep
port as the tunnel endpoint.
This patch causes failure for test "ptap - L3 over patch port". This is
because this test is already using non-bridge port gre1 as tunnel endpoint.
In this test, a flow is added to redirect tunnel packets to gre1 port,
as shown below:
ovs-ofctl add-flow br1 in_port=p1,actions=output=gre1
It later generates a datapath flow which matches an extra eth field:
- recirc_id(0),...,eth_type(0x0800),...
+ recirc_id(0),...,eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),...
With this patch, this flow need only a NORMAL action.
Signed-off-by: William Tu <u9012063@gmail.com>
Co-authored-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This may be needed in some special cases, such as to support some hardware
offload implementations. Note that disabling TCP sequence number
verification is not an optimization in itself, but supporting some
hardware offload implementations may offer better performance. TCP
sequence number verification is enabled by default. This option is only
available for the userspace datapath. Access to this option is presently
provided via 'dpctl' commands as the need for this option is quite node
specific, by virtue of which nics are in use on a given node. A test is
added to verify this option.
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/359188.html
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Cache the 'conn' context and use it when it is valid. The cached 'conn'
context will get reset if it is not expected to be valid; the cost to do
this is negligible. Besides being most optimal, this also handles corner
cases, such as decapsulation leading to the same tuple, as in tunnel VPN
cases. A negative test is added to check the resetting of the cached
'conn'.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Add support in ovsdb-tool for migrating clustered dbs to standalone dbs.
E.g. usage to migrate nb/sb db to standalone db from raft:
ovsdb-tool cluster-to-standalone ovnnb_db.db ovnnb_db_cluster.db
Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tables and columns may be abbreviated to unique prefixes, but until
now the error messages have just said there's more than one match.
This commit makes the error messages list the possibilities.
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This worked fine as long as there was only one table whose name started
with "C", but now we have three of them.
Acked-by: Justin Pettit <jpettit@ovn.org>
Fixes: 61a5264d60d0 ("ovs-vswitchd: Add Datapath, CT_Zone, and CT_Zone_Policy tables.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
OVN is separated into its own repo. This commit removes the OVN source,
OVN tests, and OVN documentation. It also removes mentions of OVN from
most documentation. The only place where OVN has been left is in
changelogs/NEWS, since we shouldn't mess with the history of the
project.
There is an exception here. The ovsdb-cluster tests rely on ovn-nbctl
and ovn-sbctl to run. Therefore those ovn utilities, as well as their
dependencies remain in the repo with this commit.
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Useful for tracking where the uninitialized memory came from.
Report example:
Thread 13 revalidator11:
Conditional jump or move depends on uninitialised value(s)
at 0x4C35D96: __memcmp_sse4_1 (in vgpreload_memcheck.so)
by 0x9D4404: ofpbuf_equal (ofpbuf.h:273)
by 0x9D4404: revalidate_ukey__ (ofproto-dpif-upcall.c:2219)
<...>
by 0x6AF488E: clone (clone.S:95)
Uninitialised value was created by a stack allocation
at 0x9D4450: compose_slow_path (ofproto-dpif-upcall.c:1062)
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
When a packet needs to be encapsulated in userspace, the endpoint
address needs to be resolved to fill in the headers. If it is not,
then currently OvS sends either a Neighbor Solicitation (IPv6)
or an ARP Query (IPv4) to resolve it.
The problem is that the NS/ARP packet will go through the flow
rules in the new bridge, but inheriting the ofproto table version
from the original packet to be encapsulated. When those versions
don't match, the result is unexpected because no flow rules might
be visible, which would cause the default table rule to be used
to drop the packet. Or only part of the flow rules would be visible
and so on.
Since the NS/ARP packet is created by OvS and will be injected in
the outgoing bridge, use the corresponding ofproto version instead.
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-By: Vasu Dasari <vdasari@gmail.com>
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Do not strictly require python2 for CHECK_CONNTRACK macro definitions in
system-{kmod,userspace}-macros.at
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Fix conntrack checks in the following tests in tests/system-ovn.at:
- ovn -- DNAT and SNAT on distributed router - N/S
- ovn -- DNAT and SNAT on distributed router - E/W
Fixes: a6ee09882283 ("OVN: run local logical flows first in S_ROUTER_OUT_SNAT table")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When a client uses monitor-cond-since with a non-zero last-id but the
server is not in cluster mode for the DB being monitored, it leads to
segmentation fault because the txn_history list is not initialized in
this case.
Program terminated with signal SIGSEGV, Segmentation fault.
1536 struct ovsdb_txn *txn = h_node->txn;
(gdb) bt
0 ovsdb_monitor_get_changes_after (txn_uuid=txn_uuid@entry=0x7ffe8605b7e0, dbmon=0x17c1b40, p_mcs=p_mcs@entry=0x17c4900) at ovsdb/monitor.c:1536
1 0x000000000040da2d in ovsdb_jsonrpc_monitor_create (request_id=0x1804630, version=<optimized out>, params=0x17ad330, db=0x18015b0, s=<optimized out>) at ovsdb/jsonrpc-server.c:1469
2 ovsdb_jsonrpc_session_got_request (request=0x17ad520, s=<optimized out>) at ovsdb/jsonrpc-server.c:1002
3 ovsdb_jsonrpc_session_run (s=<optimized out>) at ovsdb/jsonrpc-server.c:556
...
Although it doesn't happen in normal use cases, no one can prevent a
client to send this on purpose or in a corner case when a client firstly
connected to a clustered DB but later the server restarted with a
non-clustered DB.
This patch fixes it by always initialize the txn_history list to avoid
the undefined behavior in this case. It adds a test case to cover it, too.
Fixes: 695e815 ("ovsdb-server: Transaction history tracking.")
Reported-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A new unixctl command cluster/change-election-timer is implemented to
change leader election timeout base value according to the scale needs.
The change takes effect upon consensus of the cluster, implemented through
the append-request RPC. A new field "election-timer" is added to raft log
entry for this purpose.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
candiate_retrying is used to determine if the current node is disconnected
from the cluster when the node is in candiate role. However, a node
can flap between candidate and follower role before a leader is elected
when majority of the cluster is down, so is_connected() will flap, too, which
confuses clients.
This patch avoids the flapping with the help of a new member had_leader,
so that if no leader was elected since last election, we know we are
still retrying, and keep as disconnected from the cluster.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
As mentioned in RAFT paper, section 6.2:
Leaders: A server might be in the leader state, but if it isn’t the current
leader, it could be needlessly delaying client requests. For example, suppose a
leader is partitioned from the rest of the cluster, but it can still
communicate with a particular client. Without additional mechanism, it could
delay a request from that client forever, being unable to replicate a log entry
to any other servers. Meanwhile, there might be another leader of a newer term
that is able to communicate with a majority of the cluster and would be able to
commit the client’s request. Thus, a leader in Raft steps down if an election
timeout elapses without a successful round of heartbeats to a majority of its
cluster; this allows clients to retry their requests with another server.
Reported-by: Aliasgar Ginwala <aginwala@ebay.com>
Tested-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When clustered mode is used, the client needs to retry connecting
to new servers when certain failures happen. Today it is allowed to
retry new connection only if multiple remotes are used, which prevents
using LB VIP with clustered nodes. This patch makes sure the retry
logic works when using LB VIP: although same IP is used for retrying,
the LB can actually redirect the connection to a new node.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This allows to use a one-character expression inside the 'if'
statement and multiple spaces before the line continuation character.
Fixes false positive in case like this:
#define MACRO(ARG) \
if (a) { \
do_work(ARG); \
}
Fixes: 16770c6d9179 ("checkpatch: support macro continuation")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
No need to use quotes for strings like "br0".
Keeping UUIDs always in quotes to avoid different treatment of those
that starts with digits and those that starts with letters.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
These fragment-related tests are failing on later kernels (4.19.x)
because kernel quietly drops any packet fragment that is not the last
but has a size smaller than IPV6_MIN_MTU. This patch fixes
them by increasing their sizes to IPV6_MIN_MTU.
Reviewed-by: Darrell Ball <dlu998@gmail.com>
Reivewed-at: https://github.com/openvswitch/ovs/pull/278
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Say an ARP entry is learnt on a OVS port and when such a port is deleted,
learnt entry should be removed from the port. It would have be aged out after
ARP ageout time. This code will clean up immediately.
Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to
verify neighbors are added and removed on deletion of a ports and bridges.
Discussion for this addition is at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html
Signed-off-by: Vasu Dasari <vdasari@gmail.com>
Reviewed-by: Flavio Fernandes <flavio@flaviof.com>
Reviewed-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The patch adds '-n' to tcpdump to avoid address coverting. Add '-l' for rhel8
to avoid buffering. Since '-U' is used to output to stdout, simply use 'cat'
to search result. Use OVS_WAIT_UNTIL instead of sleep, and also remove/add
some newlines. Finally, move tcpdump captured interface into the namespace,
(capture p1 instead of ovs-p1), and tested using af_xdp.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
The patch introduces experimental AF_XDP support for OVS netdev.
AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket
type built upon the eBPF and XDP technology. It is aims to have comparable
performance to DPDK but cooperate better with existing kernel's networking
stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program
attached to the netdev, by-passing a couple of Linux kernel's subsystems
As a result, AF_XDP socket shows much better performance than AF_PACKET
For more details about AF_XDP, please see linux kernel's
Documentation/networking/af_xdp.rst. Note that by default, this feature is
not compiled in.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
New IP Multicast Snooping Options are added to the Northbound DB
Logical_Switch:other_config column. These allow enabling IGMP snooping and
querier on the logical switch and get translated by ovn-northd to rows in
the IP_Multicast Southbound DB table.
ovn-northd monitors for changes done by ovn-controllers in the Southbound DB
IGMP_Group table. Based on the entries in IGMP_Group ovn-northd creates
Multicast_Group entries in the Southbound DB, one per IGMP_Group address X,
containing the list of logical switch ports (aggregated from all controllers)
that have IGMP_Group entries for that datapath and address X. ovn-northd
also creates a logical flow that matches on IP multicast traffic destined
to address X and outputs it on the tunnel key of the corresponding
Multicast_Group entry.
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A new IP_Multicast table is added to Southbound DB. This table stores the
multicast related configuration for each datapath. Each row will be
populated by ovn-northd and will control:
- if IGMP Snooping is enabled or not, the snooping table size and multicast
group idle timeout.
- if IGMP Querier is enabled or not (only if snooping is enabled too), query
interval, query source addresses (Ethernet and IP) and the max-response
field to be stored in outgoing queries.
- an additional "seq_no" column is added such that ovn-sbctl or if needed a
CMS can flush currently learned groups. This can be achieved by incrementing
the "seq_no" value.
A new IGMP_Group table is added to Southbound DB. This table stores all the
multicast groups learned by ovn-controllers. The table is indexed by
datapath, group address and chassis. For a learned multicast group on a
specific datapath each ovn-controller will store its own row in this table.
Each row contains the list of chassis-local ports on which the group was
learned. Rows in the IGMP_Group table are updated or deleted only by the
ovn-controllers that created them.
A new action ("igmp") is added to punt IGMP packets on a specific logical
switch datapath to ovn-controller if IGMP snooping is enabled.
Per datapath IGMP multicast snooping support is added to pinctrl:
- incoming IGMP reports are processed and multicast groups are maintained
(using the OVS mcast-snooping library).
- each OVN controller syncs its in-memory IGMP groups to the Southbound DB
in the IGMP_Group table.
- pinctrl also sends periodic IGMPv3 general queries for all datapaths where
querier is enabled.
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Co-authored-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Origins for this patch are captured at
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048923.html.
Summarizing here, when a test fails, it would be good to pause test execution
and let the developer poke around the system to see current status of system.
As part of this patch, made a small tweaks to ovs-macros.at, so that when test
suite fails, ovs_on_exit() function will be called. And in this function, a check
is made to see if an environment variable to OVS_PAUSE_TEST is set. If it is
set, then test suite is paused and will continue to wait for user input
Ctrl-D. Meanwhile user can poke around the system to see why test case has
failed. Once done with investigation, user can press ctrl-d to cleanup the
test suite.
For example, to re-run test case 139:
export OVS_PAUSE_TEST=1
cd tests/system-userspace-testsuite.dir/139
sudo -E ./run
When error occurs, above command would display something like this:
=====================================================
Set environment variable to use various ovs utilities
export OVS_RUNDIR=/opt/vdasari/Developer/ovs/_build-gcc/tests/system-userspace-testsuite.dir/139
Press ENTER to continue:
=====================================================
And from another window, one can execute ovs-xxx commands like:
export OVS_RUNDIR=/opt/vdasari/Developer/ovs/_build-gcc/tests/system-userspace-testsuite.dir/139
$ ovs-ofctl dump-ports br0
.
.
To be able to pause while performing `make check`, one can do:
$ OVS_PAUSE_TEST=1 make check TESTSUITEFLAGS='-v'
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Vasu Dasari <vdasari@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>