2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 21:38:13 +00:00

3402 Commits

Author SHA1 Message Date
Ilya Maximets
275f78f95a dpif-netdev.at: Wait for miss upcall log.
Some tests checks for 'miss upcall' log in a log file immediately
after sending the packet, this causes test failures while running
them under valgrind or on the overloaded system.

Fix that by waiting for appearance of the actual string in the log
file.  Some other tests uses 'sleep 1' to fix that, but it's better
to wait for event than sleep for a specific amount of time.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-07-23 10:20:18 -07:00
Ilya Maximets
9e11517e6c ovs-router: Fix flushing of local routes.
Since commit 8e4e45887ec3, priority of 'local' route entries no
longer matches with 'plen'.  This should be taken into account
while flushing cached routes, otherwise they will remain in OVS
even after removing them from the system:

  # ifconfig eth0 11.0.0.1
  # ovs-appctl ovs/route/show
    --- A new route synchronized from kernel route table ---
    Cached: 11.0.0.1/32 dev eth0 SRC 11.0.0.1 local
  # ifconfig eth0 0
  # ovs-appctl ovs/route/show
    -- the new route entry is still in ovs route table ---
    Cached: 11.0.0.1/32 dev eth0 SRC 11.0.0.1 local

CC: wenxu <wenxu@ucloud.cn>
Fixes: 8e4e45887ec3 ("ofproto-dpif-xlate: makes OVS native tunneling honor tunnel-specified source addresses")
Reported-by: Zheng Jingzhou <glovejmm@163.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-July/373093.html
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-07-21 06:23:49 -07:00
Jeff Squyres
b4e50218a0 bond: Add 'primary' interface concept for active-backup mode.
In AB bonding, if the current active slave becomes disabled, a
replacement slave is arbitrarily picked from the remaining set of
enabled slaves.  This commit adds the concept of a "primary" slave: an
interface that will always be (or become) the current active slave if
it is enabled.

The rationale for this functionality is to allow the designation of a
preferred interface for a given bond.  For example:

1. Bond is created with interfaces p1 (primary) and p2, both enabled.
2. p1 becomes the current active slave (because it was designated as
   the primary).
3. Later, p1 fails/becomes disabled.
4. p2 is chosen to become the current active slave.
5. Later, p1 becomes re-enabled.
6. p1 is chosen to become the current active slave (because it was
   designated as the primary)

Note that p1 becomes the active slave once it becomes re-enabled, even
if nothing has happened to p2.

This "primary" concept exists in Linux kernel network interface
bonding, but did not previously exist in OVS bonding.

Only one primary slave interface is supported per bond, and is only
supported for active/backup bonding.

The primary slave interface is designated via
"other_config:bond-primary" when creating a bond.

Also, while adding tests for the "primary" concept, make a few small
improvements to the non-primary AB bonding test.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-17 01:42:18 +02:00
Eli Britstein
77057965cb dpif-netdev: Don't use zero flow mark.
Zero flow mark is used to indicate the HW to remove the mark. A packet
marked with zero mark is received in SW without a mark at all, so it
cannot be used as a valid mark. Change the pool range to fix it.

Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roni Bar Yanai <roniba@mellanox.com>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-08 17:50:36 +02:00
Eli Britstein
9ac365a8ed dpif-netdev: Add mega ufid in flow add/del log.
As offload is done using the mega ufid of a flow, for better
debugability, add it in the log message.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roni Bar Yanai <roniba@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-08 17:50:36 +02:00
Adrian Moreno
82a106ebfb ofproto: Delete buckets when lb_output is false.
When lb-output-action is toggled back to "false" buckets are not being
deleted. Delete them as they will no longer be used.

Add unit test to verify buckets are correctly deleted.

Cc: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-07 01:40:32 +02:00
Yi-Hung Wei
98670b77ff bridge: Fix null dereference on ct_timeout_policy record
Accoridng to vswitch.ovsschema, each CT_Zone record may have
zero or one associcated CT_Timeout_policy.  Thus, this patch
checks if ovsrec_ct_timeout_policy exist before accesses the
record.

VMWare-BZ: 2585825
Fixes: 45339539f69d ("ovs-vsctl: Add conntrack zone commands.")
Fixes: 993cae678bca ("ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables")
Reported-by: Yang Song <yangsong@vmware.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-06-27 16:40:04 -07:00
Matteo Croce
3950e350d2 ofproto-dpif.at: Add unit test for lb_output action.
Extend the balance-tcp one so it tests lb-output action too.
The test checks that that the option is shown in bond/show,
and that the lb_output action is programmed in the datapath.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-06-22 13:11:51 +02:00
Vishal Deep Ajmera
9df65060cf userspace: Avoid dp_hash recirculation for balance-tcp bond mode.
Problem:

In OVS, flows with output over a bond interface of type “balance-tcp”
gets translated by the ofproto layer into "HASH" and "RECIRC" datapath
actions. After recirculation, the packet is forwarded to the bond
member port based on 8-bits of the datapath hash value computed through
dp_hash. This causes performance degradation in the following ways:

1. The recirculation of the packet implies another lookup of the
packet’s flow key in the exact match cache (EMC) and potentially
Megaflow classifier (DPCLS). This is the biggest cost factor.

2. The recirculated packets have a new “RSS” hash and compete with the
original packets for the scarce number of EMC slots. This implies more
EMC misses and potentially EMC thrashing causing costly DPCLS lookups.

3. The 256 extra megaflow entries per bond for dp_hash bond selection
put additional load on the revalidation threads.

Owing to this performance degradation, deployments stick to “balance-slb”
bond mode even though it does not do active-active load balancing for
VXLAN- and GRE-tunnelled traffic because all tunnel packet have the
same source MAC address.

Proposed optimization:

This proposal introduces a new load-balancing output action instead of
recirculation.

Maintain one table per-bond (could just be an array of uint16's) and
program it the same way internal flows are created today for each
possible hash value (256 entries) from ofproto layer. Use this table to
load-balance flows as part of output action processing.

Currently xlate_normal() -> output_normal() ->
bond_update_post_recirc_rules() -> bond_may_recirc() and
compose_output_action__() generate 'dp_hash(hash_l4(0))' and
'recirc(<RecircID>)' actions. In this case the RecircID identifies the
bond. For the recirculated packets the ofproto layer installs megaflow
entries that match on RecircID and masked dp_hash and send them to the
corresponding output port.

Instead, we will now generate action as
    'lb_output(<bond id>)'

This combines hash computation (only if needed, else re-use RSS hash)
and inline load-balancing over the bond. This action is used *only* for
balance-tcp bonds in userspace datapath (the OVS kernel datapath
remains unchanged).

Example:
Current scheme:

With 8 UDP flows (with random UDP src port):

  flow-dump from pmd on cpu core: 2
  recirc_id(0),in_port(7),<...> actions:hash(hash_l4(0)),recirc(0x1)

  recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0xb236c260/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0x7d89eb18/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0xa78d75df/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0xb58d846f/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0x24534406/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0x3cf32550/0xff),<...> actions:1

New scheme:
We can do with a single flow entry (for any number of new flows):

  in_port(7),<...> actions:lb_output(1)

A new CLI has been added to dump datapath bond cache as given below.

 # ovs-appctl dpif-netdev/bond-show [dp]

   Bond cache:
     bond-id 1 :
       bucket 0 - slave 2
       bucket 1 - slave 1
       bucket 2 - slave 2
       bucket 3 - slave 1

Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-06-22 13:11:51 +02:00
Roi Dayan
1fe4297563 netdev-offload-tc: Revert tunnel src/dst port masks handling
The cited commit intended to add tc support for masking tunnel src/dst
ips and ports. It's not possible to do tunnel ports masking with
openflow rules and the default mask for tunnel ports set to 0 in
tnl_wc_init(), unlike tunnel ports default mask which is full mask.
So instead of never passing tunnel ports to tc, revert the changes
to tunnel ports to always pass the tunnel port.
In sw classification is done by the kernel, but for hw we must match
the tunnel dst port.

Fixes: 5f568d049130 ("netdev-offload-tc: Allow to match the IP and port mask of tunnel")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-06-19 08:51:11 +02:00
Dumitru Ceara
d072d2de01 ofproto-dpif-trace: Improve NAT tracing.
When ofproto/trace detects a recirc action it resumes execution at the
specified next table. However, if the ct action performs SNAT/DNAT,
e.g., ct(commit,nat(src=1.1.1.1:4000),table=42), the src/dst IPs and
ports in the oftrace_recirc_node->flow field are not updated. This leads
to misleading outputs from ofproto/trace as real packets would actually
first get NATed and might match different flows when recirculated.

Assume the first IP/port from the NAT src/dst action will be used by
conntrack for the translation and update the oftrace_recirc_node->flow
accordingly. This is not entirely correct as conntrack might choose a
different IP/port but the result is more realistic than before.

This fix covers new connections. However, for reply traffic that executes
actions of the form ct(nat, table=42) we still don't update the flow as
we don't have any information about conntrack state when tracing.

Also move the oftrace_recirc_node processing out of ofproto_trace()
and to its own function, ofproto_trace_recirc_node() for better
readability/

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-06-16 15:07:31 -07:00
Dumitru Ceara
ae25f8c8ff ovsdb-idl: Avoid inconsistent IDL state with OVSDB_MONITOR_V3.
Assuming an ovsdb client connected to a database using OVSDB_MONITOR_V3
(i.e., "monitor_cond_since" method) with the initial monitor condition
MC1.

Assuming the following two transactions are executed on the
ovsdb-server:
TXN1: "insert record R1 in table T1"
TXN2: "insert record R2 in table T2"

If the client's monitor condition MC1 for table T2 matches R2 then the
client will receive the following update3 message:
method="update3", "insert record R2 in table T2", last-txn-id=TXN2

At this point, if the presence of the new record R2 in the IDL triggers
the client to update its monitor condition to MC2 and add a clause for
table T1 which matches R1, a monitor_cond_change message is sent to the
server:
method="monitor_cond_change", "clauses from MC2"

In normal operation the ovsdb-server will reply with a new update3
message of the form:
method="update3", "insert record R1 in table T1", last-txn-id=TXN2

However, if the connection drops in the meantime, this last update might
get lost.

It might happen that during the reconnect a new transaction happens
that modifies the original record R1:
TXN3: "modify record R1 in table T1"

When the client reconnects, it will try to perform a fast resync by
sending:
method="monitor_cond_since", "clauses from MC2", last-txn-id=TXN2

Because TXN2 is still in the ovsdb-server transaction history, the
server replies with the changes from the most recent transactions only,
i.e., TXN3:
result="true", last-txbb-id=TXN3, "modify record R1 in table T1"

This causes the IDL on the client in to end up in an inconsistent
state because it has never seen the update that created R1.

Such a scenario is described in:
https://bugzilla.redhat.com/show_bug.cgi?id=1808580#c22

To avoid this issue, the IDL will now maintain (up to) 3 different
types of conditions for each DB table:
- new_cond: condition that has been set by the IDL client but has
  not yet been sent to the server through monitor_cond_change.
- req_cond: condition that has been sent to the server but the reply
  acknowledging the change hasn't been received yet.
- ack_cond: condition that has been acknowledged by the server.

Whenever the IDL FSM is restarted (e.g., voluntary or involuntary
disconnect):
- if there is a known last_id txn-id the code ensures that new_cond
  will contain the most recent condition set by the IDL client
  (either req_cond if there was a request in flight, or new_cond
  if the IDL client set a condition while the IDL was disconnected)
- if there is no known last_id txn-id the code ensures that ack_cond will
  contain the most recent conditions set by the IDL client regardless
  whether they were acked by the server or not.

When monitor_cond_since/monitor_cond requests are sent they will
always include ack_cond and if new_cond is not NULL a follow up
monitor_cond_change will be generated afterwards.

On the other hand ovsdb_idl_db_set_condition() will always modify new_cond.

This ensures that updates of type "insert" that happened before the last
transaction known by the IDL but didn't match old monitor conditions are
sent upon reconnect if the monitor condition has changed to include them
in the meantime.

Fixes: 403a6a0cb003 ("ovsdb-idl: Fast resync from server when connection reset.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-06-15 14:07:29 +02:00
Tonghao Zhang
5f568d0491 netdev-offload-tc: Allow to match the IP and port mask of tunnel
This patch allows users to offload the TC flower rules with
tunnel mask. This patch allows masked match of the following,
where previously supported an exact match was supported:
* Remote (dst) tunnel endpoint address
* Local (src) tunnel endpoint address
* Remote (dst) tunnel endpoint UDP port

And also allows masked match of the following, where previously
no match was supported:
* Local (src) tunnel endpoint UDP port

In some case, mask is useful as wildcards. For example, DDOS,
in that case, we don’t want to allow specified hosts IPs or
only source Ports to access the targeted host. For example:

$ ovs-appctl dpctl/add-flow "tunnel(dst=2.2.2.100,src=2.2.2.0/255.255.255.0,tp_dst=4789),\
  recirc_id(0),in_port(3),eth(),eth_type(0x0800),ipv4()" ""

$ tc filter show dev vxlan_sys_4789 ingress
  ...
  eth_type ipv4
  enc_dst_ip 2.2.2.100
  enc_src_ip 2.2.2.0/24
  enc_dst_port 4789
  enc_ttl 64
  in_hw in_hw_count 2
	action order 1: gact action drop
    ...

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-06-03 09:56:07 +02:00
Eiichi Tsukata
a611705990 classifier: Prevent tries vs n_tries race leading to NULL dereference.
Currently classifier tries and n_tries can be updated not atomically,
there is a race condition which can lead to NULL dereference.
The race can happen when main thread updates a classifier tries and
n_tries in classifier_set_prefix_fields() and at the same time revalidator
or handler thread try to lookup them in classifier_lookup__(). Such race
can be triggered when user changes prefixes of flow_table.

Race(user changes flow_table prefixes: ip_dst,ip_src => none):

  [main thread]             [revalidator/handler thread]
  ===========================================================
                            /* cls->n_tries == 2 */
                            for (int i = 0; i < cls->n_tries; i++) {
  trie_init(cls, i, NULL);
  /* n_tries == 0 */
  cls->n_tries = n_tries;
                            /* cls->tries[i]->feild is NULL */
                            trie_ctx_init(&trie_ctx[i],&cls->tries[i]);
                            /* trie->field is NULL */
                            ctx->be32ofs = trie->field->flow_be32ofs;

To prevent the race, instead of re-introducing internal mutex
implemented in the commit fccd7c092e09 ("classifier: Remove internal
mutex."), this patch makes trie field RCU protected and checks it after
read.

Fixes: fccd7c092e09 ("classifier: Remove internal mutex.")
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-05-28 11:46:39 +02:00
William Tu
fae687c85e oss-fuzz: Fix miniflow_target.c.
Clang reports:
tests/oss-fuzz/miniflow_target.c:209:26: error: suggest braces around \
initialization of subobject
      [-Werror,-Wmissing-braces]
          struct flow flow2 = {0};

Fix it by using memset.

Cc: Bhargava Shastry <bshastry@sect.tu-berlin.de>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-05-14 08:38:38 -07:00
Yi-Hung Wei
1740aaf49d metaflow: Fix maskable conntrack orig tuple fields
From man ovs-fields(7), the conntrack origin tuple fields
ct_nw_src/dst, ct_ipv6_src/dst, and ct_tp_src/dst are supposed
to be bitwise maskable, but they are not.  This patch enables
those fields to be maskable, and adds a regression test.

Fixes: daf4d3c18da4 ("odp: Support conntrack orig tuple key.")
Reported-by: Wenying Dong <wenyingd@vmware.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-05-14 07:42:18 -07:00
William Tu
5f01a9019f tests: Add tests using tap device.
Similar to using veth across namespaces, this patch creates
tap devices, assigns to namespaces, and allows traffic to
go through different test cases.

Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-05-14 07:22:35 -07:00
William Tu
29bb3093eb userspace: Enable TSO support for non-DPDK.
This patch enables TSO support for non-DPDK use cases, and
also add check-system-tso testsuite. Before TSO, we have to
disable checksum offload, allowing the kernel to calculate the
TCP/UDP packet checsum. With TSO, we can skip the checksum
validation by enabling checksum offload, and with large packet
size, we see better performance.

Consider container to container use cases:
  iperf3 -c (ns0) -> veth peer -> OVS -> veth peer -> iperf3 -s (ns1)
And I got around 6Gbps, similar to TSO with DPDK-enabled.

Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-05-14 07:21:34 -07:00
William Tu
2078901a4c userspace: Add conntrack timeout policy support.
Commit 1f1613183733 ("ct-dpif, dpif-netlink: Add conntrack timeout
policy support") adds conntrack timeout policy for kernel datapath.
This patch enables support for the userspace datapath.  I tested
using the 'make check-system-userspace' which checks the timeout
policies for ICMP and UDP cases.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
2020-05-01 08:22:45 -07:00
Yi-Hung Wei
81f71381ff ofp-actions: Add delete field action
This patch adds a new OpenFlow action, delete field, to delete a
field in packets.  Currently, only the tun_metadata fields are
supported.

One use case to add this action is to support multiple versions
of geneve tunnel metadatas to be exchanged among different versions
of networks.  For example, we may introduce tun_metadata2 to
replace old tun_metadata1, but still want to provide backward
compatibility to the older release.  In this case, in the new
OpenFlow pipeline, we would like to support the case to receive a
packet with tun_metadata1, do some processing.  And if the packet
is going to a switch in the newer release, we would like to delete
the value in tun_metadata1 and set a value into tun_metadata2.

Currently, ovs does not provide an action to remove a value in
tun_metadata if the value is present.  This patch fulfills the gap
by adding the delete_field action.  For example, the OpenFlow
syntax to delete tun_metadata1 is:

    actions=delete_field:tun_metadata1

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
2020-04-29 09:00:54 -07:00
William Tu
5bfc519fee netdev-afxdp: Add interrupt mode netdev class.
The patch adds a new netdev class 'afxdp-nonpmd' to enable afxdp
interrupt mode. This is similar to 'type=afxdp', except that the
is_pmd field is set to false. As a result, the packet processing
is handled by main thread, not pmd thread. This avoids burning
the CPU to always 100% when there is no traffic.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-04-28 17:58:31 +02:00
Yifeng Sun
134e6831ac system-traffic: Check frozen state handling with TLV map change
This patch enhances a system traffic test to prevent regression on
the tunnel metadata table (tun_table) handling with frozen state.
Without a proper fix this test can crash ovs-vswitchd due to a
use-after-free bug on tun_table.

These are the timed sequence of how this bug is triggered:

- Adds an OpenFlow rule in OVS that matches Geneve tunnel metadata that
contains a controller action.
- When the first packet matches the aforementioned OpenFlow rule,
during the miss upcall, OVS stores a pointer to the tun_table (that
decodes the Geneve tunnel metadata) in a frozen state and pushes down
a datapath flow into kernel datapath.
- Issues a add-tlv-map command to reprogram the tun_table on OVS.
OVS frees the old tun_table and create a new tun_table.
- A subsequent packet hits the kernel datapath flow again. Since
there is a controller action associated with that flow, it triggers
slow path controller upcall.
- In the slow path controller upcall, OVS derives the tun_table
from the frozen state, which points to the old tun_table that is
already being freed at this time point.
- In order to access the tunnel metadata, OVS uses the invalid
pointer that points to the old tun_table and triggers the core dump.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Co-authored-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-04-10 08:20:45 -07:00
Malvika Gupta
804ef1a702 tests/testsuite: Skip failing UT cases on aarch64
The following test cases are failing inconsistently on aarch64 platforms and
have been skipped until further investigation can be made on how to fix them:

20: bfd.at:268           bfd - bfd decay
2104: ovsdb-idl.at:1815  Check Python IDL connects to leader - Python3 (leader only)
2105: ovsdb-idl.at:1816  Check Python IDL reconnects to leader - Python3 (leader only)

Suggested-by: Yanqin Wei <Yanqin.Wei@arm.com>
Suggested-by: Lance Yang <Lance.Yang@arm.com>
Signed-off-by: Malvika Gupta <malvika.gupta@arm.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-04-09 14:35:40 -07:00
Malvika Gupta
9e44424204 tests/atlocal.in: Add check for aarch64 Architecture
This patch adds a condition to check if the CPU architecture is aarch64. If the
condition evaluates to true, $IS_ARM64 variable is set to 'yes'. For all other
architectures, this variable is set to 'no'.

Reviewed-by: Yanqin Wei <Yanqin.wei@arm.com>
Signed-off-by: Malvika Gupta <malvika.gupta@arm.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-04-09 14:35:17 -07:00
William Tu
3c6d05a02e userspace: Add GTP-U support.
GTP, GPRS Tunneling Protocol, is a group of IP-based communications
protocols used to carry general packet radio service (GPRS) within
GSM, UMTS and LTE networks.  GTP protocol has two parts: Signalling
(GTP-Control, GTP-C) and User data (GTP-User, GTP-U). GTP-C is used
for setting up GTP-U protocol, which is an IP-in-UDP tunneling
protocol. Usually GTP is used in connecting between base station for
radio, Serving Gateway (S-GW), and PDN Gateway (P-GW).

This patch implements GTP-U protocol for userspace datapath,
supporting only required header fields and G-PDU message type.
See spec in:
https://tools.ietf.org/html/draft-hmm-dmm-5g-uplane-analysis-00

Tested-at: https://travis-ci.org/github/williamtu/ovs-travis/builds/666518784
Signed-off-by: Feng Yang <yangfengee04@gmail.com>
Co-authored-by: Feng Yang <yangfengee04@gmail.com>
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Co-authored-by: Yi Yang <yangyi01@inspur.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2020-03-25 20:26:51 -07:00
Terry Wilson
9435b0b8e6 Handle refTable values with setkey()
For columns like QoS.queues where we have a map containing refTable
values, assigning w/ __setattr__ e.g. qos.queues={1: $queue_row}
works, but using using qos.setkey('queues', 1, $queue_row) results
in an Exception. The opdat argument can essentially just be the
JSON representation of the map column instead of trying to build
it.

Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-20 08:47:53 -07:00
Ben Pfaff
323ae1e808 ofproto-dpif-xlate: Fix recirculation when in_port is OFPP_CONTROLLER.
Recirculation usually requires finding the pre-recirculation input port.
Packets sent by the controller, with in_port of OFPP_CONTROLLER or
OFPP_NONE, do not have a real input port data structure, only a port
number.  The code in xlate_lookup_ofproto_() mishandled this case,
failing to return the ofproto data structure.  This commit fixes the
problem and adds a test to guard against regression.

Reported-by: Numan Siddique <numans@ovn.org>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-March/368642.html
Tested-by: Numan Siddique <numans@ovn.org>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-20 08:27:17 -07:00
Han Zhou
93ee420935 raft: Fix the problem of stuck in candidate role forever.
Sometimes a server can stay in candidate role forever, even if the server
already see the new leader and handles append-requests normally. However,
because of the wrong role, it appears as disconnected from cluster and
so the clients are disconnected.

This problem happens when 2 servers become candidates in the same
term, and one of them is elected as leader in that term. It can be
reproduced by the test cases added in this patch.

The root cause is that the current implementation only changes role to
follower when a bigger term is observed (in raft_receive_term__()).
According to the RAFT paper, if another candidate becomes leader with
the same term, the candidate should change to follower.

This patch fixes it by changing the role to follower when leader
is being updated in raft_update_leader().

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-06 14:32:32 -08:00
Han Zhou
2833885f7a raft: Fix raft_is_connected() when there is no leader yet.
If there is never a leader known by the current server, it's status
should be "disconnected" to the cluster. Without this patch, when
a server in cluster is restarted, before it successfully connecting
back to the cluster it will appear as connected, which is wrong.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-06 13:47:15 -08:00
Han Zhou
bda1f6b605 ovsdb-server: Don't disconnect clients after raft install_snapshot.
When "schema" field is found in read_db(), there can be two cases:
1. There is a schema change in clustered DB and the "schema" is the new one.
2. There is a install_snapshot RPC happened, which caused log compaction on the
server and the next log is just the snapshot, which always constains "schema"
field, even though the schema hasn't been changed.

The current implementation doesn't handle case 2), and always assume the schema
is changed hence disconnect all clients of the server. It can cause stability
problem when there are big number of clients connected when this happens in
a large scale environment.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-06 13:46:10 -08:00
Ilya Maximets
82c9d9993d netdev-dpdk: Remove deprecated ring port type.
'dpdkr' ring ports was deprecated in 2.13 release and was not
actually used for a long time.  Remove support now.

More details in
commit b4c5f00c339b ("netdev-dpdk: Deprecate ring ports.")

Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-03-06 12:41:04 +01:00
Eli Britstein
1baa102abb dpif-netdev.at: VLAN id modification test for ARP partial HW offloading.
Follow up to commit eb540c0f5fc8 ("flow: Fix parsing l3_ofs with partial
offloading") that fixed the issue, add a unit-test for it.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-02-28 13:55:14 +01:00
Yanqin Wei
f7995da00b dpif-netdev.at: Fix partial offloading test cases failure.
Some partial offloading test cases are failing inconsistently. The root
cause is that dummy netdev is assigned with "linux_tc" offloading API.
dpif-netdev - partial hw offload - dummy
dpif-netdev - partial hw offload - dummy-pmd
dpif-netdev - partial hw offload with packet modifications - dummy
dpif-netdev - partial hw offload with packet modifications - dummy-pmd

This patch fixes this issue by changing 'options:ifindex=1' to some big
value. It is a workaround to make "linux_tc" init flow api failure. All
above cases can pass consistently after applying this patch.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Reviewed-by: Lijian Zhang <Lijian.Zhang@arm.com>
Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-02-28 13:55:13 +01:00
Yi-Hung Wei
a867c010ee conntrack: Fix conntrack new state
In connection tracking system, a connection is established if we
see packets from both directions.  However, in userspace datapath's
conntrack, if we send a connection setup packet in one direction
twice, it will make the connection to be in established state.

This patch fixes the aforementioned issue, and adds a system traffic
test for UDP and TCP traffic to avoid regression.

Fixes: a489b16854b59 ("conntrack: New userspace connection tracker.")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-01-29 10:41:43 -08:00
Ben Pfaff
f2c7be2389 Typo fix: vswtch -> vswitch.
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-01-17 10:29:42 -08:00
Ben Pfaff
a529e3cd1f ovsdb-server: Allow OVSDB clients to specify the UUID for inserted rows.
Acked-by: Han Zhou <hzhou@ovn.org>
Requested-by: Leonid Ryzhyk <lryzhyk@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-01-16 16:02:05 -08:00
Damijan Skvarc
c3428f4399 tests: introduced tests for adding/deleting logical routers in VTEP database
New tests were introduced based on lcov report, which reveals apparent code
is not covered by ovs test suites.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-01-07 12:13:38 -08:00
Anju Thomas
a13a020975 userspace: Improved packet drop statistics.
Currently OVS maintains explicit packet drop/error counters only on port
level.  Packets that are dropped as part of normal OpenFlow processing
are counted in flow stats of “drop” flows or as table misses in table
stats. These can only be interpreted by controllers that know the
semantics of the configured OpenFlow pipeline.  Without that knowledge,
it is impossible for an OVS user to obtain e.g. the total number of
packets dropped due to OpenFlow rules.

Furthermore, there are numerous other reasons for which packets can be
dropped by OVS slow path that are not related to the OpenFlow pipeline.
The generated datapath flow entries include a drop action to avoid
further expensive upcalls to the slow path, but subsequent packets
dropped by the datapath are not accounted anywhere.

Finally, the datapath itself drops packets in certain error situations.
Also, these drops are today not accounted for.This makes it difficult
for OVS users to monitor packet drop in an OVS instance and to alert a
management system in case of a unexpected increase of such drops.
Also OVS trouble-shooters face difficulties in analysing packet drops.

With this patch we implement following changes to address the issues
mentioned above.

1. Identify and account all the silent packet drop scenarios
2. Display these drops in ovs-appctl coverage/show

Co-authored-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Co-authored-by: Keshav Gupta <keshugupta1@gmail.com>
Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>
Signed-off-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Signed-off-by: Keshav Gupta <keshugupta1@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-07 17:01:42 +01:00
Ilya Maximets
af683565ba tests: Allow running offloads testsuite under valgrind.
This helps a lot with finding memory leaks and uninitialized
data usage.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-01-07 10:12:46 +01:00
Han Zhou
7efc698014 ovsdb raft: Fix the problem when cluster restarted after DB compaction.
Cluster doesn't work after all nodes restarted after DB compaction,
unless there is any transaction after DB compaction before the restart.

Error log is like:
raft|ERR|internal error: deferred vote_request message completed but not ready
to send because message index 9 is past last synced index 0: s2 vote_request:
term=6 last_log_index=9 last_log_term=4

The root cause is that the log_synced member is not initialized when
reading the raft header. This patch fixes it and remove the XXX
from the test case.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-20 12:57:38 -08:00
Han Zhou
6d03405394 ovsdb-cluster.at: Wait until leader is elected before DB compact.
In test case "election timer change", before testing DB compact,
we had to insert some data. Otherwise, inserting data after DB
compact will cause busy loop as mentioned in the XXX comment.

The root cause of the busy loop is still not clear, but the test
itself didn't wait until the leader election finish before initiating
DB compact. This patch adds the wait to make sure the test continue
after leader is elected so that the following tests are based on
a clean state. While this wait is added, the busy loop problem is
gone even without inserting the data, so the additional data insertion
is also removed by this patch.

A separate patch will address the busy loop problem in the scenario:
1. Restart cluster
2. DB compact before the cluster is ready
3. Insert data

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-20 12:57:36 -08:00
Timothy Redaelli
0c4d144a98 Remove dependency on python3-six
Since Python 2 support was removed in 1ca0323e7c29 ("Require Python 3 and
remove support for Python 2."), python3-six is not needed anymore.

Moreover python3-six is not available on RHEL/CentOS7 without using EPEL
and so this patch is needed in order to release OVS 2.13 on RHEL7.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-20 12:23:06 -08:00
Ilya Maximets
138d30a9c0 dpctl: Fix referencing DPDK in a flow dump.
Few reasons to replace 'non-dpdk interfaces' with 'the main thread':

* Flows are dumped from threads (from per thread flow tables) not from
  the interfaces.

* 'non-dpdk' here sounds like all other flows (dumped from PMDs) has
  some relation with DPDK which is not true at least because we have
  afxdp and dummy ports that could be polled by PMD threads.

* 'main thread' is the same term as we're using in the output of
  ovs-appctl dpif-netdev/pmd-stats-show.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-12-19 08:44:27 +01:00
Ilya Maximets
2620f056a9 system-afxdp.at: Add test for infinite re-addition of failed ports.
New file created for AF_XDP specific tests.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-18 02:02:52 +01:00
Damijan Skvarc
bd6da4ab37 tests: introduced test for checking "ovs-vsctl emer-reset"
Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-17 13:50:18 -08:00
Ben Pfaff
c77f3a1625 tests: Log commands being executed for async message control test.
The "ofproto - asynchronous message control (OpenFlow 1.4)" test fails
from time to time when I'm running tests in parallel locally.  So far,
I've not been able to determine the root cause, but logging the
commands as they're executed should help.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:18 -08:00
Ben Pfaff
2235035119 tests: Improve logging for async message control test.
The "ofproto - asynchronous message control (OpenFlow 1.4)" test fails
from time to time when I'm running tests in parallel locally.  So far,
I've not been able to determine the root cause, but logging the
difference between expected and actual output should help.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:14 -08:00
Ben Pfaff
e6eef11145 tests: Better document OVS_WAIT_UNTIL, OVS_WAIT_WHILE macros.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:12 -08:00
Ben Pfaff
95a5454c51 ofp-print: Abbreviate lists of fields in table features output.
This makes the output both shorter and easier to read.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:05 -08:00
Damijan Skvarc
822c8923f0 ovs-vsctl: unit test for checking fail-mode related
unit test is introduced which checks fail-mode related commands.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-03 11:00:57 -08:00