2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-31 06:15:47 +00:00
Commit Graph

2707 Commits

Author SHA1 Message Date
Martin Varghese
ebe0e518b0 tunnel: Bareudp Tunnel Support.
There are various L3 encapsulation standards using UDP being discussed to
leverage the UDP based load balancing capability of different networks.
MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.

The Bareudp tunnel provides a generic L3 encapsulation support for
tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
tunnel.

An example to create bareudp device to tunnel MPLS traffic is
given

$ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
             type=bareudp options:remote_ip=2.1.1.3
             options:local_ip=2.1.1.2 \
             options:payload_type=0x8847 options:dst_port=6635

The bareudp device supports special handling for MPLS & IP as
they can have multiple ethertypes. MPLS procotcol can have ethertypes
ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have
ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).

The bareudp device to tunnel L3 traffic with multiple ethertypes
(MPLS & IP) can be created by passing the L3 protocol name as string in
the field payload_type. An example to create bareudp device to tunnel
MPLS unicast & multicast traffic is given below.::

$ ovs-vsctl add-port  br_mpls udp_port -- set interface
            udp_port \
            type=bareudp options:remote_ip=2.1.1.3
            options:local_ip=2.1.1.2 \
            options:payload_type=mpls options:dst_port=6635

Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-By: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-22 12:51:22 +01:00
Ilya Maximets
55f2b065ac odp-util: Fix netlink message overflow with userdata.
Too big userdata could overflow netlink message leading to out-of-bound
memory accesses or assertion while formatting nested actions.

Fix that by checking the size and returning correct error code.

Credit to OSS-Fuzz.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27640
Fixes: e995e3df57 ("Allow OVS_USERSPACE_ATTR_USERDATA to be variable length.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
2020-12-22 00:25:04 +01:00
Jianbo Liu
c5b4b0ce95 dpif-netlink: Fix issues of the offloaded flows counter.
The n_offloaded_flows counter is saved in dpif, and this is the first
one when ofproto is created. When flow operation is done by ovs-appctl
commands, such as, dpctl/add-flow, a new dpif is opened, and the
n_offloaded_flows in it can't be used. So, instead of using counter,
the number of offloaded flows is queried from each netdev, then sum
them up. To achieve this, a new API is added in netdev_flow_api to get
how many flows assigned to a netdev.

In order to get better performance, this number is calculated directly
from tc_to_ufid hmap for netdev-offload-tc, because flow dumping by tc
takes much time if there are many flows offloaded.

Fixes: af06184705 ("dpif-netlink: Count the number of offloaded rules")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 20:25:59 +01:00
XiaoXiong Ding
ced0d8fb92 ofproto-dpif-xlate: Stop forwarding MLD reports to group ports.
According with rfc4541 section 2.1.1, a snooping switch
should forward membership reports only to ports with
routers attached.The current code violates the RFC
forwarding membership reports to group ports as well.
The same issue doesn't exist with IPv4.

Fixes: 06994f879c ("mcast-snooping: Add Multicast Listener Discovery support")
Signed-off-by: XiaoXiong Ding <dingxiaoxiong@huawei.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 18:45:36 +01:00
Jianbo Liu
af06184705 dpif-netlink: Count the number of offloaded rules
Add a counter for the offloaded rules, and display it in the command
of "ovs-appctl upcall/show".

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-12-07 07:30:15 +01:00
Ben Pfaff
91fc374a9c Eliminate use of term "slave" in bond, LACP, and bundle contexts.
The new term is "member".

Most of these changes should not change user-visible behavior.  One
place where they do is in "ovs-ofctl dump-flows", which will now output
"members:..." inside "bundle" actions instead of "slaves:...".  I don't
expect this to cause real problems in most systems.  The old syntax
is still supported on input for backward compatibility.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
2020-10-21 11:28:24 -07:00
Ben Pfaff
807152a4dd Use primary/secondary, not master/slave, as names for OpenFlow roles.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
2020-10-16 19:10:02 -07:00
Flavio Leitner
b0672f4ba2 ofproto-dpif-upcall: Log the emergency flow flush.
When the number of flows in the datapath reaches twice the
maximum, revalidators will delete all flows as an emergency
action to recover. In that case, log a message with values
and increase a coverage counter.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-10-08 16:44:44 +02:00
Flavio Leitner
cade1c4642 ofproto-dpif-upcall: Log the value of flow limit.
The datapath flow limit is calculated by revalidators so
log the value as well.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-10-08 16:43:21 +02:00
Greg Rose
90c1cb3f0f python: Fixup python shebangs to python3.
Builds on RHEL 8.2 systems are failing due to this issue.

See [1] as to why this is necessary.

I used the following command to identify files that need this fix:
find . -type f -executable | /usr/lib/rpm/redhat/brp-mangle-shebangs

I also updated the copyright notices as needed.

1. https://fedoraproject.org/wiki/Changes/Make_ambiguous_python_shebangs_error

Fixes: 1ca0323e7c ("Require Python 3 and remove support for Python 2.")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-08-26 13:05:01 +02:00
Aaron Conole
9688b4c771 connmgr: Support changing openflow versions without restarting.
When commit a0baa7dfa4 ("connmgr: Make treatment of active and passive
connections more uniform") was applied, it did not take into account
that a reconfiguration of the allowed_versions setting would require a
reload of the ofservice object (only accomplished via a restart of OvS).

For now, during the reconfigure cycle, we delete the ofservice object and
then recreate it immediately.  A new test is added to ensure we do not
break this behavior again.

Fixes: a0baa7dfa4 ("connmgr: Make treatment of active and passive connections more uniform")
Suggested-by: Ben Pfaff <blp@ovn.org>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1782834
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Numan Siddique <numans@ovn.org>
Tested-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-08-17 21:38:47 +02:00
Jeff Squyres
b4e50218a0 bond: Add 'primary' interface concept for active-backup mode.
In AB bonding, if the current active slave becomes disabled, a
replacement slave is arbitrarily picked from the remaining set of
enabled slaves.  This commit adds the concept of a "primary" slave: an
interface that will always be (or become) the current active slave if
it is enabled.

The rationale for this functionality is to allow the designation of a
preferred interface for a given bond.  For example:

1. Bond is created with interfaces p1 (primary) and p2, both enabled.
2. p1 becomes the current active slave (because it was designated as
   the primary).
3. Later, p1 fails/becomes disabled.
4. p2 is chosen to become the current active slave.
5. Later, p1 becomes re-enabled.
6. p1 is chosen to become the current active slave (because it was
   designated as the primary)

Note that p1 becomes the active slave once it becomes re-enabled, even
if nothing has happened to p2.

This "primary" concept exists in Linux kernel network interface
bonding, but did not previously exist in OVS bonding.

Only one primary slave interface is supported per bond, and is only
supported for active/backup bonding.

The primary slave interface is designated via
"other_config:bond-primary" when creating a bond.

Also, while adding tests for the "primary" concept, make a few small
improvements to the non-primary AB bonding test.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-17 01:42:18 +02:00
Yunjian Wang
240207eef8 ofproto: Remove duplicated includes
Remove duplicated includes.

Acked-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-07-14 06:46:01 -07:00
Gowrishankar Muthukrishnan
e3ca911fc1 ofproto: report coverage on hitting datapath flow limit
Whenever the number of flows in the datapath crosses above
the flow limit set/autoconfigured, it is helpful to report
this event through coverage counter for an operator/devops
engineer to know and take proactive corrections in the
switch configuration.

Today, these events are reported in ovs vswitch log when
a new flow can not be inserted in upcall processing in which
case ovs writes a warning, otherwise an auto correction
made by ovs to flush old flows without any intimation at all.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukr@redhat.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-07-13 16:15:13 -07:00
Ilya Maximets
8842fdf1b3 netdev-offload: Use dpif type instead of class.
There is no real difference between the 'class' and 'type' in the
context of common lookup operations inside netdev-offload module
because it only checks the value of pointers without using the
value itself.  However, 'type' has some meaning and can be used by
offload provides on the initialization phase to check if this type
of Flow API in pair with the netdev type could be used in particular
datapath type.  For example, this is needed to check if Linux flow
API could be used for current tunneling vport because it could be
used only if tunneling vport belongs to system datapath, i.e. has
backing linux interface.

This is needed to unblock tunneling offloads in userspace datapath
with DPDK flow API.

Acked-by: Eli Britstein <elibr@mellanox.com>
Acked-by: Roni Bar Yanai <roniba@mellanox.com>
Acked-by: Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-08 19:07:21 +02:00
Adrian Moreno
82a106ebfb ofproto: Delete buckets when lb_output is false.
When lb-output-action is toggled back to "false" buckets are not being
deleted. Delete them as they will no longer be used.

Add unit test to verify buckets are correctly deleted.

Cc: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-07 01:40:32 +02:00
Vishal Deep Ajmera
9df65060cf userspace: Avoid dp_hash recirculation for balance-tcp bond mode.
Problem:

In OVS, flows with output over a bond interface of type “balance-tcp”
gets translated by the ofproto layer into "HASH" and "RECIRC" datapath
actions. After recirculation, the packet is forwarded to the bond
member port based on 8-bits of the datapath hash value computed through
dp_hash. This causes performance degradation in the following ways:

1. The recirculation of the packet implies another lookup of the
packet’s flow key in the exact match cache (EMC) and potentially
Megaflow classifier (DPCLS). This is the biggest cost factor.

2. The recirculated packets have a new “RSS” hash and compete with the
original packets for the scarce number of EMC slots. This implies more
EMC misses and potentially EMC thrashing causing costly DPCLS lookups.

3. The 256 extra megaflow entries per bond for dp_hash bond selection
put additional load on the revalidation threads.

Owing to this performance degradation, deployments stick to “balance-slb”
bond mode even though it does not do active-active load balancing for
VXLAN- and GRE-tunnelled traffic because all tunnel packet have the
same source MAC address.

Proposed optimization:

This proposal introduces a new load-balancing output action instead of
recirculation.

Maintain one table per-bond (could just be an array of uint16's) and
program it the same way internal flows are created today for each
possible hash value (256 entries) from ofproto layer. Use this table to
load-balance flows as part of output action processing.

Currently xlate_normal() -> output_normal() ->
bond_update_post_recirc_rules() -> bond_may_recirc() and
compose_output_action__() generate 'dp_hash(hash_l4(0))' and
'recirc(<RecircID>)' actions. In this case the RecircID identifies the
bond. For the recirculated packets the ofproto layer installs megaflow
entries that match on RecircID and masked dp_hash and send them to the
corresponding output port.

Instead, we will now generate action as
    'lb_output(<bond id>)'

This combines hash computation (only if needed, else re-use RSS hash)
and inline load-balancing over the bond. This action is used *only* for
balance-tcp bonds in userspace datapath (the OVS kernel datapath
remains unchanged).

Example:
Current scheme:

With 8 UDP flows (with random UDP src port):

  flow-dump from pmd on cpu core: 2
  recirc_id(0),in_port(7),<...> actions:hash(hash_l4(0)),recirc(0x1)

  recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0xb236c260/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0x7d89eb18/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0xa78d75df/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0xb58d846f/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0x24534406/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0x3cf32550/0xff),<...> actions:1

New scheme:
We can do with a single flow entry (for any number of new flows):

  in_port(7),<...> actions:lb_output(1)

A new CLI has been added to dump datapath bond cache as given below.

 # ovs-appctl dpif-netdev/bond-show [dp]

   Bond cache:
     bond-id 1 :
       bucket 0 - slave 2
       bucket 1 - slave 1
       bucket 2 - slave 2
       bucket 3 - slave 1

Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-06-22 13:11:51 +02:00
Dumitru Ceara
d072d2de01 ofproto-dpif-trace: Improve NAT tracing.
When ofproto/trace detects a recirc action it resumes execution at the
specified next table. However, if the ct action performs SNAT/DNAT,
e.g., ct(commit,nat(src=1.1.1.1:4000),table=42), the src/dst IPs and
ports in the oftrace_recirc_node->flow field are not updated. This leads
to misleading outputs from ofproto/trace as real packets would actually
first get NATed and might match different flows when recirculated.

Assume the first IP/port from the NAT src/dst action will be used by
conntrack for the translation and update the oftrace_recirc_node->flow
accordingly. This is not entirely correct as conntrack might choose a
different IP/port but the result is more realistic than before.

This fix covers new connections. However, for reply traffic that executes
actions of the form ct(nat, table=42) we still don't update the flow as
we don't have any information about conntrack state when tracing.

Also move the oftrace_recirc_node processing out of ofproto_trace()
and to its own function, ofproto_trace_recirc_node() for better
readability/

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-06-16 15:07:31 -07:00
Vlad Buslov
191536574e netdev-offload: Implement terse dump support
In order to improve revalidator performance by minimizing unnecessary
copying of data, extend netdev-offloads to support terse dump mode. Extend
netdev_flow_api->flow_dump_create() with 'terse' bool argument. Implement
support for terse dump in functions that convert netlink to flower and
flower to match. Set flow stats "used" value based on difference in number
of flow packets because lastuse timestamp is not included in TC terse dump.

Kernel API support is implemented in following patch.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2020-06-05 10:14:27 +02:00
Ilya Maximets
c36bba351b ofproto: Fix statistics of removed flow.
'fr' is a new variable on the stack.  '+=' here adds the real statistics
to a random stack memory.

Fixes: 164413156c ("Add offload packets statistics")
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-05-15 20:14:00 +02:00
William Tu
2078901a4c userspace: Add conntrack timeout policy support.
Commit 1f16131837 ("ct-dpif, dpif-netlink: Add conntrack timeout
policy support") adds conntrack timeout policy for kernel datapath.
This patch enables support for the userspace datapath.  I tested
using the 'make check-system-userspace' which checks the timeout
policies for ICMP and UDP cases.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
2020-05-01 08:22:45 -07:00
Yi-Hung Wei
81f71381ff ofp-actions: Add delete field action
This patch adds a new OpenFlow action, delete field, to delete a
field in packets.  Currently, only the tun_metadata fields are
supported.

One use case to add this action is to support multiple versions
of geneve tunnel metadatas to be exchanged among different versions
of networks.  For example, we may introduce tun_metadata2 to
replace old tun_metadata1, but still want to provide backward
compatibility to the older release.  In this case, in the new
OpenFlow pipeline, we would like to support the case to receive a
packet with tun_metadata1, do some processing.  And if the packet
is going to a switch in the newer release, we would like to delete
the value in tun_metadata1 and set a value into tun_metadata2.

Currently, ovs does not provide an action to remove a value in
tun_metadata if the value is present.  This patch fulfills the gap
by adding the delete_field action.  For example, the OpenFlow
syntax to delete tun_metadata1 is:

    actions=delete_field:tun_metadata1

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
2020-04-29 09:00:54 -07:00
Yifeng Sun
02f641e2c5 tun_metadata: Fix coredump caused by use-after-free bug
Tun_metadata can be referened by flow and frozen_state at the same
time. When ovs-vswitchd handles TLV table mod message, the involved
tun_metadata gets freed. The call trace to free tun_metadata is
shown as below:

ofproto_run
- handle_openflow
  - handle_single_part_openflow
    - handle_tlv_table_mod
      - tun_metadata_table_mod
        - tun_metadata_postpone_free

Unfortunately, this tun_metadata can be still used by some frozen_state,
and later on when frozen_state tries to access its tun_metadata table,
ovs-vswitchd crashes. The call trace to access tun_metadata from
frozen_state is shown as below:

udpif_upcall_handler
- recv_upcalls
  - process_upcall
    - frozen_metadata_to_flow

It is unsafe for frozen_state to reference tun_table because tun_table
is protected by RCU while the lifecycle of frozen_state can span several
RCU quiesce states. Current code violates OVS's RCU protection mechanism.

This patch fixes it by simply stopping frozen_state from referencing
tun_table. If frozen_state needs tun_table, the latest valid tun_table
can be found through ofproto_get_tun_tab() efficiently.

A previous commit seems fixing the samiliar issue:
254878c188 (ofproto-dpif-xlate: Fix segmentation fault caused by tun_table)

VMware-BZ: #2526222
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-04-10 07:08:27 -07:00
William Tu
3c6d05a02e userspace: Add GTP-U support.
GTP, GPRS Tunneling Protocol, is a group of IP-based communications
protocols used to carry general packet radio service (GPRS) within
GSM, UMTS and LTE networks.  GTP protocol has two parts: Signalling
(GTP-Control, GTP-C) and User data (GTP-User, GTP-U). GTP-C is used
for setting up GTP-U protocol, which is an IP-in-UDP tunneling
protocol. Usually GTP is used in connecting between base station for
radio, Serving Gateway (S-GW), and PDN Gateway (P-GW).

This patch implements GTP-U protocol for userspace datapath,
supporting only required header fields and G-PDU message type.
See spec in:
https://tools.ietf.org/html/draft-hmm-dmm-5g-uplane-analysis-00

Tested-at: https://travis-ci.org/github/williamtu/ovs-travis/builds/666518784
Signed-off-by: Feng Yang <yangfengee04@gmail.com>
Co-authored-by: Feng Yang <yangfengee04@gmail.com>
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Co-authored-by: Yi Yang <yangyi01@inspur.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2020-03-25 20:26:51 -07:00
Ben Pfaff
e86ec1e7e0 ofproto: Fix typo in manpage fragment.
There was a missing ] and an extra space.

Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-20 08:50:58 -07:00
Ben Pfaff
323ae1e808 ofproto-dpif-xlate: Fix recirculation when in_port is OFPP_CONTROLLER.
Recirculation usually requires finding the pre-recirculation input port.
Packets sent by the controller, with in_port of OFPP_CONTROLLER or
OFPP_NONE, do not have a real input port data structure, only a port
number.  The code in xlate_lookup_ofproto_() mishandled this case,
failing to return the ofproto data structure.  This commit fixes the
problem and adds a test to guard against regression.

Reported-by: Numan Siddique <numans@ovn.org>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-March/368642.html
Tested-by: Numan Siddique <numans@ovn.org>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-20 08:27:17 -07:00
Vishal Deep Ajmera
44810e6d41 ofproto: Add support to watch controller port liveness in fast-failover group
Currently fast-failover group does not support checking liveness of controller
port (OFPP_CONTROLLER). However this feature can be useful for selecting
alternate pipeline when controller connection itself is down for e.g.
by using local DHCP server to reply for any DHCP request originating from VMs.

This patch adds the support for watching controller port liveness in fast-
failover group. Controller port is considered live when atleast one
of-connection is alive.

Example usage:

ovs-ofctl add-group br-int 'group_id=1234,type=ff,
          bucket=watch_port:CONTROLLER,actions:<A>,
          bucket=watch_port:1,actions:<B>

Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-03-06 13:25:23 -08:00
Ben Pfaff
7cc77b301f ofproto-dpif: Only delete tunnel backer ports along with the dpif.
The admin can choose whether or not to delete flows from datapaths when
they stop ovs-vswitchd.  The goal of not deleting flows it to allow
existing traffic to continue being forwarded until ovs-vswitchd is
restarted.  Until now, regardless of this choice, ovs-vswitchd has
always deleted tunnel ports from the datapath.  When flows are not
deleted, this nevertheless prevents tunnel traffic from being forwarded.

With this patch, ovs-vswitchd no longer deletes tunnel ports in the
case where it does not delete flows, allowing tunnel traffic to continue
being forwarded.

Reported-by: Antonin Bas <abas@vmware.com>
Tested-by: Antonin Bas <abas@vmware.com>
Tested-by: txfh2007 <txfh2007@aliyun.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-02-29 11:01:57 -08:00
Kirill A. Kornilov
8e371aa497 vswitchd: Add serial number configuration.
Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-01-31 09:17:29 -08:00
Ilya Maximets
342b8904ab dpif: Fix dp_extra_info leak by reworking the allocation scheme.
dpctl module leaks the 'dp_extra_info' in case the dumped flow doesn't
fit the dump filter while executing dpctl/dump-flows and also while
executing dpctl/get-flow.

This is already a 3rd attempt to fix all the leaks and incorrect usage
of this string that definitely indicates poor initial design of the
feature.

Flow dump/get documentation clearly states that the caller does not own
the data provided in dpif_flow.  Datapath still owns all the data and
promises to not free/modify it until the next quiescent period, however
we're requesting the caller to free 'dp_extra_info' and this obviously
breaks the rules.

This patch fixes the issue by by storing 'dp_extra_info' within
'struct dp_netdev_flow' making datapath to own it.  'dp_netdev_flow'
is RCU-protected, so it will be valid until the next quiescent period.

Fixes: 0e8f5c6a38 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command")
Tested-by: Emma Finn <emma.finn@intel.com>
Acked-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-27 21:20:01 +01:00
Ben Pfaff
79eadafeb1 ofproto: Do not delete datapath flows on exit by default.
Commit e96a5c24e8 ("upcall: Remove datapath flows when setting
n-threads.") caused OVS to delete datapath flows when it exits through
any graceful means.  This is not necessarily desirable, especially when
OVS is being stopped as part of an upgrade.  This commit changes OVS so
that it only removes datapath flows when requested, via "ovs-appctl
exit --cleanup".

Acked-by: Numan Siddique <numans@ovn.org>
Tested-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-01-24 13:00:52 -08:00
Ben Pfaff
586cd3101e ofproto-dpif-upcall: Get rid of udpif_synchronize().
RCU provides the semantics we want from udpif_synchronize() and it
should be much more lightweight than killing and restarting all the
upcall threads.  It looks like udpif_synchronize() was written before
the OVS tree had RCU support, which is probably why we didn't use it
here from the beginning.  So we can just change udpif_synchronize()
to a single ovsrcu_synchronize() call.

However, udpif_synchronize() only has a single caller, which calls
ovsrcu_synchronize() anyway just beforehand, via xlate_txn_commit().
So we can get rid of udpif_synchronize() entirely, which this patch
does.

As a side effect, this eliminates one reason why terminating OVS cleanly
clears the datapath flow table.  An upcoming patch will eliminate
other reasons.

Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2020-01-24 13:00:52 -08:00
Damijan Skvarc
dbbd0cf644 dpif: Fix memory leak while dumping dpif flows.
Leak was detected by running test: "ofproto-dpif - balance-tcp bonding"

Fixes: 0e8f5c6a38 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command")
Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-22 21:05:58 +01:00
Ilya Maximets
d7b55c5c94 dpif: Fix leak and usage of uninitialized dp_extra_info.
'dpif_probe_feature'/'revalidate' doesn't free the 'dp_extra_info'
string.  Also, all the implementations of dpif_flow_get() should
initialize the value to avoid printing/freeing of random memory.

 30 bytes in 1 blocks are definitely lost in loss record 323 of 889
    at 0x483AD19: realloc (vg_replace_malloc.c:836)
    by 0xDDAD89: xrealloc (util.c:149)
    by 0xCE1609: ds_reserve (dynamic-string.c:63)
    by 0xCE1A90: ds_put_format_valist (dynamic-string.c:161)
    by 0xCE19B9: ds_put_format (dynamic-string.c:142)
    by 0xCCCEA9: dp_netdev_flow_to_dpif_flow (dpif-netdev.c:3170)
    by 0xCCD2DD: dpif_netdev_flow_get (dpif-netdev.c:3278)
    by 0xCCEA0A: dpif_netdev_operate (dpif-netdev.c:3868)
    by 0xCDF81B: dpif_operate (dpif.c:1361)
    by 0xCDEE93: dpif_flow_get (dpif.c:1002)
    by 0xCDECF9: dpif_probe_feature (dpif.c:962)
    by 0xC635D2: check_recirc (ofproto-dpif.c:896)
    by 0xC65C02: check_support (ofproto-dpif.c:1567)
    by 0xC63274: open_dpif_backer (ofproto-dpif.c:818)
    by 0xC65E3E: construct (ofproto-dpif.c:1605)
    by 0xC4D436: ofproto_create (ofproto.c:549)
    by 0xC3931A: bridge_reconfigure (bridge.c:877)
    by 0xC3FEAC: bridge_run (bridge.c:3324)
    by 0xC4551D: main (ovs-vswitchd.c:127)

CC: Emma Finn <emma.finn@intel.com>
Fixes: 0e8f5c6a38 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
2020-01-20 17:51:16 +01:00
Ilya Maximets
7a5e0ee7cc dpif: Turn dpif_flow_hash function into generic odp_flow_key_hash.
Current implementation of dpif_flow_hash() doesn't depend on datapath
interface and only complicates the callers by forcing them to figure
out what is their current 'dpif'.  If we'll need different hashing
for different 'dpif's we'll implement an API for dpif-providers
and each dpif implementation will be able to use their local function
directly without calling it via dpif API.

This change will allow us to not store 'dpif' pointer in the userspace
datapath implementation which is broken and will be removed in next
commits.

This patch moves dpif_flow_hash() to odp-util module and replaces
unused odp_flow_key_hash() by it, along with removing of unused 'dpif'
argument.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2020-01-08 16:02:37 +01:00
Anju Thomas
a13a020975 userspace: Improved packet drop statistics.
Currently OVS maintains explicit packet drop/error counters only on port
level.  Packets that are dropped as part of normal OpenFlow processing
are counted in flow stats of “drop” flows or as table misses in table
stats. These can only be interpreted by controllers that know the
semantics of the configured OpenFlow pipeline.  Without that knowledge,
it is impossible for an OVS user to obtain e.g. the total number of
packets dropped due to OpenFlow rules.

Furthermore, there are numerous other reasons for which packets can be
dropped by OVS slow path that are not related to the OpenFlow pipeline.
The generated datapath flow entries include a drop action to avoid
further expensive upcalls to the slow path, but subsequent packets
dropped by the datapath are not accounted anywhere.

Finally, the datapath itself drops packets in certain error situations.
Also, these drops are today not accounted for.This makes it difficult
for OVS users to monitor packet drop in an OVS instance and to alert a
management system in case of a unexpected increase of such drops.
Also OVS trouble-shooters face difficulties in analysing packet drops.

With this patch we implement following changes to address the issues
mentioned above.

1. Identify and account all the silent packet drop scenarios
2. Display these drops in ovs-appctl coverage/show

Co-authored-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Co-authored-by: Keshav Gupta <keshugupta1@gmail.com>
Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>
Signed-off-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Signed-off-by: Keshav Gupta <keshugupta1@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-07 17:01:42 +01:00
Ilya Maximets
924d94a695 ofproto-dpif-upcall: Fix using uninitialized upcall hash.
upcalls are allocated on stack and 'hash' field must be initialized
regardless of attribute existence because it will be used later.

 Conditional jump or move depends on uninitialised value(s)
    at 0xFA74A7: dpif_netlink_encode_execute (dpif-netlink.c:1828)
    by 0xFA6DE8: dpif_netlink_operate__ (dpif-netlink.c:1906)
    by 0xFA612F: dpif_netlink_operate_chunks (dpif-netlink.c:2219)
    by 0xFA0E36: dpif_netlink_operate (dpif-netlink.c:2275)
    by 0xE5AFAC: dpif_operate (dpif.c:1376)
    by 0xDF3922: handle_upcalls (ofproto-dpif-upcall.c:1615)
    by 0xDF269B: recv_upcalls (ofproto-dpif-upcall.c:857)
    by 0xDF1C49: udpif_upcall_handler (ofproto-dpif-upcall.c:759)
    by 0xF3A3FE: ovsthread_wrapper (ovs-thread.c:383)
    by 0x565F6DA: start_thread (pthread_create.c:463)
    by 0x615988E: clone (clone.S:95)
  Uninitialised value was created by a stack allocation
    at 0xDF2258: recv_upcalls (ofproto-dpif-upcall.c:773)

Fixes: 0442bfb11d ("ofproto-dpif-upcall: Echo HASH attribute back to datapath.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
2020-01-07 15:35:08 +01:00
Ilya Maximets
c4d8a4e039 ofproto-dpif: Fix using uninitialized execute hash.
Most of callers doesn't initialize dpif_execute.hash leaving random
value from the stack.  And this random value used later while encoding
netlink message and might produce unwanted kernel behavior.

Fix that by fully initializing dpif_execute structure.  Using
designated initializers to avoid such issues in the future.

Fixes: 0442bfb11d ("ofproto-dpif-upcall: Echo HASH attribute back to datapath.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2020-01-07 15:34:34 +01:00
zhaozhanxu
164413156c Add offload packets statistics
Add argument '--offload-stats' for command ovs-appctl bridge/dump-flows
to display the offloaded packets statistics.

The commands display as below:

orignal command:

ovs-appctl bridge/dump-flows br0

duration=574s, n_packets=1152, n_bytes=110768, priority=0,actions=NORMAL
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x1,actions=controller(reason=)
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x3,actions=drop

new command with argument '--offload-stats'

Notice: 'n_offload_packets' are a subset of n_packets and 'n_offload_bytes' are
a subset of n_bytes.

ovs-appctl bridge/dump-flows --offload-stats br0

duration=582s, n_packets=1152, n_bytes=110768, n_offload_packets=1107, n_offload_bytes=107992, priority=0,actions=NORMAL
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x1,actions=controller(reason=)
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x3,actions=drop

Signed-off-by: zhaozhanxu <zhaozhanxu@163.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-06 10:57:16 +01:00
Dmytro Linkin
84dd881bb5 ofproto-dpif-xlate: Prevent duplicating of traffic to a mirror port
Currently ofproto design disallow duplicating output packet on forwarding
and mirroring to/from same ovs port. Next scenario reveal lack of design:
1. Send ping between regular ovs ports (VFs, for ex.), stop it.
2. While rule still exist, make mirror for one of the ports.
Prevent duplicating of traffic to a mirror port.

Fixes: 86e2dcddce ("dpif-xlate: Snoop multicast packets and send them properly")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-03 10:31:54 -08:00
William Tu
551c72854e ofproto-dpif: Refactor the get capability code.
Make the code simpler by removing the use of
xasprintf and free, and use smap_add_format.

Cc: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-12-02 16:19:25 -08:00
Ben Pfaff
7b950521f5 ofproto-dpif-xlate: Restore table ID on error in xlate_table_action().
Found by inspection.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 12:41:07 -08:00
Linhaifeng
c9ee7c0e83 ofproto: fix stack-buffer-overflow
Should use flow->actions not &flow->actions.

here is ASAN report:
=================================================================
==57189==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xffff428fa0e8 at pc 0xffff7f61a520 bp 0xffff428f9420 sp 0xffff428f9498 READ of size 196 at 0xffff428fa0e8 thread T150 (revalidator22)
    #0 0xffff7f61a51f in __interceptor_memcpy (/lib64/libasan.so.4+0xa251f)
    #1 0xaaaad26a3b2b in ofpbuf_put lib/ofpbuf.c:426
    #2 0xaaaad26a30cb in ofpbuf_clone_data_with_headroom lib/ofpbuf.c:248
    #3 0xaaaad26a2e77 in ofpbuf_clone_with_headroom lib/ofpbuf.c:218
    #4 0xaaaad26a2dc3 in ofpbuf_clone lib/ofpbuf.c:208
    #5 0xaaaad23e3993 in ukey_set_actions ofproto/ofproto-dpif-upcall.c:1640
    #6 0xaaaad23e3f03 in ukey_create__ ofproto/ofproto-dpif-upcall.c:1696
    #7 0xaaaad23e553f in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1806
    #8 0xaaaad23e65fb in ukey_acquire ofproto/ofproto-dpif-upcall.c:1984
    #9 0xaaaad23eb583 in revalidate ofproto/ofproto-dpif-upcall.c:2625
    #10 0xaaaad23dee5f in udpif_revalidator ofproto/ofproto-dpif-upcall.c:1076
    #11 0xaaaad26b84ef in ovsthread_wrapper lib/ovs-thread.c:708
    #12 0xffff7e74a8bb in start_thread (/lib64/libpthread.so.0+0x78bb)
    #13 0xffff7e0665cb in thread_start (/lib64/libc.so.6+0xd55cb)

Address 0xffff428fa0e8 is located in stack of thread T150 (revalidator22) at offset 328 in frame
    #0 0xaaaad23e4cab in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1762

  This frame has 4 object(s):
    [32, 96) 'actions'
    [128, 192) 'buf'
    [224, 328) 'full_flow'
    [384, 2432) 'stub' <== Memory access at offset 328 partially underflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported) Thread T150 (revalidator22) created by T0 here:
    #0 0xffff7f5b0f7f in __interceptor_pthread_create (/lib64/libasan.so.4+0x38f7f)
    #1 0xaaaad26b891f in ovs_thread_create lib/ovs-thread.c:792
    #2 0xaaaad23dc62f in udpif_start_threads ofproto/ofproto-dpif-upcall.c:639
    #3 0xaaaad23daf87 in ofproto_set_flow_table ofproto/ofproto-dpif-upcall.c:446
    #4 0xaaaad230ff7f in dpdk_evs_cfg_set vswitchd/bridge.c:1134
    #5 0xaaaad2310097 in bridge_reconfigure vswitchd/bridge.c:1148
    #6 0xaaaad23279d7 in bridge_run vswitchd/bridge.c:3944
    #7 0xaaaad23365a3 in main vswitchd/ovs-vswitchd.c:240
    #8 0xffff7dfb1adf in __libc_start_main (/lib64/libc.so.6+0x20adf)
    #9 0xaaaad230a3d3  (/usr/sbin/ovs-vswitchd-2.7.0-1.1.RC5.001.asan+0x26f3d3)

SUMMARY: AddressSanitizer: stack-buffer-overflow (/lib64/libasan.so.4+0xa251f) in __interceptor_memcpy Shadow bytes around the buggy address:
  0x200fe851f3c0: 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2 00 00 00 00
  0x200fe851f3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3f0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00
  0x200fe851f400: f2 f2 f2 f2 f8 f8 f8 f8 f8 f8 f8 f8 f2 f2 f2 f2
=>0x200fe851f410: 00 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2
  0x200fe851f420: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==57189==ABORTING

Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Linhaifeng <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 11:19:26 -08:00
Ilya Maximets
89ee730a60 ofproto-provider: Move datapath capabilities callback to correct section.
'get_datapath_cap' callback was mistakenly placed in
'Connection tracking' section of the 'struct dpif_class'
while belongs to the 'Datapath information'.

CC: William Tu <u9012063@gmail.com>
Fixes: 27501802d0 ("ofproto-dpif: Expose datapath capability to ovsdb.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-11-28 16:51:44 +01:00
Ilya Maximets
47f8743e84 ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.
Handling of OpenFlow PACKET_OUT implies pushing the packet through
the datapath and packet processing inside the datapath could trigger
an upcall.  In case of system datapath, 'dpif_execute()' sends packet
to the kernel module and returns.  If any, upcall  will be triggered
inside the kernel and handled by a separate thread in userspace.
But in case of userspace datapath full processing of the packet happens
inside the 'dpif_execute()' in the same thread that handled PACKET_OUT.
This causes an issue if upcall will lead to modification of flow rules.
For example, it could happen while processing of 'learn' actions.
Since whole handling of PACKET_OUT is protected by 'ofproto_mutex',
OVS will assert on attempt to take it recursively while processing
'learn' actions:

   0 __GI_raise (sig=sig@entry=6)
   1 __GI_abort ()
   2 ovs_abort_valist ()
   3 ovs_abort ()
   4 ovs_mutex_lock_at (where=where@entry=0xad4199 "ofproto/ofproto.c:5391")
                <Trying to acquire ofproto_mutex again>
   5 ofproto_flow_mod_learn ()       at ofproto/ofproto.c:5391
                <Trying to modify flows according to 'learn' action>
   6 xlate_learn_action ()           at ofproto/ofproto-dpif-xlate.c:5378
                <'learn' action found>
   7 do_xlate_actions ()             at ofproto/ofproto-dpif-xlate.c:6893
   8 xlate_recursively ()            at ofproto/ofproto-dpif-xlate.c:4233
   9 xlate_table_action ()           at ofproto/ofproto-dpif-xlate.c:4361
  10 in xlate_ofpact_resubmit ()     at ofproto/ofproto-dpif-xlate.c:4672
  11 do_xlate_actions ()             at ofproto/ofproto-dpif-xlate.c:6773
  12 xlate_actions ()                at ofproto/ofproto-dpif-xlate.c:7570
                 <Translating actions>
  13 upcall_xlate ()                 at ofproto/ofproto-dpif-upcall.c:1197
  14 process_upcall ()               at ofproto/ofproto-dpif-upcall.c:1413
  15 upcall_cb ()                    at ofproto/ofproto-dpif-upcall.c:1315
  16 dp_netdev_upcall (DPIF_UC_MISS) at lib/dpif-netdev.c:6236
                 <Flow cache miss. Making upcall>
  17 handle_packet_upcall ()         at lib/dpif-netdev.c:6591
  18 fast_path_processing ()         at lib/dpif-netdev.c:6709
  19 dp_netdev_input__ ()            at lib/dpif-netdev.c:6797
  20 dp_netdev_recirculate ()        at lib/dpif-netdev.c:6842
  21 dp_execute_cb ()                at lib/dpif-netdev.c:7158
  22 odp_execute_actions ()          at lib/odp-execute.c:794
  23 dp_netdev_execute_actions ()    at lib/dpif-netdev.c:7332
  24 dpif_netdev_execute ()          at lib/dpif-netdev.c:3725
  25 dpif_netdev_operate ()          at lib/dpif-netdev.c:3756
                 <Packet pushed to userspace datapath for processing>
  26 dpif_operate ()                 at lib/dpif.c:1367
  27 dpif_execute ()                 at lib/dpif.c:1321
  28 packet_execute ()               at ofproto/ofproto-dpif.c:4760
  29 ofproto_packet_out_finish ()    at ofproto/ofproto.c:3594
                 <Taking ofproto_mutex>
  30 handle_packet_out ()            at ofproto/ofproto.c:3635
  31 handle_single_part_openflow (OFPTYPE_PACKET_OUT) at ofproto/ofproto.c:8400
  32 handle_openflow ()                               at ofproto/ofproto.c:8587
  33 ofconn_run ()
  34 connmgr_run ()
  35 ofproto_run ()
  36 bridge_run__ ()
  37 bridge_run ()
  38 main ()

Fix that by splitting the 'ofproto_packet_out_finish()' in two parts.
First one that translates side-effects and requires holding 'ofproto_mutex'
and the second that only pushes packet to datapath.  The second part moved
out of 'ofproto_mutex' because 'ofproto_mutex' is not required and actually
should not be taken in order to avoid recursive locking.

Reported-by: Anil Kumar Koli <anilkumar.k@altencalsoftlabs.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-April/048494.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-11-26 09:48:36 +01:00
Flavio Leitner
aa453e3199 ofproto-dpif: Expose datapath ND Extensions capability to ovsdb
Document and expose datapath ND Extensions capability to ovsdb.

Fixes: d0d571493 ("ofproto-dpif: Allow IPv6 ND Extensions only if supported")
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-22 14:03:36 -08:00
Tonghao Zhang
0442bfb11d ofproto-dpif-upcall: Echo HASH attribute back to datapath.
The kernel datapath may sent upcall with hash info,
ovs-vswitchd should get it from upcall and then send
it back.

The reason is that:
| When using the kernel datapath, the upcall don't
| include skb hash info relatived. That will introduce
| some problem, because the hash of skb is important
| in kernel stack. For example, VXLAN module uses
| it to select UDP src port. The tx queue selection
| may also use the hash in stack.
|
| Hash is computed in different ways. Hash is random
| for a TCP socket, and hash may be computed in hardware,
| or software stack. Recalculation hash is not easy.
|
| There will be one upcall, without information of skb
| hash, to ovs-vswitchd, for the first packet of a TCP
| session. The rest packets will be processed in Open vSwitch
| modules, hash kept. If this tcp session is forward to
| VXLAN module, then the UDP src port of first tcp packet
| is different from rest packets.
|
| TCP packets may come from the host or dockers, to Open vSwitch.
| To fix it, we store the hash info to upcall, and restore hash
| when packets sent back.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html
Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=bd1903b7c4596ba6f7677d0dfefd05ba5876707d
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-22 09:51:57 -08:00
Flavio Leitner
d0d571493c ofproto-dpif: Allow IPv6 ND Extensions only if supported
The IPv6 ND Extensions is only implemented in userspace datapath,
but nothing prevents that to be used with other datapaths.

This patch probes the datapath and only allows if the support
is available.

Fixes: 9b2b84973 ("Support for match & set ICMPv6 reserved and options type fields")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-21 17:15:47 -08:00
William Tu
27501802d0 ofproto-dpif: Expose datapath capability to ovsdb.
The patch adds support for fetching the datapath's capabilities
from the result of 'check_support()', and write the supported capability
to a new database column, called 'capabilities' under Datapath table.

To see how it works, run:
 # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
 # ovs-vsctl -- --id=@m create Datapath datapath_version=0 \
     'ct_zones={}' 'capabilities={}' \
     -- set Open_vSwitch . datapaths:"netdev"=@m

 # ovs-vsctl list-dp-cap netdev
 ufid=true sample_nesting=true clone=true tnl_push_pop=true \
 ct_orig_tuple=true ct_eventmask=true ct_state=true \
 ct_clear=true max_vlan_headers=1 recirc=true ct_label=true \
 max_hash_alg=1 ct_state_nat=true ct_timeout=true \
 ct_mark=true ct_orig_tuple6=true check_pkt_len=true \
 masked_set_action=true max_mpls_depth=3 trunc=true ct_zone=true

Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
---
v5:
    Add improved documentation from Ben and
    fix checkpatch error (tab and line 79 char)
v4:
    rebase to master
v3:
    fix 32-bit build, reported by Greg
    travis: https://travis-ci.org/williamtu/ovs-travis/builds/599276267
v2:
	rebase to master
2019-11-21 09:19:59 -08:00
Ben Pfaff
afc0c379f6 ovs-actions: Clarify documentation for stack usage with group buckets.
This should be less confusing now.

Reported-by: Han Zhou <hzhou@ovn.org>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-20 15:19:50 -08:00