2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-31 06:15:47 +00:00
Commit Graph

281 Commits

Author SHA1 Message Date
Eelco Chaudron
252ee0f182 dpif: Fix flow put debug message match content.
The odp_flow_format() function applies a wildcard mask when a
mask for a given key was not present. However, in the context of
installing flows in the datapath, the absence of a mask actually
indicates that the key should be ignored, meaning it should not
be masked at all.

To address this inconsistency, odp_flow_format() now includes an
option to skip formatting keys that are missing a mask.

This was found during a debug session of the ‘datapath - ping between two
ports on cvlan’ test case. The log message was showing the following:

  put[create] ufid:XX recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(3),
    skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),
    eth(src=12:f6:8b:52:f9:75,dst=6e:48:c8:77:d3:8c),eth_type(0x88a8),
    vlan(vid=4094,pcp=0/0x0),encap(eth_type(0x8100),
    vlan(vid=100/0x0,pcp=0/0x0),encap(eth_type(0x0800),
    ipv4(src=10.2.2.2,dst=10.2.2.1,proto=1,tos=0,ttl=64,frag=no),
    icmp(type=0,code=0))), actions:2

Where it should have shown the below:

  put[create] ufid:XX recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(3),
    skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),
    eth(src=12:f6:8b:52:f9:75,dst=6e:48:c8:77:d3:8c),eth_type(0x88a8),
    vlan(vid=4094,pcp=0/0x0),encap(eth_type(0x8100)), actions:2

Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2024-09-09 10:22:16 +02:00
Eelco Chaudron
c903624885 dpif: Fix potential NULL pointer access in log_flow_message().
When actions is NULL and action_len is not zero, it would
potentially allow format_odp_actions() to dereference the
NULL pointer. The fix will check for valid actions pointer
only, as format_odp_actions() will handle a zero length.

Fixes: cdee00fd63 ("datapath: Replace "struct odp_action" by Netlink attributes.")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
2024-08-29 11:18:07 +02:00
Adrian Moreno
d0afbf0944 ofproto_dpif: Check for psample support.
Only kernel datapath supports this action so add a function in dpif.c
that checks for that.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2024-07-15 11:15:30 +02:00
Adrian Moreno
1a3bd96b4f odp-util: Add support OVS_ACTION_ATTR_PSAMPLE.
Add support for parsing and formatting the new action.

Also, flag OVS_ACTION_ATTR_SAMPLE as requiring datapath assistance if it
contains a nested OVS_ACTION_ATTR_PSAMPLE. The reason is that the
sampling rate from the parent "sample" is made available to the nested
"psample" by the kernel.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2024-07-14 17:19:52 +02:00
Eric Garver
3c8d069b9b dpif: Probe support for OVS_ACTION_ATTR_DROP.
Kernel support has been added for this action. As such, we need to probe
the datapath for support.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Eric Garver <eric@garver.life>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2024-04-05 23:10:17 +02:00
Eric Garver
8bb065961e dpif: Stub out unimplemented action OVS_ACTION_ATTR_DEC_TTL.
This is prep for adding a different OVS_ACTION_ATTR_ enum value. This
action, OVS_ACTION_ATTR_DEC_TTL, is not actually implemented. However,
to make -Werror happy we must add a case to all existing switches.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Eric Garver <eric@garver.life>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2024-04-05 23:10:17 +02:00
Ilya Maximets
47520b33bd ofproto-dpif: Fix removal of renamed datapath ports.
OVS configuration is based on port names and OpenFlow port numbers.
Names are stored in the database and translated later to OF ports.
On the datapath level, each port has a name and a datapath port number.
Port name in the database has to match datapath port name, unless it's
a tunnel port.

If a datapath port is renamed with 'ip link set DEV name NAME',
ovs-vswitchd will wake up, destroy all the OpenFlow-related structures
and clean other things up.  This is because the port no longer
represents the port from a database due to a name difference.

However, ovs-vswitch will not actually remove the port from the
datapath, because it thinks that this port is no longer there.  This
is happening because lookup is performed by name and the name have
changed.  As a result we have a port in a datapath that is not related
to any port known to ovs-vswitchd and ovs-vswitchd can't remove it.
This port also occupies a datapath port number and prevents the port
to be added back with a new name.

Fix that by performing lookup by a datapath port number during the port
destruction.  The name was used only to avoid spurious warnings in a
normal case where the port was successfully deleted by other parts of
OVS.  Adding an extra flag to avoid these warnings instead.

Fixes: 02f8d6460a ("ofproto-dpif: Query port existence by name to prevent warnings.")
Reported-at: https://github.com/openvswitch/ovs-issues/issues/284
Tested-by: Alin-Gabriel Serdean <aserdean@ovn.org>
Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-07-31 13:57:57 +02:00
Eelco Chaudron
903294cde6 dpif: Add coverage counters for dpif_operate() failures.
Add additional error coverage counters for dpif operation failures.
This could help to quickly identify netlink problems when communicating
with the OVS kernel module.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2070630
Reviewed-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-06-23 15:04:55 +02:00
Eelco Chaudron
4d69c19000 ofproto-dpif-upcall: Reset ukey's last stats value if the datapath changed.
When the ukey's action set changes, it could cause the flow to use a
different datapath, for example, when it moves from tc to kernel.
This will cause the the cached previous datapath statistics to be used.

This change will reset the cached statistics when a change in
datapath is discovered.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-03-03 22:27:37 +01:00
Eelco Chaudron
182b9cb352 dpif: Fix tunnel key set for IPv6 tunnels with SLOW_ACTION.
The dpif_execute_helper_cb() function is supposed to add the
OVS_ACTION_ATTR_SET(OVS_KEY_ATTR_TUNNEL()) action to the
list of actions when passing it down to the kernel.

This function was only checking if the IPv4 destination
address was set, not both. This patch fixes this, including
a datapath testcase.

Fixes: 076caa2fb0 ("ofproto: Meter translation.")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-01-06 15:42:15 +01:00
Gaetan Rivet
9ab104718b dpctl: Add function to read hardware offload statistics.
Expose a function to query datapath offload statistics.
This function is separate from the current one in netdev-offload
as it exposes more detailed statistics from the datapath, instead of
only from the netdev-offload provider.

Each datapath is meant to use the custom counters as it sees fit for its
handling of hardware offloads.

Call the new API from dpctl.

Signed-off-by: Gaetan Rivet <grive@u256.net>
Reviewed-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-18 15:12:01 +01:00
Eelco Chaudron
51ec98635e utilities: Add upcall USDT probe and associated script.
Added the dpif_recv:recv_upcall USDT probe, which is used by the
included upcall_monitor.py script. This script receives all upcall
packets sent by the kernel to ovs-vswitchd. By default, it will
show all  upcall events, which looks something like this:

 TIME               CPU  COMM      PID      DPIF_NAME          TYPE PKT_LEN FLOW_KEY_LEN
 5952147.003848809  2    handler4  1381158  system@ovs-system  0    98      132
 5952147.003879643  2    handler4  1381158  system@ovs-system  0    70      160
 5952147.003914924  2    handler4  1381158  system@ovs-system  0    98      152

It can also dump the packet and NetLink content, and if required,
the packets can also be written to a pcap file.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-18 00:46:30 +01:00
Martin Varghese
1917ace893 Encap & Decap actions for MPLS packet type.
The encap & decap actions are extended to support MPLS packet type.
Encap & decap actions adds and removes MPLS header at start of the
packet.

The existing PUSH MPLS & POP MPLS actions inserts & removes MPLS
header between ethernet header and the IP header. Though this behaviour
is fine for L3 VPN where an IP packet is encapsulated inside a MPLS
tunnel, it does not suffice the L2 VPN requirements. In L2 VPN the
ethernet packets must be encapsulated inside MPLS tunnel.

In this change the encap & decap actions are extended to support MPLS
packet type. The encap & decap adds and removes MPLS header at the
start of packet as depicted below.

Encapsulation:

Actions - encap(mpls),encap(ethernet)

Incoming packet -> | ETH | IP | Payload |

1 Actions -  encap(mpls) [Datapath action - ADD_MPLS:0x8847]

        Outgoing packet -> | MPLS | ETH | Payload|

2 Actions - encap(ethernet) [ Datapath action - push_eth ]

        Outgoing packet -> | ETH | MPLS | ETH | Payload|

Decapsulation:

Incoming packet -> | ETH | MPLS | ETH | IP | Payload |

Actions - decap(),decap(packet_type(ns=0,type=0))

1 Actions -  decap() [Datapath action - pop_eth)

        Outgoing packet -> | MPLS | ETH | IP | Payload|

2 Actions - decap(packet_type(ns=0,type=0)) [Datapath action - POP_MPLS:0x6558]

        Outgoing packet -> | ETH  | IP | Payload|

Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-17 02:04:20 +01:00
Ilya Maximets
11441385c2 bridge: Fix incorrect configuration of netdev's dpif type.
netdev_set_dpif_type() can only be used with a normalized dpif type
as an argument, which is a constant static string derived from a type
of a dpif_class or a constant string "system".  Usage of a same
constant string allows netdev-offload module to compare types by
simply comparing pointers.

OTOH, 'br->ofproto->type' is a dynamic string that:
a. Can be NULL.
b. Even if not NULL and equal, can be a different dynamically
   allocated string.

Both these qualities breaks assumptions made by all other modules
related to HW offload, breaking the functionality.

Fix that by moving netdev_set_dpif_type() to dpif.c and calling with
a correct constant string as an argument.

The call moved from bridge.c to dpif.c, because we need to have access
to the dpif class, but bridge.c should not.

Not trying to set the dpif_type inside the netdev_ports_insert(),
because it's used now outside the offloading context.  So, it's
cleaner to move the netdev_set_dpif_type() call outside of the
netdev-offload module.

Additionally removed the redundant call from the netdev_ports_insert()
and refactored the function, since it doesn't need an extra argument
anymore.

Fixes: 4f19a78a61 ("netdev-vport: Fix userspace tunnel ioctl(SIOCGIFINDEX) info logs.")
Reported-by: Roi Dayan <roid@nvidia.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-December/390117.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Lin Huang <linhuang@ruijie.com.cn>
Acked-by: Roi Dayan <roid@nvidia.com>
2021-12-17 21:31:55 +01:00
Eelco Chaudron
9b20df73a6 dpctl: dpif: Allow viewing and configuring dp cache sizes.
This patch adds a general way of viewing/configuring datapath
cache sizes. With an implementation for the netlink interface.

The ovs-dpctl/ovs-appctl show commands will display the
current cache sizes configured:

 $ ovs-dpctl show
 system@ovs-system:
   lookups: hit:25 missed:63 lost:0
   flows: 0
   masks: hit:282 total:0 hit/pkt:3.20
   cache: hit:4 hit-rate:4.54%
   caches:
     masks-cache: size:256
   port 0: ovs-system (internal)
   port 1: br-int (internal)
   port 2: genev_sys_6081 (geneve: packet_type=ptap)
   port 3: br-ex (internal)
   port 4: eth2
   port 5: sw0p1 (internal)
   port 6: sw0p3 (internal)

A specific cache can be configured as follows:

 $ ovs-appctl dpctl/cache-set-size DP CACHE SIZE
 $ ovs-dpctl cache-set-size DP CACHE SIZE

For example to disable the cache do:

 $ ovs-dpctl cache-set-size system@ovs-system masks-cache 0
 Setting cache size successful, new size 0.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-08 21:48:05 +01:00
Somnath Chatterjee
42c348184b dpif: Fix function pointer check for bond_add.
There was typo in function pointer check in
dpif_bond_add() before calling bond_add().

Fixes: 9df65060cf ("userspace: Avoid dp_hash recirculation for balance-tcp bond mode.")
Signed-off-by: Somnath Chatterjee <somnath.b.chatterjee@ericsson.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-11-03 16:38:45 +01:00
Mark Gray
b1e517bd2f dpif-netlink: Introduce per-cpu upcall dispatch.
The Open vSwitch kernel module uses the upcall mechanism to send
packets from kernel space to user space when it misses in the kernel
space flow table. The upcall sends packets via a Netlink socket.
Currently, a Netlink socket is created for every vport. In this way,
there is a 1:1 mapping between a vport and a Netlink socket.
When a packet is received by a vport, if it needs to be sent to
user space, it is sent via the corresponding Netlink socket.

This mechanism, with various iterations of the corresponding user
space code, has seen some limitations and issues:

* On systems with a large number of vports, there is correspondingly
a large number of Netlink sockets which can limit scaling.
(https://bugzilla.redhat.com/show_bug.cgi?id=1526306)
* Packet reordering on upcalls.
(https://bugzilla.redhat.com/show_bug.cgi?id=1844576)
* A thundering herd issue.
(https://bugzilla.redhat.com/show_bug.cgi?id=1834444)

This patch introduces an alternative, feature-negotiated, upcall
mode using a per-cpu dispatch rather than a per-vport dispatch.

In this mode, the Netlink socket to be used for the upcall is
selected based on the CPU of the thread that is executing the upcall.
In this way, it resolves the issues above as:

a) The number of Netlink sockets scales with the number of CPUs
rather than the number of vports.
b) Ordering per-flow is maintained as packets are distributed to
CPUs based on mechanisms such as RSS and flows are distributed
to a single user space thread.
c) Packets from a flow can only wake up one user space thread.

Reported-at: https://bugzilla.redhat.com/1844576
Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-07-16 20:05:03 +02:00
Ilya Maximets
f9d3039039 dpif: Fix use of uninitialized execute hash.
'dpif_execute_helper_cb' doesn't initilalize the 'hash' field that
may be passed down to datapath and might cause execution of a different
set of actions, e.g. on recirculation.

 Thread 6 handler27:
 Conditional jump or move depends on uninitialised value(s)
    at 0x53A2C2: dpif_netlink_encode_execute (dpif-netlink.c:1841)
    by 0x53A2C2: dpif_netlink_operate__ (dpif-netlink.c:1919)
    by 0x53A82D: dpif_netlink_operate_chunks (dpif-netlink.c:2238)
    by 0x53A82D: dpif_netlink_operate (dpif-netlink.c:2297)
    by 0x48135F: dpif_operate (dpif.c:1366)
    by 0x481923: dpif_execute.part.24 (dpif.c:1320)
    by 0x481C46: dpif_execute (dpif.c:1312)
    by 0x481C46: dpif_execute_helper_cb (dpif.c:1243)
    by 0x4AE943: odp_execute_actions (odp-execute.c:865)
    by 0x47F272: dpif_execute_with_help (dpif.c:1296)
    by 0x4812FF: dpif_operate (dpif.c:1422)
    by 0x442226: handle_upcalls (ofproto-dpif-upcall.c:1617)
    by 0x442226: recv_upcalls.isra.36 (ofproto-dpif-upcall.c:855)
    by 0x442351: udpif_upcall_handler (ofproto-dpif-upcall.c:755)
    by 0x4FDE2C: ovsthread_wrapper (ovs-thread.c:383)
    by 0x5E19159: start_thread (in /usr/lib64/libpthread-2.28.so)
    by 0x69ECF72: clone (in /usr/lib64/libc-2.28.so)
  Uninitialised value was created by a stack allocation
    at 0x481966: dpif_execute_helper_cb (dpif.c:1159)

Additionally added a missing comment to the 'struct dpif_execute'.

Fixes: 0442bfb11d ("ofproto-dpif-upcall: Echo HASH attribute back to datapath.")
Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-04-20 00:00:22 +02:00
Jianbo Liu
c5b4b0ce95 dpif-netlink: Fix issues of the offloaded flows counter.
The n_offloaded_flows counter is saved in dpif, and this is the first
one when ofproto is created. When flow operation is done by ovs-appctl
commands, such as, dpctl/add-flow, a new dpif is opened, and the
n_offloaded_flows in it can't be used. So, instead of using counter,
the number of offloaded flows is queried from each netdev, then sum
them up. To achieve this, a new API is added in netdev_flow_api to get
how many flows assigned to a netdev.

In order to get better performance, this number is calculated directly
from tc_to_ufid hmap for netdev-offload-tc, because flow dumping by tc
takes much time if there are many flows offloaded.

Fixes: af06184705 ("dpif-netlink: Count the number of offloaded rules")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 20:25:59 +01:00
Ben Pfaff
91fc374a9c Eliminate use of term "slave" in bond, LACP, and bundle contexts.
The new term is "member".

Most of these changes should not change user-visible behavior.  One
place where they do is in "ovs-ofctl dump-flows", which will now output
"members:..." inside "bundle" actions instead of "slaves:...".  I don't
expect this to cause real problems in most systems.  The old syntax
is still supported on input for backward compatibility.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
2020-10-21 11:28:24 -07:00
Ben Pfaff
8205fbc8f5 Eliminate "whitelist" and "blacklist" terms.
There is one remaining use under datapath.  That change should happen
upstream in Linux first according to our usual policy.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
2020-10-16 19:22:24 -07:00
Ilya Maximets
8842fdf1b3 netdev-offload: Use dpif type instead of class.
There is no real difference between the 'class' and 'type' in the
context of common lookup operations inside netdev-offload module
because it only checks the value of pointers without using the
value itself.  However, 'type' has some meaning and can be used by
offload provides on the initialization phase to check if this type
of Flow API in pair with the netdev type could be used in particular
datapath type.  For example, this is needed to check if Linux flow
API could be used for current tunneling vport because it could be
used only if tunneling vport belongs to system datapath, i.e. has
backing linux interface.

This is needed to unblock tunneling offloads in userspace datapath
with DPDK flow API.

Acked-by: Eli Britstein <elibr@mellanox.com>
Acked-by: Roni Bar Yanai <roniba@mellanox.com>
Acked-by: Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-07-08 19:07:21 +02:00
Vishal Deep Ajmera
9df65060cf userspace: Avoid dp_hash recirculation for balance-tcp bond mode.
Problem:

In OVS, flows with output over a bond interface of type “balance-tcp”
gets translated by the ofproto layer into "HASH" and "RECIRC" datapath
actions. After recirculation, the packet is forwarded to the bond
member port based on 8-bits of the datapath hash value computed through
dp_hash. This causes performance degradation in the following ways:

1. The recirculation of the packet implies another lookup of the
packet’s flow key in the exact match cache (EMC) and potentially
Megaflow classifier (DPCLS). This is the biggest cost factor.

2. The recirculated packets have a new “RSS” hash and compete with the
original packets for the scarce number of EMC slots. This implies more
EMC misses and potentially EMC thrashing causing costly DPCLS lookups.

3. The 256 extra megaflow entries per bond for dp_hash bond selection
put additional load on the revalidation threads.

Owing to this performance degradation, deployments stick to “balance-slb”
bond mode even though it does not do active-active load balancing for
VXLAN- and GRE-tunnelled traffic because all tunnel packet have the
same source MAC address.

Proposed optimization:

This proposal introduces a new load-balancing output action instead of
recirculation.

Maintain one table per-bond (could just be an array of uint16's) and
program it the same way internal flows are created today for each
possible hash value (256 entries) from ofproto layer. Use this table to
load-balance flows as part of output action processing.

Currently xlate_normal() -> output_normal() ->
bond_update_post_recirc_rules() -> bond_may_recirc() and
compose_output_action__() generate 'dp_hash(hash_l4(0))' and
'recirc(<RecircID>)' actions. In this case the RecircID identifies the
bond. For the recirculated packets the ofproto layer installs megaflow
entries that match on RecircID and masked dp_hash and send them to the
corresponding output port.

Instead, we will now generate action as
    'lb_output(<bond id>)'

This combines hash computation (only if needed, else re-use RSS hash)
and inline load-balancing over the bond. This action is used *only* for
balance-tcp bonds in userspace datapath (the OVS kernel datapath
remains unchanged).

Example:
Current scheme:

With 8 UDP flows (with random UDP src port):

  flow-dump from pmd on cpu core: 2
  recirc_id(0),in_port(7),<...> actions:hash(hash_l4(0)),recirc(0x1)

  recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0xb236c260/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0x7d89eb18/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0xa78d75df/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0xb58d846f/0xff),<...> actions:2
  recirc_id(0x1),dp_hash(0x24534406/0xff),<...> actions:1
  recirc_id(0x1),dp_hash(0x3cf32550/0xff),<...> actions:1

New scheme:
We can do with a single flow entry (for any number of new flows):

  in_port(7),<...> actions:lb_output(1)

A new CLI has been added to dump datapath bond cache as given below.

 # ovs-appctl dpif-netdev/bond-show [dp]

   Bond cache:
     bond-id 1 :
       bucket 0 - slave 2
       bucket 1 - slave 1
       bucket 2 - slave 2
       bucket 3 - slave 1

Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-06-22 13:11:51 +02:00
Ilya Maximets
342b8904ab dpif: Fix dp_extra_info leak by reworking the allocation scheme.
dpctl module leaks the 'dp_extra_info' in case the dumped flow doesn't
fit the dump filter while executing dpctl/dump-flows and also while
executing dpctl/get-flow.

This is already a 3rd attempt to fix all the leaks and incorrect usage
of this string that definitely indicates poor initial design of the
feature.

Flow dump/get documentation clearly states that the caller does not own
the data provided in dpif_flow.  Datapath still owns all the data and
promises to not free/modify it until the next quiescent period, however
we're requesting the caller to free 'dp_extra_info' and this obviously
breaks the rules.

This patch fixes the issue by by storing 'dp_extra_info' within
'struct dp_netdev_flow' making datapath to own it.  'dp_netdev_flow'
is RCU-protected, so it will be valid until the next quiescent period.

Fixes: 0e8f5c6a38 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command")
Tested-by: Emma Finn <emma.finn@intel.com>
Acked-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-27 21:20:01 +01:00
Ilya Maximets
d7b55c5c94 dpif: Fix leak and usage of uninitialized dp_extra_info.
'dpif_probe_feature'/'revalidate' doesn't free the 'dp_extra_info'
string.  Also, all the implementations of dpif_flow_get() should
initialize the value to avoid printing/freeing of random memory.

 30 bytes in 1 blocks are definitely lost in loss record 323 of 889
    at 0x483AD19: realloc (vg_replace_malloc.c:836)
    by 0xDDAD89: xrealloc (util.c:149)
    by 0xCE1609: ds_reserve (dynamic-string.c:63)
    by 0xCE1A90: ds_put_format_valist (dynamic-string.c:161)
    by 0xCE19B9: ds_put_format (dynamic-string.c:142)
    by 0xCCCEA9: dp_netdev_flow_to_dpif_flow (dpif-netdev.c:3170)
    by 0xCCD2DD: dpif_netdev_flow_get (dpif-netdev.c:3278)
    by 0xCCEA0A: dpif_netdev_operate (dpif-netdev.c:3868)
    by 0xCDF81B: dpif_operate (dpif.c:1361)
    by 0xCDEE93: dpif_flow_get (dpif.c:1002)
    by 0xCDECF9: dpif_probe_feature (dpif.c:962)
    by 0xC635D2: check_recirc (ofproto-dpif.c:896)
    by 0xC65C02: check_support (ofproto-dpif.c:1567)
    by 0xC63274: open_dpif_backer (ofproto-dpif.c:818)
    by 0xC65E3E: construct (ofproto-dpif.c:1605)
    by 0xC4D436: ofproto_create (ofproto.c:549)
    by 0xC3931A: bridge_reconfigure (bridge.c:877)
    by 0xC3FEAC: bridge_run (bridge.c:3324)
    by 0xC4551D: main (ovs-vswitchd.c:127)

CC: Emma Finn <emma.finn@intel.com>
Fixes: 0e8f5c6a38 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
2020-01-20 17:51:16 +01:00
Ilya Maximets
7a5e0ee7cc dpif: Turn dpif_flow_hash function into generic odp_flow_key_hash.
Current implementation of dpif_flow_hash() doesn't depend on datapath
interface and only complicates the callers by forcing them to figure
out what is their current 'dpif'.  If we'll need different hashing
for different 'dpif's we'll implement an API for dpif-providers
and each dpif implementation will be able to use their local function
directly without calling it via dpif API.

This change will allow us to not store 'dpif' pointer in the userspace
datapath implementation which is broken and will be removed in next
commits.

This patch moves dpif_flow_hash() to odp-util module and replaces
unused odp_flow_key_hash() by it, along with removing of unused 'dpif'
argument.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2020-01-08 16:02:37 +01:00
Anju Thomas
a13a020975 userspace: Improved packet drop statistics.
Currently OVS maintains explicit packet drop/error counters only on port
level.  Packets that are dropped as part of normal OpenFlow processing
are counted in flow stats of “drop” flows or as table misses in table
stats. These can only be interpreted by controllers that know the
semantics of the configured OpenFlow pipeline.  Without that knowledge,
it is impossible for an OVS user to obtain e.g. the total number of
packets dropped due to OpenFlow rules.

Furthermore, there are numerous other reasons for which packets can be
dropped by OVS slow path that are not related to the OpenFlow pipeline.
The generated datapath flow entries include a drop action to avoid
further expensive upcalls to the slow path, but subsequent packets
dropped by the datapath are not accounted anywhere.

Finally, the datapath itself drops packets in certain error situations.
Also, these drops are today not accounted for.This makes it difficult
for OVS users to monitor packet drop in an OVS instance and to alert a
management system in case of a unexpected increase of such drops.
Also OVS trouble-shooters face difficulties in analysing packet drops.

With this patch we implement following changes to address the issues
mentioned above.

1. Identify and account all the silent packet drop scenarios
2. Display these drops in ovs-appctl coverage/show

Co-authored-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Co-authored-by: Keshav Gupta <keshugupta1@gmail.com>
Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>
Signed-off-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Signed-off-by: Keshav Gupta <keshugupta1@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-07 17:01:42 +01:00
Paul Blakey
dcdcad68c6 dpif: Add support to set user features
This enables user features on the kernel datapath via the DP_CMD_SET
command, and also retrieves them to check for actual support and
not just an older kernel ignoring the requested features.

This will be used in next patch to enable recirc_id sharing with tc.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-22 11:54:40 +01:00
Ilya Maximets
f87c135706 vswitchd: Always cleanup userspace datapath.
'netdev' datapath is implemented within ovs-vswitchd process and can
not exist without it, so it should be gracefully terminated with a
full cleanup of resources upon ovs-vswitchd exit.

This change forces dpif cleanup for 'netdev' datapath regardless of
passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to
not pass this additional option everytime for userspace datapath
installations and also allowes to not terminate system datapath in
setups where both datapaths runs at the same time.

The main part is that dpif_port_del() will lead to netdev_close()
and subsequent netdev_class->destroy(dev) which will stop HW NICs
and free their resources. For vhost-user interfaces it will invoke
vhost driver unregistering with a properly closed vhost-user
connection. For upcoming AF_XDP netdev this will allow to gracefully
destroy xdp sockets and unload xdp programs from linux interfaces.
Another important thing is that port deletion will also trigger
flushing of flows offloaded to HW NICs.

Exception made for 'internal' ports that could have user ip/route
configuration. These ports will not be removed without '--cleanup'.

This change fixes OVS disappearing from the DPDK point of view
(keeping HW NICs improperly configured, sudden closing of vhost-user
connections) and will help with linux devices clearing with upcoming
AF_XDP netdev support.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-07-02 12:24:47 +03:00
Numan Siddique
5b34f8fc3b Add a new OVS action check_pkt_larger
This patch adds a new action 'check_pkt_larger' which checks if the
packet is larger than the given size and stores the result in the
destination register.

Usage: check_pkt_larger(len)->REGISTER
Eg. match=...,actions=check_pkt_larger(1442)->NXM_NX_REG0[0],next;

This patch makes use of the new datapath action - 'check_pkt_len'
which was recently added in the commit [1].
At the start of ovs-vswitchd, datapath is probed for this action.
If the datapath action is present, then 'check_pkt_larger'
makes use of this datapath action.

Datapath action 'check_pkt_len' takes these nlattrs
      * OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for
      * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER (optional) - Nested actions
        to apply if the packet length is greater than the specified 'pkt_len'
      * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL (optional) - Nested
        actions to apply if the packet length is lesser or equal to the
        specified 'pkt_len'.

Let's say we have these flows added to an OVS bridge br-int

table=0, priority=100 in_port=1,ip,actions=check_pkt_larger:100->NXM_NX_REG0[0],resubmit(,1)
table=1, priority=200,in_port=1,ip,reg0=0x1/0x1 actions=output:3
table=1, priority=100,in_port=1,ip,actions=output:4

Then the action 'check_pkt_larger' will be translated as
  - check_pkt_len(size=100,gt(3),le(4))

datapath will check the packet length and if the packet length is greater than 100,
it will output to port 3, else it will output to port 4.

In case, datapath doesn't support 'check_pkt_len' action, the OVS action
'check_pkt_larger' sets SLOW_ACTION so that datapath flow is not added.

This OVS action is intended to be used by OVN to check the packet length
and generate an ICMP packet with type 3, code 4 and next hop mtu
in the logical router pipeline if the MTU of the physical interface
is lesser than the packet length. More information can be found here [2]

[1] - 4d5ec89fc8
[2] - https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html

Reported-at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
CC: Ben Pfaff <blp@ovn.org>
CC: Gregory Rose <gvrose8192@gmail.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-22 12:56:50 -07:00
John Hurley
608ff46aaf ovs-tc: offload datapath rules matching on internal ports
Rules applied to OvS internal ports are not represented in TC datapaths.
However, it is possible to support rules matching on internal ports in TC.
The start_xmit ndo of OvS internal ports directs packets back into the OvS
kernel datapath where they are rematched with the ingress port now being
that of the internal port. Due to this, rules matching on an internal port
can be added as TC filters to an egress qdisc for these ports.

Allow rules applied to internal ports to be offloaded to TC as egress
filters. Rules redirecting to an internal port are also offloaded. These
are supported by the redirect ingress functionality applied in an earlier
patch.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-04-10 13:55:59 +02:00
Flavio Leitner
9da3207af7 Revert "ofproto-dpif: Let the dpif report when a port is a duplicate."
This reverts commit 7521e0cf9e.

This patch introduced a regression in OSP environments using internal
ports in other netns. Their networking configuration is lost when
the service is restarted because the ports are recreated now.

Before the patch it checked using netlink if the port with a specific
"name" was already there. The check is a lookup in all ports attached
to the DP regardless of the port's netns.

After the patch it relies on the kernel to identify that situation.
Unfortunately the only protection there is register_netdevice() which
fails only if the port with that name exists in the current netns.

If the port is in another netns, it will get a new dp_port and because
of that userspace will delete the old port. At this point the original
port is gone from the other netns and there a fresh port in the current
netns.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-01-25 13:20:13 -08:00
Ilya Maximets
1270b6e52c treewide: Wider use of packet batch APIs.
This patch replaces most of direct accesses to the dp_packet_batch
internal components by appropriate APIs.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2018-12-11 20:18:26 +00:00
Sriharsha Basavapatna
e6530a8d5d dpif: Restore a few lines with form feed characters
A few lines with form feed characters (ASCII: ^L) were accidentally
deleted by a recent commit to support rebalancing of offloaded flows.
This patch reverts those lines.

Fixes: 57924fc91c ("revalidator: Rebalance offloaded flows")
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-10-31 13:30:39 -07:00
Sriharsha Basavapatna via dev
57924fc91c revalidator: Rebalance offloaded flows based on the pps rate
This is the third patch in the patch-set to support dynamic rebalancing
of offloaded flows.

The dynamic rebalancing functionality is implemented in this patch. The
ukeys that are not scheduled for deletion are obtained and passed as input
to the rebalancing routine. The rebalancing is done in the context of
revalidation leader thread, after all other revalidator threads are
done with gathering rebalancing data for flows.

For each netdev that is in OOR state, a list of flows - both offloaded
and non-offloaded (pending) - is obtained using the ukeys. For each netdev
that is in OOR state, the flows are grouped and sorted into offloaded and
pending flows.  The offloaded flows are sorted in descending order of
pps-rate, while pending flows are sorted in ascending order of pps-rate.

The rebalancing is done in two phases. In the first phase, we try to
offload all pending flows and if that succeeds, the OOR state on the device
is cleared. If some (or none) of the pending flows could not be offloaded,
then we start replacing an offloaded flow that has a lower pps-rate than
a pending flow, until there are no more pending flows with a higher rate
than an offloaded flow. The flows that are replaced from the device are
added into kernel datapath.

A new OVS configuration parameter "offload-rebalance", is added to ovsdb.
The default value of this is "false". To enable this feature, set the
value of this parameter to "true", which provides packets-per-second
rate based policy to dynamically offload and un-offload flows.

Note: This option can be enabled only when 'hw-offload' policy is enabled.
It also requires 'tc-policy' to be set to 'skip_sw'; otherwise, flow
offload errors (specifically ENOSPC error this feature depends on) reported
by an offloaded device are supressed by TC-Flower kernel module.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Sathya Perla <sathya.perla@broadcom.com>
Reviewed-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-10-19 11:27:52 +02:00
Ben Pfaff
769b50349f dpif: Remove support for multiple queues per port.
Commit 69c51582ff ("dpif-netlink: don't allocate per thread netlink
sockets") removed dpif-netlink support for multiple queues per port.
No remaining dpif provider supports multiple queues per port, so
remove infrastructure for the feature.

CC: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
2018-09-26 15:57:06 -07:00
Gavi Teitz
a692410af0 dpctl: Expand the flow dump type filter
Added new types to the flow dump filter, and allowed multiple filter
types to be passed at once, as a comma separated list. The new types
added are:
 * tc - specifies flows handled by the tc dp
 * non-offloaded - specifies flows not offloaded to the HW
 * all - specifies flows of all types

The type list is now fully parsed by the dpctl, and a new struct was
added to dpif which enables dpctl to define which types of dumps to
provide, rather than passing the type string and having dpif parse it.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-09-13 16:56:25 +02:00
Justin Pettit
8101f03fcd dpif: Don't pass in '*meter_id' to meter_set commands.
The original intent of the API appears to be that the underlying DPIF
implementaion would choose a local meter id.  However, neither of the
existing datapath meter implementations (userspace or Linux) implemented
that; they expected a valid meter id to be passed in, otherwise they
returned an error.  This commit follows the existing implementations and
makes the API somewhat cleaner.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2018-08-16 10:20:52 -07:00
Justin Pettit
6508c845ad dpif: Move common meter checks into the dpif layer.
Another dpif provider will soon add support for meters, so move
some of the common sanity checks up into the dpif layer so that each
provider doesn't need to re-implement them.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2018-07-30 13:00:49 -07:00
Justin Pettit
494a74557a Revert "dpctl: Expand the flow dump type filter"
Commit ab15e70eb5 ("dpctl: Expand the flow dump type filter") had a
number of issues with style, build breakage, and failing unit tests.
The patch is being reverted so that they can addressed.

This reverts commit ab15e70eb5.

CC: Gavi Teitz <gavi@mellanox.com>
CC: Simon Horman <simon.horman@netronome.com>
CC: Roi Dayan <roid@mellanox.com>
CC: Aaron Conole <aconole@redhat.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2018-07-25 14:17:36 -07:00
Gavi Teitz
ab15e70eb5 dpctl: Expand the flow dump type filter
Added new types to the flow dump filter, and allowed multiple filter
types to be passed at once, as a comma separated list. The new types
added are:
 * tc - specifies flows handled by the tc dp
 * non-offloaded - specifies flows not offloaded to the HW
 * all - specifies flows of all types

The type list is now fully parsed by the dpctl, and a new struct was
added to dpif which enables dpctl to define which types of dumps to
provide, rather than passing the type string and having dpif parse it.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-07-25 18:16:27 +02:00
Ben Pfaff
7521e0cf9e ofproto-dpif: Let the dpif report when a port is a duplicate.
The port_add() function checks whether the port about to be added to the
dpif is already present and adds it only if it is not.  This duplicates a
check also present (and necessary) in each dpif and races with it as well.
When a dpif has a large number of ports, the check can be expensive (it is
not efficiently implemented).  It would be nice to made the check cheaper,
but it also seems reasonable to do as done in this patch and just let the
dpif report the duplication.

Reported-by: Haifeng Lin <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-07-05 14:59:53 -07:00
Ben Pfaff
5a0e4aec1a treewide: Convert leading tabs to spaces.
It's always been OVS coding style to use spaces rather than tabs for
indentation, but some tabs have snuck in over time.  This commit converts
them to spaces.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2018-06-11 15:32:00 -07:00
Ben Pfaff
fa37affad3 Embrace anonymous unions.
Several OVS structs contain embedded named unions, like this:

struct {
    ...
    union {
        ...
    } u;
};

C11 standardized a feature that many compilers already implemented
anyway, where an embedded union may be unnamed, like this:

struct {
    ...
    union {
        ...
    };
};

This is more convenient because it allows the programmer to omit "u."
in many places.  OVS already used this feature in several places.  This
commit embraces it in several others.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Tested-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
2018-05-25 13:36:05 -07:00
Darrell Ball
7d7ded7af7 odp-execute: Rename 'may_steal' to 'should_steal'.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-05-23 11:36:47 -07:00
William Tu
c6d8720137 tunnel: make tun_key_to_attr aware of tunnel type.
When there is a flow rule which forwards a packet from geneve
port to another tunnel port, ex: gre, the tun_metadata carried
from the geneve port might affect the outgoing port.  For example,
the datapath action from geneve port output to gre port (1) shows:
  set(tunnel(tun_id=0x7b,dst=2.2.2.2,ttl=64,
    geneve({class=0xffff,type=0,len=4,0x123}),flags(df|key))),1
Where the geneve(...) should not exist.

When using kernel's tunnel port, this triggers an error saying:
"Multiple metadata blocks provided", when there is a rule forwarding
the geneve packet to vxlan/erspan tunnel port.  A userspace test case
using geneve and gre also demonstrates the issue.

The patch makes the tun_key_to_attr aware of the tunnel type. So only
the relevant output tunnel's options are set.

Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-05-14 16:21:03 -07:00
Ben Pfaff
0d71302e36 ofp-util, ofp-parse: Break up into many separate modules.
ofp-util had been far too large and monolithic for a long time.  This
commit breaks it up into units that make some logical sense.  It also
moves the pieces of ofp-parse that were specific to each unit into the
relevant unit.

Most of this commit is just moving code around.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
2018-02-13 10:43:13 -08:00
Eric Garver
1fe178d251 dpif: Add support for OVS_ACTION_ATTR_CT_CLEAR
This supports using the ct_clear action in the kernel datapath. To
preserve compatibility with current ct_clear behavior on old kernels, we
only pass this action down to the datapath if a probe reveals the
datapath actually supports it.

Signed-off-by: Eric Garver <e@erig.me>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2018-01-20 11:16:37 -08:00
Yi Yang
f59cb331c4 nsh: rework NSH netlink keys and actions
This patch changes OVS_KEY_ATTR_NSH
to nested attribute and adds three new NSH sub attribute keys:

    OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header
    OVS_NSH_KEY_ATTR_MD1:  for length-fixed MD type 1 context
    OVS_NSH_KEY_ATTR_MD2:  for length-variable MD type 2 metadata

Its intention is to align to NSH kernel implementation.

NSH match fields, set and PUSH_NSH action all use the below
nested attribute format:

OVS_KEY_ATTR_NSH begin
    OVS_NSH_KEY_ATTR_BASE
    OVS_NSH_KEY_ATTR_MD1
OVS_KEY_ATTR_NSH end

or

OVS_KEY_ATTR_NSH begin
    OVS_NSH_KEY_ATTR_BASE
    OVS_NSH_KEY_ATTR_MD2
OVS_KEY_ATTR_NSH end

In addition, NSH encap and decap actions are renamed as push_nsh
and pop_nsh to meet action naming convention.

Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-01-08 13:19:14 -08:00
Yifeng Sun
962044bf24 dpif: Fix memory leak
Valgrind complains in test 2322 (ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR):

31,584 (26,496 direct, 5,088 indirect) bytes in 48 blocks are definitely
lost in loss record 422 of 427
   by 0x5165F4: xmalloc (util.c:120)
   by 0x466194: dp_packet_new (dp-packet.c:138)
   by 0x466194: dp_packet_new_with_headroom (dp-packet.c:148)
   by 0x46621B: dp_packet_clone_data_with_headroom (dp-packet.c:210)
   by 0x46621B: dp_packet_clone_with_headroom (dp-packet.c:170)
   by 0x49DD46: dp_packet_batch_clone (dp-packet.h:789)
   by 0x49DD46: odp_execute_clone (odp-execute.c:616)
   by 0x49DD46: odp_execute_actions (odp-execute.c:795)
   by 0x471663: dpif_execute_with_help (dpif.c:1296)
   by 0x473795: dpif_operate (dpif.c:1411)
   by 0x473E20: dpif_execute.part.21 (dpif.c:1320)
   by 0x428D38: packet_execute (ofproto-dpif.c:4682)
   by 0x41EB51: ofproto_packet_out_finish (ofproto.c:3540)
   by 0x41EB51: handle_packet_out (ofproto.c:3581)
   by 0x4233DA: handle_openflow__ (ofproto.c:8044)
   by 0x4233DA: handle_openflow (ofproto.c:8219)
   by 0x4514AA: ofconn_run (connmgr.c:1437)
   by 0x4514AA: connmgr_run (connmgr.c:363)
   by 0x41C8B5: ofproto_run (ofproto.c:1813)
   by 0x40B103: bridge_run__ (bridge.c:2919)
   by 0x4103B3: bridge_run (bridge.c:2977)
   by 0x406F14: main (ovs-vswitchd.c:119)

the parameter dp_packet_batch is leaked when 'may_steal' is true.

When dpif_execute_helper_cb is passed with a true 'may_steal', it
is supposed to take the ownership of dp_packet_batch and release
it when done.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-29 13:59:15 -08:00