This is the third patch in the patch-set to support dynamic rebalancing
of offloaded flows.
The dynamic rebalancing functionality is implemented in this patch. The
ukeys that are not scheduled for deletion are obtained and passed as input
to the rebalancing routine. The rebalancing is done in the context of
revalidation leader thread, after all other revalidator threads are
done with gathering rebalancing data for flows.
For each netdev that is in OOR state, a list of flows - both offloaded
and non-offloaded (pending) - is obtained using the ukeys. For each netdev
that is in OOR state, the flows are grouped and sorted into offloaded and
pending flows. The offloaded flows are sorted in descending order of
pps-rate, while pending flows are sorted in ascending order of pps-rate.
The rebalancing is done in two phases. In the first phase, we try to
offload all pending flows and if that succeeds, the OOR state on the device
is cleared. If some (or none) of the pending flows could not be offloaded,
then we start replacing an offloaded flow that has a lower pps-rate than
a pending flow, until there are no more pending flows with a higher rate
than an offloaded flow. The flows that are replaced from the device are
added into kernel datapath.
A new OVS configuration parameter "offload-rebalance", is added to ovsdb.
The default value of this is "false". To enable this feature, set the
value of this parameter to "true", which provides packets-per-second
rate based policy to dynamically offload and un-offload flows.
Note: This option can be enabled only when 'hw-offload' policy is enabled.
It also requires 'tc-policy' to be set to 'skip_sw'; otherwise, flow
offload errors (specifically ENOSPC error this feature depends on) reported
by an offloaded device are supressed by TC-Flower kernel module.
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Sathya Perla <sathya.perla@broadcom.com>
Reviewed-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Commit 69c51582ff78 ("dpif-netlink: don't allocate per thread netlink
sockets") removed dpif-netlink support for multiple queues per port.
No remaining dpif provider supports multiple queues per port, so
remove infrastructure for the feature.
CC: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Added new types to the flow dump filter, and allowed multiple filter
types to be passed at once, as a comma separated list. The new types
added are:
* tc - specifies flows handled by the tc dp
* non-offloaded - specifies flows not offloaded to the HW
* all - specifies flows of all types
The type list is now fully parsed by the dpctl, and a new struct was
added to dpif which enables dpctl to define which types of dumps to
provide, rather than passing the type string and having dpif parse it.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
The original intent of the API appears to be that the underlying DPIF
implementaion would choose a local meter id. However, neither of the
existing datapath meter implementations (userspace or Linux) implemented
that; they expected a valid meter id to be passed in, otherwise they
returned an error. This commit follows the existing implementations and
makes the API somewhat cleaner.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Another dpif provider will soon add support for meters, so move
some of the common sanity checks up into the dpif layer so that each
provider doesn't need to re-implement them.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Commit ab15e70eb587 ("dpctl: Expand the flow dump type filter") had a
number of issues with style, build breakage, and failing unit tests.
The patch is being reverted so that they can addressed.
This reverts commit ab15e70eb5878b46f8f84da940ffc915b6d74cad.
CC: Gavi Teitz <gavi@mellanox.com>
CC: Simon Horman <simon.horman@netronome.com>
CC: Roi Dayan <roid@mellanox.com>
CC: Aaron Conole <aconole@redhat.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Added new types to the flow dump filter, and allowed multiple filter
types to be passed at once, as a comma separated list. The new types
added are:
* tc - specifies flows handled by the tc dp
* non-offloaded - specifies flows not offloaded to the HW
* all - specifies flows of all types
The type list is now fully parsed by the dpctl, and a new struct was
added to dpif which enables dpctl to define which types of dumps to
provide, rather than passing the type string and having dpif parse it.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
The port_add() function checks whether the port about to be added to the
dpif is already present and adds it only if it is not. This duplicates a
check also present (and necessary) in each dpif and races with it as well.
When a dpif has a large number of ports, the check can be expensive (it is
not efficiently implemented). It would be nice to made the check cheaper,
but it also seems reasonable to do as done in this patch and just let the
dpif report the duplication.
Reported-by: Haifeng Lin <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
It's always been OVS coding style to use spaces rather than tabs for
indentation, but some tabs have snuck in over time. This commit converts
them to spaces.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Several OVS structs contain embedded named unions, like this:
struct {
...
union {
...
} u;
};
C11 standardized a feature that many compilers already implemented
anyway, where an embedded union may be unnamed, like this:
struct {
...
union {
...
};
};
This is more convenient because it allows the programmer to omit "u."
in many places. OVS already used this feature in several places. This
commit embraces it in several others.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Tested-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
When there is a flow rule which forwards a packet from geneve
port to another tunnel port, ex: gre, the tun_metadata carried
from the geneve port might affect the outgoing port. For example,
the datapath action from geneve port output to gre port (1) shows:
set(tunnel(tun_id=0x7b,dst=2.2.2.2,ttl=64,
geneve({class=0xffff,type=0,len=4,0x123}),flags(df|key))),1
Where the geneve(...) should not exist.
When using kernel's tunnel port, this triggers an error saying:
"Multiple metadata blocks provided", when there is a rule forwarding
the geneve packet to vxlan/erspan tunnel port. A userspace test case
using geneve and gre also demonstrates the issue.
The patch makes the tun_key_to_attr aware of the tunnel type. So only
the relevant output tunnel's options are set.
Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
ofp-util had been far too large and monolithic for a long time. This
commit breaks it up into units that make some logical sense. It also
moves the pieces of ofp-parse that were specific to each unit into the
relevant unit.
Most of this commit is just moving code around.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
This supports using the ct_clear action in the kernel datapath. To
preserve compatibility with current ct_clear behavior on old kernels, we
only pass this action down to the datapath if a probe reveals the
datapath actually supports it.
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
This patch changes OVS_KEY_ATTR_NSH
to nested attribute and adds three new NSH sub attribute keys:
OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header
OVS_NSH_KEY_ATTR_MD1: for length-fixed MD type 1 context
OVS_NSH_KEY_ATTR_MD2: for length-variable MD type 2 metadata
Its intention is to align to NSH kernel implementation.
NSH match fields, set and PUSH_NSH action all use the below
nested attribute format:
OVS_KEY_ATTR_NSH begin
OVS_NSH_KEY_ATTR_BASE
OVS_NSH_KEY_ATTR_MD1
OVS_KEY_ATTR_NSH end
or
OVS_KEY_ATTR_NSH begin
OVS_NSH_KEY_ATTR_BASE
OVS_NSH_KEY_ATTR_MD2
OVS_KEY_ATTR_NSH end
In addition, NSH encap and decap actions are renamed as push_nsh
and pop_nsh to meet action naming convention.
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Valgrind complains in test 2322 (ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR):
31,584 (26,496 direct, 5,088 indirect) bytes in 48 blocks are definitely
lost in loss record 422 of 427
by 0x5165F4: xmalloc (util.c:120)
by 0x466194: dp_packet_new (dp-packet.c:138)
by 0x466194: dp_packet_new_with_headroom (dp-packet.c:148)
by 0x46621B: dp_packet_clone_data_with_headroom (dp-packet.c:210)
by 0x46621B: dp_packet_clone_with_headroom (dp-packet.c:170)
by 0x49DD46: dp_packet_batch_clone (dp-packet.h:789)
by 0x49DD46: odp_execute_clone (odp-execute.c:616)
by 0x49DD46: odp_execute_actions (odp-execute.c:795)
by 0x471663: dpif_execute_with_help (dpif.c:1296)
by 0x473795: dpif_operate (dpif.c:1411)
by 0x473E20: dpif_execute.part.21 (dpif.c:1320)
by 0x428D38: packet_execute (ofproto-dpif.c:4682)
by 0x41EB51: ofproto_packet_out_finish (ofproto.c:3540)
by 0x41EB51: handle_packet_out (ofproto.c:3581)
by 0x4233DA: handle_openflow__ (ofproto.c:8044)
by 0x4233DA: handle_openflow (ofproto.c:8219)
by 0x4514AA: ofconn_run (connmgr.c:1437)
by 0x4514AA: connmgr_run (connmgr.c:363)
by 0x41C8B5: ofproto_run (ofproto.c:1813)
by 0x40B103: bridge_run__ (bridge.c:2919)
by 0x4103B3: bridge_run (bridge.c:2977)
by 0x406F14: main (ovs-vswitchd.c:119)
the parameter dp_packet_batch is leaked when 'may_steal' is true.
When dpif_execute_helper_cb is passed with a true 'may_steal', it
is supposed to take the ownership of dp_packet_batch and release
it when done.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
a crash is seen in "netdev_ports_remove" when an interface is deleted and added
back in the system and when the interface is part of a bridge configuration.
e.g. steps:
create a tap0 interface using "ip tuntap add.."
add the tap0 interface to br0 using "ovs-vsctl add-port.."
delete the tap0 interface from system using "ip tuntap del.."
add the tap0 interface back in system using "ip tuntap add.."
(this changes the ifindex of the interface)
delete tap0 from br0 using "ovs-vsctl del-port.."
In the function "netdev_ports_insert", two hmap entries were created for
mapping "portnum -> netdev" and "ifindex -> portnum".
When the interface is deleted from the system, the "netdev_ports_remove"
function is not getting called and the old ifindex entry is not getting
cleaned up from the "ifindex_to_port" hmap.
As part of the fix, added function "dpif_port_remove" which will call
"netdev_ports_remove" in the path where the interface deletion from the system
is detected.
Also, in "netdev_ports_remove", added the code where the "ifindex_to_port_data"
(ifindex -> portnum map node) is getting freed when the ifindex is not
available any more. (as the interface is already deleted.)
VMware-BZ: #1975788
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Poll-loop is the core to implement main loop. It should be available in
libopenvswitch.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Executing dpctl commands from userspace also calls to
dpif_open()/dpif_close() but not really creating another dpif
but using a clone.
As for netdev_ports map is global we avoid adding duplicate entries
but also need to make sure we are not removing needed entries.
With this commit we make sure only the last dpif close should clean
the netdev_ports map.
Fixes: 6595cb95a4a9 ("dpif: Clean up netdev_ports map on dpif_close().")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Commit 32b77c316d9982("dpif: Save added ports in a port map.")
introduced tracking of all dpif ports by taking a reference on each
available netdev when the dpif is opened, but it failed to clear out and
release references to these netdevs when the dpif is closed.
One of the problems introduced by this was that upon clean exit of
ovs-vswitchd via "ovs-appctl exit --cleanup", the "ovs-netdev" device
was not deleted. This which could cause problems in subsequent start up.
Commit 5119e258da92 ("dpif: Fix cleanup of userspace datapath.") fixed
this particular problem by not adding such devices to the netdev_ports
map, but the referencing/unreferencing upon dpif_open()/dpif_close() is
still not balanced.
Balance the referencing of netdevs by clearing these during dpif_close().
Fixes: 32b77c316d9982("dpif: Save added ports in a port map.")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
This commit adds translation and netdev datapath support for generic
encap and decap actions for the NSH MD1 header. The generic encap and
decap actions are mapped to specific encap_nsh and decap_nsh actions
in the datapath.
The translation follows that general scheme that decap() of an NSH
packet triggers recirculation after decapsulation, while encap(nsh)
just modifies struct flow and sets the ctx->pending_encap flag to
generate the encap_nsh action at the next commit to be able to include
subsequent set_field actions for NSH headers.
Support for the flexible MD2 format using TLV properties is foreseen
in encap(nsh), but not yet fully implemented.
The CLI syntax for encap of NSH is
encap(nsh(md_type=1))
encap(nsh(md_type=2[,tlv(<tlv_class>,<tlv_type>,<hex_string>),...]))
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
It's basically what is being passed today and passing a specific
type adds a compiler type check.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Currently it is using the datapath name/type but what has actually
failed was the netdev.
Fix it by using netdev name/type instead and also log why it failed.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Hardware offload introduced extra tracking of netdev ports. This
included ovs-netdev, which is really for internal infra usage for
the userpace datapath. This breaks cleanup of the userspace
datapath. One effect is that all userspace datapath system tests
fail except for the first one run. There is no need to do this
extra tracking of tap devices for the hardware offload effort.
Hence, the approach taken is to filter both internal device
and tap device types for hardware offload. Internal devices are
'internal' from the kernel datapath perspective and tap devices
are 'internal' from the userpace datapath perspective.
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Until now, ODP output only showed port names for in_port matches. This
commit shows them in other places port numbers appear.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
Tested-by: Jan Scheurich <jan.scheurich@ericsson.com>
To be reused by other modules.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Usage:
# to dump all datapath flows (default):
ovs-dpctl dump-flows
# to dump only flows that in kernel datapath:
ovs-dpctl dump-flows type=ovs
# to dump only flows that are offloaded:
ovs-dpctl dump-flows type=offloaded
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
To use netdev flow offloading api, dpifs needs to iterate over
added ports. This addition inserts the added dpif ports in a hash map,
The map will also be used to translate dpif ports to netdevs.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Ports have a new layer3 attribute if they send/receive L3 packets.
The packet_type included in structs dp_packet and flow is considered in
ofproto-dpif. The classical L2 match fields (dl_src, dl_dst, dl_type, and
vlan_tci, vlan_vid, vlan_pcp) now have Ethernet as pre-requisite.
A dummy ethernet header is pushed to L3 packets received from L3 ports
before the the pipeline processing starts. The ethernet header is popped
before sending a packet to a L3 port.
For datapath ports that can receive L2 or L3 packets, the packet_type
becomes part of the flow key for datapath flows and is handled
appropriately in dpif-netdev.
In the 'else' branch in flow_put_on_pmd() function, the additional check
flow_equal(&match.flow, &netdev_flow->flow) was removed, as a) the dpcls
lookup is sufficient to uniquely identify a flow and b) it caused false
negatives because the flow in netdev->flow may not properly masked.
In dpif_netdev_flow_put() we now use the same method for constructing the
netdev_flow_key as the one used when adding the flow to the dplcs to make sure
these always match. The function netdev_flow_key_from_flow() used so far was
not only inefficient but sometimes caused mismatches and subsequent flow
update failures.
The kernel datapath does not support the packet_type match field.
Instead it encodes the packet type implictly by the presence or absence of
the Ethernet attribute in the flow key and mask.
This patch filters the PACKET_TYPE attribute out of netlink flow key and
mask to be sent to the kernel datapath.
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This commit adds a packet_type attribute to the structs dp_packet and flow
to explicitly carry the type of the packet as prepration for the
introduction of the so-called packet type-aware pipeline (PTAP) in OVS.
The packet_type is a big-endian 32 bit integer with the encoding as
specified in OpenFlow verion 1.5.
The upper 16 bits contain the packet type name space. Pre-defined values
are defined in openflow-common.h:
enum ofp_header_type_namespaces {
OFPHTN_ONF = 0, /* ONF namespace. */
OFPHTN_ETHERTYPE = 1, /* ns_type is an Ethertype. */
OFPHTN_IP_PROTO = 2, /* ns_type is a IP protocol number. */
OFPHTN_UDP_TCP_PORT = 3, /* ns_type is a TCP or UDP port. */
OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */
};
The lower 16 bits specify the actual type in the context of the name space.
Only name spaces 0 and 1 will be supported for now.
For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet).
This is the default packet_type in OVS and the only one supported so far.
Packets of type (OFPHTN_ONF, 0) are called Ethernet packets.
In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet.
A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet
whith the Ethernet header (and any VLAN tags) removed to expose the L3
(or L2.5) payload of the packet. These will simply be called L3 packets.
The Ethernet address fields dl_src and dl_dst in struct flow are not
applicable for an L3 packet and must be zero. However, to maintain
compatibility with the large code base, we have chosen to copy the
Ethertype of an L3 packet into the the dl_type field of struct flow.
This does not mean that it will be possible to match on dl_type for L3
packets with PTAP later on. Matching must be done on packet_type instead.
New dp_packets are initialized with packet_type Ethernet. Ports that
receive L3 packets will have to explicitly adjust the packet_type.
Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Debug log output for execute operations is missing the packet
metadata, which can be instrumental in tracing what the datapath
should be executing. No reason to not have the metadata on the debug
output, so add it there.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Add logics to detect the max level of nesting allowed by the
sample action implemented in the datapath.
Future patch allows xlate code to generate different odp actions
based on this information.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Allow actions to be part of the probe. No functional changes.
Future patch will make use this new API.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Translate OpenFlow METER instructions to datapath meter actions.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Andy Zhou <azhou@ovn.org>
Add DPIF-level infrastructure for meters. Allow meter_set to modify
the meter configuration (e.g. set the burst size if unspecified).
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Andy Zhou <azhou@ovn.org>
Upstream commit:
commit 91820da6ae85904d95ed53bf3a83f9ec44a6b80a
Author: Jiri Benc <jbenc@redhat.com>
Date: Thu Nov 10 16:28:23 2016 +0100
openvswitch: add Ethernet push and pop actions
It's not allowed to push Ethernet header in front of another Ethernet
header.
It's not allowed to pop Ethernet header if there's a vlan tag. This
preserves the invariant that L3 packet never has a vlan tag.
Based on previous versions by Lorand Jakab and Simon Horman.
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[Committer notes]
Fix build with the upstream commit by folding in the required switch
case enum handlers.
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Currently we parse the 'other_config' column in Openvswitch table in
bridge.c. We extract the values (just 'pmd-cpu-mask' for now) and we
pass them down to the datapath, via different layers.
If we want to pass other values to dpif-netdev.c (like we recently
discussed) we would have to touch ofproto.c, ofproto-dpif.c and dpif.c.
This patch sends the entire other_config column to dpif-netdev, so that
dpif-netdev can extract the values it's interested in.
No functional change.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
One common use case of 'struct dp_packet_batch' is to process all
packets in the batch in order. Add an iterator for this use case
to simplify the logic of calling sites,
Another common use case is to drop packets in the batch, by reading
all packets, but writing back pointers of fewer packets. Add macros
to support this use case.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
Add support for userspace datapath clone action. The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D, and an action list:
A, clone(B, C), D
The clone action will ensure that:
- D will see the same packet, and any meta states, such as flow, as
action B.
- D will be executed regardless whether B, or C drops a packet. They
can only drop a clone.
- When B drops a packet, clone will skip all remaining actions
within the clone envelope. This feature is useful when we add
meter action later: The meter action can be implemented as a
simple action without its own envolop (unlike the sample action).
When necessary, the flow translation layer can enclose a meter action
in clone.
The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.
Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into A, B, C, -C, -B, D.
However, there are two issues:
- The resulting datapath action list may be longer without using
clone.
- Some actions, such as NAT may not be possible to reverse.
This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible. It seems datapath should have enough context
to carry out such optimization without the userspace context.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
Currently dpctl depends on ovs-numa module to delete and create flows on
different pmd threads for pmd devices.
The next commits will move away the pmd threads state from ovs-numa to
dpif-netdev, so the ovs-numa interface will not be supported.
Also, the assignment between ports and thread is an implementation
detail of dpif-netdev, dpctl shouldn't know anything about it.
This commit changes the dpif_flow_put() and dpif_flow_del() calls to
iterate over all the pmd threads, if pmd_id is PMD_ID_NULL.
A simple test is added.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
The may_steal flag is now used, Remove OVS_UNUSED.
Since dp_packet_delete() handles the NULL pointer properly, we can
drop a few tracking variables, and make the code easier to follow.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
bridge_delete_or_reconfigure() deletes every interface that's not dumped
by OFPROTO_PORT_FOR_EACH(). ofproto_dpif.c:port_dump_next(), used by
OFPROTO_PORT_FOR_EACH, checks if the ofport is in the datapath by
calling port_query_by_name(). If port_query_by_name() returns an error,
the dump is interrupted. If port_query_by_name() returns ENODEV, the
device doesn't exist and the dump can continue.
port_query_by_name() for the userspace datapath returns ENOENT instead
of ENODEV. This is expected by dpif_port_query_by_name(), but it's not
handled correctly by port_dump_next().
dpif-netdev handles reconfiguration errors for an interface by deleting
it from the datapath, so it's possible that a device is missing. When this
happens we must make sure that port_dump_next() continues to dump other
devices, otherwise they will be deleted and the two layers will have an
inconsistent view.
This commit fixes the problem by returning ENODEV from the userspace
datapath if the port doesn't exist, and by documenting this clearly in
the dpif interfaces.
The problem was found while developing new code.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
This commit adds functionality to pass value of 'other_config' column
of 'Interface' table to datapath.
This may be used to pass not directly connected with netdev options and
configure behaviour of the datapath for different ports.
For example: pinning of rx queues to polling threads in dpif-netdev.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
To easily allow both in- and out-of-tree building of the Python
wrapper for the OVS JSON parser (e.g. w/ pip), move json.h to
include/openvswitch. This also requires moving lib/{hmap,shash}.h.
Both hmap.h and shash.h were #include-ing "util.h" even though the
headers themselves did not use anything from there, but rather from
include/openvswitch/util.h. Fixing that required including util.h
in several C files mostly due to OVS_NOT_REACHED and things like
xmalloc.
Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The patch adds a new action to support packet truncation. The new action
is formatted as 'output(port=n,max_len=m)', as output to port n, with
packet size being MIN(original_size, m).
One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload. Example
use case is below as well as shown in the testcases:
- Output to port 1 with max_len 100 bytes.
- The output packet size on port 1 will be MIN(original_packet_size, 100).
# ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)'
- The scope of max_len is limited to output action itself. The following
packet size of output:1 and output:2 will be intact.
# ovs-ofctl add-flow br0 \
'actions=output(port=1,max_len=100),output:1,output:2'
- The Datapath actions shows:
# Datapath actions: trunc(100),1,1,2
Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
All the callers of the function already have a copy of the extracted
flow in their stack (or a few frames before).
This is useful for different resons:
* It forces the callers to also call flow_extract() on the packet, which
is necessary to initialize the l2,l3,l4 pointers.
* It will be used in the userspace datapath to generate the RSS hash by
a following commit
* It can be used by the userspace connection tracker to avoid extracting
the l3 type again.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Remove "attempted to unregister a datapath provider that is not registered"
warning. It's normal for --enabled-dummy=system with userland-only build.
ovn-controller-vtep.at tests use the flag and fail on the extra warning.
Alternatively, we can make the tests ignore this specific warning.
But currently it doesn't make much sense as dp_unregister_provider
is only used for --enabled-dummy.
Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
DPDK datapath operate on batch of packets. To pass the batch of
packets around we use packets array and count. Next patch needs
to associate meta-data with each batch of packets. So Introducing
a batch structure to make handling the metadata easier.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
Commit f2d105b5 (ofproto-dpif-xlate: xlate ct_{mark, label} correctly.)
introduced the ovs_u128_and() function. It directly takes ovs_u128
values as arguments instead of pointers to them. As this is a bit more
direct way to deal with 128-bit values, modify the other utility
functions to do the same.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>