2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

3 Commits

Author SHA1 Message Date
Eli Britstein
635cb95e0c dpif-netdev: Keep orig_in_port as a field of the flow.
A flow may be modified after its initial offload failed. In this case,
according to [1], the modification is handled as a flow add.
For a vport flow "add", the orig_in_port should be provided.
Keep that field in the flow struct, so it can be provided in the flow
modification use case.

[1] 0d25621e4d9f ("dpif-netdev: Fix flow modification after failure.")

Fixes: b5e6f6f6bfbe ("dpif-netdev: Provide orig_in_port in metadata for tunneled packets.")
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-22 22:12:24 +01:00
Ilya Maximets
e7e9973b80 dpif-netdev: Forwarding optimization for flows with a simple match.
There are cases where users might want simple forwarding or drop rules
for all packets received from a specific port, e.g ::

  "in_port=1,actions=2"
  "in_port=2,actions=IN_PORT"
  "in_port=3,vlan_tci=0x1234/0x1fff,actions=drop"
  "in_port=4,actions=push_vlan:0x8100,set_field:4196->vlan_vid,output:3"

There are also cases where complex OpenFlow rules can be simplified
down to datapath flows with very simple match criteria.

In theory, for very simple forwarding, OVS doesn't need to parse
packets at all in order to follow these rules.  "Simple match" lookup
optimization is intended to speed up packet forwarding in these cases.

Design:

Due to various implementation constraints userspace datapath has
following flow fields always in exact match (i.e. it's required to
match at least these fields of a packet even if the OF rule doesn't
need that):

  - recirc_id
  - in_port
  - packet_type
  - dl_type
  - vlan_tci (CFI + VID) - in most cases
  - nw_frag - for ip packets

Not all of these fields are related to packet itself.  We already
know the current 'recirc_id' and the 'in_port' before starting the
packet processing.  It also seems safe to assume that we're working
with Ethernet packets.  So, for the simple OF rule we need to match
only on 'dl_type', 'vlan_tci' and 'nw_frag'.

'in_port', 'dl_type', 'nw_frag' and 13 bits of 'vlan_tci' can be
combined in a single 64bit integer (mark) that can be used as a
hash in hash map.  We are using only VID and CFI form the 'vlan_tci',
flows that need to match on PCP will not qualify for the optimization.
Workaround for matching on non-existence of vlan updated to match on
CFI and VID only in order to qualify for the optimization.  CFI is
always set by OVS if vlan is present in a packet, so there is no need
to match on PCP in this case.  'nw_frag' takes 2 bits of PCP inside
the simple match mark.

New per-PMD flow table 'simple_match_table' introduced to store
simple match flows only.  'dp_netdev_flow_add' adds flow to the
usual 'flow_table' and to the 'simple_match_table' if the flow
meets following constraints:

  - 'recirc_id' in flow match is 0.
  - 'packet_type' in flow match is Ethernet.
  - Flow wildcards contains only minimal set of non-wildcarded fields
    (listed above).

If the number of flows for current 'in_port' in a regular 'flow_table'
equals number of flows for current 'in_port' in a 'simple_match_table',
we may use simple match optimization, because all the flows we have
are simple match flows.  This means that we only need to parse
'dl_type', 'vlan_tci' and 'nw_frag' to perform packet matching.
Now we make the unique flow mark from the 'in_port', 'dl_type',
'nw_frag' and 'vlan_tci' and looking for it in the 'simple_match_table'.
On successful lookup we don't need to run full 'miniflow_extract()'.

Unsuccessful lookup technically means that we have no suitable flow
in the datapath and upcall will be required.  So, in this case EMC and
SMC lookups are disabled.  We may optimize this path in the future by
bypassing the dpcls lookup too.

Performance improvement of this solution on a 'simple match' flows
should be comparable with partial HW offloading, because it parses same
packet fields and uses similar flow lookup scheme.
However, unlike partial HW offloading, it works for all port types
including virtual ones.

Performance results when compared to EMC:

Test setup:

             virtio-user   OVS    virtio-user
  Testpmd1  ------------>  pmd1  ------------>  Testpmd2
  (txonly)       x<------  pmd2  <------------ (mac swap)

Single stream of 64byte packets.  Actions:
  in_port=vhost0,actions=vhost1
  in_port=vhost1,actions=vhost0

Stats collected from pmd1 and pmd2, so there are 2 scenarios:
Virt-to-Virt   :     Testpmd1 ------> pmd1 ------> Testpmd2.
Virt-to-NoCopy :     Testpmd2 ------> pmd2 --->x   Testpmd1.
Here the packet sent from pmd2 to Testpmd1 is always dropped, because
the virtqueue is full since Testpmd1 is in txonly mode and doesn't
receive any packets.  This should be closer to the performance of a
VM-to-Phy scenario.

Test performed on machine with Intel Xeon CPU E5-2690 v4 @ 2.60GHz.
Table below represents improvement in throughput when compared to EMC.

 +----------------+------------------------+------------------------+
 |                |    Default (-g -O2)    | "-Ofast -march=native" |
 |   Scenario     +------------+-----------+------------+-----------+
 |                |     GCC    |   Clang   |     GCC    |   Clang   |
 +----------------+------------+-----------+------------+-----------+
 | Virt-to-Virt   |    +18.9%  |   +25.5%  |    +10.8%  |   +16.7%  |
 | Virt-to-NoCopy |    +24.3%  |   +33.7%  |    +14.9%  |   +22.0%  |
 +----------------+------------+-----------+------------+-----------+

For Phy-to-Phy case performance improvement should be even higher, but
it's not the main use-case for this functionality.  Performance
difference for the non-simple flows is within a margin of error.

Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-07 20:32:20 +01:00
Harry van Haaren
5930dfeeb2 dpif-netdev: Refactor to multiple header files.
Split the very large file dpif-netdev.c and the datastructures
it contains into multiple header files. Each header file is
responsible for the datastructures of that component.

This logical split allows better reuse and modularity of the code,
and reduces the very large file dpif-netdev.c to be more managable.

Due to dependencies between components, it is not possible to
move component in smaller granularities than this patch.

To explain the dependencies better, eg:

DPCLS has no deps (from dpif-netdev.c file)
FLOW depends on DPCLS (struct dpcls_rule)
DFC depends on DPCLS (netdev_flow_key) and FLOW (netdev_flow_key)
THREAD depends on DFC (struct dfc_cache)

DFC_PROC depends on THREAD (struct pmd_thread)

DPCLS lookup.h/c require only DPCLS
DPCLS implementations require only dpif-netdev-lookup.h.
- This change was made in 2.12 release with function pointers
- This commit only refactors the name to "private-dpcls.h"

netdev_flow_key_equal_mf() is renamed to emc_flow_key_equal_mf().

Rename functions specific to dpcls from netdev_* namespace to the
dpcls_* namespace, as they are only used by dpcls code.

'inline' is added to the dp_netdev_flow_hash() when it is moved
definition to fix a compiler error.

One valid checkpatch issue with the use of the
EMC_FOR_EACH_POS_WITH_HASH() macro was fixed.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Co-authored-by: Cian Ferriter <cian.ferriter@intel.com>
Signed-off-by: Cian Ferriter <cian.ferriter@intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2021-07-09 17:12:45 +01:00