This patch adds a signature match cache (SMC) after exact match
cache (EMC). The difference between SMC and EMC is SMC only stores
a signature of a flow thus it is much more memory efficient. With
same memory space, EMC can store 8k flows while SMC can store 1M
flows. It is generally beneficial to turn on SMC but turn off EMC
when traffic flow count is much larger than EMC size.
SMC cache will map a signature to an dp_netdev_flow index in
flow_table. Thus, we add two new APIs in cmap for lookup key by
index and lookup index by key.
For now, SMC is an experimental feature that it is turned off by
default. One can turn it on using ovsdb options.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Acked-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Support was added in commit 9e638f223f ("ofproto: Support action
upcall meters").
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
This commit re-introduces the concept of shared mempools as the default
memory model for DPDK devices. Per port mempools are still available but
must be enabled explicitly by a user.
OVS previously used a shared mempool model for ports with the same MTU
and socket configuration. This was replaced by a per port mempool model
to address issues flagged by users such as:
https://mail.openvswitch.org/pipermail/ovs-discuss/2016-September/042560.html
However the per port model potentially requires an increase in memory
resource requirements to support the same number of ports and configuration
as the shared port model.
This is considered a blocking factor for current deployments of OVS
when upgrading to future OVS releases as a user may have to redimension
memory for the same deployment configuration. This may not be possible for
users.
This commit resolves the issue by re-introducing shared mempools as
the default memory behaviour in OVS DPDK but also refactors the memory
configuration code to allow for per port mempools.
This patch adds a new global config option, per-port-memory, that
controls the enablement of per port mempools for DPDK devices.
ovs-vsctl set Open_vSwitch . other_config:per-port-memory=true
This value defaults to false; to enable per port memory support,
this field should be set to true when setting other global parameters
on init (such as "dpdk-socket-mem", for example). Changing the value at
runtime is not supported, and requires restarting the vswitch
daemon.
The mempool sweep functionality is also replaced with the
sweep functionality from OVS 2.9 found in commits
c77f692 (netdev-dpdk: Free mempool only when no in-use mbufs.)
a7fb0a4 (netdev-dpdk: Add mempool reuse/free debug.)
A new document to discuss the specifics of the memory models and example
memory requirement calculations is also added.
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Tiago Lam <tiago.lam@intel.com>
Tested-by: Tiago Lam <tiago.lam@intel.com>
The design of the compound index feature in the C OVSDB IDL was unusual.
Indexes were generally referenced only by name rather than by pointer, and
could be obtained only from the top-level ovsdb_idl object. To iterate or
otherwise search an index required explicitly creating a special
ovsdb_idl_cursor object, which at least seemed somewhat heavy-weight given
that it required a string lookup in a table of indexes.
This commit redesigns the compound index interface. It discards the use of
names for indexes, instead having clients pass in a pointer to the index
object itself. It simplifies how indexes are created, gets rid of the need
for explicit cursor objects, and updates all of the users to the new
interface.
The underlying reason for this commit is to make it easier in
ovn-controller to keep track of the dependencies for a given function, by
making the indexes explicit arguments to any function that needs to use
them.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <hzhou8@ebay.com>
It is possible to set LSC detection mode to polling or interrupt mode
for DPDK interfaces. The default is polling mode. To set interrupt mode,
option dpdk-lsc-interrupt has to be set to true.
For detailed description and usage see the dpdk install documentation.
Signed-off-by: Robert Mulik <robert.mulik@ericsson.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Modify travis linux build script to use the latest
DPDK stable release 17.11.2. Update docs for latest
DPDK stable releases.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Misc. fixes to the Proof of Concepts section to help render the
information a bit nicer.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Ansis Atteka <aatteka@ovn.org>
New OVS-DPDK testsuite, which can be launched via `make check-dpdk`,
tests OVS using a DPDK datapath. The testsuite contains already
initial tests:
1. EAL init
2. Add standard DPDK PHY port
3. Add vhost-user-client port
Signed-off-by: Marcin Rybka <marcinx.rybka@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Two mistakes here:
- Automatic assignment of Rx queues to PMD threads has always existed -
it was simply switched from round-robin allocation to
utilization-based allocation
- The above, along with the 'pmd-rxq-rebalance' command, was added in
OVS 2.9.0 - not OVS 2.8.0 - while the 'pmd-rxq-show' command was added
in OVS 2.6.0 and modified in OVS 2.9.0
Correct both of these and modify the NEWS entry for this to clarify
things a little (it took a bit of git spelunking and bothering people on
IRC to figure out).
Signed-off-by: Stephen Finucane <stephen@that.guru>
Cc: Kevin Traynor <ktraynor@redhat.com>
Cc: Ian Stokes <ian.stokes@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
We include references from the physical and vhost-user interface guides.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Yet another section that's far too detailed for someone getting started
with DPDK in OVS. Split it out.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
This details configuration steps that apply to the entire bridge, rather
than individual ports.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Again, this stuff is too detailed for a high-level howto.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
These are separate things from physical, ring and vhost-user interfaces
and deserve their own documents. A couple of small typos are fixed along
the way.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
The "hotplugging", "flow control", and "Rx checksum offload" sections
only apply to 'dpdk' ports and are too detailed to include in a
high-level howto. Move them, reworking some aspects of this in the
process.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
This continues the breakup of the huge DPDK "howto" into smaller
components. There are a couple of related changes included, such as
using "Rx queue" instead of "rxq" and noting how Tx queues cannot be
configured.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
These ports are used to allow ingress/egress from the host and are
therefore _reasonably_ important. However, there is no clear overview of
what these ports actually are or why things are done the way they are.
Start closing this gap by providing a standalone example of using these
ports along with a little more detailed overview of the binding process.
There is additional cleanup to be done for the DPDK howto, but that will
be done separately.
We enable the TODO directive so we can actually start calling out some
TODOs.
Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
When explaining on how to add vhost-user ports to a guest, using
libvirt, the following piece of configuration is used:
<disk type='dir' device='disk'>
<driver name='qemu' type='fat'/>
<source dir='/usr/src/dpdk-stable-17.11.1'/>
<target dev='vdb' bus='virtio'/>
<readonly/>
</disk>
This is used to facilitate sharing of a DPDK directory between the host
and the guest. However, for this to work selinux also needs to be
configured (or disabled). Furthermore, if one is using Ubuntu, libvirtd
would need to be added to complain only in AppArmor. Instead, in [1] it
is advised to use wget to get the DPDK sources over the internet, which
avoids this differentiation. Thus, we drop this piece of configuration
here as well and keep the example configuration as simple as possible.
This has been verified on both a Fedora 27 image and a Ubuntu 16.04 LTS
image.
[1] http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/#dpdk-in-the-guest
Signed-off-by: Tiago Lam <tiago.lam@intel.com>
Acked-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
When explaining on how to add vhost-user ports to a guest, using
libvirt, point to the qemu-system-x86_64 binary by default, instead of
using qemu-kvm. The latter has been made obsolete and dropped from a
number of distributions (although it is still available on Fedora).
This has been verified on both a Fedora 27 image and a Ubuntu 16.04 LTS
image.
Signed-off-by: Tiago Lam <tiago.lam@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Create a document to describe the how it works and known
limitations and update the NEWS accordingly.
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Modify docs and travis linux build script to use the DPDK 17.11.1
release branch to benefit from most recent bug fixes.
There are no new features introduced in the DPDK release, only back
ported bug fixes. For completeness these bug fixes have been documented
under the 17.11.1 section in the link below.
http://dpdk.org/doc/guides-17.11/rel_notes/release_17_11.html#id1
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
The docs describe IOMMU support for dpdkvhostuserclient ports,
but it is not mentioned in the section about dpdkvhostuser
ports. Add an explicit note to say IOMMU is not supported for
dpdkvhostuser ports.
CC: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Bundles are implemented for both OF1.3 and OF1.4+, so no need to keep it
in the list. Packet type aware pipeline is now implemented too.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Also implements the backports of relevant errors to OF1.3 as specified in
ONF extension pack 1 for OF1.3.
ONF-JIRA: EXT-237
ONF-JIRA: EXT-230
ONF-JIRA: EXT-264
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
ONF extension pack 1 for OpenFlow 1.3 defines how to implement the OpenFlow
1.4 "role status" message in OpenFlow 1.3. This commit implements that
feature.
ONF-JIRA: EXT-191
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
ofp-util had been far too large and monolithic for a long time. This
commit breaks it up into units that make some logical sense. It also
moves the pieces of ofp-parse that were specific to each unit into the
relevant unit.
Most of this commit is just moving code around.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
This patch sets up foundations for Proof of Concepts that
simply materialize documentation into Ansible instructions
executed in virtualized Vagrant environment.
This Proof of Concept allows to easily build:
1. *.deb packages on Ubuntu 16.04; AND
2. *.rpm packages on CentOS 7.4.
It also sets up DEB and RPM repository over HTTP that can
be used to pull these openvswitch packages with apt-get
or yum from another host.
This particular Proof of Concept is intended to address
following use-cases:
1. for new OVS users to see how debian and rpm packages are
built;
2. for developers to easily check for packaging build
regressions;
3. for developers to easily share their sandbox builds
into QE setups (opposed to manually copying binaries);
4. for developers to add other Proof of Concepts
that possibly may require full end-to-end integration
with other thirdparty projects (e.g. DPI, libvirt, IPsec)
and need Open vSwitch packages.
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ansis Atteka <aatteka@ovn.org>
Zero copy is disabled by default. To enable it, set the 'dq-zero-copy'
option to 'true' when configuring the Interface:
ovs-vsctl set Interface dpdkvhostuserclient0
options:vhost-server-path=/tmp/dpdkvhostuserclient0
options:dq-zero-copy=true
When packets from a vHost device with zero copy enabled are destined for
a single 'dpdk' port, the number of tx descriptors on that 'dpdk' port
must be set to a smaller value. 128 is recommended. This can be achieved
like so:
ovs-vsctl set Interface dpdkport options:n_txq_desc=128
Note: The sum of the tx descriptors of all 'dpdk' ports the VM will send
to should not exceed 128. Due to this requirement, the feature is
considered 'experimental'.
Testing of the patch showed a ~8% improvement when switching 512B
packets between vHost devices on different VMs on the same host when
zero copy was enabled on the transmitting device.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
With this patch, kernel datapath testsuite can be run under valgrind by using
the "check-kernel-valgrind" target and the results can be found under directory
"tests/system-kmod-testsuite.dir/".
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
DPDK v17.11 introduces support for the vHost IOMMU feature.
This is a security feature, which restricts the vhost memory
that a virtio device may access.
This feature also enables the vhost REPLY_ACK protocol, the
implementation of which is known to work in newer versions of
QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
v2.9.0, inclusive). As such, the feature is disabled by default
in (and should remain so), for the aforementioned older QEMU
verions. Starting with QEMU v2.9.1, vhost-iommu-support can
safely be enabled, even without having an IOMMU device, with
no performance penalty.
This patch adds a new global config option, vhost-iommu-support,
that controls enablement of the vhost IOMMU feature:
ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
This value defaults to false; to enable IOMMU support, this field
should be set to true when setting other global parameters on init
(such as "dpdk-socket-mem", for example). Changing the value at
runtime is not supported, and requires restarting the vswitch daemon.
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
This is adapted from a talk I gave at OpenStack Summit Sydney on Nov. 6.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Miguel Angel Ajo <majopela@redhat.com>
Open vSwitch Testing documentation Userspace datapath section shows incorrect directory name for test result.
Morever to this check-system-userspace test fails if another OVS instance is running.
This patch corrects the directory name and adds a note for other running instances.
Signed-off-by: László Sürü <laszlo.suru@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Role based access control is a relatively new addition to OVS/OVN, and
aside from the database documentation in ovn-sb(5), there is not much
explaining what RBAC is, how to use it, and the available roles. This
document remedies that situation.
It is hopeful that any new roles added will be added to this document in
the future.
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Modify docs and travis linux build script to use the DPDK 17.05.2
release branch to benefit from most recent bug fixes.
There are no new features introduced in the DPDK release, only back
ported bug fixes. For completeness these bug fixes have been documented
under the 17.05.2 section in the link below.
http://dpdk.org/doc/guides-17.05/rel_notes/release_17_05.html
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
This patch fixes a trivial typo in vhost-user documentation:
the path to the second socket should be /tmp/dpdkvhostclient1
and not /tmp/dpdkvhostclient0.
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
In the work made in our projects, it was found the need to have a faster
access to the rows contained in tables in the replica, as well to have
the possibility to loop over a subset of rows that meet some specified
criteria.
Those needs lead us to design and implement a functionality that
satisfies those requirements, so an implementation of special indexes were
done.
In order to keep the OVSDB server implementation unmodified and avoid
extra load of processing, the indexes are created as part of the IDL.
The indexes are created as part of the initialization of the replica request
and are maintained automatically when there are changes in the replica.
This document explains the design rationale of the compound indexes feature.
Signed-off-by: Javier Albornoz <javier.albornoz@hpe.com>
Signed-off-by: Esteban Rodriguez Betancourt <estebarb@hpe.com>
Signed-off-by: Jorge Arturo Sauma Vargas <jorge.sauma@hpe.com>
Co-authored-by: Javier Albornoz <javier.albornoz@hpe.com>
Co-authored-by: Esteban Rodriguez Betancourt <estebarb@hpe.com>
Co-authored-by: Jorge Arturo Sauma Vargas <jorge.sauma@hpe.com>
Co-aughored-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Upgrading to DPDK 17.05.1 stable release adds new
significant features relevant to OVS, including,
but not limited to:
- tun/tap PMD,
- VFIO hotplug support,
- Generic flow API.
Following changes are applied:
- netdev-dpdk: Changes required by DPDK API modifications.
- doc: Because of DPDK API changes, backward compatibility
with previous DPDK releases will be broken, thus all
relevant documentation entries are updated.
- .travis: DPDK version change from 16.11.1 to 17.05.1.
- rhel/openvswitch-fedora.spec.in: DPDK version change
from 16.11 to 17.05.1
Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The OVN gateway HA design document is very useful in its current form.
It describes a range of options OVN could take to provide gateway HA.
Leave all the useful discussion in place and add a note to indicate
how the current implementation lines up with the options described.
I plan to follow up with an additional patch to describe the current L3
gateway HA implementation in the ovn-architecture document.
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Miguel Angel Ajo <majopela@redhat.com>
Modify docs and travis linux build script to use the DPDK 16.11.2 stable
branch to benefit from most recent bug fixes.
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
A few trivial fixes to vhost-user documentation including a syntax
error in the included xml file.
Signed-off-by: William Stevenson <yhvh2000@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
OpenFlow 1.1 and 1.2 support is complete. Simon Horman is not known to
be working on flow entry notifications.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Greg Rose <gvrose8192@gmail.com>