'dpdkr' ring ports was deprecated in 2.13 release and was not
actually used for a long time. Remove support now.
More details in
commit b4c5f00c33 ("netdev-dpdk: Deprecate ring ports.")
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Ideally SCTP checksum offload needs be advertised by the
NIC when userspace TSO is enabled. However, very few drivers
do that and it's not a widely used protocol. So, this patch
enables SCTP checksum offload if available, otherwise userspace
TSO can still be enabled but SCTP packets will be dropped on
NICs without support.
Fixes: 29cf9c1b3b ("userspace: Add TCP Segmentation Offload support")
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The 'XDP_RING_NEED_WAKEUP' and related flags are defined if_xdp.h, so if
users are building their own kernel, users have to update the kernel's
header files, by doing:
$ make headers_install INSTALL_HDR_PATH=/usr
Otherwise the following error shows:
/usr/local/include/bpf/xsk.h: In function 'xsk_ring_prod__needs_wakeup':
/usr/local/include/bpf/xsk.h:82:21: error: 'XDP_RING_NEED_WAKEUP' undeclared \
(first use in this function)
return *r->flags & XDP_RING_NEED_WAKEUP;
Reported-by: Tomek Osinski <osinstom@gmail.com>
Reported-at: https://osinstom.github.io/en/tutorial/ovs-afxdp-installation/
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Current implementation of ovs on windows only allows LocalSystem and
Administrators to access the named pipe created with API of ovs.
Thus any service that needs to invoke the API to create named pipe
has to run as System account to interactive with ovs. It causes the
system more vulnerable if one of those services was break into.
The patch adds the creator owner account to allowed ACLs.
Signed-off-by: Ning Wu <nwu@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Anand Kumar <kumaranand@vmware.com>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
When using TSO in OVS-DPDK with an i40e device, the following
patch is required for DPDK, which fixes an issue on the TSO path:
https://patches.dpdk.org/patch/64136/
Document this as a limitation until a DPDK release with the fix
included is supported by OVS.
Also, document best known methods for performance tuning when
testing TSO with the tool iperf.
Fixes: 29cf9c1b3b ("userspace: Add TCP Segmentation Offload support")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Currently, the AF_XDP socket (XSK) related memory are allocated by main
thread in the main thread's NUMA domain. With the patch that detects
netdev-linux's NUMA node id, the PMD thread of AF_XDP port will be run on
the AF_XDP netdev's NUMA domain. If the net device's NUMA domain
is different from the main thread's NUMA domain, we will have two
cross-NUMA memory accesses (netdev <-> memory, memory <-> CPU).
This patch addresses the aforementioned issue by allocating
the memory in the net device's NUMA domain.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The patch detects the numa node id from the name of the netdev,
by reading the '/sys/class/net/<devname>/device/numa_node'.
If not available, ex: virtual device, or any error happens,
return numa id 0. Currently only the afxdp netdev type uses it,
other linux netdev types are disabled due to no use case.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Abbreviated as TSO, TCP Segmentation Offload is a feature which enables
the network stack to delegate the TCP segmentation to the NIC reducing
the per packet CPU overhead.
A guest using vhostuser interface with TSO enabled can send TCP packets
much bigger than the MTU, which saves CPU cycles normally used to break
the packets down to MTU size and to calculate checksums.
It also saves CPU cycles used to parse multiple packets/headers during
the packet processing inside virtual switch.
If the destination of the packet is another guest in the same host, then
the same big packet can be sent through a vhostuser interface skipping
the segmentation completely. However, if the destination is not local,
the NIC hardware is instructed to do the TCP segmentation and checksum
calculation.
It is recommended to check if NIC hardware supports TSO before enabling
the feature, which is off by default. For additional information please
check the tso.rst document.
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Tested-by: Ciara Loftus <ciara.loftus.intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
open(), io.open(), codecs.open() and fileinput.FileInput no longer accept 'U'
("universal newline") in the file mode.
This flag was deprecated since Python 3.3.
In Python 3, the "universal newline" is used by default when a file is open
in text mode.
Acked-by: Flavio Leitner <fbl@sysclose.org>
Reported-at: https://bugzilla.redhat.com/1791681
Reported-by: Miro Hrončok <mhroncok@redhat.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Support offload of output action, also configuring count action for
allowing query statistics of HW offloaded flows.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This patch adds a new policer to the DPDK datapath based on RFC 4115's
Two-Rate, Three-Color marker. It's a two-level hierarchical policer
which first does a color-blind marking of the traffic at the queue
level, followed by a color-aware marking at the port level. At the end
traffic marked as Green or Yellow is forwarded, Red is dropped. For
details on how traffic is marked, see RFC 4115.
This egress policer can be used to limit traffic at different rated
based on the queues the traffic is in. In addition, it can also be used
to prioritize certain traffic over others at a port level.
For example, the following configuration will limit the traffic rate at a
port level to a maximum of 2000 packets a second (64 bytes IPv4 packets).
100pps as CIR (Committed Information Rate) and 1000pps as EIR (Excess
Information Rate). High priority traffic is routed to queue 10, which marks
all traffic as CIR, i.e. Green. All low priority traffic, queue 20, is
marked as EIR, i.e. Yellow.
ovs-vsctl --timeout=5 set port dpdk1 qos=@myqos -- \
--id=@myqos create qos type=trtcm-policer \
other-config:cir=52000 other-config:cbs=2048 \
other-config:eir=52000 other-config:ebs=2048 \
queues:10=@dpdk1Q10 queues:20=@dpdk1Q20 -- \
--id=@dpdk1Q10 create queue \
other-config:cir=41600000 other-config:cbs=2048 \
other-config:eir=0 other-config:ebs=0 -- \
--id=@dpdk1Q20 create queue \
other-config:cir=0 other-config:cbs=0 \
other-config:eir=41600000 other-config:ebs=2048 \
This configuration accomplishes that the high priority traffic has a
guaranteed bandwidth egressing the ports at CIR (1000pps), but it can also
use the EIR, so a total of 2000pps at max. These additional 1000pps is
shared with the low priority traffic. The low priority traffic can use at
maximum 1000pps.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
This reverts commit 9e334d91b3.
This commit is not needed since OVS doesn't use six anymore.
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Since Python 2 support was removed in 1ca0323e7c ("Require Python 3 and
remove support for Python 2."), python3-six is not needed anymore.
Moreover python3-six is not available on RHEL/CentOS7 without using EPEL
and so this patch is needed in order to release OVS 2.13 on RHEL7.
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
'dpdkr' a.k.a. DPDK ring ports has really poor support in OVS and not
tested on a regular basis. These ports are intended to work via
shared memory with another DPDK secondary process, but there are lots
of limitations for using this functionality in practice. Most of them
connected with running secondary DPDK application and memory layout
issues. More details are available in DPDK guide:
https://doc.dpdk.org/guides-18.11/prog_guide/multi_proc_support.html#multi-process-limitations
Beside the functional limitations it's also hard to use this
functionality correctly. User must be sure that OVS and secondary DPDK
application are running on different CPU cores, which is hard because
non-PMD threads could float over available CPU cores. This or any
other misconfiguration will likely lead to crash of OVS.
Another problem is that the user must actually build the secondary
application with the same version of DPDK that was used for OVS build.
Above issues are same as we have while using DPDK pdump.
Beside that, current implementation in OVS is not able to free
allocated rings that could lead to memory exhausting.
Initially these ports was added to use with IVSHMEM for a fast
zero-copy HOST<-->VM communication. However, IVSHMEM is not used
anymore. IVSHMEM support was removed from DPDK in 16.11 release
(instructions for IVSHMEM were removed from the OVS docs almost 3 years
ago by commit 90ca71dd31 ("doc: Remove ivshmem instructions.")) and
the patch for QEMU for using regular files as a device backend is no
longer available. That makes DPDK ring ports barely useful in real
virtualization environment.
This patch adds a deprecation warnings for run-time port creation
and documentation. Claiming to completely remove this functionality
from OVS in one of the next releases.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Modify travis linux build script to use the latest DPDK stable release
18.11.5. Update docs for latest DPDK stable releases.
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Until now there was only two options for XDP mode in OVS: SKB or DRV.
i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.
Devices like 'veth' interfaces in Linux supports native XDP, but
doesn't support zero-copy mode. This case can not be covered by
existing API and we have to use slower generic XDP for such devices.
There are few more issues, e.g. TCP is not supported in generic XDP
mode for veth interfaces due to kernel limitations, however it is
supported in native mode.
This change introduces ability to use native XDP without zero-copy
along with best-effort configuration option that enabled by default.
In best-effort case OVS will sequentially try different modes starting
from the fastest one and will choose the first acceptable for current
interface. This will guarantee the best possible performance.
If user will want to choose specific mode, it's still possible by
setting the 'options:xdp-mode'.
This change additionally changes the API by renaming the configuration
knob from 'xdpmode' to 'xdp-mode' and also renaming the modes
themselves to be more user-friendly.
The full list of currently supported modes:
* native-with-zerocopy - former DRV
* native - new one, DRV without zero-copy
* generic - former SKB
* best-effort - new one, chooses the best available from
3 above modes
Since 'best-effort' is a default mode, users will not need to
explicitely set 'xdp-mode' in most cases.
TCP related tests enabled back in system afxdp testsuite, because
'best-effort' will choose 'native' mode for veth interfaces
and this mode has no issues with TCP.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
The conventional way for packet dumping in OVS is to use ovs-tcpdump
that works via traffic mirroring. DPDK pdump could probably be used
for some lower level debugging, but it is not commonly used for
various reasons.
There are lots of limitations for using this functionality in practice.
Most of them connected with running secondary pdump process and
memory layout issues like requirement to disable ASLR in kernel.
More details are available in DPDK guide:
https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations
Beside the functional limitations it's also hard to use this
functionality correctly. User must be sure that OVS and pdump utility
are running on different CPU cores, which is hard because non-PMD
threads could float over available CPU cores. This or any other
misconfiguration will likely lead to crash of the pdump utility
or/and OVS.
Another problem is that the user must actually have this special pdump
utility in a system and it might be not available in distributions.
This change disables pdump support by default introducing special
configuration option '--enable-dpdk-pdump'. Deprecation warnings will
be shown to users on configuration and in runtime.
Claiming to completely remove this functionality from OVS in one
of the next releases.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
OVS may be unable to transmit packets for multiple reasons on
the userspace datapath and today there is a single counter to
track packets dropped due to any of those reasons. This patch
adds custom software stats for the different reasons packets
may be dropped during tx/rx on the userspace datapath in OVS.
- MTU drops : drops that occur due to a too large packet size
- Qos drops : drops that occur due to egress/ingress QOS
- Tx failures: drops as returned by the DPDK PMD send function
Note that the reason for tx failures is not specified in OVS.
In practice for vhost ports it is most common that tx failures
are because there are not enough available descriptors,
which is usually caused by misconfiguration of the guest queues
and/or because the guest is not consuming packets fast enough
from the queues.
These counters are displayed along with other stats in
"ovs-vsctl get interface <iface> statistics" command and are
available for dpdk and vhostuser/vhostuserclient ports.
Also the existing "tx_retries" counter for vhost ports has been
renamed to "ovs_tx_retries", so that all the custom statistics
that OVS accumulates itself will have the prefix "ovs_". This
will prevent any custom stats names overlapping with
driver/HW stats.
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
At the same time disambiguate some feature descriptions.
'Meters' is changed to 'Meter action' to clarify that the entry
describes the Openflow meter action rather than port based meters.
'NAT' is changed to 'Conntrack NAT' to indicate that this entry
represents NAT done in 'conntrack', rather than basic Openflow
IP address and L4 port modifications.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The patch adds support for using need_wakeup flag in AF_XDP rings.
A new option, use-need-wakeup, is added. When this option is used,
it means that OVS has to explicitly wake up the kernel RX, using poll()
syscall and wake up TX, using sendto() syscall. This feature improves
the performance by avoiding unnecessary sendto syscalls for TX.
For RX, instead of kernel always busy-spinning on fille queue, OVS wakes
up the kernel RX processing when fill queue is replenished.
The need_wakeup feature is merged into Linux kernel bpf-next tee with commit
77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") and
OVS enables it by default, if libbpf supports it. If users enable it but
runs in an older version of libbpf, then the need_wakeup feature has no effect,
and a warning message is logged.
For virtual interface, it's better set use-need-wakeup=false, since
the virtual device's AF_XDP xmit is synchronous: the sendto syscall
enters kernel and process the TX packet on tx queue directly.
On Intel Xeon E5-2620 v3 2.4GHz system, performance of physical port
to physical port improves from 6.1Mpps to 7.3Mpps.
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Since Python 3 is now mandatory, Extra Packages for Enterprise Linux
(EPEL) repository is needed in order to build OVS on RHEL7.
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The DPDK library allows OVS fast access to packet I/O in userspace. It
is not a datapath. This commit avoids using that term.
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Some users would find it useful to know the particular OVS version that
introduced a feature to the OVS tree kernel module or to the OVS
userspace (DPDK) datapath implementation. This patch updates the FAQ
to include that information.
This information is primarily gleaned from the top-level NEWS file.
For most of these, I did not verify them by looking carefully through
the history, so some of them may be inaccurate, although a few people
made corrections in review.
Requested-by: Jianjun Shen <shenj@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This workaround only applied to kernels earlier than 2.6.37, but OVS
only supports 3.10 and later.
As the original author of this code, I won't miss it.
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
OVS supports OVS Extensions as various vendor messages or as vendor
types in stats or multipart messages. Added a document to describe the
extensions as currently supported by OVS.
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Move back the dpdk-testpmd reference to the right section of this
document so that the link in howto/dpdk does not point to
"vhost-user-client tx retries config".
Fixes: 080f080c3b ("netdev-dpdk: Enable tx-retries-max config.")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Someone else wrote this script originally, I think, but I've extended
it quite a bit.
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This commit fixes potential unintended mistake in userspace-tunneling guide: for the example in userspace-tunneling guide, there is no bridge named "br-eth1", but only a bridge name "br-phy" which has a port named "eth1"
Signed-off-by: Liu Chang <liuchang@cmss.chinamobile.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A preceding commit removed the last remaining dependencies on OVN code,
so remove the OVN code.
Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Python 2 reaches end-of-life on January 1, 2020, which is only
a few months away. This means that OVS needs to stop depending
on in the next release that should occur roughly that same time.
Therefore, this commit removes all support for Python 2. It
also makes Python 3 a mandatory build dependency.
Some of the interesting consequences:
- HAVE_PYTHON, HAVE_PYTHON2, and HAVE_PYTHON3 conditionals have
been removed, since we now know that Python3 is available.
- $PYTHON and $PYTHON2 are removed, and $PYTHON3 is always
available.
- Many tests for Python 2 support have been removed, and the ones
that depended on Python 3 now run unconditionally. This allowed
several macros in the testsuite to be removed, making the code
clearer. This does make some of the changes to the testsuite
files large due to indentation level changes.
- #! lines for Python now use /usr/bin/python3 instead of
/usr/bin/python.
- Packaging depends on Python 3 packages.
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Tested-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch first defines the dpif interface for a datapath to support
adding, deleting, getting and dumping conntrack timeout policy.
The timeout policy is identified by a 4 bytes unsigned integer in
datapath, and it currently support timeout for TCP, UDP, and ICMP
protocols.
Moreover, this patch provides the implementation for Linux kernel
datapath in dpif-netlink.
In Linux kernel, the timeout policy is maintained per L3/L4 protocol,
and it is identified by 32 bytes null terminated string. On the other
hand, in vswitchd, the timeout policy is a generic one that consists of
all the supported L4 protocols. Therefore, one of the main task in
dpif-netlink is to break down the generic timeout policy into 6
sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp),
and push down the configuration using the netlink API in
netlink-conntrack.c.
This patch also adds missing symbols in the windows datapath so
that the build on windows can pass.
Appveyor CI:
* https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
We don't own ovs.org, and I doubt Ojai Valley School would enjoy
receiving our email.
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Add support in ovsdb-tool for migrating clustered dbs to standalone dbs.
E.g. usage to migrate nb/sb db to standalone db from raft:
ovsdb-tool cluster-to-standalone ovnnb_db.db ovnnb_db_cluster.db
Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>