2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-22 01:51:26 +00:00

17919 Commits

Author SHA1 Message Date
Timothy Redaelli
0c4d144a98 Remove dependency on python3-six
Since Python 2 support was removed in 1ca0323e7c29 ("Require Python 3 and
remove support for Python 2."), python3-six is not needed anymore.

Moreover python3-six is not available on RHEL/CentOS7 without using EPEL
and so this patch is needed in order to release OVS 2.13 on RHEL7.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-20 12:23:06 -08:00
Timothy Redaelli
24e6970809 travis: Use pip3 to install the python packages on linux
Currently pip is used to install the python packages on linux by travis,
but pip3 should be used since pip is a symlink of pip2.

Fixes: 1ca0323e7c29 ("Require Python 3 and remove support for Python 2.")
Cc: blp@ovn.org
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-19 07:49:01 -08:00
Ilya Maximets
138d30a9c0 dpctl: Fix referencing DPDK in a flow dump.
Few reasons to replace 'non-dpdk interfaces' with 'the main thread':

* Flows are dumped from threads (from per thread flow tables) not from
  the interfaces.

* 'non-dpdk' here sounds like all other flows (dumped from PMDs) has
  some relation with DPDK which is not true at least because we have
  afxdp and dummy ports that could be polled by PMD threads.

* 'main thread' is the same term as we're using in the output of
  ovs-appctl dpif-netdev/pmd-stats-show.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-12-19 08:44:27 +01:00
Ilya Maximets
28d0501623 ovs-thread: Avoid huge alignment on a base spinlock structure.
Marking the structure as 64 bytes aligned forces compiler to produce
big holes in the containing structures in order to fulfill this
requirement.  Also, any structure that contains this one as a member
automatically inherits this huge alignment making resulted memory
layout not efficient.  For example, 'struct umem_pool' currently
uses 3 full cache lines (192 bytes) with only 32 bytes of actual data:

  struct umem_pool {
    int                        index;                /*  0   4 */
    unsigned int               size;                 /*  4   4 */

    /* XXX 56 bytes hole, try to pack */

    /* --- cacheline 1 boundary (64 bytes) --- */
    struct ovs_spin lock __attribute__((__aligned__(64))); /* 64  64 */

    /* XXX last struct has 48 bytes of padding */

    /* --- cacheline 2 boundary (128 bytes) --- */
    void * *                   array;                /* 128  8 */

    /* size: 192, cachelines: 3, members: 4 */
    /* sum members: 80, holes: 1, sum holes: 56 */
    /* padding: 56 */
    /* paddings: 1, sum paddings: 48 */
    /* forced alignments: 1, forced holes: 1, sum forced holes: 56 */
  } __attribute__((__aligned__(64)));

Actual alignment of a spin lock is required only for Tx queue locks
inside netdev-afxdp to avoid false sharing, in all other cases
alignment only produces inefficient memory usage.

Also, CACHE_LINE_SIZE macro should be used instead of 64 as different
platforms may have different cache line sizes.

Using PADDED_MEMBERS to avoid alignment inheritance.

Fixes: ae36d63d7e3c ("ovs-thread: Make struct spin lock cache aligned.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-19 01:06:54 +01:00
Eelco Chaudron
3d56e4ac44 netdev-dpdk: Add coverage counter to count vhost IRQs.
When the dpdk vhost library executes an eventfd_write() call,
i.e. waking up the guest, a new callback will be called.

This patch adds the callback to count the number of
interrupts sent to the VM to track the number of times
interrupts where generated.

This might be of interest to find out system-calls were
called in the DPDK fast path.

The coverage counter is called "vhost_notification" and
can be read with:

  $ ovs-appctl coverage/read-counter vhost_notification
  13238319

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-12-19 01:03:52 +01:00
Kevin Traynor
6d77abf4f7 netdev-dpdk: Fix sw stats perf drop.
Accessing the sw stats in the vhost datapath of a PVP test
can incur a performance drop of ~2%.

Most of the time these stats will just be getting zero added
to them. By checking if there is a non-zero update first, we
can avoid accessing them when they won't be updated and avoid
the performance drop.

Fixes: 2f862c712e52 ("netdev-dpdk: Detailed packet drop statistics.")
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-12-18 13:06:27 +01:00
William Tu
a0152c1164 Documentation: Fix ovs-tcpdump options.
Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-18 12:50:18 +01:00
Ilya Maximets
2620f056a9 system-afxdp.at: Add test for infinite re-addition of failed ports.
New file created for AF_XDP specific tests.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-18 02:02:52 +01:00
Ilya Maximets
37a2465523 netdev-afxdp: Avoid removing of XDP program if not loaded.
'bpf_set_link_xdp_fd' generates netlink event regardless of actual
changes it does, so if-notifier will receive link update even if
there was no XDP program previously loaded on the interface.

OVS tries to remove XDP program if device configuration was not
successful triggering if-notifier that triggers bridge reconfiguration
and another attempt to add failed port.  And so on in the infinite
loop.

This patch avoids the issue by not removing XDP program if it wasn't
loaded.  Since loading of the XDP program is one of the last steps
of port configuration, this should help to avoid infinite re-addition
for most types of misconfiguration.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-18 01:39:56 +01:00
Ilya Maximets
96e744046a dpif-netdev: Avoid infinite re-addition of misconfigured ports.
Infinite re-addition of failed ports happens if the device in userspace
datapath has a linux network interface and it's not able to be
configured.  For example, if the first reconfiguration fails because of
misconfiguration or bad initial device state.
In current code victims are afxdp ports and the Mellanox NIC ports
opened by the DPDK due to their bifurcated drivers (It's unlikely for
usual netdev-linux ports to fail).

The root cause: Every change in the state of the network interface
of a linux kernel device generates if-notifier event and if-notifier
event triggers the OVS code to re-apply the configuration of ports,
i.e. add broken ports back. The most obvious part is that dpif-netdev
changes the device flags before trying to configure it:

   1. add_port()
   2. set_flags() --> if-notifier event
   3. reconfigure() --> port removal from the datapath due to
                        misconfiguration or any other issue in
                        the underlying device.
   4. setting flags back --> another if-notifier event.
   5. There was new if-notifier event?
      yes --> re-apply all settings. --> goto step 1.

Easy way to reproduce is to add afxdp port with n_rxq=N, where N is
bigger than device supports.

This patch fixes the most obvious case for this issue by moving
enabling of a promisc mode later to the place where we already know
that device could be added to datapath without errors, i.e. after
its first successful reconfiguration.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363038.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-18 01:39:56 +01:00
Eelco Chaudron
988fd46391 netdev-dpdk: add support for the RTE_ETH_EVENT_INTR_RESET event.
Currently, OVS does not register and therefore not handle the
interface reset event from the DPDK framework. This would cause a
problem in cases where a VF is used as an interface, and its
configuration changes.

As an example in the following scenario the MAC change is not
detected/acted upon until OVS is restarted without the patch applied:

  $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
  $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
            set Interface dpdk0 type=dpdk -- \
            set Interface dpdk0 options:dpdk-devargs=0000:05:0a.0

  $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-12-18 01:23:55 +01:00
Ilya Maximets
db54e96720 bridge: Allow manual notifications about interfaces' updates.
Sometimes interface updates could happen in a way ifnotifier is not
able to catch.  For example some heavy operations (device reset) in
netdev-dpdk could require re-applying of the bridge configuration.

For this purpose new manual notifier introduced. Its function
'if_notifier_manual_report()' could be called directly by the code
that aware about changes.  This new notifier is thread-safe.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
2019-12-18 01:23:50 +01:00
Damijan Skvarc
bd6da4ab37 tests: introduced test for checking "ovs-vsctl emer-reset"
Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-17 13:50:18 -08:00
Yi-Hung Wei
391b52f3c3 rhel: Support RHEL 7.8 kernel module rpm build
This patch supports RHEL 7.8 kernel module rpm package building.

$ make rpm-fedora-kmod \
RPMBUILD_OPT='-D "kversion 3.10.0-1101.el7.x86_64"'

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-13 15:58:26 -08:00
Ilya Maximets
ac9daaa6ab cirrus: Use FreeBSD 12.1 stable release.
freebsd-12-0-snap image family suddenly removed from the gCloud,
so can not be used anymore.  Updating to more recent 12.1 releases.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-12-13 19:21:56 +01:00
Ilya Maximets
cfc3c50c3b AUTHORS: Add Lance Yang.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-12-13 18:49:00 +01:00
Lance Yang
bdfab59371 travis: Move x86-only add-on packages to linux-prepare script.
To enable multiple CPU architectures support, it is necessary to move
the x86-only add-on packages from .travis.yml file. Otherwise, the
x86-only add-on packages will break the builds on some other CPU
architectures.

Reviewed-by: Yanqin Wei <Yanqin.Wei@arm.com>
Reviewed-by: Malvika Gupta <Malvika.Gupta@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Reviewed-by: Ruifeng Wang <Ruifeng.Wang@arm.com>
Acked-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Lance Yang <Lance.Yang@arm.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-12-13 18:45:10 +01:00
Lance Yang
96445fa4d9 dpif-netdev-perf: Fix using of uninitialized last_tsc.
When compiling Open vSwitch on aarch64, the compiler will warn about a
uninitialized variable in lib/dpif-netdev-perf.c. If the clock_gettime
function in rdtsc_syscall fails, the member last_tsc of the
uninitialized struct will be returned. In order to avoid the warnings,
it is necessary to initialize the variable before using.

Reviewed-by: Yanqin Wei <Yanqin.Wei@arm.com>
Reviewed-by: Malvika Gupta <Malvika.Gupta@arm.com>
Signed-off-by: Lance Yang <Lance.Yang@arm.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-12-13 18:45:10 +01:00
Ilya Maximets
e7cb123ffc dpif-netdev: Hold global port mutex while calling offload API.
We changed datapath port lookup to netdev-offload API usage, but
forgot that port mutex was there not only to protect datapath
port hash map.  It was there also as a workaround solution for
complete unsafety of netdev-offload-dpdk functions.

Turning it back to fix the behaviour and adding a comment to prevent
removing it in the future unless netdev-offload-dpdk fixed.

For the thread safety notice see the top of netdev-offload-dpdk.c.

Fixes: 30115809da2e ("dpif-netdev: Use netdev-offload API for port lookup while offloading")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eli Britstein <elibr@mellanox.com>
2019-12-13 13:43:35 +01:00
Ben Pfaff
c77f3a1625 tests: Log commands being executed for async message control test.
The "ofproto - asynchronous message control (OpenFlow 1.4)" test fails
from time to time when I'm running tests in parallel locally.  So far,
I've not been able to determine the root cause, but logging the
commands as they're executed should help.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:18 -08:00
Ben Pfaff
2235035119 tests: Improve logging for async message control test.
The "ofproto - asynchronous message control (OpenFlow 1.4)" test fails
from time to time when I'm running tests in parallel locally.  So far,
I've not been able to determine the root cause, but logging the
difference between expected and actual output should help.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:14 -08:00
Ben Pfaff
e6eef11145 tests: Better document OVS_WAIT_UNTIL, OVS_WAIT_WHILE macros.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:12 -08:00
Ben Pfaff
dd7467a01d ofp-monitor: Make OFP_FLOW_REMOVED_REASON_BUFSIZE public.
This constant is needed to use ofp_flow_removed_reason_to_string(),
which is itself public.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:09 -08:00
Ben Pfaff
95a5454c51 ofp-print: Abbreviate lists of fields in table features output.
This makes the output both shorter and easier to read.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2019-12-12 10:16:05 -08:00
Ilya Maximets
9802fafa96 checkpatch: Check spelling in commit messages.
This seems useful as I'm usually making a lot of typing mistakes.
Also, few commonly used words added to the extended dictionary.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-09 21:14:26 +01:00
Ilya Maximets
37c6dfab10 checkpatch: Skip words containing numbers.
Words like 'br0' are common and usually references some code or
database objects that should not be targets for spell checking.
So, it's better to skip all the words that has digits inside instead
of ones that starts with numbers.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-09 21:14:26 +01:00
Ilya Maximets
98a411b32e checkpatch: Allow common abbreviations for spell checking.
Abbreviations of Latin expressions like 'i.e.' or 'e.g.' are common
and known by the dictionary.  However, our spell checker is not able
to recognize them because it strips dots out of them.  To avoid this
issue we could pass non-stripped version of the word to the dictionary
checker too.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-09 21:14:26 +01:00
Jinjun Gao
40570384bc datapath-windows: Do not delete internal port on OID_SWITCH_NIC_DISCONNECT
According to the microsoft doc:
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/hyper-v-extensible-switch-port-and-network-adapter-states
Below OID request sequence is validation:
         OID_SWITCH_NIC_CONNECT -> OID_SWITCH_NIC_DISCONNECT
                  ^                           |
                  |                           V
         OID_SWITCH_NIC_CREATE  <- OID_SWITCH_NIC_DELETE

In above sequence, the windows extensible switch interface assumes the
OID_SWITCH_PORT_CREATE has issued and the port has been created
successfully. If delete the internal port in HvDisconnectNic(),
HvCreateNic() will fail when received OID_SWITCH_NIC_CREATE late because
there is no corresponding port.

Signed-off-by: Jinjun Gao <jinjung@vmware.com>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
2019-12-09 13:33:44 +02:00
Ophir Munk
1061dc7c85 dpif-netdev: Retrieve dpif_class from struct dp_netdev.
In case a pmd pointer (struct dp_netdev_pmd_thread *) needs to retrieve
the dpif_class it points at - it can access it as:  pmd->dp->class.  A
second option is to access it as: pmd->dp->dpif->dpif_class. The first
option is safe since there is one dp netdev with a constant pointer to
the dpif class. The second option is not safe since the pointer
pmd->dp->dpif may be changed under the hood, for example, in case there
is a call to dpif_open(). One such scenario is when a netdev bridge is
running while dumping flows statistics with dpctl in parallel:
ovs-appctl dpctl/dump-flows. This commit makes usage of the first
safe option instead of the second option.

Fixes: 30115809da2e ("dpif-netdev: Use netdev-offload API for port lookup while offloading")
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-12-08 15:48:23 +01:00
zhaozhanxu
164413156c Add offload packets statistics
Add argument '--offload-stats' for command ovs-appctl bridge/dump-flows
to display the offloaded packets statistics.

The commands display as below:

orignal command:

ovs-appctl bridge/dump-flows br0

duration=574s, n_packets=1152, n_bytes=110768, priority=0,actions=NORMAL
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x1,actions=controller(reason=)
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x3,actions=drop

new command with argument '--offload-stats'

Notice: 'n_offload_packets' are a subset of n_packets and 'n_offload_bytes' are
a subset of n_bytes.

ovs-appctl bridge/dump-flows --offload-stats br0

duration=582s, n_packets=1152, n_bytes=110768, n_offload_packets=1107, n_offload_bytes=107992, priority=0,actions=NORMAL
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x1,actions=controller(reason=)
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x3,actions=drop

Signed-off-by: zhaozhanxu <zhaozhanxu@163.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-06 10:57:16 +01:00
Malvika Gupta
3843208ee0 dpif-netdev-perf: Accurate cycle counter update
The accurate timing implementation in this patch gets the wall clock counter via
cntvct_el0 register access. This call is portable to all aarch64 architectures
and has been verified on an 64-bit arm server.

Suggested-by: Yanqin Wei <yanqin.wei@arm.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Malvika Gupta <malvika.gupta@arm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-05 11:17:31 -08:00
Ian Stokes
127b6a6eea dpdk: Update to use DPDK 19.11.
This commit adds support for DPDK v19.11, it includes the following
changes.

1. travis: Enable compilation and linkage with dpdk 19.11.

2. sparse: Remove dpdk network headers copies.

   https://patchwork.ozlabs.org/patch/1185256/

3. dpdk: Migrate to new PDUMP API.

   https://patchwork.ozlabs.org/patch/1192971/

4. netdev-dpdk: Prefix network structures with rte_.

   https://patchwork.ozlabs.org/patch/1109733/

5. netdev-dpdk: Update by new color definitions.

   https://patchwork.ozlabs.org/patch/1086089/

6. docs: Update docs to reference 19.11.

7. docs: Add note regarding hotplug and igb_uio requirements.

For credit all authors of the original commits to 'dpdk-latest' with the
above changes been added as co-authors for this commmit.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Co-authored-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Co-authored-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Co-authored-by: Ophir Munk <ophirmu@mellanox.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-12-04 20:51:57 +00:00
William Tu
9bf871c401 trivial: Fix erspan coding style.
Fix indentation and whitespace.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-12-03 16:24:25 -08:00
William Tu
4d80f503ff AUTHORS: Add Yi Yang.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-12-03 16:24:16 -08:00
Damijan Skvarc
822c8923f0 ovs-vsctl: unit test for checking fail-mode related
unit test is introduced which checks fail-mode related commands.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-03 11:00:57 -08:00
Dmytro Linkin
84dd881bb5 ofproto-dpif-xlate: Prevent duplicating of traffic to a mirror port
Currently ofproto design disallow duplicating output packet on forwarding
and mirroring to/from same ovs port. Next scenario reveal lack of design:
1. Send ping between regular ovs ports (VFs, for ex.), stop it.
2. While rule still exist, make mirror for one of the ports.
Prevent duplicating of traffic to a mirror port.

Fixes: 86e2dcddce85 ("dpif-xlate: Snoop multicast packets and send them properly")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-03 10:31:54 -08:00
Darrell Ball
a7f33fdbfb conntrack: Support zone limits.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-03 10:11:13 -08:00
William Tu
551c72854e ofproto-dpif: Refactor the get capability code.
Make the code simpler by removing the use of
xasprintf and free, and use smap_add_format.

Cc: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-12-02 16:19:25 -08:00
Yanqin Wei
3343f8d6cf netdev: use acquire-release semantics for change_seq in netdev
"rxq_enabled" of netdev is writen in the vhost thread and read by pmd
thread once it observes 'change_seq' is updated. This patch is to keep
order on aarch64 or other weak memory model CPU to ensure 'rxq_enabled' is
observed before 'change_seq'.

Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 14:48:14 -08:00
Greg Rose
e515e66a19 datapath: make generic netlink group const
Upstream commit:
    commit 48e48a70c08a8a68f8697f8b30cb83775bda8001
    Author: stephen hemminger <stephen@networkplumber.org>
    Date:   Wed Jul 16 11:25:52 2014 -0700

    openvswitch: make generic netlink group const

    Generic netlink tables can be const.

    Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

The equivalent tables in meter.c and conntrack.c are constified so
it should be safe to do the same for these and will improve
security as well.

Original patch slightly modified for out of tree module.

Passes check-kmod.
Passes Travis.
https://travis-ci.org/gvrose8192/ovs-experimental/builds/616880002

Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 14:36:49 -08:00
Darrell Ball
5623ed2d45 faq: Correct fragment reassembly release.
Correct fragment reassembly release for the userspace datapath.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 14:35:48 -08:00
Ben Pfaff
7b950521f5 ofproto-dpif-xlate: Restore table ID on error in xlate_table_action().
Found by inspection.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 12:41:07 -08:00
Ben Pfaff
9e9b4f5bb6 debian: Update list of copyright holders.
The list of copyright holders was incomplete and out of date.  This
updates it based on a "grep" for copyright notices, which I reviewed by
hand.

CC: 942056@bugs.debian.org
Reported-by: Chris Lamb <lamby@debian.org>
Reported-at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=942056
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 12:38:43 -08:00
Ben Pfaff
39b5e46312 Documentation: Convert multiple manpages to ReST.
Tested-by: Numan Siddique <numans@ovn.org>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 12:35:42 -08:00
David Marchand
aa34a87e7e sparse: Get rid of obsolete rte_flow header.
This header had been copied to cope with issues on the dpdk side.
Now that the problems have been fixed [1], let's drop this file as it is
now out of sync with dpdk.

1: https://git.dpdk.org/dpdk/commit/?id=fbb25a3878cc

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-12-02 19:41:09 +00:00
Linhaifeng
c9ee7c0e83 ofproto: fix stack-buffer-overflow
Should use flow->actions not &flow->actions.

here is ASAN report:
=================================================================
==57189==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xffff428fa0e8 at pc 0xffff7f61a520 bp 0xffff428f9420 sp 0xffff428f9498 READ of size 196 at 0xffff428fa0e8 thread T150 (revalidator22)
    #0 0xffff7f61a51f in __interceptor_memcpy (/lib64/libasan.so.4+0xa251f)
    #1 0xaaaad26a3b2b in ofpbuf_put lib/ofpbuf.c:426
    #2 0xaaaad26a30cb in ofpbuf_clone_data_with_headroom lib/ofpbuf.c:248
    #3 0xaaaad26a2e77 in ofpbuf_clone_with_headroom lib/ofpbuf.c:218
    #4 0xaaaad26a2dc3 in ofpbuf_clone lib/ofpbuf.c:208
    #5 0xaaaad23e3993 in ukey_set_actions ofproto/ofproto-dpif-upcall.c:1640
    #6 0xaaaad23e3f03 in ukey_create__ ofproto/ofproto-dpif-upcall.c:1696
    #7 0xaaaad23e553f in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1806
    #8 0xaaaad23e65fb in ukey_acquire ofproto/ofproto-dpif-upcall.c:1984
    #9 0xaaaad23eb583 in revalidate ofproto/ofproto-dpif-upcall.c:2625
    #10 0xaaaad23dee5f in udpif_revalidator ofproto/ofproto-dpif-upcall.c:1076
    #11 0xaaaad26b84ef in ovsthread_wrapper lib/ovs-thread.c:708
    #12 0xffff7e74a8bb in start_thread (/lib64/libpthread.so.0+0x78bb)
    #13 0xffff7e0665cb in thread_start (/lib64/libc.so.6+0xd55cb)

Address 0xffff428fa0e8 is located in stack of thread T150 (revalidator22) at offset 328 in frame
    #0 0xaaaad23e4cab in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1762

  This frame has 4 object(s):
    [32, 96) 'actions'
    [128, 192) 'buf'
    [224, 328) 'full_flow'
    [384, 2432) 'stub' <== Memory access at offset 328 partially underflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported) Thread T150 (revalidator22) created by T0 here:
    #0 0xffff7f5b0f7f in __interceptor_pthread_create (/lib64/libasan.so.4+0x38f7f)
    #1 0xaaaad26b891f in ovs_thread_create lib/ovs-thread.c:792
    #2 0xaaaad23dc62f in udpif_start_threads ofproto/ofproto-dpif-upcall.c:639
    #3 0xaaaad23daf87 in ofproto_set_flow_table ofproto/ofproto-dpif-upcall.c:446
    #4 0xaaaad230ff7f in dpdk_evs_cfg_set vswitchd/bridge.c:1134
    #5 0xaaaad2310097 in bridge_reconfigure vswitchd/bridge.c:1148
    #6 0xaaaad23279d7 in bridge_run vswitchd/bridge.c:3944
    #7 0xaaaad23365a3 in main vswitchd/ovs-vswitchd.c:240
    #8 0xffff7dfb1adf in __libc_start_main (/lib64/libc.so.6+0x20adf)
    #9 0xaaaad230a3d3  (/usr/sbin/ovs-vswitchd-2.7.0-1.1.RC5.001.asan+0x26f3d3)

SUMMARY: AddressSanitizer: stack-buffer-overflow (/lib64/libasan.so.4+0xa251f) in __interceptor_memcpy Shadow bytes around the buggy address:
  0x200fe851f3c0: 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2 00 00 00 00
  0x200fe851f3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3f0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00
  0x200fe851f400: f2 f2 f2 f2 f8 f8 f8 f8 f8 f8 f8 f8 f2 f2 f2 f2
=>0x200fe851f410: 00 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2
  0x200fe851f420: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==57189==ABORTING

Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Linhaifeng <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 11:19:26 -08:00
Ilya Maximets
30115809da dpif-netdev: Use netdev-offload API for port lookup while offloading.
Currently, while offloading, userspace datapath tries to lookup netdev
in a local port list of the datapath interface instance.  However,
there is no guarantee that these netdevs are the same netdevs that
netdev-offload module operates with and, as a result, there is no any
guarantee that these netdev instances has initialized flow API.

dpif-netdev should request ports from the netdev-offload module as
intended by flow offloading API in a same way as dpif-netlink does.
This will also give us performance benefits because we don't need to
hold global port mutex anymore.

We're not noticing any significant issues with current code, but
it will become a serious issue in the future, e.g. with offloading
for virtual tunneling ports.

Reported-by: Ophir Munk <ophirmu@mellanox.com>
Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Eli Britstein <elibr@mellanox.com>
2019-12-02 16:06:57 +01:00
Ilya Maximets
89ee730a60 ofproto-provider: Move datapath capabilities callback to correct section.
'get_datapath_cap' callback was mistakenly placed in
'Connection tracking' section of the 'struct dpif_class'
while belongs to the 'Datapath information'.

CC: William Tu <u9012063@gmail.com>
Fixes: 27501802d09f ("ofproto-dpif: Expose datapath capability to ovsdb.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-11-28 16:51:44 +01:00
Ilya Maximets
9965fef8db dp-packet: Fix clearing/copying of memory layout flags.
'ol_flags' of DPDK mbuf could contain bits responsible for external
or indirect buffers which are not actually offload flags in a common
sense.  Clearing/copying of these flags could lead to memory leaks of
external memory chunks and crashes due to access to wrong memory.

OVS should not clear these flags while resetting offloads and also
should not copy them to the newly allocated packets.

This change is required to support DPDK 19.11, as some drivers may
return mbufs with these flags set.  However, it might be good to do
the same for DPDK 18.11, because these flags are present and should
be taken into account.

Fixes: 03f3f9c0faf8 ("dpdk: Update to use DPDK 18.11.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2019-11-28 16:50:10 +01:00
Ilya Maximets
b4c5f00c33 netdev-dpdk: Deprecate ring ports.
'dpdkr' a.k.a. DPDK ring ports has really poor support in OVS and not
tested on a regular basis.  These ports are intended to work via
shared memory with another DPDK secondary process, but there are lots
of limitations for using this functionality in practice.  Most of them
connected with running secondary DPDK application and memory layout
issues.  More details are available in DPDK guide:
https://doc.dpdk.org/guides-18.11/prog_guide/multi_proc_support.html#multi-process-limitations

Beside the functional limitations it's also hard to use this
functionality correctly.  User must be sure that OVS and secondary DPDK
application are running on different CPU cores, which is hard because
non-PMD threads could float over available CPU cores.  This or any
other misconfiguration will likely lead to crash of OVS.

Another problem is that the user must actually build the secondary
application with the same version of DPDK that was used for OVS build.

Above issues are same as we have while using DPDK pdump.

Beside that, current implementation in OVS is not able to free
allocated rings that could lead to memory exhausting.

Initially these ports was added to use with IVSHMEM for a fast
zero-copy HOST<-->VM communication.  However, IVSHMEM is not used
anymore.  IVSHMEM support was removed from DPDK in 16.11 release
(instructions for IVSHMEM were removed from the OVS docs almost 3 years
ago by commit 90ca71dd317f ("doc: Remove ivshmem instructions.")) and
the patch for QEMU for using regular files as a device backend is no
longer available.  That makes DPDK ring ports barely useful in real
virtualization environment.

This patch adds a deprecation warnings for run-time port creation
and documentation.  Claiming to completely remove this functionality
from OVS in one of the next releases.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2019-11-28 16:35:30 +01:00