2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 22:05:19 +00:00
Commit Graph

17836 Commits

Author SHA1 Message Date
Tomasz Konieczny
c2c84474d4 netdev-dpdk: Fix flow control not configuring.
Currently OVS is unable to change flow control configuration in DPDK
because new settings are being overwritten by current settings with
rte_eth_dev_flow_ctrl_get(). The fix restores correct order of
operations and at the same time does not trigger error on devices
without flow control support when flow control not requested.

Fixes: 7e1de65e8d ("netdev-dpdk: Fix failure to configure flow control at netdev-init.")
Signed-off-by: Tomasz Konieczny <tomaszx.konieczny@intel.com>
Co-authored-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-11-04 17:23:47 +01:00
Darrell Ball
13ede8c112 faq: Fix meter action releases.
At the same time disambiguate some feature descriptions.
'Meters' is changed to 'Meter action' to clarify that the entry
describes the Openflow meter action rather than port based meters.
'NAT' is changed to 'Conntrack NAT' to indicate that this entry
represents NAT done in 'conntrack', rather than basic Openflow
IP address and L4 port modifications.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-03 11:30:40 -08:00
Dmytro Linkin
36e50679a6 lib/tc: Fix flow dump for tunnel id equal zero
Tunnel id 0 is not printed unless tunnel flag FLOW_TNL_F_KEY is set.
Fix that by always setting FLOW_TNL_F_KEY when tunnel id is valid.

Fixes: 0227bf092e ("lib/tc: Support optional tunnel id")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-11-02 07:41:49 +01:00
Ilya Maximets
ac4b6e83de travis: Fix skipping of python SSL tests.
After this change we'll have only one windows related skipped test
in default build.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-11-01 23:47:07 +01:00
Ilya Maximets
5ae6f976d2 travis: Workaround skipping of IPv6 tests.
IPv6 support disabled in TravisCI images but supported by kernel.
So, we could enable it in order to not skip unit tests.
We are not trying to communicate over network with IPv6, so this
should not make any harm.

Related issue: https://github.com/travis-ci/travis-ci/issues/8891

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-11-01 23:46:47 +01:00
Ashish Varma
3e613cd81c ofp-monitor: Fixed the usage of 'usable_protocols' variable in 'parse_flow_monitor_request' function.
'usable_protocols' is now getting set to OFPUTIL_P_OF10_ANY on return from
'parse_flow_monitor_request' function. The calling function now checks for the
value in this variable against the 'allowed_protocols' variable.
Also a check is added for a match field which is not supported in OpenFlow 1.0
and return an error.
Modified the man page of ovs-ofctl to reflect Flow Monitor support as
OpenFlow 1.0 Nicira extension only.

Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-11-01 14:35:00 -07:00
Greg Rose
d5ac962823 Revert "ip_gre: Remove even more unused code"
This reverts commit 42a059e02b.

Not all the necessary ipgre prefixed code was removed that
should have been.  Another patch will follow with the correct
removed code.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-11-01 19:02:10 +01:00
Ilya Maximets
9793898efb travis: Enable pdump for DPDK build.
OVS has support for DPDK pdump that checked in configure script.
Enabling it to increase OVS build test coverage by the code guarded
by DPDK_PDUMP macro.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-11-01 12:35:45 +01:00
Greg Rose
42a059e02b ip_gre: Remove even more unused code
There is a confusing mix of ipgre and gretap functions with some
needed for gretap still having ipgre_ prefixes.  This time though
I think I got the rest of the unused ipgre code.

Fixes: d5822f4288 ("gre: Remove dead ipgre code")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-31 15:57:34 -07:00
Greg Rose
adf9ac69ae ip_gre: Removed unused ipgre netdev ops
When cleaning up unused ipgre code the ipgre_netdev_ops structure
was missed. Get rid of it now.

Fixes: d5822f4288 ("gre: Remove dead ipgre code")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-31 14:08:02 -07:00
Ben Pfaff
75ad1cd6e9 Avoid indeterminate statistics in offload implementations.
A lot of the offload implementations didn't bother to initialize the
statistics they were supposed to return.  I don't know whether any of
the callers actually use them, but it looked wrong.

Found by inspection.

Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-31 10:06:14 -07:00
Ben Pfaff
cd7ea52172 AUTHORS: Add Gowrishankar Muthukrishnan.
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-30 10:51:44 -07:00
Gowrishankar Muthukrishnan
a773a5c90c lacp: warn transmit failure of lacp pdu
It might be difficult to trace whether LACP PDU tx (as in
response) was successful when the pdu was not transmitted by
egress slave for various reasons (including resource contention
within NIC) and only way to trace its fate is by looking at
ofproto->stats.tx_[packets/bytes] and slave port stats.

Adding a warning when there is tx failure could help user
debug at the root of this problem.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukr@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-30 10:50:27 -07:00
Damijan Skvarc
7139ca5521 ovsdb-execute: Remove unused variable from ovsdb_execute_mutate().
Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-30 10:33:52 -07:00
William Tu
e50547b51a netdev-afxdp: Add need_wakeup support.
The patch adds support for using need_wakeup flag in AF_XDP rings.
A new option, use-need-wakeup, is added.  When this option is used,
it means that OVS has to explicitly wake up the kernel RX, using poll()
syscall and wake up TX, using sendto() syscall. This feature improves
the performance by avoiding unnecessary sendto syscalls for TX.
For RX, instead of kernel always busy-spinning on fille queue, OVS wakes
up the kernel RX processing when fill queue is replenished.

The need_wakeup feature is merged into Linux kernel bpf-next tee with commit
77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") and
OVS enables it by default, if libbpf supports it.  If users enable it but
runs in an older version of libbpf, then the need_wakeup feature has no effect,
and a warning message is logged.

For virtual interface, it's better set use-need-wakeup=false, since
the virtual device's AF_XDP xmit is synchronous: the sendto syscall
enters kernel and process the TX packet on tx queue directly.

On Intel Xeon E5-2620 v3 2.4GHz system, performance of physical port
to physical port improves from 6.1Mpps to 7.3Mpps.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-10-29 19:26:59 +01:00
Roi Dayan
ed1617406c rhel: openvswitch-fedora.spec.in: Fix output redirect to null device
Add missing slash.

Fixes: 0447019df7 ("fedora-spec: added systemd post/postun/pre/preun sections")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-10-29 09:45:55 +01:00
Ilya Maximets
acc5df0e3c dpif-netdev: Fix time delta overflow in case of race for meter lock.
There is a race window between getting the time and getting the meter
lock.  This could lead to situation where the thread with larger
current time (this thread called time_{um}sec() later than others)
will acquire meter lock first and update meter->used to the large
value.  Next threads will try to calculate time delta by subtracting
the large meter->used from their lower time getting the negative value
which will be converted to a big unsigned delta.

Fix that by assuming that all these threads received packets in the
same time in this case, i.e. dropping negative delta to 0.

CC: Jarno Rajahalme <jarno@ovn.org>
Fixes: 4b27db644a ("dpif-netdev: Simple DROP meter implementation.")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363126.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-10-28 13:38:37 +01:00
Ilya Maximets
18ae34ae1f dpif-netdev: Do not mix recirculation depth into RSS hash itself.
Mixing of RSS hash with recirculation depth is useful for flow lookup
because same packet after recirculation should match with different
datapath rule.  Setting of the mixed value back to the packet is
completely unnecessary because recirculation depth is different on
each recirculation, i.e. we will have different packet hash for
flow lookup anyway.

This should fix the issue that packets from the same flow could be
directed to different buckets based on a dp_hash or different ports of
a balanced bonding in case they were recirculated different number of
times (e.g. due to conntrack rules).
With this change, the original RSS hash will remain the same making
it possible to calculate equal dp_hash values for such packets.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363127.html
Fixes: 048963aa85 ("dpif-netdev: Reset RSS hash when recirculating.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
2019-10-28 13:38:37 +01:00
Aliasgar Ginwala
6e26ff6337 command-line: New function ovs_cmdl_env_parse_all().
This function allows an environment variable to be included in
command-line parsing.  It will receive its first user in an
upcoming commit.

Signed-off-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-25 12:47:20 -07:00
Damijan Skvarc
dee6478d4a ovsdb-server: fix memory leak while converting database
Memory leak happens while converting existing database into new
database according to the specified schema (ovsdb-client convert
new-schema). Memory leak was detected by valgrind while executing
functional test "schema conversion online - clustered"

==16202== 96 bytes in 6 blocks are definitely lost in loss record 326 of 399
==16202==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16202==    by 0x44A5D4: xmalloc (util.c:138)
==16202==    by 0x4377A6: alloc_default_atoms (ovsdb-data.c:315)
==16202==    by 0x437F18: ovsdb_datum_init_default (ovsdb-data.c:918)
==16202==    by 0x413D82: ovsdb_row_create (row.c:59)
==16202==    by 0x40AA53: ovsdb_convert_table (file.c:220)
==16202==    by 0x40AA53: ovsdb_convert (file.c:275)
==16202==    by 0x416BE1: ovsdb_trigger_try (trigger.c:255)
==16202==    by 0x40D29E: ovsdb_jsonrpc_trigger_create (jsonrpc-server.c:1119)
==16202==    by 0x40D29E: ovsdb_jsonrpc_session_got_request (jsonrpc-server.c:986)
==16202==    by 0x40D29E: ovsdb_jsonrpc_session_run (jsonrpc-server.c:556)
==16202==    by 0x40D29E: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586)
==16202==    by 0x40D29E: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401)
==16202==    by 0x40682E: main_loop (ovsdb-server.c:209)
==16202==    by 0x40682E: main (ovsdb-server.c:460)

The problem was in ovsdb_datum_convert() function, which overrides
pointers to datum memory allocated in ovsdb_row_create() function.
Fix was done by freeing this memory before ovsdb_datum_convert()
is called.

Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-25 10:50:34 -07:00
Timothy Redaelli
9e334d91b3 docs: To build OVS on RHEL7 EPEL is needed
Since Python 3 is now mandatory, Extra Packages for Enterprise Linux
(EPEL) repository is needed in order to build OVS on RHEL7.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-25 10:28:28 -07:00
Ilya Maximets
54e2baec09 flow: Fix crash on vlan packets with partial offloading.
parse_tcp_flags() does not care about vlan tags in a packet thus
not able to parse them.  As a result, if partial offloading is
enabled in userspace datapath vlan packets are not parsed, i.e.
has no initialized offsets.  This causes OVS crash on any attempt
to access/modify packet header fields.

For example, having the flow with following actions:
  in_port=1,ip,actions=mod_nw_src:192.168.0.7,output:IN_PORT

will lead to OVS crash on vlan packet handling:

 Process terminating with default action of signal 11 (SIGSEGV)
 Invalid read of size 4
    at 0x785657: get_16aligned_be32 (unaligned.h:249)
    by 0x785657: odp_set_ipv4 (odp-execute.c:82)
    by 0x785657: odp_execute_masked_set_action (odp-execute.c:527)
    by 0x785657: odp_execute_actions (odp-execute.c:894)
    by 0x74CDA9: dp_netdev_execute_actions (dpif-netdev.c:7355)
    by 0x74CDA9: packet_batch_per_flow_execute (dpif-netdev.c:6339)
    by 0x74CDA9: dp_netdev_input__ (dpif-netdev.c:6845)
    by 0x74DB6E: dp_netdev_input (dpif-netdev.c:6854)
    by 0x74DB6E: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
    by 0x74E863: dpif_netdev_run (dpif-netdev.c:5264)
    by 0x703F57: type_run (ofproto-dpif.c:370)
    by 0x6EC8B8: ofproto_type_run (ofproto.c:1760)
    by 0x6DA52B: bridge_run__ (bridge.c:3188)
    by 0x6E083F: bridge_run (bridge.c:3252)
    by 0x1642E4: main (ovs-vswitchd.c:127)
  Address 0xc is not stack'd, malloc'd or (recently) free'd

Fix that by properly parsing vlan tags first.  Function 'parse_dl_type'
transformed for that purpose as it had no users anyway.

Added unit test for packet modification with partial offloading that
triggers above crash.

Fixes: aab96ec4d8 ("dpif-netdev: retrieve flow directly from the flow mark")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-10-25 18:06:50 +02:00
Ilya Maximets
51528c2994 tests: Fix indentation in userspace packet type aware test.
CC: Ben Pfaff <blp@ovn.org>
Fixes: 7be29a4757 ("ofproto-dpif: Remove tabs from output.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-10-25 15:50:29 +00:00
Yifeng Sun
454dc17c75 rhel: Support RHEL7.7 build and packaging
This patch provides essential fixes for OVS to support
RHEL7.7's new kernel.

make rpm-fedora-kmod \
RPMBUILD_OPT='-D "kversion 3.10.0-1062.1.2.el7.x86_64"'

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-24 14:26:21 -07:00
Numan Siddique
cec7005bde ovsdb-server: Allow replication from older schema version servers.
Presently, replication is not allowed if there is a schema version mismatch between
the schema returned by the active ovsdb-server and the local db schema. This is
causing failures in OVN DB HA deployments during uprades.

In the case of OpenStack tripleo deployment with OVN, OVN DB ovsdb-servers are
deployed on a multi node controller cluster in active/standby mode. During
minor updates or major upgrades, the cluster is updated one at a time. If
a node A is running active OVN DB ovsdb-servers and when it is updated, another
node B becomes active. After the update when OVN DB ovsdb-servers in A are started,
these ovsdb-servers fail to replicate from the active if there is a schema
version mismatch.

This patch addresses this issue by allowing replication even if there is a
schema version mismatch only if all the active db schema tables and its colums are
present in the local db schema.

This should not result in any data loss.

Signed-off-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-24 14:25:40 -07:00
Surya Rudra
f04977508f lldp: Fix for OVS crashes when a LLDP-enabled port is deleted
Issue:
When LLDP is enabled on a port, a structure to hold LLDP related state
is created and that structure has a reference to the port. The ofproto
monitor thread accesses the LLDP structure to periodically send packets
over the associated port. When the port is deleted, the LLDP structure
is not cleaned up and it continues to refer to the deleted port.

When the monitor thread attempts to access the deleted port OVS crashes.
Crash can happen with bridge delete and bond delete also.

Fix:
Remove all references to the LLDP structure and free it when
the port is deleted.

Signed-off-by: Surya Rudra <rudrasurya.r@altencalsoftlabs.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-24 12:49:57 -07:00
Ben Pfaff
49df3c0fe7 docs: DPDK isn't a datapath, so don't use the term.
The DPDK library allows OVS fast access to packet I/O in userspace.  It
is not a datapath.  This commit avoids using that term.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-23 12:38:19 -07:00
Ben Pfaff
653eedff20 faq: Give specific versions that introduced various features.
Some users would find it useful to know the particular OVS version that
introduced a feature to the OVS tree kernel module or to the OVS
userspace (DPDK) datapath implementation.  This patch updates the FAQ
to include that information.

This information is primarily gleaned from the top-level NEWS file.
For most of these, I did not verify them by looking carefully through
the history, so some of them may be inaccurate, although a few people
made corrections in review.

Requested-by: Jianjun Shen <shenj@vmware.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-23 11:52:15 -07:00
Gowrishankar Muthukrishnan
423416f587 lacp: report desync in ovs threads enabling slave
It is helpful in reporting main thread that is yet to enable bond slave,
but link state was brought up by lacp thread and capture this desync
between ovs threads for debugging.

Fixes: a8448cb170 ("lacp: Avoid packet drop on LACP bond after link up")
Signed-off-by: Gowrishankar Muthukrishnan <gmuthukr@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-23 10:24:39 -07:00
Aaron Conole
1051576cf2 ovs-tcpundump: allow multiple packet lengths
The tcpundump tool expects all packets to be a length which aligns to
exactly a 4-nibble boundary.  This means packets like DNS requests will be
stripped before being correctly processed.  Fix this by allowing at least
two nibbles (or one byte) alignment.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-23 10:22:11 -07:00
Aaron Conole
c691cffb03 ovs-tcpundump: exit when getting version
Running 'ovs-tcpundump -V' will cause ovs-tcpundump to start processing on
stdin.  Instead, print the version and exit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-23 10:22:11 -07:00
Ilya Maximets
15ba075d39 netdev-dpdk: Fix Tx queue false sharing.
'tx_q' array is allocated for each DPDK netdev.  'struct dpdk_tx_queue'
is 8 bytes long, so 8 tx queues are sharing the same cache line in
case of 64B cacheline size.  This causes 'false sharing' issue in
mutliqueue case because taking the spinlock implies write to memory
i.e. cache invalidation.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
2019-10-22 19:29:24 +02:00
Ilya Maximets
d5ad58044a travis: Test build with afxdp.
We can't easily update the kernel on TravisCI to run system tests
with AF_XDP, but we could run build tests with libbpf and headers
from newer kernels.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
2019-10-21 22:07:00 +02:00
Numan Siddique
a6028d3abb rhel: Remove the cond 'build_python3'
A previous patch removed python2 support from ovs. So we can remove
this condition and make python3 mandatory for builds. Without this
patch, make rpm-fedora on centos 7 fails unless  we pass
RPMBUILD_OPT="--with build_python3".

Signed-off-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-21 09:05:17 -07:00
William Tu
7e0c91eb07 debian and rhel: Add libunwind dev package.
The patch add libunwind dev package to debian and rhel.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
2019-10-18 17:15:23 -07:00
Bhargava Shastry
0d8fb0211c ossfuzz: Simplify miniflow fuzzer harness.
Google's oss-fuzz builder bots were complaining that miniflow_target is
too slow to fuzz in that some tests take longer than a second to
complete. This patch fixes this by replacing the random flow generation
within the harness to a more simpler scenario.

Signed-off-by: Bhargava Shastry <bshas3@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:58:53 -07:00
Yi-Hung Wei
b8c57b098d datapath: Allow attaching helper in later commit
Upstream commit:
commit 248d45f1e1934f7849fbdc35ef1e57151cf063eb
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date:   Fri Oct 4 09:26:44 2019 -0700

    openvswitch: Allow attaching helper in later commit

    This patch allows to attach conntrack helper to a confirmed conntrack
    entry.  Currently, we can only attach alg helper to a conntrack entry
    when it is in the unconfirmed state.  This patch enables an use case
    that we can firstly commit a conntrack entry after it passed some
    initial conditions.  After that the processing pipeline will further
    check a couple of packets to determine if the connection belongs to
    a particular application, and attach alg helper to the connection
    in a later stage.

    Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
337da25b5a datapath: Fix log message in ovs conntrack
Upstream commit:
commit 12c6bc38f99bb168b7f16bdb5e855a51a23ee9ec
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date:   Wed Aug 21 17:16:10 2019 -0700

    openvswitch: Fix log message in ovs conntrack

    Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
    Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
0b26ee1b47 datapath: Replace removed NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)
Backports the following upstream commit with some backward compatibility
change.

commit f319ca6557c10a711facc4dd60197470796d3ec1
Author: Geert Uytterhoeven <geert@linux-m68k.org>
Date:   Wed May 8 08:52:32 2019 +0200

    openvswitch: Replace removed NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)

    Commit 4806e975729f99c7 ("netfilter: replace NF_NAT_NEEDED with
    IS_ENABLED(CONFIG_NF_NAT)") removed CONFIG_NF_NAT_NEEDED, but a new user
    popped up afterwards.

    Fixes: fec9c271b8f1bde1 ("openvswitch: load and reference the NAT helper.")
    Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Acked-by: Florian Westphal <fw@strlen.de>
    Acked-by: Flavio Leitner <fbl@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Colin Ian King
5a6069e3f4 datapath: Check for null pointer return from nla_nest_start_noflag
upstream commit:

commit ca96534630e2edfd73121c487c957b17eca3b7d7
Author: Colin Ian King <colin.king@canonical.com>
Date:   Wed May 1 14:41:58 2019 +0100

    openvswitch: check for null pointer return from nla_nest_start_noflag

    The call to nla_nest_start_noflag can return null in the unlikely
    event that nla_put returns -EMSGSIZE.  Check for this condition to
    avoid a null pointer dereference on pointer nla_reply.

    Addresses-Coverity: ("Dereference null return value")
    Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
4c941202f7 datapath: Load and reference the NAT helper.
This commit backports the following upstream commit, and two functions
in nf_conntrack_helper.h.

Upstream commit:
commit fec9c271b8f1bde1086be5aa415cdb586e0dc800
Author: Flavio Leitner <fbl@redhat.com>
Date:   Wed Apr 17 11:46:17 2019 -0300

    openvswitch: load and reference the NAT helper.

    This improves the original commit 17c357efe5ec ("openvswitch: load
    NAT helper") where it unconditionally tries to load the module for
    every flow using NAT, so not efficient when loading multiple flows.
    It also doesn't hold any references to the NAT module while the
    flow is active.

    This change fixes those problems. It will try to load the module
    only if it's not present. It grabs a reference to the NAT module
    and holds it while the flow is active. Finally, an error message
    shows up if either actions above fails.

    Fixes: 17c357efe5ec ("openvswitch: load NAT helper")
    Signed-off-by: Flavio Leitner <fbl@redhat.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
f1e9590e81 datapath: genetlink: optionally validate strictly/dumps
This patch backports the following upstream commit within the
openvswitch kernel module with some checks so that it also works
in the older kernel.

Upstream commit:
commit ef6243acb4782df587a4d7d6c310fa5b5d82684b
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Fri Apr 26 14:07:31 2019 +0200

    genetlink: optionally validate strictly/dumps

    Add options to strictly validate messages and dump messages,
    sometimes perhaps validating dump messages non-strictly may
    be required, so add an option for that as well.

    Since none of this can really be applied to existing commands,
    set the options everwhere using the following spatch:

        @@
        identifier ops;
        expression X;
        @@
        struct genl_ops ops[] = {
        ...,
         {
                .cmd = X,
        +       .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
                ...
         },
        ...
        };

    For new commands one should just not copy the .validate 'opt-out'
    flags and thus get strict validation.

    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
09c3399616 datapath: Use nla_nest_start_noflag()
This patch backports the openvswitch changes and update the compat layer
for the following upstream patch.

commit ae0be8de9a53cda3505865c11826d8ff0640237c
Author: Michal Kubecek <mkubecek@suse.cz>
Date:   Fri Apr 26 11:13:06 2019 +0200

    netlink: make nla_nest_start() add NLA_F_NESTED flag

    Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
    netlink based interfaces (including recently added ones) are still not
    setting it in kernel generated messages. Without the flag, message parsers
    not aware of attribute semantics (e.g. wireshark dissector or libmnl's
    mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
    the structure of their contents.

    Unfortunately we cannot just add the flag everywhere as there may be
    userspace applications which check nlattr::nla_type directly rather than
    through a helper masking out the flags. Therefore the patch renames
    nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
    as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
    are rewritten to use nla_nest_start().

    Except for changes in include/net/netlink.h, the patch was generated using
    this semantic patch:

    @@ expression E1, E2; @@
    -nla_nest_start(E1, E2)
    +nla_nest_start_noflag(E1, E2)

    @@ expression E1, E2; @@
    -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
    +nla_nest_start(E1, E2)

    Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
    Acked-by: Jiri Pirko <jiri@mellanox.com>
    Acked-by: David Ahern <dsahern@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
d42fb06d76 datapath: Handle NF_NAT_NEEDED replacement
Starting from the following upstream commit, NF_NAT_NEEDED is replaced
by IS_ENABLED(CONFIG_NF_NAT) in the upstream kernel. This patch makes
some changes so that our in tree ovs kernel module is compatible to
both old and new kernels.

Upstream commit:
commit 4806e975729f99c7908d1688a143f1e16d464e6c
Author: Florian Westphal <fw@strlen.de>
Date:   Wed Mar 27 09:22:26 2019 +0100

    netfilter: replace NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)

    NF_NAT_NEEDED is true whenever nat support for either ipv4 or ipv6 is
    enabled.  Now that the af-specific nat configuration switches have been
    removed, IS_ENABLED(CONFIG_NF_NAT) has the same effect.

    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Flavio Leitner
0e34479d2b datapath: add seqadj extension when NAT is used.
upstream patch:

commit fa7e428c6b7ed3281610511a2b2ec716d9894be8
Author: Flavio Leitner <fbl@sysclose.org>
Date:   Mon Mar 25 15:58:31 2019 -0300

    openvswitch: add seqadj extension when NAT is used.

    When the conntrack is initialized, there is no helper attached
    yet so the nat info initialization (nf_nat_setup_info) skips
    adding the seqadj ext.

    A helper is attached later when the conntrack is not confirmed
    but is going to be committed. In this case, if NAT is needed then
    adds the seqadj ext as well.

    Fixes: 16ec3d4fbb96 ("openvswitch: Fix cached ct with helper.")
    Signed-off-by: Flavio Leitner <fbl@sysclose.org>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
9ea96dce45 datapath: Detect upstream nf_nat change
The following two upstream commits merge nf_nat_ipv4 and nf_nat_ipv6
into nf_nat core, and move some header files around.  To handle
these modifications, this patch detects the upstream changes, uses
the header files and config symbols properly.

Ideally, we should replace CONFIG_NF_NAT_IPV4 and CONFIG_NF_NAT_IPV6 with
CONFIG_NF_NAT and CONFIG_IPV6.  In order to keep backward compatibility,
we keep the checking of CONFIG_NF_NAT_IPV4/6 as is for the old kernel,
and replace them with marco for the new kernel.

upstream commits:
3bf195ae6037 ("netfilter: nat: merge nf_nat_ipv4,6 into nat core")
d2c5c103b133 ("netfilter: nat: remove nf_nat_l3proto.h and nf_nat_core.h")

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
c1d728dbde datapath: Replace nf_ct_invert_tuplepr() with nf_ct_invert_tuple()
After upstream net-next commit 303e0c558959 ("netfilter: conntrack:
avoid unneeded nf_conntrack_l4proto lookups") nf_ct_invert_tuplepr()
is no longer available in the kernel.

Ideally, we should be in sync with upstream kernel by calling
nf_ct_invert_tuple() directly in conntrack.c.  However,
nf_ct_invert_tuple() has different function signature in older kernel,
and it would be hard to replace that in the compat layer. Thus, we
use rpl_nf_ct_invert_tuple() in conntrack.c and maintain compatibility
in the compat layer so that ovs kernel module runs smoothly in both
new and old kernel.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Arnd Bergmann
719197e73b datapath: Fix linking without CONFIG_NF_CONNTRACK_LABELS
upstream commit:
commit a277d516de5f498c91d91189717ef7e01102ad27
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Fri Nov 2 16:36:55 2018 +0100

    openvswitch: fix linking without CONFIG_NF_CONNTRACK_LABELS

    When CONFIG_CC_OPTIMIZE_FOR_DEBUGGING is enabled, the compiler
    fails to optimize out a dead code path, which leads to a link failure:

    net/openvswitch/conntrack.o: In function `ovs_ct_set_labels':
    conntrack.c:(.text+0x2e60): undefined reference to `nf_connlabels_replace'

    In this configuration, we can take a shortcut, and completely
    remove the contrack label code. This may also help the regular
    optimization.

    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:39 -07:00
Greg Rose
efcae7df6c datapath: compat: drop bridge nf reset from nf_reset
Upstream commmit:
    commit 895b5c9f206eb7d25dc1360a8ccfc5958895eb89
    Author: Florian Westphal <fw@strlen.de>
    Date:   Sun Sep 29 20:54:03 2019 +0200

    netfilter: drop bridge nf reset from nf_reset

    commit 174e23810cd31
    ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi
    recycle always drop skb extensions.  The additional skb_ext_del() that is
    performed via nf_reset on napi skb recycle is not needed anymore.

    Most nf_reset() calls in the stack are there so queued skb won't block
    'rmmod nf_conntrack' indefinitely.

    This removes the skb_ext_del from nf_reset, and renames it to a more
    fitting nf_reset_ct().

    In a few selected places, add a call to skb_ext_reset to make sure that
    no active extensions remain.

    I am submitting this for "net", because we're still early in the release
    cycle.  The patch applies to net-next too, but I think the rename causes
    needless divergence between those trees.

    Suggested-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Added some compat layer fixups for nf_reset_ct.  This is just a portion
of the upstream commit that applies to openvswitch.

Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-10-18 16:22:12 +02:00
Pablo Neira Ayuso
a2bdad676f datapath: rename flow_stats to sw_flow_stats
Upstream commit:
    commit aef833c58d321f09ae4ce4467723542842ba9faf
    Author: Pablo Neira Ayuso <pablo@netfilter.org>
    Date:   Fri Jul 19 18:20:13 2019 +0200

    net: openvswitch: rename flow_stats to sw_flow_stats

    There is a flow_stats structure defined in include/net/flow_offload.h
    and a follow up patch adds #include <net/flow_offload.h> to
    net/sch_generic.h.

    This breaks compilation since OVS codebase includes net/sock.h which
    pulls in linux/filter.h which includes net/sch_generic.h.

    In file included from ./include/net/sch_generic.h:18:0,
                     from ./include/linux/filter.h:25,
                     from ./include/net/sock.h:59,
                     from ./include/linux/tcp.h:19,
                     from net/openvswitch/datapath.c:24

    This definition takes precedence on OVS since it is placed in the
    networking core, so rename flow_stats in OVS to sw_flow_stats since
    this structure is contained in sw_flow.

    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Acked-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-10-18 16:22:12 +02:00