2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 21:38:13 +00:00

405 Commits

Author SHA1 Message Date
Roi Dayan
c0a1df2e3f compat: Add compat fix for old kernels
In kernels older than 4.8, struct tcf_t didn't have the firstuse.
If openvswitch is compiled with the compat pkt_cls.h then there is
a struct size mismatch between openvswitch and the kernel which cause
parsing netlink actions to fail.
After this commit parsing the netlink actions pass even if compiled with
the compat pkt_cls.h.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-11-11 11:46:53 +01:00
William Tu
e50547b51a netdev-afxdp: Add need_wakeup support.
The patch adds support for using need_wakeup flag in AF_XDP rings.
A new option, use-need-wakeup, is added.  When this option is used,
it means that OVS has to explicitly wake up the kernel RX, using poll()
syscall and wake up TX, using sendto() syscall. This feature improves
the performance by avoiding unnecessary sendto syscalls for TX.
For RX, instead of kernel always busy-spinning on fille queue, OVS wakes
up the kernel RX processing when fill queue is replenished.

The need_wakeup feature is merged into Linux kernel bpf-next tee with commit
77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") and
OVS enables it by default, if libbpf supports it.  If users enable it but
runs in an older version of libbpf, then the need_wakeup feature has no effect,
and a warning message is logged.

For virtual interface, it's better set use-need-wakeup=false, since
the virtual device's AF_XDP xmit is synchronous: the sendto syscall
enters kernel and process the TX packet on tx queue directly.

On Intel Xeon E5-2620 v3 2.4GHz system, performance of physical port
to physical port improves from 6.1Mpps to 7.3Mpps.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-10-29 19:26:59 +01:00
Ben Pfaff
49df3c0fe7 docs: DPDK isn't a datapath, so don't use the term.
The DPDK library allows OVS fast access to packet I/O in userspace.  It
is not a datapath.  This commit avoids using that term.

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-23 12:38:19 -07:00
Yi-Hung Wei
4c941202f7 datapath: Load and reference the NAT helper.
This commit backports the following upstream commit, and two functions
in nf_conntrack_helper.h.

Upstream commit:
commit fec9c271b8f1bde1086be5aa415cdb586e0dc800
Author: Flavio Leitner <fbl@redhat.com>
Date:   Wed Apr 17 11:46:17 2019 -0300

    openvswitch: load and reference the NAT helper.

    This improves the original commit 17c357efe5ec ("openvswitch: load
    NAT helper") where it unconditionally tries to load the module for
    every flow using NAT, so not efficient when loading multiple flows.
    It also doesn't hold any references to the NAT module while the
    flow is active.

    This change fixes those problems. It will try to load the module
    only if it's not present. It grabs a reference to the NAT module
    and holds it while the flow is active. Finally, an error message
    shows up if either actions above fails.

    Fixes: 17c357efe5ec ("openvswitch: load NAT helper")
    Signed-off-by: Flavio Leitner <fbl@redhat.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
f1e9590e81 datapath: genetlink: optionally validate strictly/dumps
This patch backports the following upstream commit within the
openvswitch kernel module with some checks so that it also works
in the older kernel.

Upstream commit:
commit ef6243acb4782df587a4d7d6c310fa5b5d82684b
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Fri Apr 26 14:07:31 2019 +0200

    genetlink: optionally validate strictly/dumps

    Add options to strictly validate messages and dump messages,
    sometimes perhaps validating dump messages non-strictly may
    be required, so add an option for that as well.

    Since none of this can really be applied to existing commands,
    set the options everwhere using the following spatch:

        @@
        identifier ops;
        expression X;
        @@
        struct genl_ops ops[] = {
        ...,
         {
                .cmd = X,
        +       .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
                ...
         },
        ...
        };

    For new commands one should just not copy the .validate 'opt-out'
    flags and thus get strict validation.

    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
09c3399616 datapath: Use nla_nest_start_noflag()
This patch backports the openvswitch changes and update the compat layer
for the following upstream patch.

commit ae0be8de9a53cda3505865c11826d8ff0640237c
Author: Michal Kubecek <mkubecek@suse.cz>
Date:   Fri Apr 26 11:13:06 2019 +0200

    netlink: make nla_nest_start() add NLA_F_NESTED flag

    Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
    netlink based interfaces (including recently added ones) are still not
    setting it in kernel generated messages. Without the flag, message parsers
    not aware of attribute semantics (e.g. wireshark dissector or libmnl's
    mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
    the structure of their contents.

    Unfortunately we cannot just add the flag everywhere as there may be
    userspace applications which check nlattr::nla_type directly rather than
    through a helper masking out the flags. Therefore the patch renames
    nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
    as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
    are rewritten to use nla_nest_start().

    Except for changes in include/net/netlink.h, the patch was generated using
    this semantic patch:

    @@ expression E1, E2; @@
    -nla_nest_start(E1, E2)
    +nla_nest_start_noflag(E1, E2)

    @@ expression E1, E2; @@
    -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
    +nla_nest_start(E1, E2)

    Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
    Acked-by: Jiri Pirko <jiri@mellanox.com>
    Acked-by: David Ahern <dsahern@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
d42fb06d76 datapath: Handle NF_NAT_NEEDED replacement
Starting from the following upstream commit, NF_NAT_NEEDED is replaced
by IS_ENABLED(CONFIG_NF_NAT) in the upstream kernel. This patch makes
some changes so that our in tree ovs kernel module is compatible to
both old and new kernels.

Upstream commit:
commit 4806e975729f99c7908d1688a143f1e16d464e6c
Author: Florian Westphal <fw@strlen.de>
Date:   Wed Mar 27 09:22:26 2019 +0100

    netfilter: replace NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)

    NF_NAT_NEEDED is true whenever nat support for either ipv4 or ipv6 is
    enabled.  Now that the af-specific nat configuration switches have been
    removed, IS_ENABLED(CONFIG_NF_NAT) has the same effect.

    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
9ea96dce45 datapath: Detect upstream nf_nat change
The following two upstream commits merge nf_nat_ipv4 and nf_nat_ipv6
into nf_nat core, and move some header files around.  To handle
these modifications, this patch detects the upstream changes, uses
the header files and config symbols properly.

Ideally, we should replace CONFIG_NF_NAT_IPV4 and CONFIG_NF_NAT_IPV6 with
CONFIG_NF_NAT and CONFIG_IPV6.  In order to keep backward compatibility,
we keep the checking of CONFIG_NF_NAT_IPV4/6 as is for the old kernel,
and replace them with marco for the new kernel.

upstream commits:
3bf195ae6037 ("netfilter: nat: merge nf_nat_ipv4,6 into nat core")
d2c5c103b133 ("netfilter: nat: remove nf_nat_l3proto.h and nf_nat_core.h")

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Yi-Hung Wei
c1d728dbde datapath: Replace nf_ct_invert_tuplepr() with nf_ct_invert_tuple()
After upstream net-next commit 303e0c558959 ("netfilter: conntrack:
avoid unneeded nf_conntrack_l4proto lookups") nf_ct_invert_tuplepr()
is no longer available in the kernel.

Ideally, we should be in sync with upstream kernel by calling
nf_ct_invert_tuple() directly in conntrack.c.  However,
nf_ct_invert_tuple() has different function signature in older kernel,
and it would be hard to replace that in the compat layer. Thus, we
use rpl_nf_ct_invert_tuple() in conntrack.c and maintain compatibility
in the compat layer so that ovs kernel module runs smoothly in both
new and old kernel.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-18 10:55:47 -07:00
Greg Rose
efcae7df6c datapath: compat: drop bridge nf reset from nf_reset
Upstream commmit:
    commit 895b5c9f206eb7d25dc1360a8ccfc5958895eb89
    Author: Florian Westphal <fw@strlen.de>
    Date:   Sun Sep 29 20:54:03 2019 +0200

    netfilter: drop bridge nf reset from nf_reset

    commit 174e23810cd31
    ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi
    recycle always drop skb extensions.  The additional skb_ext_del() that is
    performed via nf_reset on napi skb recycle is not needed anymore.

    Most nf_reset() calls in the stack are there so queued skb won't block
    'rmmod nf_conntrack' indefinitely.

    This removes the skb_ext_del from nf_reset, and renames it to a more
    fitting nf_reset_ct().

    In a few selected places, add a call to skb_ext_reset to make sure that
    no active extensions remain.

    I am submitting this for "net", because we're still early in the release
    cycle.  The patch applies to net-next too, but I think the rename causes
    needless divergence between those trees.

    Suggested-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Added some compat layer fixups for nf_reset_ct.  This is just a portion
of the upstream commit that applies to openvswitch.

Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-10-18 16:22:12 +02:00
Ben Pfaff
36e5d97f9b ovs-vlan-bug-workaround: Remove.
This workaround only applied to kernels earlier than 2.6.37, but OVS
only supports 3.10 and later.

As the original author of this code, I won't miss it.

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-14 15:34:53 -07:00
Ben Pfaff
e5273084d2 Fix "the the" typo in two places.
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-09 15:41:53 -07:00
Greg Rose
9048ab600b acinclude: Fix false positive search for prandom_u32
Searching random.h for prandom_u32 will also match when prandom_u32_max
is present and cause a false positive HAVE_PRANDOM_U32.  Fix this up
by looking for the parenthesis following prandom_u32 so it won't
match on prandom_u32_max.

Passes Travis:
https://travis-ci.org/gvrose8192/ovs-experimental/builds/595171808

Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-10-08 10:48:06 -07:00
Yi-Hung Wei
2fc8309bd6 datapath: compat: Backport nf_conntrack_timeout support
This patch brings in nf_ct_timeout_put() and nf_ct_set_timeout()
when it is not available in the kernel.

Three symbols are created in acinclude.m4.

* HAVE_NF_CT_SET_TIMEOUT is used to determine if upstream net-next commit
717700d183d65 ("netfilter: Export nf_ct_{set,destroy}_timeout()") is
availabe.  If it is defined, the kernel should have all the
nf_conntrack_timeout support that OVS needs.

* HAVE_NF_CT_TIMEOUT is used to check if upstream net-next commit
6c1fd7dc489d9 ("netfilter: cttimeout: decouple timeout policy from
nfnetlink_cttimeout object") is there.  If it is not defined, we
will use the old ctnl_timeout interface rather than the nf_ct_timeout
interface that is introduced in this commit.

* HAVE_NF_CT_TIMEOUT_FIND_GET_HOOK_NET is used to check if upstream
commit 19576c9478682 ("netfilter: cttimeout: add netns support") is
there, so that we pass different arguement based on whether the kernel
has netns support.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2019-09-26 13:50:17 -07:00
Yifeng Sun
4bfdefea09 datapath: compat: Backports bugfixes for nf_conncount
This patch backports several critical bug fixes related to
locking and data consistency in nf_conncount code.

This backport is based on the following upstream net-next upstream commits.
a007232 ("netfilter: nf_conncount: fix argument order to find_next_bit")
c80f10b ("netfilter: nf_conncount: speculative garbage collection on empty lists")
2f971a8 ("netfilter: nf_conncount: move all list iterations under spinlock")
df4a902 ("netfilter: nf_conncount: merge lookup and add functions")
e8cfb37 ("netfilter: nf_conncount: restart search when nodes have been erased")
f7fcc98 ("netfilter: nf_conncount: split gc in two phases")
4cd273b ("netfilter: nf_conncount: don't skip eviction when age is negative")
c78e781 ("netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS")
d4e7df1 ("netfilter: nf_conncount: use rb_link_node_rcu() instead of rb_link_node()")
53ca0f2 ("netfilter: nf_conncount: remove wrong condition check routine")
3c5cdb1 ("netfilter: nf_conncount: fix unexpected permanent node of list.")
31568ec ("netfilter: nf_conncount: fix list_del corruption in conn_free")
fd3e71a ("netfilter: nf_conncount: use spin_lock_bh instead of spin_lock")

This patch adds additional compat code so that it can build on
all supported kernel versions.

In addition, this patch helps OVS datapath to always choose bug-fixed
nf_conncount code. If kernel already has these fixes, then kernel's
nf_conncount is being used. Otherwise, OVS falls back to use compat
nf_conncount functions.

Travis tests are at
https://travis-ci.org/yifsun/ovs-travis/builds/569056850
On latest RHEL kernel, 'make check-kmod' runs good.

VMware-BZ: #2396471

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-28 17:49:49 -07:00
John Hurley
abef79598c compat: add compatibility headers for tc mpls action
OvS includes compat code for several TC actions including vlan, mirred and
tunnel key. MPLS actions have recently been added to TC in the kernel. In
preparation for adding TC offload code for MPLS, add the MPLS compat code.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-08-01 18:09:42 +02:00
William Tu
0de1b42596 netdev-afxdp: add new netdev type for AF_XDP.
The patch introduces experimental AF_XDP support for OVS netdev.
AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket
type built upon the eBPF and XDP technology.  It is aims to have comparable
performance to DPDK but cooperate better with existing kernel's networking
stack.  An AF_XDP socket receives and sends packets from an eBPF/XDP program
attached to the netdev, by-passing a couple of Linux kernel's subsystems
As a result, AF_XDP socket shows much better performance than AF_PACKET
For more details about AF_XDP, please see linux kernel's
Documentation/networking/af_xdp.rst. Note that by default, this feature is
not compiled in.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
2019-07-19 17:42:06 +03:00
Greg Rose
6d97adeea9 compat: Clean up gre_calc_hlen
It's proliferated throughout three .c files so let's pull them all
together in gre.h where the inline function belongs. This requires
some adjustments to the compat layer so that the various iterations
of gre_calc_hlen and ip_gre_calc_hlen since the 3.10 kernel are
handled correctly.

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-07-03 11:16:23 -07:00
Yifeng Sun
2adada0e3d datapath: Support kernel version 5.0.x
This patch updated acinclude.m4 so that OVS can be compiled on
5.0.x kernels.
This patch also updated travis files so that 5.0.x kernel versions
are used during travis test builds.
Besides, NEWS and releases.rst are also updated to reflect this
new support.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-06-13 10:03:11 -07:00
Petr Machata
140c8971c3 net: core: dev: Add extack argument to dev_change_flags()
Upstream commit:
    commit 567c5e13be5cc74d24f5eb54cf353c2e2277189b
    Author: Petr Machata <petrm@mellanox.com>
    Date:   Thu Dec 6 17:05:42 2018 +0000

    net: core: dev: Add extack argument to dev_change_flags()

    In order to pass extack together with NETDEV_PRE_UP notifications, it's
    necessary to route the extack to __dev_open() from diverse (possibly
    indirect) callers. One prominent API through which the notification is
    invoked is dev_change_flags().

    Therefore extend dev_change_flags() with and extra extack argument and
    update all users. Most of the calls end up just encoding NULL, but
    several sites (VLAN, ipvlan, VRF, rtnetlink) do have extack available.

    Since the function declaration line is changed anyway, name the other
    function arguments to placate checkpatch.

    Signed-off-by: Petr Machata <petrm@mellanox.com>
    Acked-by: Jiri Pirko <jiri@mellanox.com>
    Reviewed-by: Ido Schimmel <idosch@mellanox.com>
    Reviewed-by: David Ahern <dsahern@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

This patch backports the above upstream patch and also adds fixes
in compat code.

Cc: Petr Machata <petrm@mellanox.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-06-13 10:03:11 -07:00
Michał Mirosław
9feb5bda27 OVS: remove use of VLAN_TAG_PRESENT
Upstream commits:
    (1) commit 9df46aefafa6dee81a27c2a9d8ba360abd8c5fe3
    Author: Michał Mirosław <mirq-linux@rere.qmqm.pl>
    Date:   Thu Nov 8 18:44:50 2018 +0100

    OVS: remove use of VLAN_TAG_PRESENT

    This is a minimal change to allow removing of VLAN_TAG_PRESENT.
    It leaves OVS unable to use CFI bit, as fixing this would need
    a deeper surgery involving userspace interface.

    Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
    Signed-off-by: David S. Miller <davem@davemloft.net>

    (2) commit 6083e28aa02d7c9e6b87f8b944e92793094ae047
    Author: Michał Mirosław <mirq-linux@rere.qmqm.pl>
    Date:   Sat Nov 10 19:55:34 2018 +0100

    OVS: remove VLAN_TAG_PRESENT - fixup

    It turns out I missed one VLAN_TAG_PRESENT in OVS code while rebasing.
    This fixes it.

    Fixes: 9df46aefafa6 ("OVS: remove use of VLAN_TAG_PRESENT")
    Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
    Signed-off-by: David S. Miller <davem@davemloft.net>

This patch backports the above upstream patch to OVS and adds
extra checking in kernel module's compat code.

Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-06-13 10:03:11 -07:00
Yifeng Sun
8b7cc75261 datapath: Check extack argument of rtnl_create_link()
Upstream commit d0522f1cd25edb796548f91e04766fa3cbc3b6df ("net:
Add extack argument to rtnl_create_link") added new argument
to rtnl_create_link(). This introduced compiling errors in
the code of kernel datapath.

This patch fixes this issue.

Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-06-13 10:03:11 -07:00
Ilya Maximets
346c653823 acinclude: Add vector defines to sparse.
By adding compiler default flags for vector instructions to
cgcc we'll be able to check the same sources that we're building.
Also, this will allow to avoid re-defining these flags and
types specifically for "sparse" includes.

"sparse" headers "bmi2intrin.h" and "emmintrin.h" dropped as
not needed anymore.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-06-06 18:54:57 +03:00
Aaron Conole
8c7130da98 compat: add SCTP netfilter states for older kernels
Bake in the SCTP states from the kernel UAPI.  This means an older
revision of the kernel headers won't interfere with the SCTP display
enhancement.  Additionally, if a newer version is available, or if
x-compiling the datapath module we defer to that version (since this
is just meant to provide the missing definitions).

This will be used in a future commit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-24 09:50:53 -07:00
Yifeng Sun
d58b59c17c datapath: Support kernel version 4.19.x and 4.20.x
This patch updated acinclude.m4 so that OVS can be compiled on 4.19.x
and 4.20.x kernels.
This patch also updated travis files so that latest kernel versions
are used during travis test builds.

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-10 12:58:15 -07:00
Gao Feng
7685ce31cf netfilter: Remove useless param helper of nf_ct_helper_ext_add
Upstream commit:
    commit 440534d3c56be04abfb26850ee882d19d223557a
    Author: Gao Feng <gfree.wind@vip.163.com>
    Date:   Mon Jul 9 18:06:33 2018 +0800

    netfilter: Remove useless param helper of nf_ct_helper_ext_add

    The param helper of nf_ct_helper_ext_add is useless now, then remove
    it now.

    Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

This patch backports the above upstream patch to OVS.

Cc: Gao Feng <gfree.wind@vip.163.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-10 12:58:08 -07:00
Florian Westphal
7857a9b4fc datapath: Use new header file net/ipv6_frag.h
Upstream commit:
    commit 70b095c84326640eeacfd69a411db8fc36e8ab1a
    Author: Florian Westphal <fw@strlen.de>
    Date:   Sat Jul 14 01:14:01 2018 +0200

    ipv6: remove dependency of nf_defrag_ipv6 on ipv6 module

    IPV6=m
    DEFRAG_IPV6=m
    CONNTRACK=y yields:

    net/netfilter/nf_conntrack_proto.o: In function `nf_ct_netns_do_get':
    net/netfilter/nf_conntrack_proto.c:802: undefined reference to `nf_defrag_ipv6_enable'
    net/netfilter/nf_conntrack_proto.o:(.rodata+0x640): undefined reference to `nf_conntrack_l4proto_icmpv6'

    Setting DEFRAG_IPV6=y causes undefined references to ip6_rhash_params
    ip6_frag_init and ip6_expire_frag_queue so it would be needed to force
    IPV6=y too.

    This patch gets rid of the 'followup linker error' by removing
    the dependency of ipv6.ko symbols from netfilter ipv6 defrag.

    Shared code is placed into a header, then used from both.

    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

This patch backports the above upstream patch to OVS.

Cc: Florian Westphal <fw@strlen.de>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-10 12:57:51 -07:00
Florian Westphal
4fdec8986a datapath: Pass nf_hook_state to nf_conntrack_in()
Upstream Commit:
    commit 93e66024b0249cec81e91328c55a754efd3192e0
    Author: Florian Westphal <fw@strlen.de>
    Date:   Wed Sep 12 15:19:07 2018 +0200

    netfilter: conntrack: pass nf_hook_state to packet and error handlers

    nf_hook_state contains all the hook meta-information: netns, protocol family,
    hook location, and so on.

    Instead of only passing selected information, pass a pointer to entire
    structure.

    This will allow to merge the error and the packet handlers and remove
    the ->new() function in followup patches.

    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

This patch backports the above upstream patch to OVS and fixes compiling
errors on RHEL kernels.

Cc: Florian Westphal <fw@strlen.de>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-10 12:57:44 -07:00
Yifeng Sun
945d6d1c11 datapath: Handle removal of nf_conntrack_l3proto.h
Upstream kernel commit a0ae2562 ("netfilter: conntrack: remove l3proto
abstraction") removed header file net/netfilter/nf_conntrack_l3proto.h.
This patch detects it and fixes compilation errors of OVS on 4.19+ kernels.

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-10 12:57:36 -07:00
Yifeng Sun
8f6d230fcd datapath: Fix compiling error for 4.14.111+ kernel
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Fixes: f72469405eec9 ("datapath: meter: Use struct_size() in kzalloc()")
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-27 22:15:00 -07:00
Ben Pfaff
0cdd5b13de sparse: Configure target operating system and fix fallout.
cgcc, the "sparse" wrapper that OVS uses, can be told the host architecture
or the host OS or both.  Until now, OVS has told it the host architecture
because it is fairly common that it doesn't guess it automatically.  Until
now, OS has not told it the host OS, assuming that it would get it right.
However, it doesn't--if you tell it the host OS or the host architecture,
it doesn't really have a default for the other.  This means that on Linux
(presumably the only OS where sparse works properly for OVS), it was not
defining __linux__, which caused some weird behavior.

This commit adds a flag to the cgcc invocation to make it define __linux__
on Linux, and it fixes some errors that this would otherwise cause.

Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-24 08:37:08 -07:00
Gustavo A. R. Silva
f72469405e datapath: meter: Use struct_size() in kzalloc()
Upstream commit:
    commit c5c3899de09e307e3a0999ab8d620ab0ede05aa1
    Author: Gustavo A. R. Silva <gustavo@embeddedor.com>
    Date:   Tue Jan 15 15:19:17 2019 -0600

    openvswitch: meter: Use struct_size() in kzalloc()

    One of the more common cases of allocation size calculations is finding the
    size of a structure that has a zero-sized array at the end, along with
    memory for some number of elements for that array. For example:

    struct foo {
        int stuff;
        struct boo entry[];
    };

    instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);

    Instead of leaving these open-coded and prone to type mistakes, we can now
    use the new struct_size() helper:

    instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);

    This code was detected with the help of Coccinelle.

    Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Use of struct_size() needed some compat layer adjustments to make use
of this new macro.  This patch pulls in some of the needed support
from the linux mm.h and overflow.h header files.  This new header
file support is also necessary for the following patch that converts
to use of kvmalloc().

Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-16 13:58:19 -07:00
Ilya Maximets
e2f3751186 acinclude: Use AC_SEARCH_LIBS for linking with dl.
DPDK uses dlopen to load plugins and we need to search for
library containing this function. But we should not do this
in a loop because 'AC_SEARCH_LIBS' could do this for us.
Also, 'AC_SEARCH_LIBS' prints user-visible messages that are
useful for debuging.
Also added the new 'checking' message and code normalized to
be more readable.

With this change we'll have following additional messages:

  checking for library containing dlopen... -ldl
  checking whether linking with dpdk works... yes

Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-12 10:51:37 -07:00
Ilya Maximets
f0d1ead8b1 acinclude: Transparent checking for DPDK dependencies.
'AC_CHECK_DECL' makes almost same thing as 'AC_COMPILE_IFELSE', but
looks more pretty. Additionally it prints checking results in a
user-visible way making it easy to understand which configs checked
and why we need one or another dependency.

For exmaple, with this patch, configure log may look like this:

  checking whether dpdk datapath is enabled... yes
  checking for rte_config.h... yes
  checking whether RTE_LIBRTE_VHOST_NUMA is declared... no
  checking whether RTE_EAL_NUMA_AWARE_HUGEPAGES is declared... yes
  checking for library containing get_mempolicy... -lnuma
  checking whether RTE_LIBRTE_VHOST_NUMA is declared... (cached) no
  checking whether RTE_LIBRTE_PMD_PCAP is declared... yes
  checking for library containing pcap_dump... -lpcap
  checking whether RTE_LIBRTE_PDUMP is declared... yes
  checking whether RTE_LIBRTE_MLX5_PMD is declared... no
  checking whether RTE_LIBRTE_MLX4_PMD is declared... yes
  checking whether RTE_LIBRTE_MLX4_DLOPEN_DEPS is declared... yes

Instead of just:

  checking whether dpdk datapath is enabled... yes
  checking for rte_config.h... yes
  checking for library containing get_mempolicy... -lnuma
  checking for library containing pcap_dump... -lpcap

Anyway, code looks more clean and easier to understand. Also, with
this change we're defining VHOST_NUMA only if RTE_LIBRTE_VHOST_NUMA
defined. This costs nothing as all the checks with 'AC_CHECK_DECL'
are cached.

Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-12 10:51:32 -07:00
John Hurley
7808b2b971 compat: add compatibility headers for tc skbedit action
OvS includes compat code for several TC actions including vlan, mirred and
tunnel key. Add support for using skbedit actions when compiling
user-space code against older kernel headers.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-04-09 17:33:14 +02:00
Eli Britstein
3f7a64db48 acinclude: Omit unnecessary define
Commit fc3b425fa02f ("acinclude: Include libmnl when needed") added
unnecessary include of DPDK_MNL. Omit it.

Fixes: fc3b425fa02f ("acinclude: Include libmnl when needed")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-18 16:10:13 +00:00
Eli Britstein
84844e3f22 acinclude: Include libverbs and libmlx4 when needed
DPDK 18.11 uses libverbs and libmlx4 when MLX4 PMD is enabled.

This commit makes OVS to link to libverbs and libmlx4 when MLX4 PMD is
enabled on DPDK.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-18 16:09:31 +00:00
Eli Britstein
939bad9ada acinclude: Include libverbs and libmlx5 when needed
DPDK 18.11 uses libverbs and libmlx5 when MLX5 PMD is enabled.

This commit makes OVS to link to libverbs and libmlx5 when MLX5 PMD is
enabled on DPDK.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-18 16:07:39 +00:00
Christian Ehrhardt
5b5aa2d8a5 acinclude: Also use LIBS from dpkg pkg-config
DPDK 18.11 builds using the more modern meson build system no more
provide the -ldpdk linker script. Instead it is expected to use
pkgconfig for linker options as well.

This change will set DPDK_LIB from pkg-config (if pkg-config was
available) and since that already carries the whole-archive flags
around the PMDs skips the further wrapping in more whole-archive
if that is already part of DPDK_LIB.

To work reliable in all environments this needs pkg-config 0.29.1.
We want to be able to use PKG_CHECK_MODULES_STATIC which
is not yet available in 0.24. Therefore update pkg.m4
to pkg-config 0.29.1.

This should be backport-safe as these macro files are all versioned.
autoconf is smart enough to check the version if you have it locally,
and if the system's is higher, it will use that one instead.

Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-13 20:15:11 +00:00
Ilya Maximets
865cee2baa acinclude: Check for rte_config.h before checking dependencies.
Current ./configure script shows misleading errors in case of wrong
DPDK path:

  # ./configure --with-dpdk=/wrong/path
  ...
  checking whether dpdk datapath is enabled... yes
  checking for library containing get_mempolicy... -lnuma
  checking for library containing pcap_dump... -lpcap
  checking for library containing mnl_attr_put... no
  configure: error: unable to find libmnl, install the dependency package

This happens because we're not checking for headers before checking
for dependencies. All the compile attempts fails and script thinks
that we need more dependencies.

With this change script will check for 'rte_config.h' availability
and produce sane error message:

  # ./configure --with-dpdk=/wrong/path
  ...
  checking for rte_config.h... no
  configure: error: unable to find rte_config.h in /wrong/path

'AC_INCLUDES_DEFAULT' passed explicitly to avoid preprocessor test.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-12 16:47:47 +00:00
Ilya Maximets
4b3c58f36c acinclude: Use NUMA_AWARE_HUGEPAGES too for libnuma check.
This fixes build with NUMA_AWARE_HUGEPAGES enabled and VHOST_NUMA
disabled. This should not be a usual case. But it's possible to
configure DPDK this way.

Fixes: 5e925ccc2a6f ("netdev-dpdk: DPDK v17.11 upgrade")

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-12 11:58:00 +00:00
Ilya Maximets
47ab42a0b5 acinclude: Drop DPDK_EXTRA_LIB variable.
AC_SEARCH_LIBS enables the libraries itself:

  checking for library containing get_mempolicy... -lnuma
  checking for library containing pcap_dump... -lpcap

So, they are available in LIBS. No need to add them twice.

Also, DPDK_EXTRA_LIB doesn't even work, because each check overwrites
the variable instead of appending the new library. It was first time
misused while making libnuma optional and copy-pasted to several places
after that.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-07 14:50:19 +00:00
Yifeng Sun
7c84d7f481 datapath: Add support for kernel 4.18.x
No code changing is necessary to support 4.18.x.

Only one kernel test failed and it is in the process of being fixed.

Updated .travis.yml to include 4.18.x and also use latest 4.17 version.
Updated test files to test 4.18 kernel.

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 13:42:02 -08:00
Greg Rose
4a90b277ba compat: Fixup ipv6 fragmentation on 4.9.135+ kernels
Upstream commit 648700f76b03 ("inet: frags: use rhashtables...") changed
how ipv6 fragmentation is implemented.  This patch was backported to
the upstream stable 4.9.x kernel starting at 4.9.135.

This patch creates the compatibility layer changes required to both
compile and also operate correctly with ipv6 fragmentation on these
kernels. Check if the inet_frags 'rnd' field is present to key on
whether the upstream patch is present.  Also update Travis to the
latest 4.9 kernel release so that this patch is compile tested.

Passes Travis:
https://travis-ci.org/gvrose8192/ovs-experimental/builds/478033409

Cc: William Tu <u9012063@gmail.com>
Cc: Yi-Hung Wei <yihung.wei@gmail.com>
Cc: Yifeng Sun <pkusunyifeng@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-01-22 17:08:17 -08:00
Yifeng Sun
eca4cde025 nf_conntrack_proto: Fix HAVE_NET_NS_GET macro for nf_conntrack
In previous code, macro HAVE_NET_NS_SET is used in code but
never generated by config. This patch fixes it.

Fixes: 179fccce34db ("compat: Backport nf_ct_netns_{get, put}()")
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-12-27 11:13:27 -08:00
Yi-Hung Wei
8aef057be0 datapath: compat: Fix static key backport
The original static key backport is based on the upstream
net-next commit 11276d5306b8
("locking/static_keys: Add a new static_key interface").

However, in Canonical's Trusty kernel, it introduced partial static
support which have different definition of some of the macros that
breaks the compatibility code.

For example, in net-next git tree commit 11276d5306b8
("locking/static_keys: Add a new static_key interface").
+ #define DEFINE_STATIC_KEY_TRUE(name)   \
+       struct static_key_true name = STATIC_KEY_TRUE_INIT

On the other hand, in Canonical's Trusty git tree commit 13f5d5d1cccb6
("x86/KVM/VMX: Add module argument for L1TF mitigation")
+ #define DEFINE_STATIC_KEY_TRUE(name)   \
+       struct static_key name = STATIC_KEY_INIT_TRUE

This commit resolves the ovs kernel module compatibility issue on
Trusty kernel.

VMware-BZ: #2251101
Fixes: 6660a9597a49 ("datapath: compat: Introduce static key support")
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-12-18 12:16:32 -08:00
Timothy Redaelli
fc3b425fa0 acinclude: Include libmnl when needed
DPDK 18.11 uses libmnl when MLX5 PMD is enabled.

This commit makes OVS to link to libmnl when MLX5 PMD is enabled on
DPDK.

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2018-12-17 10:22:10 +00:00
Greg Rose
3f3b76f980 compat: Fixup ip_tunnel_info_opts_set
A new flags parameter has been added in 4.19 so add compat fixup.

Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-12-15 08:13:12 -08:00
Ben Pfaff
64ed99ffbc netdev-linux: Don't include <net/if_packet.h>.
This header only defines sockaddr_pkt, which this source file doesn't use.

This was the only user of net/if_packet.h, so also remove the
configure-time test for it (which netdev-linux wasn't using anyway).

Reported-by: Andre McCurdy <armccurdy@gmail.com>
Reported-at: https://github.com/openvswitch/ovs/pull/253
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-10-03 16:55:55 -07:00
Ben Pfaff
4c1e8cb942 acinclude.m4: Really check whether GCC support -Wno-null-pointer-arithmetic.
I've noticed recently an annoying quantity of error messages like the
following in builds in various places:

    gcc: error: unrecognized command line option ‘-Wunknown-warning-option’

This didn't really make sense because OVS checks whether the compiler
supports warning options before it uses them.  Looking closer, the GCC
manual has a note that explains the issue:

     When an unrecognized warning option is requested (e.g.,
    '-Wunknown-warning'), GCC emits a diagnostic stating that the
    option is not recognized.  However, if the '-Wno-' form is used,
    the behavior is slightly different: no diagnostic is produced for
    '-Wno-unknown-warning' unless other diagnostics are being
    produced.  This allows the use of new '-Wno-' options with old
    compilers, but if something goes wrong, the compiler warns that
    an unrecognized option is present.

Thus, we can properly check only for the *positive* version of a warning
option, so this commit makes the OVS tests do that.

Fixes: a7021b08b0d5 ("configure: Disable -Wnull-pointer-arithmetic Clang warning.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
2018-09-26 13:19:00 -07:00