2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 21:38:13 +00:00

405 Commits

Author SHA1 Message Date
Pravin B Shelar
f2252c6105 datapath: compat: Update Geneve and VxLAN modules.
This patch brings in various updates to upstream Geneve and VxLAN
modules. For geneve this patch adds IPv6 support, for vxlan it adds
VXLAN GPE is the major feature.
This should make OVS compat tunnel implementation in sync upto
current net branch.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-07-08 19:27:49 -07:00
Pravin B Shelar
d7e5891346 datapath: backport: udp: Add socket based GRO and config
Upstream commit:
    commit 38fd2af24fcfda93f9fea3e53f26e48775ae9e09
    Author: Tom Herbert <tom@herbertland.com>

    udp: Add socket based GRO and config

    Add gro_receive and  gro_complete to struct udp_tunnel_sock_cfg.

    Signed-off-by: Tom Herbert <tom@herbertland.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-07-08 19:27:49 -07:00
Pravin B Shelar
b7ebebcdd7 datapath: compat: Update udp_sock_create
Update udp-socket-create to create ipv6 socket currectly.

Partially backports commit fd384412e199b ("udp_tunnel: Seperate ipv6
functions into its own file.")

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-07-08 19:27:49 -07:00
Pravin B Shelar
1c95839fee datapath: compat: rename HAVE_METADATA_DST to USE_UPSTREAM_TUNNEL
To better represent the meaning of symbol.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-07-08 19:27:49 -07:00
Pravin B Shelar
3259c4ff75 datapath: backport: ip_tunnel: add support for setting flow label via collect metadata
Update udp_tunnel6_xmit_skb(). Specificaly changes are
related to setting ipv6 label.

Upstream commit:
    commit 134611446dc657e1bbc73ca0e4e6b599df687db0
    Author: Daniel Borkmann <daniel@iogearbox.net>

    ip_tunnel: add support for setting flow label via collect metadata

    This patch extends udp_tunnel6_xmit_skb() to pass in the IPv6 flow label
    from call sites. Currently, there's no such option and it's always set to
    zero when writing ip6_flow_hdr(). Add a label member to ip_tunnel_key, so
    that flow-based tunnels via collect metadata frontends can make use of it.
    vxlan and geneve will be converted to add flow label support separately.

    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-07-08 19:27:49 -07:00
Pravin B Shelar
5a109898bd datapath: backport: tunnels: Remove encapsulation offloads on decap.
Following patch backports updated iptunnel pull function.
Also brings in following upstream fix:

    commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168
    Author: Jesse Gross <jesse@kernel.org>

    tunnels: Remove encapsulation offloads on decap.

    If a packet is either locally encapsulated or processed through GRO
    it is marked with the offloads that it requires. However, when it is
    decapsulated these tunnel offload indications are not removed. This
    means that if we receive an encapsulated TCP packet, aggregate it with
    GRO, decapsulate, and retransmit the resulting frame on a NIC that does
    not support encapsulation, we won't be able to take advantage of hardware
    offloads even though it is just a simple TCP packet at this point.

    This fixes the problem by stripping off encapsulation offload indications
    when packets are decapsulated.

    The performance impacts of this bug are significant. In a test where a
    Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
    and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
    result of avoiding unnecessary segmentation at the VM tap interface.

    Reported-by: Ramu Ramamurthy <sramamur@linux.vnet.ibm.com>
    Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
    Signed-off-by: Jesse Gross <jesse@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-07-08 19:27:48 -07:00
Pravin B Shelar
c6e13fccb9 datapath: backport: iptunnel: scrub packet in iptunnel_pull_header
Upstream Commit:
    commit 7f290c94352e59b1d720055fce760a69a63bd0a1
    Author: Jiri Benc <jbenc@redhat.com>

    iptunnel: scrub packet in iptunnel_pull_header

    Part of skb_scrub_packet was open coded in iptunnel_pull_header. Let it call
    skb_scrub_packet directly instead.

    Signed-off-by: Jiri Benc <jbenc@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-07-08 19:27:48 -07:00
Pravin B Shelar
aad7cb91ef datapath: compat: Refactor egress tunnel info
upstream tunnel egress info is retrieved using ndo_fill_metadata_dst.
Since we do not have it on older kernel we need to keep vport operation
to do same on these kernels.
Following patch try to merge these to operations into one to avoid code
duplication.
This commit backports fc4099f1 ("openvswitch:
Fix egress tunnel info.")

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-07-08 19:27:48 -07:00
Eric W. Biederman
0374bcbe29 compat: ipv4: Pass struct net through ip_fragment.
Upstream commit:
    ipv4: Pass struct net through ip_fragment

    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Upstream: 694869b3c544 ("ipv4: Pass struct net through ip_fragment")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-06-27 11:27:37 +02:00
Bhanuprakash Bodireddy
16dfb8fa79 acinclude: check for numa library
Numa library is needed for NUMA aware vHost User functionality.
Incase of missing numa package, the OVS DPDK configuration fails with
"error: Could not find DPDK libraries in <DPDK_LOC>/TARGET/lib" though
the DPDK library is installed.

This patch fixes this inappropriate error by checking for presence of
numa library and output an appropriate error message "error: unable to
find libnuma, install the dependency package" in case of missing package.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-06-24 13:58:19 -07:00
Jarno Rajahalme
7f2ab8cd23 datapath: change nf_connlabels_get bit arg to 'highest used'
Upstream commit:
    commit adff6c65600000ec2bb71840c943ee12668080f5
    Author: Florian Westphal <fw@strlen.de>
    Date:   Tue Apr 12 18:14:25 2016 +0200

    netfilter: connlabels: change nf_connlabels_get bit arg to 'highest used'

    nf_connlabel_set() takes the bit number that we would like to set.
    nf_connlabels_get() however took the number of bits that we want to
    support.

    So e.g. nf_connlabels_get(32) support bits 0 to 31, but not 32.
    This changes nf_connlabels_get() to take the highest bit that we want
    to set.

    Callers then don't have to cope with a potential integer wrap
    when using nf_connlabels_get(bit + 1) anymore.

    Current callers are fine, this change is only to make folloup
    nft ct label set support simpler.

    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Jarno Rajahalme <jarno@ovn.org>

OVS compat code defined nf_connlabels_get() if it was missing.  Now we
redefine it if it is missing, or if it has the old signature.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-06-20 18:51:09 -07:00
Jarno Rajahalme
0d330e4299 datapath: compat for NAT.
Compat code required to make the NAT code in the following patch
compile with Linux 3.10 - 4.6.

Some compat code applies to the conntrack.c itself; these are added
after the main NAT backport for conntrack.c later in the series.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-06-20 18:51:06 -07:00
Jarno Rajahalme
71ce9eddaf acinclude: Add OVS_FIND_PARAM_IFELSE.
OVS_FIND_PARAM_IFELSE is more robust macro for checking function
parameters, as it does not require the parameter to be on the same
line as the function name like the OVS_GREP_IFELSE does.

Use this to fix the check for struct conntrack_zone parameter, which
is on a different line on Linux 4.3 and higher.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-06-20 18:51:06 -07:00
Ciara Loftus
db8f13b020 netdev-dpdk: NUMA Aware vHost User
This commit allows for vHost User memory from QEMU, DPDK and OVS, as
well as the servicing PMD, to all come from the same socket.

The socket id of a vhost-user port used to be set to that of the master
lcore. Now it is possible to update the socket id if it is detected
(during VM boot) that the vhost device memory is not on this node. If
this is the case, a new mempool is created from the new node, and the
PMD thread currently servicing the port will no longer, in favour of a
thread from the new node (if enabled in the pmd-cpu-mask).

To avail of this functionality, one must enable the
CONFIG_RTE_LIBRTE_VHOST_NUMA DPDK configuration option.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-06-17 17:11:55 -07:00
Mauricio Vasquez B
4c16ee484e acinclude: fix issue when configuring with --with-dpdk
when an empty path is given to the --with-dpdk option
(--with-dpdk="" or --width-dpdk=$NON_SET_ENV_VARIABLE) the configure
script does not show any error and configures OvS without DPDK support,
this can create some confusion.

This patch modifies that behavior showing an explicity error in that case.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-06-01 10:28:28 -07:00
Markos Chandras
c253c2e322 acinclude.m4: Fix skb_get_hash function detection
Commit e2f3178f0582 ("datapath: Add support for kernel 3.14.") added
support for 3.14 kernels and a new OVS_GREP_IFELSE check for the
"skg_get_hash" function in the process. "skb_get_hash" was introduced
in the Linux kernel commit 3958afa1b272 ("net: Change skb_get_rxhash to
skb_get_hash") which exists in >=3.14 but the OVS_GREP_IFELSE macro
also matches the "skb_get_hash_raw" function which exists in older
kernels. As a result of which, the check makes the build system
behave as if the "skb_get_hash" function is available in these older
kernels leading to build failures. We fix this by explicitly checking
for "skb_get_hash(" which matches the function definition.

Signed-off-by: Markos Chandras <mchandras@suse.de>
Signed-off-by: Jesse Gross <jesse@kernel.org>
2016-05-16 11:33:12 -07:00
Joe Stringer
54529e94e9 compat: Remove skbuff header helper backports.
These have existed largely since v2.6.22, so it's well overdue.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-05-03 15:46:44 -07:00
Joe Stringer
a0c6adfbde compat: Remove unused ipv[46] backports.
These pieces #if on kernel versions which are not supported since commit
f2ab1536ddbc ("compat: Backport conntrack strictly to v3.10+.")

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-05-03 15:46:40 -07:00
Joe Stringer
39c0ff2217 compat: ipv4: Pass struct net into ip_defrag.
Upstream commit:
    ipv4: Pass struct net into ip_defrag and ip_check_defrag

    The function ip_defrag is called on both the input and the output
    paths of the networking stack.  In particular conntrack when it is
    tracking outbound packets from the local machine calls ip_defrag.

    So add a struct net parameter and stop making ip_defrag guess which
    network namespace it needs to defragment packets in.

    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Upstream: 19bcf9f203c8 ("ipv4: Pass struct net into ip_defrag and ip_check_defrag")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-05-02 17:06:36 -07:00
Joe Stringer
fa67f8e070 compat: Add a struct net parameter to l4_pkt_to_tuple.
Upstream commit:
    netfilter: nf_conntrack: Add a struct net parameter to l4_pkt_to_tuple

    As gre does not have the srckey in the packet gre_pkt_to_tuple
    needs to perform a lookup in it's per network namespace tables.

    Pass in the proper network namespace to all pkt_to_tuple
    implementations to ensure gre (and any similar protocols) can get this
    right.

    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Upstream: a31f1adc0948 ("netfilter: nf_conntrack: Add a struct net
parameter to l4_pkt_to_tuple")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-05-02 17:06:36 -07:00
Bhanuprakash Bodireddy
40b5ea8631 acinclude: Autodetect DPDK location when configuring OVS
When using DPDK datapath, the OVS configure script requires the DPDK
build directory passed on --with-dpdk. This can be avoided if DPDK
library, headers are in standard compiler search paths.

This patch fixes the problem by searching for DPDK libraries in standard
locations and configure OVS sources for dpdk datapath.

If the install location is manually specified in "--with-dpdk"
autodiscovery shall be skipped.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-04-15 14:46:52 -07:00
Jesse Gross
2b64e8fc1b datapath: Check for sock argument to v6ops->fragment.
Ubuntu 3.13.0-83-generic has backported a patch that adds an intermediate
version of the v6ops->fragment function that doesn't seem to ever been
part of a released upstream kernel. This version is missing the sock
argument to the fragment function.

Since we already have a backported version of the function from a newer
kernel, this simply ignores the version that Ubuntu is now making available
and continues to use the OVS version, similar to what it was doing before.

Reported-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
Reported-by: Aaron Rosen <aaronorosen@gmail.com>
Reported-by: Russell Bryant <russell@ovn.org>
Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Russell Bryant <russell@ovn.org>
2016-03-21 12:53:19 -07:00
Pravin B Shelar
3cdc5697c7 datapath: Remove OVS_FRAGMENT_BACKPORT
This macro is not required as we drop support for unsupported
kernel versions.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-03-14 09:53:57 -07:00
Pravin B Shelar
8063e09587 datapath: Drop support for kernel older than 3.10
Currently OVS out of tree datapath supports a large number of kernel
versions. From 2.6.32 to 4.3 and various distribution-specific
kernels. But at this point major features are only available on more
recent kernels.  For example, stateful services are only available
starting in kernel 3.10 and STT is available on starting with 3.5.

Since these features are becoming essential to many OVS deployments,
and the effort of maintaining the backports is high. We have decided
to drop support for older kernel. Following patch drops supports
for kernel older than 3.10.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-03-14 09:53:51 -07:00
David Wragg
06f1a61a87 datapath: Set a large MTU on tunnel devices.
Upstream commit:
    Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
    transmit vxlan packets of any size, constrained only by the ability to
    send out the resulting packets.  4.3 introduced netdevs corresponding
    to tunnel vports.  These netdevs have an MTU, which limits the size of
    a packet that can be successfully encapsulated.  The default MTU
    values are low (1500 or less), which is awkwardly small in the context
    of physical networks supporting jumbo frames, and leads to a
    conspicuous change in behaviour for userspace.

    Instead, set the MTU on openvswitch-created netdevs to be the relevant
    maximum (i.e. the maximum IP packet size minus any relevant overhead),
    effectively restoring the behaviour prior to 4.3.

    Signed-off-by: David Wragg <david@weave.works>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Upstream: 7e059158d57b ("vxlan, gre, geneve: Set a large MTU on ovs-created
tunnel devices")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-02-19 11:11:47 -08:00
Joe Stringer
f5284933c8 datapath: Re-designate OVS_FRAGMENT_BACKPORT.
Typically the way that we include backported code is by testing for
existence of the feature in the upstream codebase via header checks,
then attempt to use the upstream code as much as possible. However, for
the IP fragmentation handling backport we have an additional constraint
which is that we cannot support kernels older than Linux-3.10.

To date, OVS_FRAGMENT_BACKPORT has been defined to include the backport
of the IP fragmentation code for all kernels from 3.10 to 4.2, rather
than attempting to use the upstream code as much as possible. This patch
relaxes OVS_FRAGMENT_BACKPORT to only check the lower bound so that the
upstream code may be used in more circumstances.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-04 10:31:52 -08:00
Joe Stringer
0f09d6e30d compat: Detect and use upstream ip_fragment().
Previously a version check was used to determine whether the upstream
ip_fragment() should be used or the backported version. The actual test
is for whether upstream commit d6b915e29f4a ("ip_fragment: don't forward
defragmented DF packet") is present, so test for that instead.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-04 10:31:28 -08:00
Joe Stringer
e0d45da35e compat: Detect and use inet_frag_queue->list_evictor.
Kernels 3.17 to 4.2 have a work queue to evict old fragments, but do not
track these fragments in an eviction list. On these kernels, we detect
the absence of the list_evictor and provide one. This commit fixes the
reliance on kernel versions in the case that this functionality is
backported.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-04 10:30:44 -08:00
Joe Stringer
34c3c2b755 compat: Detect and use nf_ct_frag6_gather().
This function is a likely candidate for backporting, and currently
relies on version checks to include the source or not. Grep for the
appropriate functions instead, and include the backport based on that.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:46 -08:00
Joe Stringer
401da7b982 compat: Detect and use inet_getpeer_v4().
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:46 -08:00
Joe Stringer
1dba169037 compat: Detect and use __skb_dst_copy().
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:46 -08:00
Joe Stringer
459064e7a0 compat: Detect and use nf_connlabels_get().
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:46 -08:00
Joe Stringer
e4e04c3b6b compat: Detect and use nf_ipv6_ops->fragment.
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:46 -08:00
Joe Stringer
ac9cd0d28a compat: Detect and use struct nf_conntrack_zone.
Rather than relying on version checks, detect the presence of this
structure and use it if available.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:46 -08:00
Joe Stringer
d1c390e679 compat: Detect and use inet_frags->lock.
Prior to ab1c724f6330 ("inet: frag: use seqlock for hash rebuild")
upstream, a rwlock was used when rebuilding inet_frags. Rather than
using a version check to detect this, search for it in the header and
enable the code based on whether it exists.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:45 -08:00
Joe Stringer
91408ae003 compat: Detect and use inet_frags->frags_work.
Kernels 3.17 and newer have a work queue to evict old fragments, while
older kernel versions use an LRU in the fast path; see upstream commit
b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue").
This commit fixes the version checking so that rather than enabling the
code for either of these approaches using version checks, it is
triggered based on the presence of the work queue in "struct inet_frags".

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:45 -08:00
Joe Stringer
8f00ece904 compat: Detect and use inet_frag_queue->last_in.
Kernels 3.17 and older have this field, while newer kernels use the
'flags' field. Detect this in the build in case anyone backports this
change to an older kernel.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-02-02 14:26:45 -08:00
Simon Horman
6dbd98e6ca datapath: test for netlink_set_err returning void
In v2.6.33 netlink_set_err returns void. However, 1a50307ba182 ("netlink:
fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()") was backported and
included in v2.6.33.2 and in that and subsequent v2.6.33 stable releases
netlink_set_err returns an int.

It seems plausible that there are other backports floating around. So check
for netlink_set_err returning void rather than including compatibility code
based on the version of the kernel.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-01-28 19:02:54 -08:00
Ilya Maximets
ab2a3154f0 acinclude.m4: Fix dpdk build if -mssse3 not supported.
On arm/arm64:
	gcc: error: unrecognized command line option '-mssse3'

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-01-12 08:26:58 -08:00
Joe Stringer
a94ebc3999 datapath: Add conntrack action
Expose the kernel connection tracker via OVS. Userspace components can
make use of the CT action to populate the connection state (ct_state)
field for a flow. This state can be subsequently matched.

Exposed connection states are OVS_CS_F_*:
- NEW (0x01) - Beginning of a new connection.
- ESTABLISHED (0x02) - Part of an existing connection.
- RELATED (0x04) - Related to an established connection.
- INVALID (0x20) - Could not track the connection for this packet.
- REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
- TRACKED (0x80) - This packet has been sent through conntrack.

When the CT action is executed by itself, it will send the packet
through the connection tracker and populate the ct_state field with one
or more of the connection state flags above. The CT action will always
set the TRACKED bit.

When the COMMIT flag is passed to the conntrack action, this specifies
that information about the connection should be stored. This allows
subsequent packets for the same (or related) connections to be
correlated with this connection. Sending subsequent packets for the
connection through conntrack allows the connection tracker to consider
the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.

The CT action may optionally take a zone to track the flow within. This
allows connections with the same 5-tuple to be kept logically separate
from connections in other zones. If the zone is specified, then the
"ct_zone" match field will be subsequently populated with the zone id.

IP fragments are handled by transparently assembling them as part of the
CT action. The maximum received unit (MRU) size is tracked so that
refragmentation can occur during output.

IP frag handling contributed by Andy Zhou.

Based on original design by Justin Pettit.

Upstream: 7f8a436 "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:17:25 -08:00
Joe Stringer
2d3ef70b74 compat: Backport IPv6 fragmentation.
IPv6 fragmentation functionality is not exported by most kernels, so
backport this code from the upstream 4.3 development tree.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:17:24 -08:00
Joe Stringer
595e069a06 compat: Backport IPv4 reassembly.
Backport IPv4 reassembly from the upstream commit caaecdd3d3f8 ("inet:
frags: remove INET_FRAG_EVICTED and use list_evictor for the test").

This is necessary because kernels prior to upstream commit d6b915e29f4a
("ip_fragment: don't forward defragmented DF packet") would not always
track the maximum received unit size during ip_defrag(). Without the
MRU, refragmentation cannot occur so reassembled packets are dropped.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
213e1f54b4 compat: Wrap IPv4 fragmentation.
Most kernels provide some form of ip fragmentation. However, until
recently many of them would always send ICMP responses for over_MTU
packets, even when operating in bridge mode. Backport the check to
ensure this doesn't occur.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
cfda4537ec compat: Backport ip_skb_dst_mtu().
>From upstream f87c10a8aa1e ("ipv4: introduce ip_dst_mtu_maybe_forward
and protect forwarding path against pmtu spoofing")

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
3f506f0761 compat: Backport dev_recursion_level().
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
f000c8793d compat: Backport prandom_u32_max().
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
0d03e51ca9 compat: Backport 'dst' functions.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
057772cf24 compat: Backport nf_ct_tmpl_alloc().
Loosely based upon Linux commit 0838aa7fcfcd "netfilter: fix netns
dependencies with conntrack templates" and commit 5e8018fc6142
"netfilter: nf_conntrack: add efficient mark to zone mapping".

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:15 -08:00
Pravin B Shelar
e23775f20e datapath: Add support for lwtunnel
Following patch adds support for lwtunnel to OVS datapath.
With this change OVS datapath detect lwtunnel support and
make use of new APIs if available. On older kernel where the
support is not there the backported tunnel modules are used.
These backported tunnel devices acts as lwtunnel devices.
I tried to keep backported module same as upstream for easier
bug-fix backport. Since STT and LISP are not upstream OVS
always needs to use respective modules from tunnel compat layer.
To make it work on kernel 4.3 I have converted STT and LISP
modules to lwtunnel API model.

lwtunnel make use of skb-dst to pass tunnel information to the
tunnel module. On older kernel this is not possible. So the in
case of old kernel metadata ref is stored in OVS_CB and direct
call to tunnel transmit function is made by respective tunnel
vport modules. Similarly on receive side tunnel recv directly
call netdev-vport-receive to pass the skb to OVS.

Major backported components include:
Geneve, GRE, VXLAN, ip_tunnel, udp-tunnels GRO.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2015-12-03 16:30:21 -08:00
Pravin B Shelar
ccb75a285b datapath: Fix compilation on kernel 2.6.32
Fixes following compilation error:

CC [M]  /home/travis/build/openvswitch/ovs/datapath/linux/actions.o

In file included from
/home/travis/build/openvswitch/ovs/datapath/linux/actions.c:21:0:

/home/travis/build/openvswitch/ovs/datapath/linux/compat/include/linux/skbuff.h:
In function ‘rpl_skb_postpull_rcsum’:

/home/travis/build/openvswitch/ovs/datapath/linux/compat/include/linux/skbuff.h:384:4:
error: implicit declaration of function ‘skb_checksum_start_offset’
[-Werror=implicit-function-declaration]

cc1: some warnings being treated as errors

Reported-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
2015-10-09 14:23:21 -07:00