2
0
mirror of https://github.com/openvswitch/ovs synced 2025-10-29 15:28:56 +00:00
Commit Graph

166 Commits

Author SHA1 Message Date
Joe Stringer
a94ebc3999 datapath: Add conntrack action
Expose the kernel connection tracker via OVS. Userspace components can
make use of the CT action to populate the connection state (ct_state)
field for a flow. This state can be subsequently matched.

Exposed connection states are OVS_CS_F_*:
- NEW (0x01) - Beginning of a new connection.
- ESTABLISHED (0x02) - Part of an existing connection.
- RELATED (0x04) - Related to an established connection.
- INVALID (0x20) - Could not track the connection for this packet.
- REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
- TRACKED (0x80) - This packet has been sent through conntrack.

When the CT action is executed by itself, it will send the packet
through the connection tracker and populate the ct_state field with one
or more of the connection state flags above. The CT action will always
set the TRACKED bit.

When the COMMIT flag is passed to the conntrack action, this specifies
that information about the connection should be stored. This allows
subsequent packets for the same (or related) connections to be
correlated with this connection. Sending subsequent packets for the
connection through conntrack allows the connection tracker to consider
the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.

The CT action may optionally take a zone to track the flow within. This
allows connections with the same 5-tuple to be kept logically separate
from connections in other zones. If the zone is specified, then the
"ct_zone" match field will be subsequently populated with the zone id.

IP fragments are handled by transparently assembling them as part of the
CT action. The maximum received unit (MRU) size is tracked so that
refragmentation can occur during output.

IP frag handling contributed by Andy Zhou.

Based on original design by Justin Pettit.

Upstream: 7f8a436 "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:17:25 -08:00
Joe Stringer
2d3ef70b74 compat: Backport IPv6 fragmentation.
IPv6 fragmentation functionality is not exported by most kernels, so
backport this code from the upstream 4.3 development tree.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:17:24 -08:00
Joe Stringer
595e069a06 compat: Backport IPv4 reassembly.
Backport IPv4 reassembly from the upstream commit caaecdd3d3f8 ("inet:
frags: remove INET_FRAG_EVICTED and use list_evictor for the test").

This is necessary because kernels prior to upstream commit d6b915e29f4a
("ip_fragment: don't forward defragmented DF packet") would not always
track the maximum received unit size during ip_defrag(). Without the
MRU, refragmentation cannot occur so reassembled packets are dropped.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
213e1f54b4 compat: Wrap IPv4 fragmentation.
Most kernels provide some form of ip fragmentation. However, until
recently many of them would always send ICMP responses for over_MTU
packets, even when operating in bridge mode. Backport the check to
ensure this doesn't occur.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
cfda4537ec compat: Backport ip_skb_dst_mtu().
>From upstream f87c10a8aa1e ("ipv4: introduce ip_dst_mtu_maybe_forward
and protect forwarding path against pmtu spoofing")

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
3f506f0761 compat: Backport dev_recursion_level().
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
f000c8793d compat: Backport prandom_u32_max().
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
0d03e51ca9 compat: Backport 'dst' functions.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:16 -08:00
Joe Stringer
057772cf24 compat: Backport nf_ct_tmpl_alloc().
Loosely based upon Linux commit 0838aa7fcfcd "netfilter: fix netns
dependencies with conntrack templates" and commit 5e8018fc6142
"netfilter: nf_conntrack: add efficient mark to zone mapping".

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-12-03 17:08:15 -08:00
Pravin B Shelar
e23775f20e datapath: Add support for lwtunnel
Following patch adds support for lwtunnel to OVS datapath.
With this change OVS datapath detect lwtunnel support and
make use of new APIs if available. On older kernel where the
support is not there the backported tunnel modules are used.
These backported tunnel devices acts as lwtunnel devices.
I tried to keep backported module same as upstream for easier
bug-fix backport. Since STT and LISP are not upstream OVS
always needs to use respective modules from tunnel compat layer.
To make it work on kernel 4.3 I have converted STT and LISP
modules to lwtunnel API model.

lwtunnel make use of skb-dst to pass tunnel information to the
tunnel module. On older kernel this is not possible. So the in
case of old kernel metadata ref is stored in OVS_CB and direct
call to tunnel transmit function is made by respective tunnel
vport modules. Similarly on receive side tunnel recv directly
call netdev-vport-receive to pass the skb to OVS.

Major backported components include:
Geneve, GRE, VXLAN, ip_tunnel, udp-tunnels GRO.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2015-12-03 16:30:21 -08:00
Pravin B Shelar
ccb75a285b datapath: Fix compilation on kernel 2.6.32
Fixes following compilation error:

CC [M]  /home/travis/build/openvswitch/ovs/datapath/linux/actions.o

In file included from
/home/travis/build/openvswitch/ovs/datapath/linux/actions.c:21:0:

/home/travis/build/openvswitch/ovs/datapath/linux/compat/include/linux/skbuff.h:
In function ‘rpl_skb_postpull_rcsum’:

/home/travis/build/openvswitch/ovs/datapath/linux/compat/include/linux/skbuff.h:384:4:
error: implicit declaration of function ‘skb_checksum_start_offset’
[-Werror=implicit-function-declaration]

cc1: some warnings being treated as errors

Reported-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
2015-10-09 14:23:21 -07:00
Pravin B Shelar
fdce83a304 datapath: Add support for 4.2 kernel. 2015-09-23 19:41:19 -07:00
Pravin B Shelar
ec96e66376 datapath: Fix compilation on kernel 3.18
Fixes following compilation error:
In file included from ovs/datapath/linux/actions.c:30: ovs/datapath/linux/compat/include/linux/if_vlan.h:65:
error: redefinition of ‘__vlan_hwaccel_push_inside’ include/linux/if_vlan.h:353: note: previous definition of
‘__vlan_hwaccel_push_inside’ was here ovs/datapath/linux/compat/include/linux/if_vlan.h:83:

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-09-22 14:21:25 -07:00
Joe Stringer
c0cddcec39 datapath: Add support for 4.1 kernel.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-09-18 13:27:24 -07:00
Ciara Loftus
613750abeb configure: Fix DPDK linking when using a relative path
When linking with DPDK, if a relative path is used with the
'--with-dpdk' flag, then OVS will always be compiled with vHost Cuse
support, even if it is not enabled in the DPDK build.
This patch fixes this problem, and enables the correct version of
vHost despite whether or not a relative or absolute path is used.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-09-16 10:19:18 -07:00
Jiri Benc
dd693f9bf2 datapath: Use netlink ipv4 API to handle the ipv4 addr attributes.
upstream: ("netlink: implement nla_put_in_addr and nla_put_in6_addr")
upstream: ("netlink: implement nla_get_in_addr and nla_get_in6_addr")
IP addresses are often stored in netlink attributes. Add generic functions
to do that.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-09-14 10:44:57 -07:00
Timo Puha
18f777b287 dpdk: add support for v2.1.0
Update relevant artifacts to add support for DPDK v2.1.0
 - INSTALL.DPDK.md
 - acinclude.m4: Change DPDK library name
 - netdev-dpdk: Limit minimum mbuf size to to adapt to DPDK bug fix that
   changes the treatment of the requested mbuf size
 - build.sh: Change DPDK version number

Note that this breaks compatibility with DPDK v2.0.0 although only
for the library name change.

Note that throughput for vhost ports with mergeable buffers is reduced
about 10% due to a necessary bug fix in DPDK vhost code.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Signed-off-by: Timo Puha <timox.puha@intel.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2015-09-08 16:20:18 +01:00
Flavio Leitner
572e54faff datapath: check for rx handler register
Red Hat Enterprise Linux 6 has backported the netdev RX
handler facility so use the netdev_rx_handler_register as
an indicator.

The handler prototype changed between 2.6.36 and 2.6.39
since there could be backports in any stage, don't look
at the kernel version, but at the prototype.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2015-08-31 12:48:16 -07:00
Flavio Leitner
21a719d658 datapath: check for el6 kernels for per_cpu
The OVS hook has been backported so it doesn't work to
decide per_cpu work arounds.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2015-08-28 09:47:54 -07:00
Flavio Leitner
920fc09189 datapath: check for backported ip_is_fragment
Red Hat Enterprise Linux 6 has backported it from upstream,
so check for ip_is_fragment instead of kernel version.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2015-08-28 09:45:34 -07:00
Flavio Leitner
46d69d187e datapath: check for backported proto_ports_offset
Red Hat Enterprise Linux 6 has backported it from upstream,
so check for proto_ports_offset instead of kernel version.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2015-08-28 09:44:48 -07:00
Flavio Leitner
75e93287db datapath: improve l4_rxhash regex
Red Hat Enterprise Linux 6 has a comment saying
that it doesn't support l4_rxhash which matches
the current grep regex.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2015-08-28 09:43:22 -07:00
Gary Mussar
693e113840 dpdk: Fix detection of vhost_cuse in dpdk rte_config.h
Dpdk allows users to create a config that includes other config files and
then override values.

Eg.
defconfig_x86_64-native_vhost_cuse-linuxapp-gcc:

CONFIG_RTE_BUILD_COMBINE_LIBS=y
CONFIG_RTE_BUILD_SHARED_LIB=n
CONFIG_RTE_LIBRTE_VHOST=y
CONFIG_RTE_LIBRTE_VHOST_USER=n

This allows you to have both a vhostuser and vhostcuse config in the same
source tree without the need to replicate everything in those config files
just to change a couple of settings. The resultant .config file has all of
the settings from the included files with the updated settings at the end.
The resultant rte_config.h contains multiple undefs and defines for the
overridden settings.

Eg.
  > grep RTE_LIBRTE_VHOST_USER x86_64-native_vhost_cuse-linuxapp-gcc/include/rte_config.h
  #undef RTE_LIBRTE_VHOST_USER
  #define RTE_LIBRTE_VHOST_USER 1
  #undef RTE_LIBRTE_VHOST_USER

The current mechanism to detect the RTE_LIBRTE_VHOST_USER setting merely
greps the rte_config.h file for the string "define RTE_LIBRTE_VHOST_USER 1"
rather than the final setting of RTE_LIBRTE_VHOST_USER. The following patch
changes this test to detect the final setting of RTE_LIBRTE_VHOST_USER.

Signed-off-by: Gary Mussar <gmussar@ciena.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com
2015-07-28 14:00:54 -07:00
Joe Stringer
83a278847d acinclude: Silence OVS_FIND_FIELD_IFELSE.
Fields found using OVS_FIND_FIELD_IFELSE would previously be printed in
the console during configure. Clean up the output.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-07-20 11:29:49 -07:00
Daniele Di Proietto
10d01ad215 acinclude: Require libfuse only for DPDK with vhost-cuse.
DPDK with vhost-user doesn't require libfuse, so we shouldn't link OVS
with libfuse unless DPDK is built with vhost-cuse support.

CC: Rapelly, Varun <vrapelly@sonusnet.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-07-16 13:34:18 -07:00
Ciara Loftus
7d1ced0177 netdev-dpdk: add dpdk vhost-user ports
This patch adds support for a new port type to the userspace
datapath called dpdkvhostuser.

A new dpdkvhostuser port will create a unix domain socket which
when provided to QEMU is used to facilitate communication between
the virtio-net device on the VM and the OVS port on the host.

vhost-cuse ('dpdkvhost') ports are still available as 'dpdkvhostcuse'
ports and will be enabled if vhost-cuse support is detected in the
DPDK build specified during compilation of the switch. Otherwise,
vhost-user ports are enabled.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2015-06-14 20:36:52 -07:00
Pravin B Shelar
d23239a29f datapath: backport kfree_skb_list()
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-05-15 21:10:22 -07:00
Joe Stringer
3afcde4381 datapath: Add support for 4.0 kernel.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-05-08 10:17:17 -07:00
Pravin B Shelar
cd7330d06f datapath: stt compatibility for RHEL7
RHEL7 backported nf_hookfn from newer kernel. Handle compatibility
by checking nf_hookfn declaration.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-05-04 13:21:24 -07:00
Mark Kavanagh
543342a41c DPDK: add support for v2.0.0
Update relevant artifacts to add support for DPDK v2.0.0
 - INSTALL.DPDK.md
 - travis build script
 - acinclude.m4: add 'mssse3' flag to OVS_CFLAGS
 - netdev-dpdk: fix build with unified offload types in DPDK v2.0.0

Note that this breaks compatibility with DPDK v1.8.0

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-29 20:49:37 -07:00
Alex Wang
2985381423 datapath: Remove linux/compat/include/linux/log2.h.
No longer need this compat file, we can use the upstream version
of the function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-04-26 09:32:41 -07:00
YAMAMOTO Takashi
c2deac545b configure: Fix -Werror build for NetBSD + clang
On NetBSD, clang (clang-3.5.0 from pkgsrc) complains
when "clang -g" is used for linking.  Specify -Qunused-arguments
to suppress the warning.

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-04-20 11:20:58 +09:00
Ben Pfaff
416e71322f acinclude: Always assume buggy strtok_r() for glibc < 2.8.
Lately our internal build system has been seeing intermittent failures that
I can't explain.  With old glibc versions, the "configure" time check will
pass, but the equivalent (almost identical) "make check" test will fail.
One possibility, I guess, is that occasionally address space randomization
will put valid data at the 0xc0ffee address that the test assumes will
segfault, and another is that some change in compiler optimization flags
is making a difference.  At any rate, I think it's safe to just always
assume that this strtok_r() bug is present whenever glibc before 2.8 is
in use.

Specifically we've seen this happen intermittently when building against
the XenServer DDK 5.6.100 build 39265, which uses glibc 2.5.

Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
2015-04-03 17:19:20 -07:00
Joe Stringer
13dd4a9738 compat: Fix RHEL7 build.
Tested against 3.10.0-229.el7.x86_64.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-03-25 17:07:01 -07:00
Daniele Di Proietto
3c94676c37 acinclude.m4: Restore --whole-archive option for DPDK.
The --whole-archive option is required to link vswitchd with DPDK,
otherwise the PMD drivers are not going to be included.  Omitting the
option is not going to cause build failures, but OVS won't be able to
use most physical NICs.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-03-20 09:55:20 -07:00
Kevin Traynor
58397e6c1e netdev-dpdk: add dpdk vhost-cuse ports
This patch adds support for a new port type to userspace datapath
called dpdkvhost. This allows KVM (QEMU) to offload the servicing
of virtio-net devices to its associated dpdkvhost port. Instructions
for use are in INSTALL.DPDK.

This has been tested on Intel multi-core platforms and with clients
that have virtio-net interfaces.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com>
Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2015-03-19 20:26:03 -07:00
Joe Stringer
c623ba497a compat: Add genlmsg_parse() helper function.
The first user will be the next patch.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 10:58:37 -08:00
Jesse Gross
9ffdbf4119 datapath: Enable OVS GSO to be used up to 3.18 if necessary.
There are two important GSO tunnel features that were introduced
after the 3.12 cutoff for our current out of tree GSO implementation:
 * 3.16 introduced support for outer UDP checksums.
 * 3.18 introduced support for verifying hardware support for protocols
   other than VXLAN.

In cases where these features are used, we should use OVS GSO to
ensure correct behavior. However, we also want to continue to use
kernel GSO or hardware TSO in existing situations. Therefore, this
extends the range of kernels where OVS GSO is available to 3.18 and
makes it easier to select which one to use.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-02-20 15:34:25 -08:00
Thomas Graf
adfaaeaced datapath: Allow building against 3.19.x
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-02-06 21:10:44 +01:00
Thomas Graf
6233a1bdf1 datapath: Account for "genetlink: pass only network namespace to genl_has_listeners()"
Upstream commit:
    genetlink: pass only network namespace to genl_has_listeners()

    There's no point to force the caller to know about the internal
    genl_sock to use inside struct net, just have them pass the network
    namespace. This doesn't really change code generation since it's
    an inline, but makes the caller less magic - there's never any
    reason to pass another socket.

    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Upstream: f8403a2 ("genetlink: pass only network namespace to genl_has_listeners()")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-02-03 22:31:20 +01:00
Thomas Graf
3174a818a1 datapath: Account for "vxlan: Group Policy extension"
Upstream commit:
    vxlan: Group Policy extension

    Implements supports for the Group Policy VXLAN extension [0] to provide
    a lightweight and simple security label mechanism across network peers
    based on VXLAN. The security context and associated metadata is mapped
    to/from skb->mark. This allows further mapping to a SELinux context
    using SECMARK, to implement ACLs directly with nftables, iptables, OVS,
    tc, etc.

    The group membership is defined by the lower 16 bits of skb->mark, the
    upper 16 bits are used for flags.

    SELinux allows to manage label to secure local resources. However,
    distributed applications require ACLs to implemented across hosts. This
    is typically achieved by matching on L2-L4 fields to identify the
    original sending host and process on the receiver. On top of that,
    netlabel and specifically CIPSO [1] allow to map security contexts to
    universal labels.  However, netlabel and CIPSO are relatively complex.
    This patch provides a lightweight alternative for overlay network
    environments with a trusted underlay. No additional control protocol
    is required.

               Host 1:                       Host 2:

          Group A        Group B        Group B     Group A
          +-----+   +-------------+    +-------+   +-----+
          | lxc |   | SELinux CTX |    | httpd |   | VM  |
          +--+--+   +--+----------+    +---+---+   +--+--+
          \---+---/                     \----+---/
              |                              |
          +---+---+                      +---+---+
          | vxlan |                      | vxlan |
          +---+---+                      +---+---+
              +------------------------------+

    Backwards compatibility:
    A VXLAN-GBP socket can receive standard VXLAN frames and will assign
    the default group 0x0000 to such frames. A Linux VXLAN socket will
    drop VXLAN-GBP  frames. The extension is therefore disabled by default
    and needs to be specifically enabled:

       ip link add [...] type vxlan [...] gbp

    In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket
    must run on a separate port number.

    Examples:
     iptables:
      host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200
      host2# iptables -I INPUT -m mark --mark 0x200 -j DROP

     OVS:
      # ovs-ofctl add-flow br0 'in_port=1,actions=load:0x200->NXM_NX_TUN_GBP_ID[],NORMAL'
      # ovs-ofctl add-flow br0 'in_port=2,tun_gbp_id=0x200,actions=drop'

    [0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy
    [1] http://lwn.net/Articles/204905/

    Signed-off-by: Thomas Graf <tgraf@suug.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Upstream: 351149 ("vxlan: Group Policy extension")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-02-03 21:56:31 +01:00
Thomas Graf
1dfb9f31f3 datapath: replace remaining users of arch_fast_hash with jhash
This patch effectively reverts commit 500f80872645 ("net: ovs: use CRC32
accelerated flow hash if available"), and other remaining arch_fast_hash()
users such as from nfsd via commit 6282cd565553 ("NFSD: Don't hand out
delegations for 30 seconds after recalling them.") where it has been used
as a hash function for bloom filtering.

While we think that these users are actually not much of concern, it has
been requested to remove the arch_fast_hash() library bits that arose
from [1] entirely as per recent discussion [2]. The main argument is that
using it as a hash may introduce bias due to its linearity (see avalanche
criterion) and thus makes it less clear (though we tried to document that)
when this security/performance trade-off is actually acceptable for a
general purpose library function.

Lets therefore avoid any further confusion on this matter and remove it to
prevent any future accidental misuse of it. For the time being, this is
going to make hashing of flow keys a bit more expensive in the ovs case,
but future work could reevaluate a different hashing discipline.

  [1] https://patchwork.ozlabs.org/patch/299369/
  [2] https://patchwork.ozlabs.org/patch/418756/

Upstream: 8754589 ("net: replace remaining users of arch_fast_hash with jhash")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-01-07 12:55:49 +01:00
Thomas Graf
97894370f5 datapath: move vlan pop/push functions into common code
So it can be used from out of openvswitch code.
Did couple of cosmetic changes on the way, namely variable naming and
adding support for 8021AD proto.

Note on backwards compatability:
Unlike the upstream version, the backport of skb_vlan_push() does not
support translating a hardware accelerated 8021AD tag to software.
This is not a problem though as it preserves existing behaviour.

Upstream: 93515d53 ("net: move vlan pop/push functions into common code")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-01-07 12:55:49 +01:00
Thomas Graf
5cce04b6f6 datapath: move make_writable helper into common code
note that skb_make_writable already exists in net/netfilter/core.c
but does something slightly different.

Upstream: e219512 ("net: move make_writable helper into common code")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-01-07 12:55:49 +01:00
Thomas Graf
17e3889fd1 datapath: Add __vlan_insert_tag() compat helper if not available
Since older kernels do not have skb->vlan_proto, it is assumed that
kernels which don't provide their own __vlan_insert_tag() will also
not have skb->vlan_proto. The backwards compat function therefore
only supports ETH_P_8021Q as the protocol type.

Upstream: 15255a43 ("vlan: introduce __vlan_insert_tag helper which does not free skb")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-01-07 12:55:49 +01:00
Thomas Graf
1f649f1c8d datapath: Account for rename to vlan_insert_tag_set_proto()
__vlan_put_tag() was renamed to vlan_insert_tag_set_proto() with
the argument list kept intact.

Upstream: 62749e ("vlan: rename __vlan_put_tag to vlan_insert_tag_set_proto")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-01-07 12:55:49 +01:00
Thomas Graf
4510f85327 datapath: Mark compatible with kernels up to 3.18.x
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-12-09 17:09:28 -08:00
Thomas Graf
0fcc086dc7 datapath: Check if nla_is_last() is available in <net/netlink.h>
nla_is_last() is not available in 3.18, it's only in net-next.
Convert to grep based to check to account for distribution backports.

Fixes: 684b5f ("datapath: Rename last_action() as nla_is_last() and move to netlink.h")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-12-09 17:09:26 -08:00
Pravin B Shelar
8abaa53cac datapath: Use upstream ipv6_find_hdr().
ipv6_find_hdr() already fixed in newer upstram kernel by Ansis, we
can start using this API safely.
This patch also backports fix (ipv6: ipv6_find_hdr restore prev
functionality) to compat ipv6_find_hdr().

CC: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2014-10-23 19:09:23 -07:00
maryam.tahhan
b35839f385 netdev-dpdk: Move to DPDK 1.7.1
This patch updates the documentation to reflect that DPDK 1.7.1
is supported. Travis scripts have also been updated to reflect
this. DPDK phy and ring ports were validated against DPDK 1.7.1.

Reviewed-by: Mark D. Gray <mark.d.gray@intel.com>
Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-10-20 11:19:42 -07:00