2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 05:18:13 +00:00

17132 Commits

Author SHA1 Message Date
Timothy Redaelli
fce20b8b73 ovs-ctl: Permit to specify additional options
Currently using ovs-ctl is not possible to specify additional options
for ovs-vswitchd and ovsdb-server (for example to specify a different
loglevel during daemon startup).

This patch adds --ovs-vswitchd-options and --ovsdb-server-options
options to ovs-ctl in order to specify the additional options.

Due to word splitting it may not be possible to specify an option that
includes whitespaces.

Reported-at: https://bugzilla.redhat.com/1664794
Reported-by: Matt Flusche <mflusche@redhat.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 17:52:17 -08:00
Daniel Alvarez
612f80fa8e ovn: change load balancer references to weak in NB schema
When a load balancer is added to multiple logical switches
and routers it has be to be removed from all of them before
being able to delete due to the current 'strong' reference
in the NB schema.

By changing it to 'weak', users can simply remove the load
balancer without having to remove all the references manually.
In particular, this will make things easier for networking-ovn,
the OpenStack integration project as it'll save some
calculations upon load balancer deletion.

The update path has been successfully from the previous version
of the schema.

Acked-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:45:39 -08:00
Timothy Redaelli
1f6daa7311 ovs-lib.in: Cleanup old socket and pidfiles in stop_daemon
Currently if a client crashes (signal 11) the unix socket (.ctl) and the
pidfile may not be deleted when you use ovs-ctl stop or restart.

Moreover since ovs-appctl is used on a closed socket some warnings are
printed.

This commit deletes the pidfile and the unix socket then returns without
running ovs-appctl if the pidfile point to a not-existing pid.

Reported-at: https://bugzilla.redhat.com/1667845
Reported-by: Candido Campos <ccamposr@redhat.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:04 -08:00
Ilya Maximets
a8b12ea59c travis: Drop redundant DPDK build check.
This check covered by 'TESTSUITE=1 DPDK=1'.
No need to run it separately.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:04 -08:00
Ilya Maximets
0fdc054fd1 travis: Use parallel jobs for DPDK and sparse builds.
This allows to save a few minutes.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:04 -08:00
Ilya Maximets
13429e3299 travis: Enable printing of executed commands.
This increases the output by a few lines, but gives important
information regarding commands and their exact arguments.
Very useful for debugging.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:04 -08:00
Ilya Maximets
ad68ace08e travis: Dump config.log on configure failures.
Useful for debugging.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:04 -08:00
Ilya Maximets
8e7d5b6434 travis: Run testsuite with desired options.
'make distcheck' executes it's own './configure' without any options
provided to the script. This means that in current configuration
Travis CI always re-builds and runs testsuite on a defualt binaries.
i.e. we're not checking testsuite with DPDK, not checking testsuite
with '--enable-shared' and not checking it with '-ljemalloc'.
We just 8 times running the testsuite without arguments. Only compiler
changes (gcc or clang) because CC is exported by Travis.

This patch reorders the commands in the build script and provides
'DISTCHECK_CONFIGURE_FLAGS' to force 'make distcheck' using our
desired configuration.

Another issue that addressed here is that we will no longe build
twice in case of TESTSUITE.

For linking inside the distcheck we also need to provide absulute path
to DPDK libraries.

'configure' executed before 'distcheck' to have a Makefile target.
It's executed without arguments because 'configure' inside the
'distcheck' will fail if we'll use sparse-wrapped CC.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:04 -08:00
Ilya Maximets
914ea15d85 automake: Clean up cxxtest.cc.
'distcheck' complains on some configurations:

  ERROR: files left in build directory after distclean:
  ./include/openvswitch/cxxtest.cc

CC: Ben Pfaff <blp@ovn.org>
Fixes: 994bfc298502 ("Automatically verify that OVS header files work OK in C++ also.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:04 -08:00
Ilya Maximets
b95cd0e505 datapath: Clean up some gcov, tmp and cache files.
'distcheck' complains about these files while building --with-linux.

  ERROR: files left in build directory after distclean:
  ./datapath/linux/.tmp_ip6_gre.gcno
  ./datapath/linux/.tmp_ip_tunnels_core.gcno
  ./datapath/linux/.tmp_genetlink-openvswitch.gcno
  ./datapath/linux/.tmp_stt.gcno
  <..>
  ./datapath/linux/.tmp_versions/vport-gre.mod
  ./datapath/linux/.tmp_versions/vport-geneve.mod
  ./datapath/linux/.tmp_versions/vport-vxlan.mod
  ./datapath/linux/.tmp_versions/vport-lisp.mod
  ./datapath/linux/.tmp_versions/vport-stt.mod
  <..>
  ./datapath/linux/.dev-openvswitch.o.d
  ./datapath/linux/.ip_tunnels_core.o.d
  ./datapath/linux/.vport.o.d
  ./datapath/linux/.udp_tunnel.o.d
  ./datapath/linux/.cache.mk

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:04 -08:00
Ilya Maximets
b6b4f5b8cf travis: Fix building datapath instead of userspace with DPDK_SHARED.
Current script does not check build of OVS with DPDK.
It builds datapath instead.

CC: Ian Stokes <ian.stokes@intel.com>
Fixes: edfe8d263d2e ("travis: Add dpdk shared library build.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 08:44:03 -08:00
Adi Nissim
0227bf092e lib/tc: Support optional tunnel id
Currently the TC tunnel_key action is always
initialized with the given tunnel id value. However,
some tunneling protocols define the tunnel id as an optional field.

This patch initializes the id field of tunnel_key:set and tunnel_key:unset
only if a value is provided.

In the case that a tunnel key value is not provided by the user
the key flag will not be set.

Signed-off-by: Adi Nissim <adin@mellanox.com>
Acked-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-02-11 11:02:43 +01:00
Ilya Maximets
47ab42a0b5 acinclude: Drop DPDK_EXTRA_LIB variable.
AC_SEARCH_LIBS enables the libraries itself:

  checking for library containing get_mempolicy... -lnuma
  checking for library containing pcap_dump... -lpcap

So, they are available in LIBS. No need to add them twice.

Also, DPDK_EXTRA_LIB doesn't even work, because each check overwrites
the variable instead of appending the new library. It was first time
misused while making libnuma optional and copy-pasted to several places
after that.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-07 14:50:19 +00:00
Ian Stokes
a41b2aea4c AUTHORS: Add Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-06 12:26:16 +00:00
Ian Stokes
1d6139a487 AUTHORS: Add Asaf Penso <asafp@mellanox.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-06 12:06:33 +00:00
Asaf Penso
fa2353af7e netdev-dpdk: Memset rte_flow_item on a need basis.
In netdev_dpdk_add_rte_flow_offload function different rte_flow_item are
created as part of the pattern matching.

For most of them, there is a check whether the wildcards are not zero.
In case of zero, nothing is being done with the rte_flow_item.

Befor the wildcard check, and regardless of the result, the
rte_flow_item is being memset.

The patch moves the memset function only if the condition of the
wildcard is true, thus saving cycles of memset if not needed.

Signed-off-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-06 11:41:21 +00:00
Ben Pfaff
49d6aafb9c ofproto: Don't always treat passive controllers as "equal".
If a passive controller chooses to configure itself as a slave controller,
I don't know a reason why it should be considered "equal" for some
purposes.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-05 13:44:11 -08:00
Ben Pfaff
c66be90bd9 vswitchd: Allow user to configure controllers as "primary" or "service".
Normally it makes sense for an active connection to be primary and a
passive connection to be a service connection, but I've run into a corner
case where it is better for a passive connection to be a primary
connection.  This specific case is for use with OFtest, which expects to be
a primary controller.  However, it also wants to reconnect frequently,
which is slow for active connections because of the backoff; by
configuring a passive, primary controller, OFtest can reconnect as
frequently and as quickly as it wants, making the overall test much faster.

Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-05 13:44:07 -08:00
Ben Pfaff
29718ad49d Remove support for OpenFlow 1.6 (draft).
ONF abandoned the OpenFlow specification, so that OpenFlow 1.6 will never
be completed.  It did not contain much in the way of useful features, so
remove what support Open vSwitch already had.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2019-02-05 09:21:19 -08:00
Ben Pfaff
a0baa7dfa4 connmgr: Make treatment of active and passive connections more uniform.
Until now, connmgr has handled active and passive OpenFlow connections in
quite different ways.  Any active connection, whether it was currently
connected or not, was always maintained as an ofconn.  Whenever such a
connection (re)connected, its settings were cleared.  On the other hand,
passive connections had a separate listener which created an ofconn when
a new connection came in, and these ofconns would be deleted when such a
connection was closed.  This approach is inelegant and has occasionally
led to bugs when reconnection didn't clear all of the state that it
should have.

There's another motivation here.  Currently, active connections are
always primary controllers and passive connections are always service
controllers (as documented in ovs-vswitchd.conf.db(5)).  Sometimes it would
be useful to have passive primary controllers (maybe active service
controllers too but I haven't personally run into that use case).  As is,
this is difficult to implement because there is so much different code in
use between active and passive connections.  This commit will make it
easier.

Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-05 08:46:59 -08:00
Ilya Maximets
ae6e4f12fc travis: Speed up linux kernel downloads.
CDN links are much faster in average. https://www.kernel.org/
links shows usually less than 10 MB/s, while https://cdn.kernel.org/
could give up to 200 MB/s and usually shows speeds much higher than
10 MB/s. Also, 'xz' archives are 30-50 MB smaller than gzip ones.
It takes a bit more time to unpack them, but it's negligible in
compare with download time.

For exmaple,
  linux-3.16.54.tar.gz - 122064395 (116M)
  linux-3.16.54.tar.xz -  81057528 (77M)

'xz' archive download via CDN link is the default way for kernel
downloading that provided by the kernel.org.

Some exmaples from Travis builds:
Before:

  100%[==========================>] 122,064,395 3.11MB/s   in 36s
  (3.23 MB/s) - 'linux-3.16.54.tar.gz' saved [122064395/122064395]

  100%[==========================>] 157,764,715 7.16MB/s   in 24s
  (6.28 MB/s) - 'linux-4.17.14.tar.gz' saved [157764715/157764715]

After:

  100%[==========================>] 81,057,528  95.0MB/s   in 0.8s
  (95.0 MB/s) - 'linux-3.16.54.tar.xz' saved [81057528/81057528]

  100%[==========================>] 102,195,552  218MB/s   in 0.4s
  (218 MB/s) - 'linux-4.17.14.tar.xz' saved [102195552/102195552]

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-05 06:45:54 -08:00
psiyengar
731dbbbe04 Fix OpenFlow v1.3.4 Conf test failures: 430.500, 430.510
This commit adds additional verification to nx_pull_header__()
in lib/nx-match.c to distinguish between bad match and bad action
header conditions and return the appropriate error type/code.

Signed-off-by: Prashanth Iyengar <prashanth_iyengar@alliedtelesis.com>
Reviewed-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz>
Reviewed-by: Rahul Gupta <Rahul_Gupta@alliedtelesis.com>
2019-02-04 16:33:30 -08:00
Ilya Maximets
7f289e0212 skiplist: Drop data comparison in skiplist_delete.
Current version of 'skiplist_delete' uses data comparator to check
if the node that we're removing exists on current level. i.e. our
node 'x' is the next of update[i] on the level i.
But it's enough to just check pointers for that purpose.

Here is the small example of how the data structures looks at
this moment:

        i   a   b   c        x        d   e   f
        0  [ ]>[ ]>[*] ---> [ ] ---> [#]>[ ]>[ ]
        1  [ ]>[*] -------> [ ] -------> [#]>[ ]
        2  [ ]>[*] -------> [ ] -----------> [#]
        3  [ ]>[*] ------------------------> [ ]
        4  [*] ----------------------------> [ ]

                        0  1  2  3  4
           update[] = { c, b, b, b, a }
        x.forward[] = { d, e, f }

        c.forward[0] = x
        b.forward[1] = x
        b.forward[2] = x
        b.forward[3] = f
        a.forward[4] = f

  Target:

        i   a   b   c                 d   e   f
        0  [ ]>[ ]>[*] ------------> [#]>[ ]>[ ]
        1  [ ]>[*] --------------------> [#]>[ ]
        2  [ ]>[*] ------------------------> [#]
        3  [ ]>[*] ------------------------> [ ]
        4  [*] ----------------------------> [ ]

        c.forward[0] = x.forward[0] = d
        b.forward[1] = x.forward[1] = e
        b.forward[2] = x.forward[2] = f
        b.forward[3] = f
        a.forward[4] = f

i.e. we're updating forward pointers while update[i].forward[i] == x.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 16:22:04 -08:00
Ben Pfaff
29e41156d6 skiplist: Remove 'height' from skiplist_node.
This member was write-only: it was initialized and never used later on.

Thanks to Esteban Rodriguez Betancourt <estebarb@hpe.com> for the
following additional rationale:

    In this case you are right, the "height" member is not only not
    used, it is in fact not required, and can be safely removed,
    without causing security issues.

    The code can't read past the end of the 'forward' array because
    the skiplist "level" member, that specifies the maximum height of
    the whole skiplist.

    The "level" field is updated in insertions and deletions, so that
    in insertion the root node will point to the newly created item
    (if there isn't a list there yet). At the deletions, if the
    deleted item is the last one at that height then the root is
    modified to point to NULL at that height, and the whole skiplist
    height is decremented.

    For the forward_to case:

    - If a node is found in a list of level/height N, then it has
      height N (that's why it was inserted in that list)

    - forward_to travels throught nodes in the same level, so it is
      safe, as it doesn't go up.

    - If a node has height N, then it belongs to all the lists
      initiated at root->forward[n, n-1 ,n-2, ..., 0]

    - forward_to may go to lower levels, but that is safe, because of
      previous point.

    So, the protection is given by the "level" field in skiplist root
    node, and it is enough to guarantee that the code won't go off
    limits at 'forward' array. But yes, the height field is unused,
    unneeded, and can be removed safely.

CC: Esteban Rodriguez Betancourt <estebarb@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 16:22:01 -08:00
Darrell Ball
c3f6bae258 conntrack: Fix possibly uninitialized memory.
There are a few cases where struct 'conn_key' padding may be unspecified
according to the C standard.  Practically, it seems implementations don't
have issue, but it is better to be safe. The code paths modified are not
hot ones.  Fix this by doing a memcpy in these cases in lieu of a
structure copy.

Found by inspection.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 16:19:17 -08:00
Li RongQing
11e4765329 flow: fix a possible memory leak in parse_ct_state
state_s should be freed always before exit parse_ct_state

Fixes: b4293a336d8d ("conntrack: Move ct_state parsing to lib/flow.c")
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 16:13:08 -08:00
Ashish Varma
3782d8efb5 ofproto-dpif-trace: Fix for the segmentation fault in ofproto_trace().
Added the check for NULL in "next_ct_states" argument passed to the
"ofproto_trace()" function. Under normal scenario, this is non-NULL. A NULL
"next_ct_states" argument is passed from the "upcall_xlate()" function on
encountering XLATE_RECURSION_TOO_DEEP or XLATE_TOO_MANY_RESUBMITS error.

VMware-BZ: #2282287
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 15:54:56 -08:00
Yi-Hung Wei
ada5efce10 datapath: Fix IPv6 later frags parsing
Upstream commit:
    commit 41e4e2cd75346667b0c531c07dab05cce5b06d15
    Author: Yi-Hung Wei <yihung.wei@gmail.com>
    Date:   Thu Jan 3 09:51:57 2019 -0800

    openvswitch: Fix IPv6 later frags parsing

    The previous commit fa642f08839b
    ("openvswitch: Derive IP protocol number for IPv6 later frags")
    introduces IP protocol number parsing for IPv6 later frags that can mess
    up the network header length calculation logic, i.e. nh_len < 0.
    However, the network header length calculation is mainly for deriving
    the transport layer header in the key extraction process which the later
    fragment does not apply.

    Therefore, this commit skips the network header length calculation to
    fix the issue.

    Reported-by: Chris Mi <chrism@mellanox.com>
    Reported-by: Greg Rose <gvrose8192@gmail.com>
    Fixes: fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags")
    Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Fixes: 9a4ab6da01f7 ("datapath: Derive IP protocol number for IPv6 later frags")
Cc: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 13:45:23 -08:00
Yi-Hung Wei
0b37a7e19e datapath: Derive IP protocol number for IPv6 later frags
Upstream commit:
    commit fa642f08839bf2ff35b2f6c6a6c062aee8121ba8
    Author: Yi-Hung Wei <yihung.wei@gmail.com>
    Date:   Tue Sep 4 15:33:41 2018 -0700

    openvswitch: Derive IP protocol number for IPv6 later frags

    Currently, OVS only parses the IP protocol number for the first
    IPv6 fragment, but sets the IP protocol number for the later fragments
    to be NEXTHDF_FRAGMENT.  This patch tries to derive the IP protocol
    number for the IPV6 later frags so that we can match that.

    Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

CC: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 13:45:22 -08:00
Ross Lagerwall
29478e6c72 datapath: Avoid OOB read when parsing flow nlattrs
Upstream commit:
    commit 04a4af334b971814eedf4e4a413343ad3287d9a9
    Author: Ross Lagerwall <ross.lagerwall@citrix.com>
    Date:   Mon Jan 14 09:16:56 2019 +0000

    openvswitch: Avoid OOB read when parsing flow nlattrs

    For nested and variable attributes, the expected length of an attribute
    is not known and marked by a negative number.  This results in an OOB
    read when the expected length is later used to check if the attribute is
    all zeros. Fix this by using the actual length of the attribute rather
    than the expected length.

    Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 13:45:18 -08:00
Yifeng Sun
7c84d7f481 datapath: Add support for kernel 4.18.x
No code changing is necessary to support 4.18.x.

Only one kernel test failed and it is in the process of being fixed.

Updated .travis.yml to include 4.18.x and also use latest 4.17 version.
Updated test files to test 4.18 kernel.

Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 13:42:02 -08:00
Yifeng Sun
7c5793e62a dpif-netlink: Fix a bug that causes duplicate key error in datapath
Kmod tests 122 and 123 failed and kernel reports a "Duplicate key of
type 6" error. Further debugging reveals that nl_attr_find__() should
start looking for OVS_KEY_ATTR_ETHERTYPE from offset returned by
a previous called nl_msg_start_nested(). This patch fixes it.

Tests 122 and 123 were skipped by kernel 4.15 and older versions.
Kernel 4.16 and later kernels start showing this failure.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 13:37:48 -08:00
Yifeng Sun
fcfd14ce3a test: Fix failed test "flow resume with geneve tun_metadata"
Test "flow resume with geneve tun_metadata" failed because there is
no controller running to handle the continuation message. A previous
commit deleted the line that starts ovs-ofctl as a controller in
order to avoid a race condition on monitor log. This patch adds
back this line but omits the log file because this test doesn't
depend on the log file.

Fixes: e8833217914f9c071c49 ("system-traffic.at: avoid a race condition on monitor log")
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
CC: David Marchand <david.marchand@redhat.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 13:35:55 -08:00
Vishal Deep Ajmera
9b2b84973d Support for match & set ICMPv6 reserved and options type fields
Currently OVS supports all ARP protocol fields as OXM match fields to
implement the relevant ARP procedures for IPv4. This includes support
for matching copying and setting ARP fields. In IPv6 ARP has been
replaced by ICMPv6 neighbor discovery (ND) procedures, neighbor
advertisement and neighbor solicitation.

The support for ICMPv6 fields in OVS is not complete for the use cases
equivalent to ARP in IPv4. OVS lacks support for matching, copying and
setting the “ND option type” and “ND reserved” fields. Without these user
cannot implement all ICMPv6 ND procedures for IPv6 support.

This commit adds additional OXM fields to OVS for ICMPv6 “ND option type“
and ICMPv6 “ND reserved” using the OXM extension mechanism. This allows
support for parsing these fields from an ICMPv6 packet header and extending
the OpenFlow protocol with specifications for these new OXM fields for
matching, copying and setting.

Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Co-authored-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com>
Signed-off-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 13:34:41 -08:00
Yifeng Sun
401eacfb22 odp-util: Stop parse odp actions if nlattr is overflow
`encap = nl_msg_start_nested(key, OVS_KEY_ATTR_ENCAP)` ensures that
key->size >= (encap + NLA_HDRLEN), so the `if` statement is safe.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11306
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 12:41:42 -08:00
Yifeng Sun
1f886f070f ofp-actions: Set an action depth limit to prevent stackoverflow by ofpacts_parse
Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12557
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 12:34:14 -08:00
Ben Pfaff
561ac8382e AUTHORS: Add Hyong Youb Kim <hyonkim@cisco.com>.
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 12:30:07 -08:00
Hyong Youb Kim
6012a42000 ovs-tcpdump: Fix an undefined variable
Run ovs-tcpdump without --span, and it throws the following
exception. Define mirror_select_all to avoid the error.

Traceback (most recent call last):
  File "/usr/local/bin/ovs-tcpdump", line 488, in <module>
    main()
  File "/usr/local/bin/ovs-tcpdump", line 454, in main
    mirror_select_all)
UnboundLocalError: local variable 'mirror_select_all' referenced before assignment

Fixes: 0475db71c650 ("ovs-tcpdump: Add --span to mirror all ports on bridge.")
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 12:28:09 -08:00
Numan Siddique
8e37d8fbb5 ovn-controller: Fix chassisredirect port flapping when ovs-vswitchd crashes
On a chassis when ovs-vswitchd crashes for some reason, the BFD status doesn't
get updated in the ovs db. ovn-controller will be reading the old BFD status
even though ovs-vswitchd is crashed. This results in the chassiredirect port
claim flapping between the master chassis and the chasiss with the next higher
priority if ovs-vswitchd crashes in the master chassis.

All the other chassis notices the BFD status down with the master chassis
and hence the next higher priority claims the port. But according to
the master chassis, the BFD status is fine and it again claims back the
chassisredirect port. And this results in flapping. The issue gets resolved
when ovs-vswitchd comes back but until then it leads to lot of SB DB
transactions and high CPU usage in ovn-controller's.

This patch fixes the issue by checking the OF connection status of the
ovn-controller with ovs-vswitchd and calculates the active bfd tunnels
only if it's connected.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 11:36:35 -08:00
Ilya Maximets
83f93ef305 system-dpdk-macros.at: Drop dpdk-socket-mem configuration.
There are two reasons:
1. OVS provides same default itself.
2. socket-mem is not necessary with dynamic memory model in DPDK 18.11.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-04 17:35:41 +00:00
Darrell Ball
298530b878 conntrack: Fix max size for inet_ntop() call.
The call to inet_ntop() in repl_ftp_v6_addr() is 1 short to handle
the maximum possible V6 address size for v4 mapping case.

Found by inspection.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 09:42:50 -08:00
Darrell Ball
cd7c99a6aa conntrack: fix ftp ipv4 address substitution.
When replacing the ipv4 address in repl_ftp_v4_addr(), the remaining size
was incorrectly calculated which could lead to the wrong replacement
adjustment.

This goes unnoticed most of the time, unless you choose carefully your
initial and replacement addresses.

Example fail address combination with 10.1.1.200 DNAT'd to 10.1.100.1.

Fix this by doing something similar to V6 and also splicing out common
code for better coverage and maintainability.

A test is updated to exercise different initial and replacement addresses
and another test is added.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Reported-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 09:42:49 -08:00
Ilya Maximets
8411b6ccec dpdk: Limit DPDK memory usage.
Since 18.05 release, DPDK moved to dynamic memory model in which
hugepages could be allocated on demand. At the same time '--socket-mem'
option was re-defined as a size of pre-allocated memory, i.e. memory
that should be allocated at startup and could not be freed.
So, DPDK with a new memory model could allocate more hugepage memory
than specified in '--socket-mem' or '-m' options.

This change adds new configurable 'other_config:dpdk-socket-limit'
which could be used to limit the ammount of memory DPDK could use.
It uses new DPDK option '--socket-limit'.
Ex.:

  ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-limit="1024,1024"

Also, in order to preserve old behaviour, if '--socket-limit' is not
specified, it will be defaulted to the amount of memory specified by
'--socket-mem' option, i.e. OVS will not be able to allocate more.
This is needed, for example, to disallow OVS to allocate more memory
than reserved for it by Nova in OpenStack installations.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-02-01 12:57:17 +00:00
Pieter Jansen van Vuuren
dbcb014d1f lib/tc: add set ipv6 traffic class action offload via pedit
Extend ovs-tc translation by allowing non-byte-aligned fields
for set actions. Use new boundary shifts and add set ipv6 traffic
class action offload via pedit.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-01-31 10:53:25 +01:00
Pieter Jansen van Vuuren
95431229b9 lib/tc: add set ipv4 dscp and ecn action offload via pedit
Add setting of ipv4 dscp and ecn fields in tc offload using pedit.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-01-31 10:53:20 +01:00
Pieter Jansen van Vuuren
0d9f0cd44b lib/tc: fix 32 bits shift for pedit offset calculation
pedit allows setting entire words with an optional mask and OVS
makes use of such masks to allow setting fields that do not span
entire words. One mask for leading bytes that should not be
updated and another mask for trailing bytes that should not be
updated. The masks are created using bit shifts.

In the case of the mask to omit trailing bytes a right bit shift
is used. Currently the code can produce shifts of 1, 2, 3 or 4
bytes (8, 16, 24 or 32 bits) based on the alignment of the end
of field being set.

However, a shift of 32 bits on a 32bit value is undefined.
As it stands the code relies on the result of UINT32_MAX >> 32
being UINT32_MAX. Or in other words a mask that results in the
pedit action setting all bytes of the word under operation.

This patch adjusts the code to use a shift of 0 for this case,
which gives the same result as the undefined behaviour that was
relied on, and appears logically correct as the desire is for no
trailing bytes (or bits!) to be omitted from the set action.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-01-31 10:53:18 +01:00
Pieter Jansen van Vuuren
f8b63e5930 lib/tc: make pedit mask calculations byte order agnostic
pedit allows setting entire words with an optional mask and OVS
makes use of such masks to allow setting fields that do not span
entire words.

The struct tc_pedit_key structure, which is part of the kernel
ABI, uses host byte order fields to store the mask and value for
a pedit action, however, these fields contain values in network
byte order.

In order to allow static analysis tools to check for endianness
problems this patch adds a local version of struct tc_pedit_key
which uses big endian types and refactors the relevant code as
appropriate.

In the course of making this change it became apparent that the
calculation of masks was occurring using host byte order although
the values are in network byte order. This patch also fixes that
problem by shifting values in host byte order and then converting
them to network byte order. It is believe this fixes a bug on big
endian systems although we are not in a position to test that.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-01-31 10:53:10 +01:00
Ilya Maximets
68c00e3eed dpdk: Use svec instead of re-inventing.
No need to implement dynamic vector to store arguments.
'svec' perfectly covers all the needed functionality.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-01-30 16:09:01 +00:00
Han Zhou
509c92dcdc raft.c: Remove noisy INFO log
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-01-28 21:15:51 -08:00
Ilya Maximets
9c68ca3432 dpdk: Use dynamic string for socket-mem construction.
No need to allocate memory and use 'strcat' direcly.
'dynamic-string' could do this for us.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-01-28 20:46:40 +00:00