Currently using ovs-ctl is not possible to specify additional options
for ovs-vswitchd and ovsdb-server (for example to specify a different
loglevel during daemon startup).
This patch adds --ovs-vswitchd-options and --ovsdb-server-options
options to ovs-ctl in order to specify the additional options.
Due to word splitting it may not be possible to specify an option that
includes whitespaces.
Reported-at: https://bugzilla.redhat.com/1664794
Reported-by: Matt Flusche <mflusche@redhat.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When a load balancer is added to multiple logical switches
and routers it has be to be removed from all of them before
being able to delete due to the current 'strong' reference
in the NB schema.
By changing it to 'weak', users can simply remove the load
balancer without having to remove all the references manually.
In particular, this will make things easier for networking-ovn,
the OpenStack integration project as it'll save some
calculations upon load balancer deletion.
The update path has been successfully from the previous version
of the schema.
Acked-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Currently if a client crashes (signal 11) the unix socket (.ctl) and the
pidfile may not be deleted when you use ovs-ctl stop or restart.
Moreover since ovs-appctl is used on a closed socket some warnings are
printed.
This commit deletes the pidfile and the unix socket then returns without
running ovs-appctl if the pidfile point to a not-existing pid.
Reported-at: https://bugzilla.redhat.com/1667845
Reported-by: Candido Campos <ccamposr@redhat.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This check covered by 'TESTSUITE=1 DPDK=1'.
No need to run it separately.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This increases the output by a few lines, but gives important
information regarding commands and their exact arguments.
Very useful for debugging.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
'make distcheck' executes it's own './configure' without any options
provided to the script. This means that in current configuration
Travis CI always re-builds and runs testsuite on a defualt binaries.
i.e. we're not checking testsuite with DPDK, not checking testsuite
with '--enable-shared' and not checking it with '-ljemalloc'.
We just 8 times running the testsuite without arguments. Only compiler
changes (gcc or clang) because CC is exported by Travis.
This patch reorders the commands in the build script and provides
'DISTCHECK_CONFIGURE_FLAGS' to force 'make distcheck' using our
desired configuration.
Another issue that addressed here is that we will no longe build
twice in case of TESTSUITE.
For linking inside the distcheck we also need to provide absulute path
to DPDK libraries.
'configure' executed before 'distcheck' to have a Makefile target.
It's executed without arguments because 'configure' inside the
'distcheck' will fail if we'll use sparse-wrapped CC.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
'distcheck' complains on some configurations:
ERROR: files left in build directory after distclean:
./include/openvswitch/cxxtest.cc
CC: Ben Pfaff <blp@ovn.org>
Fixes: 994bfc298502 ("Automatically verify that OVS header files work OK in C++ also.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
'distcheck' complains about these files while building --with-linux.
ERROR: files left in build directory after distclean:
./datapath/linux/.tmp_ip6_gre.gcno
./datapath/linux/.tmp_ip_tunnels_core.gcno
./datapath/linux/.tmp_genetlink-openvswitch.gcno
./datapath/linux/.tmp_stt.gcno
<..>
./datapath/linux/.tmp_versions/vport-gre.mod
./datapath/linux/.tmp_versions/vport-geneve.mod
./datapath/linux/.tmp_versions/vport-vxlan.mod
./datapath/linux/.tmp_versions/vport-lisp.mod
./datapath/linux/.tmp_versions/vport-stt.mod
<..>
./datapath/linux/.dev-openvswitch.o.d
./datapath/linux/.ip_tunnels_core.o.d
./datapath/linux/.vport.o.d
./datapath/linux/.udp_tunnel.o.d
./datapath/linux/.cache.mk
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Current script does not check build of OVS with DPDK.
It builds datapath instead.
CC: Ian Stokes <ian.stokes@intel.com>
Fixes: edfe8d263d2e ("travis: Add dpdk shared library build.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Currently the TC tunnel_key action is always
initialized with the given tunnel id value. However,
some tunneling protocols define the tunnel id as an optional field.
This patch initializes the id field of tunnel_key:set and tunnel_key:unset
only if a value is provided.
In the case that a tunnel key value is not provided by the user
the key flag will not be set.
Signed-off-by: Adi Nissim <adin@mellanox.com>
Acked-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
AC_SEARCH_LIBS enables the libraries itself:
checking for library containing get_mempolicy... -lnuma
checking for library containing pcap_dump... -lpcap
So, they are available in LIBS. No need to add them twice.
Also, DPDK_EXTRA_LIB doesn't even work, because each check overwrites
the variable instead of appending the new library. It was first time
misused while making libnuma optional and copy-pasted to several places
after that.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
In netdev_dpdk_add_rte_flow_offload function different rte_flow_item are
created as part of the pattern matching.
For most of them, there is a check whether the wildcards are not zero.
In case of zero, nothing is being done with the rte_flow_item.
Befor the wildcard check, and regardless of the result, the
rte_flow_item is being memset.
The patch moves the memset function only if the condition of the
wildcard is true, thus saving cycles of memset if not needed.
Signed-off-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
If a passive controller chooses to configure itself as a slave controller,
I don't know a reason why it should be considered "equal" for some
purposes.
Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Normally it makes sense for an active connection to be primary and a
passive connection to be a service connection, but I've run into a corner
case where it is better for a passive connection to be a primary
connection. This specific case is for use with OFtest, which expects to be
a primary controller. However, it also wants to reconnect frequently,
which is slow for active connections because of the backoff; by
configuring a passive, primary controller, OFtest can reconnect as
frequently and as quickly as it wants, making the overall test much faster.
Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
ONF abandoned the OpenFlow specification, so that OpenFlow 1.6 will never
be completed. It did not contain much in the way of useful features, so
remove what support Open vSwitch already had.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Until now, connmgr has handled active and passive OpenFlow connections in
quite different ways. Any active connection, whether it was currently
connected or not, was always maintained as an ofconn. Whenever such a
connection (re)connected, its settings were cleared. On the other hand,
passive connections had a separate listener which created an ofconn when
a new connection came in, and these ofconns would be deleted when such a
connection was closed. This approach is inelegant and has occasionally
led to bugs when reconnection didn't clear all of the state that it
should have.
There's another motivation here. Currently, active connections are
always primary controllers and passive connections are always service
controllers (as documented in ovs-vswitchd.conf.db(5)). Sometimes it would
be useful to have passive primary controllers (maybe active service
controllers too but I haven't personally run into that use case). As is,
this is difficult to implement because there is so much different code in
use between active and passive connections. This commit will make it
easier.
Signed-off-by: Ben Pfaff <blp@ovn.org>
CDN links are much faster in average. https://www.kernel.org/
links shows usually less than 10 MB/s, while https://cdn.kernel.org/
could give up to 200 MB/s and usually shows speeds much higher than
10 MB/s. Also, 'xz' archives are 30-50 MB smaller than gzip ones.
It takes a bit more time to unpack them, but it's negligible in
compare with download time.
For exmaple,
linux-3.16.54.tar.gz - 122064395 (116M)
linux-3.16.54.tar.xz - 81057528 (77M)
'xz' archive download via CDN link is the default way for kernel
downloading that provided by the kernel.org.
Some exmaples from Travis builds:
Before:
100%[==========================>] 122,064,395 3.11MB/s in 36s
(3.23 MB/s) - 'linux-3.16.54.tar.gz' saved [122064395/122064395]
100%[==========================>] 157,764,715 7.16MB/s in 24s
(6.28 MB/s) - 'linux-4.17.14.tar.gz' saved [157764715/157764715]
After:
100%[==========================>] 81,057,528 95.0MB/s in 0.8s
(95.0 MB/s) - 'linux-3.16.54.tar.xz' saved [81057528/81057528]
100%[==========================>] 102,195,552 218MB/s in 0.4s
(218 MB/s) - 'linux-4.17.14.tar.xz' saved [102195552/102195552]
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This commit adds additional verification to nx_pull_header__()
in lib/nx-match.c to distinguish between bad match and bad action
header conditions and return the appropriate error type/code.
Signed-off-by: Prashanth Iyengar <prashanth_iyengar@alliedtelesis.com>
Reviewed-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz>
Reviewed-by: Rahul Gupta <Rahul_Gupta@alliedtelesis.com>
Current version of 'skiplist_delete' uses data comparator to check
if the node that we're removing exists on current level. i.e. our
node 'x' is the next of update[i] on the level i.
But it's enough to just check pointers for that purpose.
Here is the small example of how the data structures looks at
this moment:
i a b c x d e f
0 [ ]>[ ]>[*] ---> [ ] ---> [#]>[ ]>[ ]
1 [ ]>[*] -------> [ ] -------> [#]>[ ]
2 [ ]>[*] -------> [ ] -----------> [#]
3 [ ]>[*] ------------------------> [ ]
4 [*] ----------------------------> [ ]
0 1 2 3 4
update[] = { c, b, b, b, a }
x.forward[] = { d, e, f }
c.forward[0] = x
b.forward[1] = x
b.forward[2] = x
b.forward[3] = f
a.forward[4] = f
Target:
i a b c d e f
0 [ ]>[ ]>[*] ------------> [#]>[ ]>[ ]
1 [ ]>[*] --------------------> [#]>[ ]
2 [ ]>[*] ------------------------> [#]
3 [ ]>[*] ------------------------> [ ]
4 [*] ----------------------------> [ ]
c.forward[0] = x.forward[0] = d
b.forward[1] = x.forward[1] = e
b.forward[2] = x.forward[2] = f
b.forward[3] = f
a.forward[4] = f
i.e. we're updating forward pointers while update[i].forward[i] == x.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This member was write-only: it was initialized and never used later on.
Thanks to Esteban Rodriguez Betancourt <estebarb@hpe.com> for the
following additional rationale:
In this case you are right, the "height" member is not only not
used, it is in fact not required, and can be safely removed,
without causing security issues.
The code can't read past the end of the 'forward' array because
the skiplist "level" member, that specifies the maximum height of
the whole skiplist.
The "level" field is updated in insertions and deletions, so that
in insertion the root node will point to the newly created item
(if there isn't a list there yet). At the deletions, if the
deleted item is the last one at that height then the root is
modified to point to NULL at that height, and the whole skiplist
height is decremented.
For the forward_to case:
- If a node is found in a list of level/height N, then it has
height N (that's why it was inserted in that list)
- forward_to travels throught nodes in the same level, so it is
safe, as it doesn't go up.
- If a node has height N, then it belongs to all the lists
initiated at root->forward[n, n-1 ,n-2, ..., 0]
- forward_to may go to lower levels, but that is safe, because of
previous point.
So, the protection is given by the "level" field in skiplist root
node, and it is enough to guarantee that the code won't go off
limits at 'forward' array. But yes, the height field is unused,
unneeded, and can be removed safely.
CC: Esteban Rodriguez Betancourt <estebarb@hpe.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
There are a few cases where struct 'conn_key' padding may be unspecified
according to the C standard. Practically, it seems implementations don't
have issue, but it is better to be safe. The code paths modified are not
hot ones. Fix this by doing a memcpy in these cases in lieu of a
structure copy.
Found by inspection.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
state_s should be freed always before exit parse_ct_state
Fixes: b4293a336d8d ("conntrack: Move ct_state parsing to lib/flow.c")
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Added the check for NULL in "next_ct_states" argument passed to the
"ofproto_trace()" function. Under normal scenario, this is non-NULL. A NULL
"next_ct_states" argument is passed from the "upcall_xlate()" function on
encountering XLATE_RECURSION_TOO_DEEP or XLATE_TOO_MANY_RESUBMITS error.
VMware-BZ: #2282287
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Upstream commit:
commit 41e4e2cd75346667b0c531c07dab05cce5b06d15
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date: Thu Jan 3 09:51:57 2019 -0800
openvswitch: Fix IPv6 later frags parsing
The previous commit fa642f08839b
("openvswitch: Derive IP protocol number for IPv6 later frags")
introduces IP protocol number parsing for IPv6 later frags that can mess
up the network header length calculation logic, i.e. nh_len < 0.
However, the network header length calculation is mainly for deriving
the transport layer header in the key extraction process which the later
fragment does not apply.
Therefore, this commit skips the network header length calculation to
fix the issue.
Reported-by: Chris Mi <chrism@mellanox.com>
Reported-by: Greg Rose <gvrose8192@gmail.com>
Fixes: fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 9a4ab6da01f7 ("datapath: Derive IP protocol number for IPv6 later frags")
Cc: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Upstream commit:
commit fa642f08839bf2ff35b2f6c6a6c062aee8121ba8
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date: Tue Sep 4 15:33:41 2018 -0700
openvswitch: Derive IP protocol number for IPv6 later frags
Currently, OVS only parses the IP protocol number for the first
IPv6 fragment, but sets the IP protocol number for the later fragments
to be NEXTHDF_FRAGMENT. This patch tries to derive the IP protocol
number for the IPV6 later frags so that we can match that.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Upstream commit:
commit 04a4af334b971814eedf4e4a413343ad3287d9a9
Author: Ross Lagerwall <ross.lagerwall@citrix.com>
Date: Mon Jan 14 09:16:56 2019 +0000
openvswitch: Avoid OOB read when parsing flow nlattrs
For nested and variable attributes, the expected length of an attribute
is not known and marked by a negative number. This results in an OOB
read when the expected length is later used to check if the attribute is
all zeros. Fix this by using the actual length of the attribute rather
than the expected length.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
No code changing is necessary to support 4.18.x.
Only one kernel test failed and it is in the process of being fixed.
Updated .travis.yml to include 4.18.x and also use latest 4.17 version.
Updated test files to test 4.18 kernel.
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Kmod tests 122 and 123 failed and kernel reports a "Duplicate key of
type 6" error. Further debugging reveals that nl_attr_find__() should
start looking for OVS_KEY_ATTR_ETHERTYPE from offset returned by
a previous called nl_msg_start_nested(). This patch fixes it.
Tests 122 and 123 were skipped by kernel 4.15 and older versions.
Kernel 4.16 and later kernels start showing this failure.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Test "flow resume with geneve tun_metadata" failed because there is
no controller running to handle the continuation message. A previous
commit deleted the line that starts ovs-ofctl as a controller in
order to avoid a race condition on monitor log. This patch adds
back this line but omits the log file because this test doesn't
depend on the log file.
Fixes: e8833217914f9c071c49 ("system-traffic.at: avoid a race condition on monitor log")
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
CC: David Marchand <david.marchand@redhat.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Currently OVS supports all ARP protocol fields as OXM match fields to
implement the relevant ARP procedures for IPv4. This includes support
for matching copying and setting ARP fields. In IPv6 ARP has been
replaced by ICMPv6 neighbor discovery (ND) procedures, neighbor
advertisement and neighbor solicitation.
The support for ICMPv6 fields in OVS is not complete for the use cases
equivalent to ARP in IPv4. OVS lacks support for matching, copying and
setting the “ND option type” and “ND reserved” fields. Without these user
cannot implement all ICMPv6 ND procedures for IPv6 support.
This commit adds additional OXM fields to OVS for ICMPv6 “ND option type“
and ICMPv6 “ND reserved” using the OXM extension mechanism. This allows
support for parsing these fields from an ICMPv6 packet header and extending
the OpenFlow protocol with specifications for these new OXM fields for
matching, copying and setting.
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Co-authored-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com>
Signed-off-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
`encap = nl_msg_start_nested(key, OVS_KEY_ATTR_ENCAP)` ensures that
key->size >= (encap + NLA_HDRLEN), so the `if` statement is safe.
Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11306
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Run ovs-tcpdump without --span, and it throws the following
exception. Define mirror_select_all to avoid the error.
Traceback (most recent call last):
File "/usr/local/bin/ovs-tcpdump", line 488, in <module>
main()
File "/usr/local/bin/ovs-tcpdump", line 454, in main
mirror_select_all)
UnboundLocalError: local variable 'mirror_select_all' referenced before assignment
Fixes: 0475db71c650 ("ovs-tcpdump: Add --span to mirror all ports on bridge.")
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
On a chassis when ovs-vswitchd crashes for some reason, the BFD status doesn't
get updated in the ovs db. ovn-controller will be reading the old BFD status
even though ovs-vswitchd is crashed. This results in the chassiredirect port
claim flapping between the master chassis and the chasiss with the next higher
priority if ovs-vswitchd crashes in the master chassis.
All the other chassis notices the BFD status down with the master chassis
and hence the next higher priority claims the port. But according to
the master chassis, the BFD status is fine and it again claims back the
chassisredirect port. And this results in flapping. The issue gets resolved
when ovs-vswitchd comes back but until then it leads to lot of SB DB
transactions and high CPU usage in ovn-controller's.
This patch fixes the issue by checking the OF connection status of the
ovn-controller with ovs-vswitchd and calculates the active bfd tunnels
only if it's connected.
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
There are two reasons:
1. OVS provides same default itself.
2. socket-mem is not necessary with dynamic memory model in DPDK 18.11.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
The call to inet_ntop() in repl_ftp_v6_addr() is 1 short to handle
the maximum possible V6 address size for v4 mapping case.
Found by inspection.
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When replacing the ipv4 address in repl_ftp_v4_addr(), the remaining size
was incorrectly calculated which could lead to the wrong replacement
adjustment.
This goes unnoticed most of the time, unless you choose carefully your
initial and replacement addresses.
Example fail address combination with 10.1.1.200 DNAT'd to 10.1.100.1.
Fix this by doing something similar to V6 and also splicing out common
code for better coverage and maintainability.
A test is updated to exercise different initial and replacement addresses
and another test is added.
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Reported-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Since 18.05 release, DPDK moved to dynamic memory model in which
hugepages could be allocated on demand. At the same time '--socket-mem'
option was re-defined as a size of pre-allocated memory, i.e. memory
that should be allocated at startup and could not be freed.
So, DPDK with a new memory model could allocate more hugepage memory
than specified in '--socket-mem' or '-m' options.
This change adds new configurable 'other_config:dpdk-socket-limit'
which could be used to limit the ammount of memory DPDK could use.
It uses new DPDK option '--socket-limit'.
Ex.:
ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-limit="1024,1024"
Also, in order to preserve old behaviour, if '--socket-limit' is not
specified, it will be defaulted to the amount of memory specified by
'--socket-mem' option, i.e. OVS will not be able to allocate more.
This is needed, for example, to disallow OVS to allocate more memory
than reserved for it by Nova in OpenStack installations.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Extend ovs-tc translation by allowing non-byte-aligned fields
for set actions. Use new boundary shifts and add set ipv6 traffic
class action offload via pedit.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Add setting of ipv4 dscp and ecn fields in tc offload using pedit.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
pedit allows setting entire words with an optional mask and OVS
makes use of such masks to allow setting fields that do not span
entire words. One mask for leading bytes that should not be
updated and another mask for trailing bytes that should not be
updated. The masks are created using bit shifts.
In the case of the mask to omit trailing bytes a right bit shift
is used. Currently the code can produce shifts of 1, 2, 3 or 4
bytes (8, 16, 24 or 32 bits) based on the alignment of the end
of field being set.
However, a shift of 32 bits on a 32bit value is undefined.
As it stands the code relies on the result of UINT32_MAX >> 32
being UINT32_MAX. Or in other words a mask that results in the
pedit action setting all bytes of the word under operation.
This patch adjusts the code to use a shift of 0 for this case,
which gives the same result as the undefined behaviour that was
relied on, and appears logically correct as the desire is for no
trailing bytes (or bits!) to be omitted from the set action.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
pedit allows setting entire words with an optional mask and OVS
makes use of such masks to allow setting fields that do not span
entire words.
The struct tc_pedit_key structure, which is part of the kernel
ABI, uses host byte order fields to store the mask and value for
a pedit action, however, these fields contain values in network
byte order.
In order to allow static analysis tools to check for endianness
problems this patch adds a local version of struct tc_pedit_key
which uses big endian types and refactors the relevant code as
appropriate.
In the course of making this change it became apparent that the
calculation of masks was occurring using host byte order although
the values are in network byte order. This patch also fixes that
problem by shifting values in host byte order and then converting
them to network byte order. It is believe this fixes a bug on big
endian systems although we are not in a position to test that.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
No need to implement dynamic vector to store arguments.
'svec' perfectly covers all the needed functionality.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
No need to allocate memory and use 'strcat' direcly.
'dynamic-string' could do this for us.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>