In post-install in kmod fedora spec file, the variables storing
different parts of kernel version numbers are renamed. The condition
check to run ovs-kmod-manage.sh for RHEL 7.2 and 7.4 uses the older
variables.
Fixes: c3570519ec (rhel: add 4.4 kernel in kmod build with mulitple versions, fedora)
Signed-off-by: Martin Xu <martinxu9.ovs@gmail.com>
CC: Greg Rose <gvrose8192@gmail.com>
CC: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The call to ovsdb_datum_diff() initializes 'new', so it's not necessary to
also do it in ovsdb_datum_apply_diff().
Found by inspection.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
I've noticed recently an annoying quantity of error messages like the
following in builds in various places:
gcc: error: unrecognized command line option ‘-Wunknown-warning-option’
This didn't really make sense because OVS checks whether the compiler
supports warning options before it uses them. Looking closer, the GCC
manual has a note that explains the issue:
When an unrecognized warning option is requested (e.g.,
'-Wunknown-warning'), GCC emits a diagnostic stating that the
option is not recognized. However, if the '-Wno-' form is used,
the behavior is slightly different: no diagnostic is produced for
'-Wno-unknown-warning' unless other diagnostics are being
produced. This allows the use of new '-Wno-' options with old
compilers, but if something goes wrong, the compiler warns that
an unrecognized option is present.
Thus, we can properly check only for the *positive* version of a warning
option, so this commit makes the OVS tests do that.
Fixes: a7021b08b0 ("configure: Disable -Wnull-pointer-arithmetic Clang warning.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
This patch fixes the path for ovs-kmod-manage.sh script in the
openvswitch-kmod RPM in fedora spec file. Currently the path prefix is
hard coded to /usr/share. Use %{_datadir} instead.
Fixes: 22c33c3039 (rhel: support kmod build against mulitple kernel versions, fedora)
Signed-off-by: Martin Xu <martinxu9.ovs@gmail.com>
CC: Greg Rose <gvrose8192@gmail.com>
CC: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
When using the kernel datapath, OVS allocates a pool of sockets to handle
netlink events. The number of sockets is: ports * n-handler-threads, where
n-handler-threads is user configurable and defaults to 3/4*number of cores.
This because vswitchd starts n-handler-threads threads, each one with a
netlink socket for every port of the switch. Every thread then, starts
listening on events on its set of sockets with epoll().
On setup with lot of CPUs and ports, the number of sockets easily hits
the process file descriptor limit, and ovs-vswitchd will exit with -EMFILE.
Change the number of allocated sockets to just one per port by moving
the socket array from a per handler structure to a per datapath one,
and let all the handlers share the same sockets by using EPOLLEXCLUSIVE
epoll flag which avoids duplicate events, on systems that support it.
The patch was tested on a 56 core machine running Linux 4.18 and latest
Open vSwitch. A bridge was created with 2000+ ports, some of them being
veth interfaces with the peer outside the bridge. The latency of the upcall
is measured by setting a single 'action=controller,local' OpenFlow rule to
force all the packets going to the slow path and then to the local port.
A tool[1] injects some packets to the veth outside the bridge, and measures
the delay until the packet is captured on the local port. The rx timestamp
is get from the socket ancillary data in the attribute SO_TIMESTAMPNS, to
avoid having the scheduler delay in the measured time.
The first test measures the average latency for an upcall generated from
a single port. To measure it 100k packets, one every msec, are sent to a
single port and the latencies are measured.
The second test is meant to check latency fairness among ports, namely if
latency is equal between ports or if some ports have lower priority.
The previous test is repeated for every port, the average of the average
latencies and the standard deviation between averages is measured.
The third test serves to measure responsiveness under load. Heavy traffic
is sent through all ports, latency and packet loss is measured
on a single idle port.
The fourth test is all about fairness. Heavy traffic is injected in all
ports but one, latency and packet loss is measured on the single idle port.
This is the test setup:
# nproc
56
# ovs-vsctl show |grep -c Port
2223
# ovs-ofctl dump-flows ovs_upc_br
cookie=0x0, duration=4.827s, table=0, n_packets=0, n_bytes=0, actions=CONTROLLER:65535,LOCAL
# uname -a
Linux fc28 4.18.7-200.fc28.x86_64 #1 SMP Mon Sep 10 15:44:45 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
And these are the results of the tests:
Stock OVS Patched
netlink sockets
in use by vswitchd
lsof -p $(pidof ovs-vswitchd) \
|grep -c GENERIC 91187 2227
Test 1
one port latency
min/avg/max/mdev (us) 2.7/6.6/238.7/1.8 1.6/6.8/160.6/1.7
Test 2
all port
avg latency/mdev (us) 6.51/0.97 6.86/0.17
Test 3
single port latency
under load
avg/mdev (us) 7.5/5.9 3.8/4.8
packet loss 95 % 62 %
Test 4
idle port latency
under load
min/avg/max/mdev (us) 0.8/1.5/210.5/0.9 1.0/2.1/344.5/1.2
packet loss 94 % 4 %
CPU and RAM usage seems not to be affected, the resource usage of vswitchd
idle with 2000+ ports is unchanged:
# ps u $(pidof ovs-vswitchd)
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
openvsw+ 5430 54.3 0.3 4263964 510968 pts/1 RLl+ 16:20 0:50 ovs-vswitchd
Additionally, to check if vswitchd is thread safe with this patch, the
following test was run for circa 48 hours: on a 56 core machine, a
bridge with kernel datapath is filled with 2200 dummy interfaces and 22
veth, then 22 traffic generators are run in parallel piping traffic into
the veths peers outside the bridge.
To generate as many upcalls as possible, all packets were forced to the
slowpath with an openflow rule like 'action=controller,local' and packet
size was set to 64 byte. Also, to avoid overflowing the FDB early and
slowing down the upcall processing, generated mac addresses were restricted
to a small interval. vswitchd ran without problems for 48+ hours,
obviously with all the handler threads with almost 99% CPU usage.
[1] https://github.com/teknoraver/network-tools/blob/master/weed.c
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
This skip including floatn-common.h if it's not available since it
was introduced in glibc 2.27 and OVS doesn't not actually require
that to work with previous glibc version.
Fixes: 07aec2ac1 sparse: Support newer GCC/glibc versions.
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Determine installation location of ovs-lib using runtime location
of script, rather than build-time parameters.
Signed-off-by: James Page <james.page@ubuntu.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The test 'truncate and output to gre tunnel' is broken on certain kernels
where OVS kernel module and upstream GRE module can't co-exist. This
patch creates a test that doesn't depend on upstream GRE module but
provides the same testing.
The replaced test is skipped on problematic kernel versions.
On centos, this test may fail due to the default rules of iptables.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Currently, OVS does not update the flow stats after a packet is
restarted by NXT_RESUME message. This patch fixes the aforementioned
issue and adds an unit test to prevent regression.
Fixes: 77ab5fd2a9 ("Implement serializing the state of packet traversal in "continuations".")
VMware-BZ: #2198435
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Add CT_LB action to ovn-trace utility in order to fix the
following ovn-trace error if a load balancer rule is added to
OVN configuration
ct_next(ct_state=est|trk /* default (use --ct to customize) */) {
*** ct_lb action not implemented;
};
Add '--lb_dst' option in order to specify the ip address to use
in VIP pool. If --lb_dst is not provided the destination ip will be
randomly choosen
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The commit "6f01617442" added the documenation for the DHCPv4 option
252 in the wrong section. This patch fixes it.
Fixes: 6f01617442 ("ovn: Add DHCP support for option 252.")
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
Since the time that OVS introduced support for IP fragments, the OVS
functions that format flows have used "nw_frag", but the ones that parse
flows have expected "ip_frag". Obviously this is a bug and it's a surprise
that it's gone so long without anyone reporting the problem. This fixes
it and adds a test.
Reported-by: Gurucharan Shetty <guru@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Gurucharan Shetty <guru@ovn.org>
The payload calculation in OvsGetTcpHeader is wrong:
`ntohs(ipHdr->tot_len) - expr` instead of `ntohs((ipHdr->tot_len) - expr)`.
We already have a macro for that calculation defined in NetProto.h so use it.
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Acked-by: Anand Kumar <kumaranand@vmware.com>
This patch implements limiting conntrack entries
per zone using dpctl commands.
Example:
ovs-appctl dpctl/ct-set-limits default=5 zone=1,limit=2 zone=1,limit=3
ovs-appctl dpct/ct-del-limits zone=4
ovs-appctl dpct/ct-get-limits zone=1,2,3
- Also update the netlink-socket.c to support netlink family
'OVS_WIN_NL_CTLIMIT_FAMILY_ID' for conntrack zone limit.
Signed-off-by: Anand Kumar <kumaranand@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
On certain kernel versions, when openvswitch kernel module creates
a gre0 interface, the kernel’s gre module will jump out and compete
to control the gre0 interface. This will cause the failure of
openvswitch kernel module loading.
This fix renames fallback devices by adding a prefix "ovs-".
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
VMware Issue: #2162866
The present code resets the database when it is in the state -
'RPL_S_SCHEMA_REQUESTED' and repopulates the database when it
receives the monitor reply when it is in the state -
'RPL_S_MONITOR_REQUESTED'. If however, it goes to active mode
before it processes the monitor reply, the whole data is lost.
This patch alleviates the problem by resetting the database when it
receives the monitor reply (before processing it). So that
reset and repopulation of the db happens in the same state.
This approach still has a window for data loss if the function
process_notification() when processing the monitor reply fails for
some reason or ovsdb-server crashes for some reason during
process_notification().
Reported-by: Han Zhou <zhouhan@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047161.html
Tested-by: aginwala <aginwala@ebay.com>
Acked-by: Han Zhou <zhouhan@gmail.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The *_index_init_row() function casts a generic ovsdb_idl_row pointer to
a specific type of row pointer. This can cause an increase in required
alignment with some kinds of data on some architectures. GCC complains,
e.g.:
lib/vswitch-idl.c: In function 'ovsrec_flow_sample_collector_set_index_init_row'
lib/vswitch-idl.c:9277:12: warning: cast increases required alignment of target
However, rows are always allocated with malloc(), which returns member
suitable for any type, so this is a false positive warning and this commit
suppresses it.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <zhouhan@gmail.com>
This patch extends 4886d4d249 (debian, rhel: Ship ovs shared libraries
and header files) to fedora, by packaging the shared libraries in
openvswitch and openvswitch-dvel RPM. These files are always packaged in
the RPMs built with rhel6 spec file.
VMware-BZ: #2036847
CC: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Martin Xu <martinxu9.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
With this change, we can remove a case of free done in the error code
path.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
This reverts commit a94f9524db.
This is a revert of a previously reverted commit
2bdd1f3d96.
When we originally added commit 2bdd1f3d96 it was part of an
effort to work around gre module conflicts found while enabling
the ERSPAN feature. Testing at the time did not show any benefit
so in commit a94f9524db we reverted it. However, further
developments showed that in some corner cases it did have a
benefit and it did not do any harm so we reverted the original
revert to restore the code.
Signed-off-by: Greg Rose <roseg@vmware.com>
Tested-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Prior to OVS 2.9 automatic assignment of Rxqs to PMDs
(i.e. CPUs) was done by round-robin.
That was changed in OVS 2.9 to ordering the Rxqs based on
their measured processing cycles. This was to assign the
busiest Rxqs to different PMDs, improving aggregate
throughput.
For the most part the new scheme should be better, but
there could be situations where a user prefers a simple
round-robin scheme because Rxqs from a single port are
more likely to be spread across multiple PMDs, and/or
traffic is very bursty/unpredictable.
Add 'pmd-rxq-assign' config to allow a user to select
round-robin based assignment.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Currently the default flow (actions=NORMAL) is present in the flow table after
the flow table is restored also when the default flow is removed.
This commit changes the behaviour of the "ovs-save save-flows" command to use
"replace-flows" instead of "add-flows" to restore the flows. This is needed in
order to always have the new flow table as it was before restoring it.
Reported-by: Flavio Leitner <fbl@sysclose.org>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1626096
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Added new types to the flow dump filter, and allowed multiple filter
types to be passed at once, as a comma separated list. The new types
added are:
* tc - specifies flows handled by the tc dp
* non-offloaded - specifies flows not offloaded to the HW
* all - specifies flows of all types
The type list is now fully parsed by the dpctl, and a new struct was
added to dpif which enables dpctl to define which types of dumps to
provide, rather than passing the type string and having dpif parse it.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
In a previous commit, the dpif_flow struct was expanded, with the
'offloaded' field being moved into a new struct which also includes a
field for the dp layer the flow is handled on. The initialization of
these fields was only done in dpif-netlink.
This completes that commit, by initializing the fields in dpif-netdev
as well. As the 'offloaded' field was previously ignored by
dpif-netdev, the attrs are initialized to the default values of
'false' for the offloaded state, and 'ovs' for the dp layer.
Fixes: d63ca5329f ("dpctl: Properly reflect a rule's offloaded to HW state")
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
The tunnel_cfg had the gro_receive and gro_complete fields uninitialized
in function lisp_open(). This caused an uninitialized memory read.
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Previously the key was used to check the presence of vlan id and
prio fields instead of using the mask. Additionally the vlan id
field was considered to be present if only the prio field was set,
and vice versa. f.e. setting the following:
ovs-ofctl -OOpenFlow13,OpenFlow15 add-flow br0 \
priority=10,cookie=1,table=0,ip,dl_vlan_pcp=2,actions=output:2
Resulted in (instead of wildcarding vlan_id, filter matches 0):
filter protocol 802.1Q pref 1 flower chain 0
filter protocol 802.1Q pref 1 flower chain 0 handle 0x1
vlan_id 0
vlan_prio 2
vlan_ethtype ip
eth_type ipv4
ip_flags nofrag
in_hw
action order 1: mirred (Egress Redirect to device eth1) stolen
index 2 ref 1 bind 1 installed 5 sec used 5 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
cookie 47040ae7a94fff6afd7ed8aa04b11ba4
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
This fixes leaks on the error path in parse_intel_port_custom_property().
ofp_print_ofpst_port_reply() failed to free the custom_stats in decoded
port stats. This fixes the problem.
parse_intel_port_custom_property() had a memory leak if there was more than
one custom stats property (which there shouldn't be, but still). This
fixes the problem.
There was a function netdev_free_custom_stats_counters() meant for freeing
custom_stats, but hardly anything used it. This adopts it consistently.
It wasn't safe to free the custom stats if ofputil_decode_port_stats()
returned an error. Using netdev_free_custom_stats_counters() avoids this
pitfall.
Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9972
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
This patch increases coverage of `lib/flow.c` from 39% to 43%, covers three
additional files and increases coverage in five other source/header files.
Signed-off-by: Bhargava Shastry <bshastry at sect.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch will make sure VXLAN tunnels with and without the group
based policy (GBP) option enabled can not coexist on the same
destination UDP port.
In theory, VXLAN tunnel with and without GBP enables can be
multiplexed on the same UDP port as long as different VNI's are
used. However currently OVS does not support this, hence this patch to
check for this condition.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch invokes parse_tcp_flags() in flow_extract_target.c after doing a
basic sanitization check (that packet contains at least an ETH header).
A cursory evaluation shows that the patch improves line coverage of
lib/flow.c from 37% to 39%.
Signed-off-by: Bhargava Shastry <bshastry at sect.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Since OpenSSL upstream commit 201b305a2409
("apps/dsaparam.c generates code that is intended to be pasted or included into
an existing source file: the function is static, and the code doesn't include
dsa.h. Match the generated C source style of dsaparam.") "openssl dhparam -C"
generates the get_dh functions as static, but the functions are used inside
stream-ssl.c and so the static keyword cannot be used.
This commit removes the static keyword from the get_dh functions during
dhparams.c file generation by restoring the current behaviour.
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>