2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-31 14:25:26 +00:00
Commit Graph

15394 Commits

Author SHA1 Message Date
Ben Pfaff
fca1b42abe ovsdb-idl: Fix assertion failure on error path parsing server reply.
If the database server sent an error reply to a monitor_cond request, and
the error was not a JSON string, then passing the error to json_string()
caused an assertion failure.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
2017-12-11 14:31:03 -08:00
Ben Pfaff
1ac62a0e09 ovsdb-idl: Fix indentation in a couple of places.
White space changes only.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
2017-12-11 14:30:52 -08:00
Ben Pfaff
f5a12b82b8 ovsdb-idl: Improve comments.
This change documents the IDL state machine, adds other comments,
and fixes a spelling error in a comment.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
2017-12-11 14:30:34 -08:00
Darrell Ball
a81da08057 conntrack: Fix icmp error address sanity check.
An address sanity check is done on icmp error packets to
check that the icmp error payload makes sense w.r.t. the
packet itself.

The sanity check was partially incorrect since it tried
to verify the source address of the error packet against the
original destination, which does not makes since the error
can be generated by any intermediate node.

Reported-by: wangzhike <wangzhike@jd.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341609.html
Fixes: a489b1685 ("conntrack: New userspace connection tracker.")
CC: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: wangzhike <wangzhike@jd.com>
Co-authored-by: wangzhike <wangzhike@jd.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-12-11 14:18:18 -08:00
Darrell Ball
3a2a425b4c conntrack: Disable algs by default.
Presently, alg processing is enabled by default to better exercise code.
This is similar to kernels before 4.7 as well.  The recommended default
behavior in the newer kernels is to only process algs if a helper is
supplied in a conntrack rule.  The behavior is changed to match the
later kernels.

A test is extended to check that the control connection is still
created in such a case.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
2017-12-11 14:14:24 -08:00
Darrell Ball
bd7d93f8b4 conntrack: Allow specified alg port numbers.
Algs can use variable control port numbers for servers.
The main use case is a kind of feeble security measure; the
thinking being by some is that it obscures the alg traffic.
It is really not very effective, but the kernel has this
capability. This patch mimics the capability.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
2017-12-11 14:14:11 -08:00
Darrell Ball
94e711433c conntrack: Refactor algs.
Upcoming requirements for new algs make it desirable to split out
alg helpers more cleanly.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
2017-12-11 14:13:58 -08:00
Ben Pfaff
f0aa3801f1 dpif-netdev: Avoid "sparse" warning.
"sparse" warns when odp_port_t is used directly in an inequality
comparison.  This avoids the warning.

CC: Kevin Traynor <ktraynor@redhat.com>
Fixes: a130f1a89b ("dpif-netdev: Add port/queue tiebreaker to rxq_cycle_sort.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
2017-12-11 13:42:53 -08:00
Ben Pfaff
836476ce24 ofproto: Keep inserting buckets into a group from changing group type.
The "insert buckets" and "delete buckets" operations on a group should not
change the group's type or properties, but the implementation did this by
mistake.  This fixes the problem.

Reported-by: shivani dommeti <shivani.dommeti@gmail.com>
Tested-by: shivani dommeti <shivani.dommeti@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-December/045830.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-12-11 11:44:12 -08:00
Aaron Conole
c66caed404 daemon-unix: include missing help information
These options have existed for a while, but were not expressed in the
help information.  Inform the user that these options exist, and give
some basic help.

Reported-by: Saravanan KR <skramaja@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Markos Chandras <mchandras@suse.de>
2017-12-11 10:59:25 -08:00
Ben Pfaff
8c087cecff Merge branch 'dpdk_merge' of https://github.com/istokes/ovs into HEAD 2017-12-11 10:24:09 -08:00
Anand Kumar
433695320a datapath-windows: Add support for deleting conntrack entry by 5-tuple.
To delete a conntrack entry specified by 5-tuple pass an additional
conntrack 5-tuple parameter to flush-conntrack.

Signed-off-by: Anand Kumar <kumaranand@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
2017-12-11 18:15:45 +02:00
Justin Pettit
55616d9a61 doc: Correct path of kernel system tests results directory.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2017-12-08 15:30:16 -08:00
Ben Pfaff
5cb92182ab ofproto-dpif-xlate: Change assertion to log message.
Until now, compose_output_action__() has asserted that a packet output to
a patch port is not to be truncated.  This commit changes this to an error
that will be included in trace output, for two reasons.  First, this sounds
like only a minor problem to me which doesn't warrant killing the process.
Second, it will be easier to track down the actual problem (if any) if we
can get a trace instead of a segfault.

Reported-by: Kevin Lin <kevin@kelda.io>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-December/045832.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-12-08 15:20:56 -08:00
Ben Pfaff
8b496c72c7 ofproto-dpif-xlate: Correctly decide whether truncating.
xlate_output_action() must tell some of the functions it calls whether the
packet is being truncated.  Until now, it has inferred that based on
whether its max_len argument is nonzero.

Unfortunately, max_len conflates two different purposes.  Historically it
was used only to limit the number of bytes of packets sent to an OpenFlow
controller in packet_in messages.  When packet truncation was introduced,
it was then also used to specify the truncation length.  This meant that,
for example, when xlate_output_reg_action() called into
xlate_output_action() passing along for max_len an OpenFlow controller byte
limit (which ovs-ofctl by default sets to 65535), xlate_output_action()
interpreted that as a truncation request and told the functions it called
that the packet was being truncated, which in the worst case led to
assertion failures.

This commit disentangles these two meaning of max_len, separating them into
two separate parameters, and updates the callers.

Reported-by: Kevin Lin <kevin@kelda.io>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-December/045841.html
Tested-by: Kevin Lin <kevin@kelda.io>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-12-08 15:15:31 -08:00
Ben Pfaff
391ce8049c ovsdb-server: Document monitor_cond_change behavior for unmentioned tables.
It seems best to be explicit about this.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2017-12-08 14:53:48 -08:00
Ben Pfaff
a01ca4ee69 jsonrpc-server: Report monitor session ID properly in error message.
The error message in question is about the monitor session ID but it
actually reports the JSON-RPC request ID instead, which is surprising.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2017-12-08 14:53:21 -08:00
Ben Pfaff
f0e65e6702 tests: Always ignore "Broken pipe" and "Connection reset" log messages.
Until now, the ovn-controller-vtep, ovn-nbctl, and ovn-sbctl tests have
ignored "Broken pipe" and "Connection reset" messages.  The same rationale
that applies to them also applies to ovs-vsctl and other utilities.  It
seems easier to just always ignore them.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2017-12-08 14:53:15 -08:00
Ben Pfaff
14347f3e82 stream-unix: Give accepted sockets distinct names for log messages.
At least on Linux, when process A connects to process B over a Unix
domain socket, unless process A bound its socket to a name before
it made the connection, process B gets an empty peer name.  Until
now, OVS has just reported the name of the connection as "unix".
This is not meaningful, of course.  I do not know of a good general
solution to this problem, but this commit attempts a step in the
right direction by at least giving each connection of this kind a
number: "unix#1", "unix#2", and so on.  That way, in log messages
one can at least see which messages are related to a particular
connection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2017-12-08 14:21:19 -08:00
Ben Pfaff
1cae21eece test-ovsdb: Triggers should wake up other triggers immediately.
When a trigger executes, it can make changes to the database that fulfill
the conditions for some other trigger to execute.  ovsdb-server implements
this properly, but the code in test-ovsdb for testing triggers outside
ovsdb-server did not.  This fixes the problem.

Found by inspection.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2017-12-08 14:13:49 -08:00
Ben Pfaff
9bc3966ce2 test-ovsdb: Simplify code in do_trigger().
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2017-12-08 14:13:42 -08:00
Michal Weglicki
3eb8d4fa0d netdev-dpdk: extend netdev_dpdk_get_status to include if_type and if_descr
This commit extends netdev_dpdk_get_status API to include additional
driver-related information: if_type and if_descr.

v2->v3: Code rebase.
v3->v4: Minor comments applied.
v5->v6: Adds DPDK port specific description in documentation.

Co-authored-by: Michal Weglicki <michalx.weglicki@intel.com>
Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Signed-off-by: Przemyslaw Szczerbik <przemyslawx.szczerbik@intel.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Ilya Maximets
d9d73f84ea Revert "dpif_netdev: Refactor dp_netdev_pmd_thread structure."
This reverts commit a807c15796.

Padding and aligning of dp_netdev_pmd_thread structure members
is useless, broken in a several ways and only greatly degrades
maintainability and extensibility of the structure.

Issues:

    1. It's not working because all the instances of struct
       dp_netdev_pmd_thread allocated only by usual malloc. All the
       memory is not aligned to cachelines -> structure almost never
       starts at aligned memory address. This means that any further
       paddings and alignments inside the structure are completely
       useless. Fo example:

       Breakpoint 1, pmd_thread_main
       (gdb) p pmd
       $49 = (struct dp_netdev_pmd_thread *) 0x1b1af20
       (gdb) p &pmd->cacheline1
       $51 = (OVS_CACHE_LINE_MARKER *) 0x1b1af60
       (gdb) p &pmd->cacheline0
       $52 = (OVS_CACHE_LINE_MARKER *) 0x1b1af20
       (gdb) p &pmd->flow_cache
       $53 = (struct emc_cache *) 0x1b1afe0

       All of the above addresses shifted from cacheline start by 32B.

       Can we fix it properly? NO.
       OVS currently doesn't have appropriate API to allocate aligned
       memory. The best candidate is 'xmalloc_cacheline()' but it
       clearly states that "The memory returned will not be at the
       start of a cache line, though, so don't assume such alignment".
       And also, this function will never return aligned memory on
       Windows or MacOS.

    2. CACHE_LINE_SIZE is not constant. Different architectures have
       different cache line sizes, but the code assumes that
       CACHE_LINE_SIZE is always equal to 64 bytes. All the structure
       members are grouped by 64 bytes and padded to CACHE_LINE_SIZE.
       This leads to a huge holes in a structures if CACHE_LINE_SIZE
       differs from 64. This is opposite to portability. If I want
       good performance of cmap I need to have CACHE_LINE_SIZE equal
       to the real cache line size, but I will have huge holes in the
       structures. If you'll take a look to struct rte_mbuf from DPDK
       you'll see that it uses 2 defines: RTE_CACHE_LINE_SIZE and
       RTE_CACHE_LINE_MIN_SIZE to avoid holes in mbuf structure.

    3. Sizes of system/libc defined types are not constant for all the
       systems. For example, sizeof(pthread_mutex_t) == 48 on my
       ARMv8 machine, but only 40 on x86. The difference could be
       much bigger on Windows or MacOS systems. But the code assumes
       that sizeof(struct ovs_mutex) is always 48 bytes. This may lead
       to broken alignment/big holes in case of padding/wrong comments
       about amount of free pad bytes.

    4. Sizes of the many fileds in structure depends on defines like
       DP_N_STATS, PMD_N_CYCLES, EM_FLOW_HASH_ENTRIES and so on.
       Any change in these defines or any change in any structure
       contained by thread should lead to the not so simple
       refactoring of the whole dp_netdev_pmd_thread structure. This
       greatly reduces maintainability and complicates development of
       a new features.

    5. There is no reason to align flow_cache member because it's
       too big and we usually access random entries by single thread
       only.

So, the padding/alignment only creates some visibility of performance
optimization but does nothing useful in reality. It only complicates
maintenance and adds huge holes for non-x86 architectures and non-Linux
systems. Performance improvement stated in a original commit message
should be random and not valuable. I see no performance difference.

Most of the above issues are also true for some other padded/aligned
structures like 'struct netdev_dpdk'. They will be treated separately.

CC: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
CC: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Mark Kavanagh
a14d1cc8a7 netdev-dpdk: vHost IOMMU support
DPDK v17.11 introduces support for the vHost IOMMU feature.
This is a security feature, which restricts the vhost memory
that a virtio device may access.

This feature also enables the vhost REPLY_ACK protocol, the
implementation of which is known to work in newer versions of
QEMU (i.e. v2.10.0), but is buggy in older versions (v2.7.0 -
v2.9.0, inclusive). As such, the feature is disabled by default
in (and should remain so), for the aforementioned older QEMU
verions. Starting with QEMU v2.9.1, vhost-iommu-support can
safely be enabled, even without having an IOMMU device, with
no performance penalty.

This patch adds a new global config option, vhost-iommu-support,
that controls enablement of the vhost IOMMU feature:

    ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true

This value defaults to false; to enable IOMMU support, this field
should be set to true when setting other global parameters on init
(such as "dpdk-socket-mem", for example). Changing the value at
runtime is not supported, and requires restarting the vswitch daemon.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Mark Kavanagh
5e925ccc2a netdev-dpdk: DPDK v17.11 upgrade
This commit adds support for DPDK v17.11:
- minor updates to accomodate DPDK API changes
- update references to DPDK version in Documentation
- update DPDK version in travis' linux-build script
- document DPDK v17.11 virtio driver bug

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
Tested-by: Jan Scheurich <jan.scheurich@ericsson.com>
Tested-by: Guoshuai Li <ligs@dtdream.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Yifeng Sun
d1ce9c2033 dpif-netdev: Fix memory leak
Valgrind complains in test 1019 (dpctl - add-if set-if del-if):

4,850,896 (4,850,240 direct, 656 indirect) bytes in 1 blocks are
definitely lost in loss record 364 of 364
   by 0x517062: xcalloc (util.c:103)
   by 0x46CBBC: dp_netdev_set_nonpmd (dpif-netdev.c:4498)
   by 0x46CBBC: create_dp_netdev (dpif-netdev.c:1299)
   by 0x46CBBC: dpif_netdev_open (dpif-netdev.c:1337)
   by 0x472CB0: do_open (dpif.c:350)
   by 0x472E6F: dpif_create (dpif.c:404)
   by 0x472E6F: dpif_create_and_open (dpif.c:417)
   by 0x430EBC: open_dpif_backer (ofproto-dpif.c:727)
   by 0x430EBC: construct (ofproto-dpif.c:1411)
   by 0x41B714: ofproto_create (ofproto.c:539)
   by 0x40C84E: bridge_reconfigure (bridge.c:647)
   by 0x4104C5: bridge_run (bridge.c:2998)
   by 0x406FA4: main (ovs-vswitchd.c:119)

The reference count wasn't released at this earlier return.

This fix passes the test 'make check'.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Kevin Traynor
8368866efb dpif-netdev: Calculate rxq cycles prior to compare_rxq_cycles calls.
compare_rxq_cycles sums the latest cycles from each queue for
comparison with each other. While each comparison correctly
gets the latest cycles, the cycles could change between calls
to compare_rxq_cycle. In order to use consistent values through
each call of compare_rxq_cycles, sum the cycles before qsort is
called.

Requested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Kevin Traynor
cc131ac184 dpif-netdev: Rename rxq_cycle_sort to compare_rxq_cycles.
This function is used for comparison between queues
as part of the sort. It does not do the sort itself.
As such, give it a more appropriate name.

Suggested-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Billy O'Mahony
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Kevin Traynor
a130f1a89b dpif-netdev: Add port/queue tiebreaker to rxq_cycle_sort.
rxq_cycle_sort is used to compare rx queues by their measured number
of cycles. In the event that they are equal, 0 could be returned.
However, it is observed that returning 0 results in a different sort
order on Windows/Linux. This is ok in practice but it causes a unit
test failure for
"1007: PMD - pmd-cpu-mask/distribution of rx queues" when running
on different OS's.

In order to have a consistent sort result across multiple OS's,
introduce a tiebreaker of port/queue.

Fixes: 655856ef39 ("dpif-netdev: Change rxq_scheduling to use rxq processing cycles.")
Reported-by: Alin Gabriel Serdean <aserdean@ovn.org>
Tested-by: Alin Gabriel Serdean <aserdean@ovn.org>
Co-authored-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Kevin Traynor
255b7bda98 netdev-dpdk: Remove uneeded call to rte_eth_dev_count().
The call to rte_eth_dev_count() was added as workaround
for rte_eth_dev_get_port_by_name() not handling cases
when there was no DPDK ports.

In versions of DPDK >= 17.02 rte_eth_dev_get_port_by_name()
does handle this case (DPDK commit f9ae888b1e19).
rte_eth_dev_count() is no longer needed so remove it.

Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Ilya Maximets
b2e72a9c9d netdev-dpdk: Add comment about variables naming convention.
It'll be nice to document current naming convention for variables of
the following types used in netdev-dpdk:

	* netdev
	* netdev_dpdk
	* netdev_rxq
	* netdev_rxq_dpdk

to be sure that we will not return to chaos which was before
commit d46285a220 ("netdev-dpdk: Consistent variable naming.").

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Ilya Maximets
3d0d5ab153 netdev-dpdk: Fix variables naming in set_admin_state function.
Function 'netdev_dpdk_set_admin_state()' was missed while fixing
variables naming according to the following convention:

    'struct netdev':'netdev'
    'struct netdev_dpdk':'dev'
    'struct netdev_rxq':'rxq'
    'struct netdev_rxq_dpdk':'rx'

Fixes: d46285a220 ("netdev-dpdk: Consistent variable naming.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokess <ian.stokes@intel.com>
2017-12-08 21:42:54 +00:00
Ben Pfaff
65d9759c4f ovsdb-data: Add OVS_WARN_UNUSED_RESULT annotations to function definitions.
The function prototypes in ovsdb-data.h already have these, but it seems
more complete to have the annotation on the definitions too.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
2017-12-08 13:39:29 -08:00
Ben Pfaff
ed4ef16a22 AUTHORS: Update email address for Thadeu Lima de Souza Cascardo.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@cascardo.eti.br>
2017-12-08 10:30:34 -08:00
Justin Pettit
159cc1f4e5 datapath-windows: Correct endianness for deleting zone.
The zone Netlink attribute is supposed to be in network-byte order, but
the Windows code for deleting conntrack entries was treating it as
host-byte order.

Found by inspection.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
2017-12-07 15:06:00 -08:00
Yi-Hung Wei
c43a133198 dpctl: Support flush conntrack by conntrack 5-tuple
With this patch, "flush-conntrack" in ovs-dpctl and ovs-appctl accept
a conntrack 5-tuple to delete the conntrack entry specified by the 5-tuple.
For example, user can use the following command to flush a conntrack entry
in zone 5.

$ ovs-dpctl flush-conntrack zone=5 \
  'ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1'

$ ovs-appctl dpctl/flush-conntrack zone=5 \
  'ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1'

VMWare-BZ: #1983178
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2017-12-07 13:50:29 -08:00
Yi-Hung Wei
817a76577f ct-dpif,dpif-netlink: Support conntrack flush by ct 5-tuple
This patch adds support of flushing a conntrack entry specified by the
conntrack 5-tuple, and provides the implementation in dpif-netlink.
The implementation of dpif-netlink in the linux datapath utilizes the
NFNL_SUBSYS_CTNETLINK netlink subsystem to delete a conntrack entry in
nf_conntrack.  Future patches will add support for the userspace and
Windows datapaths.

VMWare-BZ: #1983178
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2017-12-07 13:49:40 -08:00
Justin Pettit
0bd28b0bcd dpctl: Fix comment describing get_one_dp().
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2017-12-07 11:04:39 -08:00
Ben Pfaff
07754b23ee tests: Use $(MKDIR_P) instead of mkdir -p.
It is more portable.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
2017-12-06 10:23:21 -08:00
Ben Pfaff
0cc54252c9 tests: Use $(MKDIR_P) to avoid races.
"test -d x || mkdir x" has a race when invoked in parallel: it is possible
for two processes to both see that 'x' does not exist and both try to
create it, and if that happens then one of them will fail.  This avoids
the problem.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
2017-12-06 10:23:21 -08:00
Numan Siddique
081afa70fd OVN pacemaker: Add the monitor action for Master role
Pacemaker Resource agent periodically calls the OVN OCF's "monitor" action
periodically to check the status. But the OVN OCF script doesn't add the
action "monitor" for the role "Master" because of which the pacemaker
resource agent do not call the "monitor" action at all for the master.
In case OVN db servers exit for some reason this totally gets undetected
and one of the standby node is not promoted to master.

This patch adds the monitor action for "Master" role. Also the monitor
action do not check for the status of the ovn-northd (if manage_northd is yes).
This patch also checks for the status of the ovn-northd in the monitor action
for the "Master" role. If any of the ovsdb-server or ovn-northd is not running,
monitor action will return OCF_NOT_RUNNING and this will cause the pacemaker
to restart the OVN OCF resource.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1512568
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
CC: Russell Bryant <russell@ovn.org>
Signed-off-by: Russell Bryant <russell@ovn.org>
2017-12-05 10:35:10 -05:00
Justin Pettit
cbdf9440f4 ovn/TODO: Remove some completed items.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2017-12-04 13:22:51 -08:00
Yunjian Wang
995daf4c6a datapath: Fix kernel panic for uninitialized tun_dst of ovs_gso_cb.
The variable tun_dst in struct ovs_gso_cb isn't necessarily all-zeros which
came from the Netlink layer. When delete a netdev port and immediately add
a vxlan port, they maybe use the same port_no. So the variable tun_dst of
struct ovs_gso_cb hasn't be set, when the skb sent to the vxlan port. And
the panic will be triggered.

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000052
	IP: [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch]
	PGD 1f9f374067 PUD 1f9f375067 PMD 0
	Oops: 0000 [#1] SMP
	RIP: 0010:[<ffffffffa07954f4>]  [<ffffffffa07954f4>] rpl_vxlan_xmit+0x34/0x60 [openvswitch]
	RSP: 0018:ffff881fff483898  EFLAGS: 00010202
	RAX: 0000000000000040 RBX: ffff881ff2d59f00 RCX: ffff881f742016b0
	RDX: 0000000000000001 RSI: ffff881f9f5f0000 RDI: ffff881ff2d59f00
	RBP: ffff881fff483898 R08: 000000000000002e R09: 0000000000000000
	R10: 0000000000000000 R11: ffff881fff483a50 R12: ffff881f74201680
	R13: 000000000000ffbe R14: 0000000000000000 R15: ffff881ff2d59f00
	FS:  00007f8b6f7fe700(0000) GS:ffff881fff480000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: 0000000000000052 CR3: 0000001f9f373000 CR4: 00000000000027e0
	DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
	DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
	Call Trace:
	 <IRQ>
	[<ffffffffa0786480>] ovs_vport_send+0xa0/0x180 [openvswitch]
	[<ffffffffa077414e>] do_output+0x4e/0xf0 [openvswitch]
	[<ffffffffa07758ae>] do_execute_actions+0xa6e/0xa90 [openvswitch]
	[<ffffffff815b654f>] ? netlink_unicast+0x16f/0x1b0
	[<ffffffff815732bb>] ? skb_zerocopy+0x1fb/0x380
	[<ffffffffa07847ca>] ? flow_lookup.isra.8+0x4a/0xc0 [openvswitch]
	[<ffffffffa0775b2d>] ovs_execute_actions+0x4d/0x140 [openvswitch]
	[<ffffffffa077c604>] ovs_dp_process_packet+0x94/0x140 [openvswitch]
	[<ffffffffa07762c4>] ? ovs_ct_update_key+0xc4/0x150 [openvswitch]
	[<ffffffffa078637b>] ovs_vport_receive+0x7b/0xe0 [openvswitch]
	[<ffffffffa077c604>] ? ovs_dp_process_packet+0x94/0x140 [openvswitch]
	[<ffffffff816062d6>] ? __fib_validate_source.isra.13+0x2b6/0x400
	[<ffffffff8158da15>] ? dst_init+0xe5/0xf0
	[<ffffffffa021a2af>] ? generic_packet+0x1f/0x30 [nf_conntrack]
	[<ffffffffa02160d0>] ? nf_conntrack_in+0x350/0x5f0 [nf_conntrack]
	[<ffffffffa0787047>] netdev_port_receive+0xa7/0x100 [openvswitch]
	[<ffffffffa07870be>] netdev_frame_hook+0x1e/0x30 [openvswitch]
	[<ffffffff81581a52>] __netif_receive_skb_core+0x1e2/0x800
	[<ffffffff81582088>] __netif_receive_skb+0x18/0x60
	[<ffffffff81582110>] netif_receive_skb_internal+0x40/0xc0
	[<ffffffff81583228>] napi_gro_receive+0xd8/0x130
	[<ffffffffa04ef634>] ixgbe_clean_rx_irq+0x7c4/0xa60 [ixgbe]
	[<ffffffffa04f0930>] ixgbe_poll+0x2e0/0x6c0 [ixgbe]
	[<ffffffff815828b0>] net_rx_action+0x170/0x380
	[<ffffffff81090b0f>] __do_softirq+0xef/0x280
	[<ffffffff816ac15c>] call_softirq+0x1c/0x30
	[<ffffffff8102e47d>] do_softirq+0x5d/0xb0
	[<ffffffff81090ebd>] irq_exit+0x12d/0x140
	[<ffffffff816accf8>] do_IRQ+0x58/0xf0
	[<ffffffff816a1ced>] common_interrupt+0x6d/0x6d
	<EOI>

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
2017-12-04 09:44:36 -08:00
Lucas Alvares Gomes
c4c1e1a5af OVN: Add external_ids to NAT and Logical_Router_Static_Route tables.
The external_ids column is missing from the NAT and
Logical_Router_Static_Route tables.

Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Daniel Alvarez <dalvarez@redhat.com>
Acked-by: Miguel Angel Ajo <majopela@redhat.com>
2017-12-04 09:06:55 -08:00
Ben Pfaff
791efb3673 sflow: Correctly document setup command.
Reported-by: Shivaram Mysore <shivaram.mysore@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
2017-12-04 08:39:39 -08:00
Ben Pfaff
2450cdabf6 coding-style: Explain when to break lines before or after binary operators.
The coding style has never been explicit about this.  This commit adds some
explanation of why one position or the other might be favored in a given
situation.

Suggested-by: Flavio Leitner <fbl@sysclose.org>
Suggested-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341091.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Tiago Lam <tiagolam@gmail.com>
2017-12-04 08:33:49 -08:00
Ben Pfaff
99cf99597f odp-util: Fix another hang in NSH action parsing.
Found by libfuzzer.

Reported-by: Bhargava Shastry <bshastry@sec.t-labs.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
2017-12-01 13:28:30 -08:00
Yi-Hung Wei
9131abe223 lib, ovsdb: Adapt headers for C++ usage
This patch adds 'extern "C"' in a couple of header files so that
they can be compiled with C++ compilers.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-12-01 13:22:18 -08:00
Ben Pfaff
936cca1792 fedora.rst, rhel.rst: Fix broken build.
This fixes several "ERROR: Unexpected indentation" messages from the
docs-check target.

Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-12-01 12:55:21 -08:00
Flavio Leitner
6806b9629c RPM: Improve doc to use builddep tool.
Instead of listing all the dependencies, use the RPM group
'Development Tools' and the builddep tool to find specific
ones.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
2017-12-01 11:44:21 -08:00