ovs_flow_cmd_del() now allocates reply (if needed) after the flow has
already been removed from the flow table. If the reply allocation
fails, a netlink error is signaled with netlink_set_err(), as is
already done in ovs_flow_cmd_new_or_set() in the similar situation.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reduce and clarify locking requirements for ovs_flow_cmd_alloc_info(),
ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info().
A datapath pointer is available only when holding a lock. Change
ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info() to take a
dp_ifindex directly, rather than a datapath pointer that is then
(only) used to get the dp_ifindex. This is useful, since the
dp_ifindex is available even when the datapath pointer is not, both
before and after taking a lock, which makes further critical section
reduction possible.
Make ovs_flow_cmd_alloc_info() take an 'acts' argument instead a
'flow' pointer. This allows some future patches to do the allocation
before acquiring the flow pointer.
The locking requirements after this patch are:
ovs_flow_cmd_alloc_info(): May be called without locking, must not be
called while holding the RCU read lock (due to memory allocation).
If 'acts' belong to a flow in the flow table, however, then the
caller must hold ovs_mutex.
ovs_flow_cmd_fill_info(): Either ovs_mutex or RCU read lock must be held.
ovs_flow_cmd_build_info(): This calls both of the above, so the caller
must hold ovs_mutex.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
For ovs_flow_stats_get() using ovsl_dereference() was wrong, since
flow dumps call this with RCU read lock.
ovs_flow_stats_clear() is always called with ovs_mutex, so can use
ovsl_dereference().
Also, make the ovs_flow_stats_get() 'flow' argument const to make
later patches cleaner.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Incorrect struct name was confusing, even though otherwise
inconsequental.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
The OVS implementation of BFD allows configuration of the source and
destination IP addresses of BFD packets. This patch adds the same
configuration option to the VTEP schema.
Signed-off-by: Bruce Davie <bsd@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
Bump timeout differences, because timeouts different by 1s might end up
to have the same position in the heap as rule_eviction_priority() uses
1024ms as a unit.
Also, use time/stop to avoid relying on how long an add-flow would take.
These tests were introduced by commit 6d56c1f1.
("ofproto: Update rule's priority in eviction group.")
Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: Kmindg G <kmindg@gmail.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Depending on how output buffers are flushed, there might
be still races. But this at least makes the race windows smaller.
Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Changing one of the files in the Open vSwitch ``lib'' directory
causes 43 binaries to be relinked, which takes a lot of time even with
parallel ``make''. 31 of those binaries are in the ``tests''
directory. ovs-test attemps to combine most of those binaries into a
single test program that just takes a subcommand name as its first
command-line argument.
The following patch makes use of this infrastructure.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Every other table has an external_ids column, which can be useful to
controller writers for integration purposes, so add one to Flow_Table also.
Reported-by: Ariel Tubaltsev <atubaltsev@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Justin Pettit <jpettit@nicira.com>
Windows does not have a sleep(seconds). But it does have
a Sleep(milliseconds). Sleep() in windows does not have a
return value. Since we are not using the return value for xsleep()
anywhere as of now, don't return any.
Introduced by commit 275eebb9 (utils: Introduce xsleep for RCU quiescent state)
CC: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
This patch shrinks the struct ofpbuf from 104 to 48 bytes on 64-bit
systems, or from 52 to 36 bytes on 32-bit systems (counting in the
'l7' removal from an earlier patch). This may help contribute to
cache efficiency, and will speed up initializing, copying and
manipulating ofpbufs. This is potentially important for the DPDK
datapath, but the rest of the code base may also see a little benefit.
Changes are:
- Remove 'l7' pointer (previous patch).
- Use offsets instead of layer pointers for l2_5, l3, and l4 using
'l2' as basis. Usually 'data' is the same as 'l2', but this is not
always the case (e.g., when parsing or constructing a packet), so it
can not be easily used as the offset basis. Also, packet parsing is
faster if we do not need to maintain the offsets each time we pull
data from the ofpbuf.
- Use uint32_t for 'allocated' and 'size', as 2^32 is enough even for
largest possible messages/packets.
- Use packed enum for 'source'.
- Rearrange to avoid unnecessary padding.
- Remove 'private_p', which was used only in two cases, both of which
had the invariant ('l2' == 'data'), so we can temporarily use 'l2'
as a private pointer.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Move most memory allocations away from the ovs_mutex critical
sections. vport allocations still happen while the lock is taken, as
changing that would require major refactoring. Also, vports are
created very rarely so it should not matter.
Change ovs_dp_cmd_get() now only takes the rcu_read_lock(), rather
than ovs_lock(), as nothing need to be changed. This was done by
ovs_vport_cmd_get() already.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Masks are inserted when flows are inserted to the table, so it is
logical to correspondingly remove masks when flows are removed from
the table, in ovs_flow_table_remove().
This allows ovs_flow_free() to be called without locking, which will
be used by later patches.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>Summary:
Use netlink_has_listeners() and NLM_F_ECHO flag to determine if a
reply is needed or not for OVS_FLOW_CMD_NEW, OVS_FLOW_CMD_SET, or
OVS_FLOW_CMD_DEL. Currently, OVS userspace does not request a reply
for OVS_FLOW_CMD_NEW, but usually does for OVS_FLOW_CMD_DEL, as stats
may have changed.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Jesse helped to clarify how to maintain the ABI. Making the
adjustment accordingly and add some comments.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)
The rcu_assign_pointer() ensures that the initialization of a structure
is carried out before storing a pointer to that structure.
And in the case of the NULL pointer, there is no structure to initialize.
So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL)
Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
sparse emits the following warning:
lib/dpif-netdev.c:1755:15: warning: Initializer entry defined twice
lib/dpif-netdev.c:1755:15: also defined here
due to a bug in sparse which doesn't like inlined functions which
expands a #define within it. This commit removes inline to make
sparse happy.
Signed-off-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This fix restores the order of include path such that the local include/ comes
before the system /usr/include in the #include path. Thus by making sure that
include/linux/types.h and include/linux/openvswitch.h take precedence over the
similar files in /usr/include/ directory.
Signed-off-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Wrap long lines, fix whitespaces, and fix a typo in a comment.
No functional changes are intended.
Cc: Andy Zhou <azhou@nicira.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Number of bytes in 2 words should be 8, instead of 4 bytes,
to better follow the mhash_finish() API. It is unlikely the fix
will improve the quality of hashing results.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
For Windows, use a kernel assigned localhost TCP port to listen for
runtime management connections and then write it into a file
so that a client can read it and then make a TCP connection.
Since we do not have the infrastructure to create pidfiles on
windows as of now, we create the *.ctl file without a pid. This
should be okay since we use different OVS_RUNDIR when we run
multiple copies of a daemon.
We do not generate man pages on Windows. But we still update them
for Windows so that anyone can read it elsewhere. Since we do not
generate it directly, we cannot dynamically show the configured
OVS_RUNDIR in windows. So, I have a not so nice \fIOVS_RUNDIR\fR
in the man page.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
We should use closesocket() while closing sockets so that
closing sockets work fine on both POSIX and Windows.
(In POSIX, we #define closesocket close)
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
When we send a wrong command to a Open vSwitch daemon, it returns
an error. In that case, close the connection to the daemon.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Commit 61057e884ca9c(ofproto-dpif-upcall: Slightly simplify
udpif_upcall_handler().) restructured the main loop in
udpif_upcall_handler() and discarded the check for the
'exit_latch' after acquiring the mutex. This makes it
possible for the following race:
- main thread sets the 'exit_latch' after the handler thread
checking it.
- main thread acquires the handler thread mutex and signals the
condition variable of handler thread.
- main thread releases the mutex and 'join' the handler thread.
- handler thread acquires the mutex, finds that n_upcalls is 0
and waits on the signal of condition variable.
- then OVS will hang forever.
This commit fixes the above issue by adding a check for the
'exit_latch' after acquiring the mutex.
Bug #1217229
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
ovs-vsctl is a command-line interface to the Open vSwitch database,
and as such it just modifies the desired Open vSwitch configuration in
the database. ovs-vswitchd, on the other hand, monitors the database
and implements the actual configuration specified in the database.
This can lead to surprises when the user makes a change to the
database, with ovs-vsctl, that ovs-vswitchd cannot actually
implement. In such a case, the ovs-vsctl command silently succeeds
(because the database was successfully updated) but its desired
effects don't actually take place. One good example of such a change
is attempting to add a port with a misspelled name (e.g. ``ovs-vsctl
add-port br0 fth0'', where fth0 should be eth0); another is creating
a bridge or a port whose name is longer than supported
(e.g. ``ovs-vsctl add-br'' with a 16-character bridge name on
Linux). It can take users a long time to realize the error, because it
requires looking through the ovs-vswitchd log.
The patch improves the situation by checking whether operations that
ovs executes succeed and report an error when
they do not. This patch only report add-br and add-port
operation errors by examining the `ofport' value that
ovs-vswitchd stores into the database record for the newly created
interface. Until ovs-vswitchd finishes implementing the new
configuration, this column is empty, and after it finishes it is
either -1 (on failure) or a positive number (on success).
Signed-off-by: Andy Zhou <azhou@nicira.com>
Co-authored-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This allows a client to obtain the IDL version of a row given its UUID.
It isn't normally useful, but there's a specialized use case for getting
the IDL version of a row given the UUID returned by
ovsdb_idl_txn_get_insert_uuid() following transaction commit.
An alternative would be to generate table-specific versions of
ovsdb_idl_txn_get_insert_uuid(). That seems reasonable to me too.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Now that we don't need to parse TCP flags from the packet after
extraction, we usually do not need the 'l7' pointer any more. When
needed, ofpbuf_get_tcp|udp|sctp|icmp_payload() or ofpbuf_get_l4_size()
can be used instead.
Removal of 'l7' was requested by Pravin for the DPDK datapath work, as
it simplifies packet parsing a bit.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Inline the most trivial ofpbuf functions to allow for better optimization.
Also inline the most often used ofpbuf_pull() and ofpbuf_try_pull(), which
should help streamline packet parsing.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Using ofpbuf_end() to compute payload length would fail if the ofpbuf
had any tailroom.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
There were more superficial differences between Windows and non-Windows
versions of the code due to naming. This commit reduces those differences.
CC: Gurucharan Shetty <shettyg@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Add basic recirculation infrastructure and user space
data path support for it. The following bond mega flow patch will
make use of this infrastructure.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Recirculation ID needs to be unique per datapath. Its usage will be
tracked by the backer that corresponds to the datapath.
In theory, Recirculation ID can be any uint32_t value, except 0. This
implementation limits to a smaller range just for ease of debugging.
Make the range size 0 effectively disables recirculation.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This fixes regressions from commit f7791740
("netdev: Rename netdev_rx to netdev_rxq")
and commit df1e5a3b.
("netdev: Extend rx_recv to pass multiple packets.")
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Remove unnecessary locking from functions that are always called with
appropriate locking.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Building OVS tree without DPDK produced the following warning message:
lib/dpif-netdev.c:1868:5: error: statement with no effect
This error message is complaining the return value of the following
macro not being used.
#define pmd_thread_setaffinity_cpu(c) (0)
The patch fixed this warnning by making the stub functions
as inline funtions.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Following patch adds DPDK netdev-class to userspace datapath. Now
OVS can use DPDK port for IO by just configuring DPDK port and then
adding dpdk type port to userspace datapath.
Refer to INSTALL.DPDK doc for further info.
This is based a patch from Gerald Rogers.
Signed-off-by: Gerald Rogers <gerald.rogers@intel.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>