2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

158 Commits

Author SHA1 Message Date
Daniele Di Proietto
910885540a dpif-netdev: use dpif_packet structure for packets
This commit introduces a new data structure used for receiving packets from
netdevs and passing them to dpifs.
The purpose of this change is to allow storing some private data for each
packet. The subsequent commits make use of it.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:12 -07:00
Jesse Gross
a5d4fadd00 netdev-vport: Use dpif_port as base for tunnel backing port.
In most cases, tunnel ports specify a dpif name to act as the backing
port in the datapath. However, in the case of UDP tunnels the type is
used with the port number appended. This is potentially a problem for
IPsec tunnels because they have different types but should have the
same backing port. The hasn't been a problem in practice though because
no UDP tunnels are currently used with IPsec.

This switches to use the dpif_port in all cases plus a port number if
necessary. It does this by making the names short enough to accomodate
ports, which also makes the naming more consistent.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-06-13 16:28:22 -07:00
Thomas Graf
bbe6109de7 vswitchd: Add error column to Interface table to store error condition
Store the error condition of a failed port configuration in a new
column 'error' in the Interface table.

Example:
$ ovs-vsctl add-port br0 test -- \
     set Interface test type=vxlan options:unknown=1
ovs-vsctl: Error detected while setting up 'test'.  [...]

$ ovs-vsctl list Interface test | grep error
error         : "test: could not set configuration (Invalid argument)"

Fixing the error will clear the error column:
$ ovs-vsctl set Interface test options:remote_ip=1.1.1.1
$ ovs-vsctl list Interface test | grep error
error         : []
$

For now, the high level error messages when opening and configuring
the netdev are used. Further patches can extend passing the error
pointer into the individual netdev implementations to allow for more
fine grained error messages to be stored.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-05-19 15:42:30 -07:00
Ryan Wilson
fe83f81df9 netdev: Remove netdev from global shash when the user is changing interface configuration.
When the user changes port type (i.e. changing p0 from type 'internal' to
'gre'), the netdev must first be deleted, then re-created with the new type.
Deleting the netdev requires there exist no more references to the netdev.
However, the xlate cache holds references to netdevs and the cache is only
invalidated by revalidator threads. Thus, if cache is not invalidated prior to
the netdev being re-created, the netdev will not be able to be re-created and
the configuration change will fail.

This patch always removes the netdev from the global netdev shash when the
user changes port type. This ensures that the new netdev can always be created
while handler and revalidator threads can retain references to the old netdev
until they are finished.

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-05-16 11:35:38 -07:00
Joe Stringer
ca94dda64d netdev: Safely increment refcount in netdev_open().
netdev_open() would previously increment a netdev's refcount without
holding a lock for it. This commit shifts the locking to protect it.

Found by inspection.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Reviewed-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-05-05 11:20:43 +12:00
Joe Stringer
a17ceb1bc4 netdev: Reuse netdev_ref() in netdev_rxq_open().
netdev_rxq_open() open-codes much of netdev_ref(), so re-use that
function instead.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Reviewed-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-05-05 11:20:25 +12:00
Alex Wang
41ca1e0afb netdev-vport: Checks tunnel status change when route-table is reset.
Commit 3e912ffcbb (netdev: Add 'change_seq' back to netdev.) added per-
netdev change number for indicating status change.  Future commits used
this change number to optimize the netdev status update to database.
However, the work also introduced the bug in the following scenario:

- assume interface eth0 has address 1.2.3.4, eth1 has adddress 10.0.0.1.
- assume tunnel port p1 is set with remote_ip=10.0.0.5.
- after setup, 'ovs-vsctl list interface p1 status' should show the
  'tunnel_egress_iface="eth1"'.
- now if the address of eth1 is change to 0 via 'ifconfig eth1 0'.
- expectedly, after change, 'ovs-vsctl list interface p1 status' should
  show the 'tunnel_egress_iface="eth0"'

However, 'tunnel_egress_iface' will not be updated on current master.
This is in that, the 'netdev-vport' module corresponding to p1 does
not react to routing related changes.

To fix the bug, this commit adds a change sequence number in the route-
table module and makes netdev-vport check the sequence number for
tunnel status update.

Bug #1240626

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-05-02 14:29:18 -07:00
Alex Wang
aaea735bb6 netdev: Fix an use of uninitialized mutex.
Commit 05bf6d3c62e1d (ovs-thread: Add checking for mutex and
rwlock initialization.) helps find an use of uninitialized
mutex (netdev_class_mutex) during upgrade.  The assertion
check aborts the ovs.

This commit fixes the issue by adding the proper initialization.

Bug #1239914.
Bug #1240598.
Bug #1240626.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-29 07:54:49 -07:00
Gurucharan Shetty
a4fdb0f3bd netdev: Initialize netdev_class_mutex.
This code path currently does not initialize
netdev_class_mutex.
dummy_enable
 ->netdev_dummy_register
   ->netdev_register_provider
     ->ovs_mutex_lock(&netdev_class_mutex)

ovsdb-server on windows crashes without it.

This commit adds a new initialization function.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-25 07:48:25 -07:00
Alex Wang
3e912ffcbb netdev: Add 'change_seq' back to netdev.
This commit can be seen as a partial revert of commit
da4a619179d (netdev: Globally track port status changes)
by adding the 'change_seq' to 'struct netdev'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-10 12:55:28 -07:00
Pravin
8a9562d21a dpif-netdev: Add DPDK netdev.
Following patch adds DPDK netdev-class to userspace datapath. Now
OVS can use DPDK port for IO by just configuring DPDK port and then
adding dpdk type port to userspace datapath.

Refer to INSTALL.DPDK doc for further info.

This is based a patch from Gerald Rogers.

Signed-off-by: Gerald Rogers <gerald.rogers@intel.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
55c955bd8a netdev: Add support multiqueue recv.
new netdev type like DPDK can support multi-queue IO. Following
patch Adds support for same.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
f77917408a netdev: Rename netdev_rx to netdev_rxq
Preparation for multi queue netdev IO.  There are no functional changes
in this patch.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
e4cfed38b1 dpif-netdev: Add poll-mode-device thread.
This patch adds PMD type netdev for netdevice with poll-mode
drivers.  Since there is no way to get signal on a packet recv
from these devices we need to poll them in busy loop.  So minimize
system call overhead this patch uses dpif-thread exclusively
for PMD devices and rest of devices which needs system calls to
do IO are moved to dpif-netdev-run().
PMD device like DPDK work in userspace so there is no system call
overhead for them.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
40d26f04b2 netdev: Send ofpbuf directly to netdev.
DPDK netdev need to access ofpbuf while sending buffer. Following
patch changes netdev_send accordingly.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
df1e5a3bc7 netdev: Extend rx_recv to pass multiple packets.
DPDK can receive multiple packets but current netdev API does
not allow that.  Following patch allows dpif-netdev receive batch
of packet in a rx_recv() call for any netdev port.  This will be
used by dpdk-netdev.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-03-21 11:48:28 -07:00
Ben Pfaff
8917f72cbb ovs-atomic: Delete atomic, atomic_flag, ovs_refcount destroy functions.
None of the atomic implementations need a destroy function anymore, so it's
"more standard" and more convenient for users to get rid of them.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2014-03-13 12:45:47 -07:00
Ben Pfaff
2f51a7ebda Use __linux__ instead of LINUX_DATAPATH in C code.
The LINUX_DATAPATH C preprocessor symbol was originally meant to be used as
a signal for whether the Linux datapath module could be used, but it was
used as a proxy for a lot of other stuff that is really just Linux
specific.  This commit switches all of these users to just test for
__linux__, which is more straightforward and should have the same result.

CC: Luigi Rizzo <rizzo@iet.unipi.it>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-03-05 07:51:55 -08:00
Ben Pfaff
da695497b8 netdev: Change netdev_class_rwlock to recursive mutex, for POSIX safety.
With glibc, rwlocks by default allow recursive read-locking even if a
thread is blocked waiting for the write-lock.  POSIX allows such attempts
to deadlock, and it appears that the libc used in NetBSD, at least, does
deadlock.  The netdev_class_rwlock is in fact acquired recursively in this
way, which is a bug.  This commit fixes the problem by switching to a
recursive mutex.  This allows for less parallelism, but according to an
existing comment that doesn't matter here anyway.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
2014-02-19 14:50:57 -08:00
Simon Horman
b73c85181d netdev-linux: Read packet auxdata to obtain vlan_tid
If VLAN acceleration is used when the kernel receives a packet
then the outer-most VLAN tag will not be present in the packet
when it is received by netdev-linux. Rather, it will be present
in auxdata.

This patch uses recvmsg() instead of recv() to read auxdata for
each packet and if the vlan_tid is set then it is added to the packet.

Adding the vlan_tid makes use of headroom available
in the buffer parameter of rx_recv.

Signed-off-by: Simon Horman <horms@verge.net.au>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-16 15:09:14 -08:00
Simon Horman
bfd3367b9e netdev_class: Pass a struct ofpbuf * to rx_recv()
Update the netdev_class so that struct ofpbuf * is passed to rx_recv()
to provide both the data and size of the data to read a packet into.

On success, update struct ofpbuf size inside netdev_class rx_recv
implementation and return 0. This moves logic from the caller.
On error a positive error code is returned, whereas previously
a negative error code was returned. This is a more common convention.

This patch should not have any behavioural changes.

This patch is in preparation for the netdev-linux variant of rx_recv()
making use of headroom in the struct ofpbuf * parameter to push a VLAN tag
obtained from auxdata.

Signed-off-by: Simon Horman <horms@verge.net.au>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-16 14:37:43 -08:00
Ben Pfaff
c5f81b20da ovs-atomic: Add atomic_destroy() and use everywhere it is needed.
C11 is able to require that atomics don't need to be destroyed, but some
of the OVS implementations do.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-08 17:13:30 -08:00
Joe Stringer
da4a619179 netdev: Globally track port status changes
Previously, we tracked status changes for ofports on a per-device basis.
Each time in the main thread's loop, we would inspect every ofport
to determine whether the status had changed for corresponding devices.

This patch replaces the per-netdev change_seq with a global 'struct seq'
which tracks status change for all ports. In the average case where
ports are not constantly going up or down, this allows us to check the
sequence once per main loop and not poll any ports. In the worst case,
execution is expected to be similar to how it is currently.

In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces average CPU usage of the main thread from about 40% to
about 35%.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-12-12 15:03:19 -08:00
Ben Pfaff
4f6b993481 netdev: Log a warning when netdev_set_config() fails.
This allows its callers to avoid duplicating the code.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2013-12-11 11:08:33 -08:00
Ben Pfaff
89454bf477 netdev: Fix deadlock when netdev_dump_queues() callback calls into netdev.
We have a call chain like this:

    iface_configure_qos() calls
        netdev_dump_queues(), which calls
            netdev_linux_dump_queues(), which calls back through 'cb' to
                qos_unixctl_show_cb(), which calls
                    netdev_delete_queue(), which calls
                        netdev_linux_delete_queue().

Both netdev_dump_queues() and netdev_linux_delete_queue() take the same
mutex in the same netdev, which deadlocks.

This commit fixes the problem by getting rid of the callback.

netdev_linux_dump_queue_stats() would benefit from the same treatment but
it's less urgent because I don't see any callbacks from that function that
call back into a netdev function.

Bug #19319.
Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-27 21:50:40 -07:00
Ethan Jackson
3fd6ef29e7 netdev: Make run and wait functions optional as documented.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Reported-by: Guolin Yang <gyang@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-08-21 13:19:48 -07:00
Ben Pfaff
d72e22c841 netdev: Clean up on "construct" error in netdev_open().
Reported-by: ZhengLingyun <konghuarukhr@163.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-11 10:24:48 -07:00
Ben Pfaff
863838160e netdev: Make netdev access thread-safe.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-09 21:34:02 -07:00
Ben Pfaff
9dc63482bb netdev: Adopt four-step alloc/construct/destruct/dealloc lifecycle.
This is the same lifecycle used in the ofproto provider interface.
Compared to the previous netdev provider interface, it has the
advantage that the netdev top layer can control when any given
netdev becomes visible to the outside world.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-09 21:21:38 -07:00
Ben Pfaff
991e5fae57 netdev: Make netdev_from_name() take a reference to its returned netdev.
This API change is necessary for thread safety, to be added in an upcoming
commit.  Otherwise, the client would not be able to safely use the returned
netdev because it could already have been destroyed.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:52:43 -07:00
Ben Pfaff
2f980d7417 netdev: Make netdev_get_devices() take a reference to each netdev.
This API change is necessary for thread safety, to be added in an upcoming
commit.  Otherwise, the client would not be able to actually use any of
the returned netdevs because they could already have been destroyed.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:52:42 -07:00
Ben Pfaff
15aee11695 netdev: Minor formatting improvements.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-02 12:24:19 -07:00
Ben Pfaff
6dc34a0dab Implement OpenFlow 1.3 queue stats duration feature.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-26 10:20:04 -07:00
Ben Pfaff
10a89ef04d Replace all uses of strerror() by ovs_strerror(), for thread safety.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-28 16:09:38 -07:00
Ethan Jackson
e20ae81136 netdev: Support null netdev argument in netdev_ref().
This will be convenient in future patches.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-06-27 18:23:40 -07:00
YAMAMOTO Takashi
666afb55e8 add minimal NetBSD support
mostly ride on the existing FreeBSD support.

Signed-off-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-22 20:39:14 -07:00
Ben Pfaff
0bb0393a0b netdev: New function netdev_ref().
I suspect that this makes it easier to make sure that a netdev stays open
as long as needed in some cases where a module needs access to a netdev
opened by some higher-level module.

CC: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-22 11:10:30 -07:00
Alex Wang
94a538422d netdev: Prevent using reserved names
This commit adds a function to lib/netdev.c to check that the interface name
is not the same as any of the registered vport providers' dpif_port name
(e.g. gre_system) or the datapath's internal port name (e.g. ovs-system).

Bug #15077.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-16 14:03:17 -07:00
Ben Pfaff
b5d57fc879 netdev: Get rid of netdev_dev.
The distinction between struct netdev_dev and struct netdev has always
been confusing.  Now that previous commits have eliminated all interesting
state from struct netdev, this commit deletes it and renames struct
netdev_dev to take its place.  Now the situation makes much more sense and
I won't have to continue making embarrassed explanations in the future.

Good riddance.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-10 14:46:15 -07:00
Ben Pfaff
796223f5bc netdev: Add new "struct netdev_rx" for capturing packets from a netdev.
Separating packet capture from "struct netdev" means that there is no
remaining per-"struct netdev" state, which will allow us to get rid of
"struct netdev_dev" (by renaming it "struct netdev").

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-05-10 14:39:36 -07:00
Ben Pfaff
4b60911067 netdev: Factor restoring flags into new "struct netdev_saved_flags".
This gets rid of the only per-instance data in "struct netdev", which
will make it possible to merge "struct netdev_dev" into "struct netdev" in
a later commit.

Ed Maste wrote the netdev-bsd changes in this commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Co-authored-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ed Maste <emaste@freebsd.org>
Tested-by: Ed Maste <emaste@freebsd.org>
2013-05-10 11:24:07 -07:00
Ben Pfaff
edfbe9f7bc netdev: Make 'smap' variable const in netdev_set_qos().
This makes this code more obviously thread-safe.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-03 13:29:47 -07:00
Ben Pfaff
c86073a000 netdev: Remove netdev_is_open(), which has no users.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-04-01 15:49:32 -07:00
Ben Pfaff
7e3dfa3bc7 netdev: Remove netdev_exists(), which has no users.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-04-01 15:49:31 -07:00
Ethan Jackson
c060c4cf83 netdev-vport: Build on all platforms.
This patch removes the final bit of linux specific code which
prevents building netdev-vport everywhere.  With this, other
platforms automatically get access to patch ports, and (if their
datapath supports it), flow based tunneling.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-28 19:09:58 -08:00
Ethan Jackson
0a740f4829 ofproto-dpif: Implement patch ports in userspace.
This commit moves responsibility for implementing patch ports from
the datapath to ofproto-dpif.  There are two main reasons to do
this.

The first is a matter of design:  ofproto-dpif both has more
information than the datapath, and is better suited to handle the
complexity required to implement patch ports.

The second is performance.  My setup is a virtual machine with two
basic learning bridges connected by patch ports.  I used
ovs-benchmark to ping the virtual router IP residing outside the
VM.  Over a 60 second run, "ovs-benchmark rate" improves from
14618.1 to 19311.9 transactions per second, or a 32% improvement.
Similarly, "ovs-benchmark latency" improves from 6ms to 4ms.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-24 12:34:07 -08:00
Ben Pfaff
cb22974d77 Replace most uses of assert by ovs_assert.
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-01-16 16:03:37 -08:00
Ethan Jackson
f431bf7d78 netdev: Parse and make available tunnel configuration.
Future patches will need to know the details of a netdev's tunnel
configuration from outside the netdev library.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-15 16:21:09 -08:00
Ethan Jackson
275707c33f netdev: Rename get_drv_info() to get_status().
get_status() is a much more intuitive name since "status" is what
the database column is called.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-03 16:55:43 -08:00
Ben Pfaff
d02a5f8ea4 ofproto: Report 0 Mbps when speed not available instead of 100 Mbps.
When a link is down, or when a link has no speed because it is not a
physical interface, Open vSwitch previously reported that its rate is 100
Mbps as a default.  This is counterintuitive, however, so this commit
changes Open vSwitch behavior to report 0 Mbps when a link is down or its
speed is otherwise unavailable.

Bug #13388.
Reported-by: Hiroshi Tanaka <htanaka@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-11-03 18:00:39 -07:00