Previously, we tracked status changes for ofports on a per-device basis.
Each time in the main thread's loop, we would inspect every ofport
to determine whether the status had changed for corresponding devices.
This patch replaces the per-netdev change_seq with a global 'struct seq'
which tracks status change for all ports. In the average case where
ports are not constantly going up or down, this allows us to check the
sequence once per main loop and not poll any ports. In the worst case,
execution is expected to be similar to how it is currently.
In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces average CPU usage of the main thread from about 40% to
about 35%.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
We have a call chain like this:
iface_configure_qos() calls
netdev_dump_queues(), which calls
netdev_linux_dump_queues(), which calls back through 'cb' to
qos_unixctl_show_cb(), which calls
netdev_delete_queue(), which calls
netdev_linux_delete_queue().
Both netdev_dump_queues() and netdev_linux_delete_queue() take the same
mutex in the same netdev, which deadlocks.
This commit fixes the problem by getting rid of the callback.
netdev_linux_dump_queue_stats() would benefit from the same treatment but
it's less urgent because I don't see any callbacks from that function that
call back into a netdev function.
Bug #19319.
Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This is the same lifecycle used in the ofproto provider interface.
Compared to the previous netdev provider interface, it has the
advantage that the netdev top layer can control when any given
netdev becomes visible to the outside world.
Signed-off-by: Ben Pfaff <blp@nicira.com>
This API change is necessary for thread safety, to be added in an upcoming
commit. Otherwise, the client would not be able to safely use the returned
netdev because it could already have been destroyed.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
This API change is necessary for thread safety, to be added in an upcoming
commit. Otherwise, the client would not be able to actually use any of
the returned netdevs because they could already have been destroyed.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
I suspect that this makes it easier to make sure that a netdev stays open
as long as needed in some cases where a module needs access to a netdev
opened by some higher-level module.
CC: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This commit adds a function to lib/netdev.c to check that the interface name
is not the same as any of the registered vport providers' dpif_port name
(e.g. gre_system) or the datapath's internal port name (e.g. ovs-system).
Bug #15077.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The distinction between struct netdev_dev and struct netdev has always
been confusing. Now that previous commits have eliminated all interesting
state from struct netdev, this commit deletes it and renames struct
netdev_dev to take its place. Now the situation makes much more sense and
I won't have to continue making embarrassed explanations in the future.
Good riddance.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Separating packet capture from "struct netdev" means that there is no
remaining per-"struct netdev" state, which will allow us to get rid of
"struct netdev_dev" (by renaming it "struct netdev").
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
This gets rid of the only per-instance data in "struct netdev", which
will make it possible to merge "struct netdev_dev" into "struct netdev" in
a later commit.
Ed Maste wrote the netdev-bsd changes in this commit.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Co-authored-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ed Maste <emaste@freebsd.org>
Tested-by: Ed Maste <emaste@freebsd.org>
This patch removes the final bit of linux specific code which
prevents building netdev-vport everywhere. With this, other
platforms automatically get access to patch ports, and (if their
datapath supports it), flow based tunneling.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
This commit moves responsibility for implementing patch ports from
the datapath to ofproto-dpif. There are two main reasons to do
this.
The first is a matter of design: ofproto-dpif both has more
information than the datapath, and is better suited to handle the
complexity required to implement patch ports.
The second is performance. My setup is a virtual machine with two
basic learning bridges connected by patch ports. I used
ovs-benchmark to ping the virtual router IP residing outside the
VM. Over a 60 second run, "ovs-benchmark rate" improves from
14618.1 to 19311.9 transactions per second, or a 32% improvement.
Similarly, "ovs-benchmark latency" improves from 6ms to 4ms.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Future patches will need to know the details of a netdev's tunnel
configuration from outside the netdev library.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
When a link is down, or when a link has no speed because it is not a
physical interface, Open vSwitch previously reported that its rate is 100
Mbps as a default. This is counterintuitive, however, so this commit
changes Open vSwitch behavior to report 0 Mbps when a link is down or its
speed is otherwise unavailable.
Bug #13388.
Reported-by: Hiroshi Tanaka <htanaka@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The ESX userspace looks quite a bit like linux, but has some key
differences which need to be specially handled in the build. To
distinguish between ESX and systems which use the linux datapath
module, this patch adds two new macros "ESX" and "LINUX_DATAPATH".
It uses these macros to disable building code on ESX which only
applies to a true Linux environment. In addition, it adds a new
route-table-stub implementation which is required for the build to
complete successfully on ESX.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Casts are sometimes necessary. One common reason that they are necessary
is for discarding a "const" qualifier. However, this can impede
maintenance: if the type of the expression being cast changes, then the
presence of the cast can hide a necessary change in the code that does the
cast. Using CONST_CAST, instead of a bare cast, makes these changes
visible.
Inspired by my own work elsewhere:
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/cast.h#n80
Signed-off-by: Ben Pfaff <blp@nicira.com>
This patch adds new netdev classes that implement
"system" and "tap" devices on FreeBSD using the
libpcap library. This enables the use of the
"netdev" datapath_type of Open vSwitch on FreeBSD.
Signed-off-by: Gaetano Catalli <gaetano.catalli@gmail.com>
Signed-off-by: Ed Maste <emaste@adaranet.com>
Signed-off-by: Giuseppe Lettieri <g.lettieri@iet.unipi.it>
Signed-off-by: Ben Pfaff <blp@nicira.com>
A smap is a string to string hash map. It has a cleaner interface
than shash's which were traditionally used for the same purpose.
This patch implements the data structure, and changes netdev and
its providers to use it.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
iface_configure_qos() passes a callback to netdev_dump_queues() that can
delete queues. The netdev-linux implementation of this function was
unprepared for the callback to delete queues, so this could cause a
use-after-free. This fixes the problem in netdev_linux_dump_queues() and
documents that netdev_dump_queues() implementations must support deletions
in the callback.
Found by valgrind:
==1593== Invalid read of size 8
==1593== at 0x4A8C43: netdev_linux_dump_queues (hmap.h:326)
==1593== by 0x4305F7: bridge_reconfigure (bridge.c:3084)
==1593== by 0x431384: bridge_run (bridge.c:1892)
==1593== by 0x432749: main (ovs-vswitchd.c:96)
==1593== Address 0x632e078 is 8 bytes inside a block of size 32 free'd
==1593== at 0x4C240FD: free (vg_replace_malloc.c:366)
==1593== by 0x4A4D74: hfsc_class_delete (netdev-linux.c:3250)
==1593== by 0x42AA59: iface_delete_queues (bridge.c:3055)
==1593== by 0x4A8C8C: netdev_linux_dump_queues (netdev-linux.c:1881)
==1593== by 0x4305F7: bridge_reconfigure (bridge.c:3084)
==1593== by 0x431384: bridge_run (bridge.c:1892)
Bug #10164.
Reported-by: Ram Jothikumar <ram@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
netdev linux devices uses mtu ioctl to get and set MTU for a device.
By caching error code from ioctl we can reduce number of ioctl calls
for device which is unregistered from system.
netdev notification is used to update mtu which saves get-mtu-ioctl.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
netdev_get_features() and other functions have always used OpenFlow 1.0
"enum ofp_port_features" bits as part of their interface. This commit
switches over to using an internally defined interface that is not tied
directly to any OpenFlow version, making evolution of each side of the
interface easier in the future.
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
netdev_find_dev_by_in4() appears to no longer be used and thus
can be removed. This also allows netdev_enumerate(), the
enumerate member of struct netdev_class and netdev_linux_enumerate()
to be removed.
I noticed this as netdev_linux_enumerate() makes use of if_nameindex()
and if_freenameindex() which are not available when compiling using
the Android NDK r6b (Android API level 13).
Most netdev provider functions are allowed to be null if the implementation
does not support this feature. This commit adds this feature for get_mtu
and set_mtu, and changes netdev-vport to take advantage of it.
Also, changes netdev_get_mtu() to report an MTU of 0 on error, instead of
leaving the MTU indeterminate.
This allows a command like "test-openflowd --enable-dummy dummy@br0
--ports=dummy@eth0,dummy@eth1,dummy@eth2" to create a dummy datapath with
a number of dummy ports. This is more useful for testing than a dummy
datapath with just an internal port, since output to "flood" and "normal"
has less pathological results.
There is no need to have vport attribute MTU (OVS_VPORT_ATTR_MTU) as
linux net-dev-ioctl can be used to get/set MTU for linux device.
Following patch removes OVS_VPORT_ATTR_MTU from datapath protocol.
This patch also adds netdev_set_mtu interface. So that MTU adjustments
can be done from OVS userspace. get_mtu() interface is also changed, now
get_mtu() returns EOPNOTSUPP rather than returning 0 and setting *pmtu
to INT_MAX in case there is no MTU attribute for given device.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Until now, each call to netdev_open() for a particular network device
had to either specify a set of network device arguments that was either
empty or (for devices that already existed) equal to the existing device's
configuration. Unfortunately, the definition of "equality" in the latter
case was mostly done in terms of strict equality of string-to-string maps,
which caused problems in cases where, for example, one set of arguments
specified the default value of an optional argument explicitly and the
other omitted it.
The netdev interface does have provisions for defining equality other ways,
but this had only been done in one case that was especially problematic in
practice. One way to solve this particular problem would be to carefully
define equality in all the problematic cases.
This commit takes another approach based on the realization that there is
really no need to do any comparisons. Instead, it removes configuration
at netdev_open() time entirely, because almost all of netdev_open()'s
callers are not interested in creating and configuring a netdev. Most of
them just want to open a configured device and use it. Therefore, this
commit stops providing any configuration arguments to netdev_open() and the
provider functions that it calls. Instead, a caller that does want to
configure a device does so after it opens it, by calling
netdev_set_config().
This change allows us to simplify the netdev interface a bit. There is no
longer any need to implement argument comparisons. As a result, there is
also no need for "struct netdev_dev" to keep track of configuration at all.
Instead, the network devices that have configuration keep track of it in
their own internal form.
This new interface does mean that it becomes possible to accidentally
create and try to use an unconfigured netdev that requires configuration.
Bug #6677.
Reported-by: Paul Ingram <paul@nicira.com>