2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-23 02:17:42 +00:00

351 Commits

Author SHA1 Message Date
Pravin B Shelar
2c2ea5a88a netdev: Rename netdev->get_status() to netdev->get_drv_info().
get_status actually returns driver information, so get_drv_info()
is better name.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-22 10:58:32 -07:00
Ben Pfaff
f486e8405a netdev-linux: Fix use-after-free when netdev_dump_queues() deletes queues.
iface_configure_qos() passes a callback to netdev_dump_queues() that can
delete queues.  The netdev-linux implementation of this function was
unprepared for the callback to delete queues, so this could cause a
use-after-free.  This fixes the problem in netdev_linux_dump_queues() and
documents that netdev_dump_queues() implementations must support deletions
in the callback.

Found by valgrind:

==1593== Invalid read of size 8
==1593==    at 0x4A8C43: netdev_linux_dump_queues (hmap.h:326)
==1593==    by 0x4305F7: bridge_reconfigure (bridge.c:3084)
==1593==    by 0x431384: bridge_run (bridge.c:1892)
==1593==    by 0x432749: main (ovs-vswitchd.c:96)
==1593==  Address 0x632e078 is 8 bytes inside a block of size 32 free'd
==1593==    at 0x4C240FD: free (vg_replace_malloc.c:366)
==1593==    by 0x4A4D74: hfsc_class_delete (netdev-linux.c:3250)
==1593==    by 0x42AA59: iface_delete_queues (bridge.c:3055)
==1593==    by 0x4A8C8C: netdev_linux_dump_queues (netdev-linux.c:1881)
==1593==    by 0x4305F7: bridge_reconfigure (bridge.c:3084)
==1593==    by 0x431384: bridge_run (bridge.c:1892)

Bug #10164.
Reported-by: Ram Jothikumar <ram@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-03-19 13:48:24 -07:00
Ethan Jackson
a00ca915ff netdev: Consistently use 'enum netdev_features'.
Without this patch sparse gives me warnings.  At any rate, it's
cleaner to be consistent.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-03-09 13:58:03 -08:00
Pravin B Shelar
c7b1b0a535 netdev-linux: Cache error code from do_get_ifindex().
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-09 13:00:29 -08:00
Pravin B Shelar
51f8745865 netdev-linux: Cache error code from get-features.
Following patch adds support for caching error code from ETHTOOL_GSET
call. Since internal device is virtual device device feature does not
make much sense, so netdev_get_features op is removed for internal
devices.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-09 12:59:58 -08:00
Pravin B Shelar
c9f716683d netdev-linux: Cache error code from set-policing.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-09 12:59:27 -08:00
Pravin B Shelar
44445cac40 netdev-linux: Cache error code from ether-addr ioctl.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-09 12:58:10 -08:00
Pravin B Shelar
90a6637d5e netdev-linux: Cache error code from mtu ioctl.
netdev linux devices uses mtu ioctl to get and set MTU for a device.
By caching error code from ioctl we can reduce number of ioctl calls
for device which is unregistered from system.
netdev notification is used to update mtu which saves get-mtu-ioctl.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-09 12:57:48 -08:00
Pravin B Shelar
4f925bd39b netdev-linux: Cache drv-info for net device.
Netdev-linux calls ETHTOOL_GDRVINFO on every netdev_linux_get_status()
which is not optimal as drv-info does not change for given device.
So following patch changes netdev_linux_get_status() to read drv-info at
device initialization and cache it.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-09 12:57:13 -08:00
Ben Pfaff
8e8cddf794 netdev-linux: Use "read", not "recv", for tap devices.
"recv" only works for sockets, but tap devices aren't sockets.

Makes the userspace switch work again.

Reported-by: Ravi Kerur <Ravi.Kerur@telekom.com>
Reported-by: 胡靖飞 <hujingfei914@msn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-03-08 14:28:24 -08:00
Ben Pfaff
2a529ead2a netdev-linux: Fix build failure with old kernel headers.
The "speed_hi" member was only introduced in 2.6.27, so builds against
older kernel headers failed.

speed_hi is fully backward compatible with older kernels, because older
kernels always set it to 0, so we could easily introduce a compatibility
layer here, but in fact I don't know of any OVS users who have interfaces
faster than 65.5 Gb/s, so it's hardly urgent.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-03-07 15:48:00 -08:00
Ben Pfaff
6c0386119d netdev: Abstract "features" interface away from OpenFlow 1.0.
netdev_get_features() and other functions have always used OpenFlow 1.0
"enum ofp_port_features" bits as part of their interface.  This commit
switches over to using an internally defined interface that is not tied
directly to any OpenFlow version, making evolution of each side of the
interface easier in the future.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-03-07 14:05:09 -08:00
Pravin B Shelar
ac8c341227 netdev-linux: Make netdev_set_policing coverage counter consistent with other counters.
Most of coverage counters in netdev-linux are counting actual system
calls rather than reads from cached data.
Following patch fixes it by incrementing it after cache check.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-05 15:42:41 -08:00
Pravin B Shelar
bba1e6f3ac netdev-linux: Fix stats for ovs internal device.
There is no need to retrieve linux system stats for internal devices
as all relevant stats for virtual device like internal device are
already reported by OVS over vport-stats. As a result it also fixes
error stats for internal-devices as they are not counted twice.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-02-29 14:09:05 -08:00
Ben Pfaff
8aa77183a1 netdev-linux: Factor out duplicate ifi_flags update code.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-02-14 16:46:40 -08:00
Ethan Jackson
059e5f4fce netdev-linux: Use 'unsigned int's to track device flags.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-02-14 16:44:21 -08:00
Ethan Jackson
c37d4da4f0 netdev-linux: Cache flags using netlink.
Before this patch, every request for a 'netdev_dev''s flags
required an ioctl call.  This occurred every time
netdev_get_carrier() was called, which theoretically was very often
if there were a large number of devices.  We were already using
netlink to keep track of the IFF_RUNNING flag. This patch
generalizes the code to keep track of all flags using the same
netlink code.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-02-14 16:40:43 -08:00
Ethan Jackson
755be9ea9d netdev-linux: Get carrier from ioctl instead of sysfs.
When a netdev Linux device is created or its netlink cache is
invalidate, it needs an alternative method to update the its
carrier status.  Previous patches retrieved this information from a
sysfs file.  This patch switches to ioctl which is significantly
simpler, and likely quite a bit faster as well.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-02-14 16:40:43 -08:00
Pravin B Shelar
153e54814d netdev-linux: Add MTU check before setting MTU.
Following patch check if current MTU needs to be changed before
issuing set-mtu ioctl.

Suggested-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-01-25 15:44:26 -08:00
Ben Pfaff
0e15264f96 unixctl: Implement quoting.
The protocol used by ovs-appctl has a long-standing bug that there
is no way to distinguish "ovs-appctl a b c" from "ovs-appctl 'a b c'".
This isn't a big deal because none of the current commands really
want to accept arguments that include spaces, but it's kind of a silly
limitation.

At the same time, the internal API is awkward because every user is
stuck doing its own argument parsing, which is no fun.

This commit fixes both problems, by adding shell-like quoting to the
protocol and modifying the internal API from one that passes a string
to one that passes in an array of pre-parsed strings.  Command
implementations may now specify how many arguments they expect.  This
simplifies some command implementations significantly.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2011-12-19 14:53:34 -08:00
Ben Pfaff
02e83e83b4 netdev-linux: Report error for truncated packets on receive.
Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2011-12-19 14:53:28 -08:00
Ben Pfaff
a57a8488da netdev-linux: Translate errno value to name in log message.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2011-12-19 14:53:24 -08:00
Justin Pettit
f850000440 netdev-linux: Don't restrict policing to IPv4 and don't call "tc".
Mike Bursell pointed out that our policer only works on IPv4
traffic--and specifically not IPv6.  By using the "basic" filter, we can
enforce policing on all traffic for a particular interface.

Jamal Hadi Salim pointed out that calling "tc" directly with system() is
pretty ugly.  This commit switches our remaining "tc" calls to directly
sending the appropriate netlink messages.

Suggested-by: Mike Bursell <mike.bursell@citrix.com>
Suggested-by: Jamal Hadi Salim <hadi@cyberus.ca>
2011-12-05 14:08:06 -08:00
Ben Pfaff
1f6e0fbd81 netdev-linux: Ref and unref the netdev_linux_cache_notifier for taps too.
netdev-linux uses netdev_linux_cache_notifier to flush its cache when the
kernel notifies userspace that a particular network device's configuration
or status has changed.  This is as applicable to tap devices as to system
and internal devices, so we should create and destroy the notifier for
tap devices also.

I doubt that in practice it's possible to run ovs-vswitchd without having
a non-tap device open, at least with the kernel datapath, because the
local port for a bridge is not a tap device, so there should be no need to
backport this to older versions.

Reported-by: Gaetano Catalli <gaetano.catalli@gmail.com>
2011-12-02 10:39:07 -08:00
Ben Pfaff
025e874a9c vlandev: New library for working with Linux VLAN devices. 2011-11-23 15:32:36 -08:00
Ben Pfaff
aaf2fb1aac netdev-linux: Reorganize slightly. 2011-11-23 13:19:53 -08:00
Ben Pfaff
52e2fbfbab netdev: Remove netdev_get_vlan_vid().
It has no remaining users.
2011-11-23 13:19:53 -08:00
Ethan Jackson
65c3058c22 vswitchd: New column "link_resets".
An interface's 'link_resets' column represents the number of times
Open vSwitch has observed its link_state change.
2011-10-17 15:03:03 -07:00
Ethan Jackson
3a18312428 netdev-linux: Maintain carrier flag constantly.
Before this patch, the carrier of a linux device was only updated
if requested by a caller.  This patch updates it whenever it
changes.
2011-10-17 15:03:03 -07:00
Ben Pfaff
bb7d0e2287 netdev-linux: Fix broken build on RHEL 6.
Commit 00fa9d37c2b "Do not include net/ethernet.h and linux/if_tunnel.h"
introduced a compile error on RHEL 6:

lib/netdev-linux.c: In function 'netdev_linux_listen':
lib/netdev-linux.c:734: error: 'ETH_P_ALL' undeclared (first use in this
function)

This fixes the problem.

I verified that the Android NDK r6b mentioned in the previous commit
contains a file named android-ndk-r6b/platforms/android-3/arch-x86/use/
linux/if_ether.h that defines ETH_P_ALL.  I didn't try building on that
platform.
2011-09-22 11:54:22 -07:00
Simon Horman
ee9bed06cd Remove netdev_find_dev_by_in4
netdev_find_dev_by_in4() appears to no longer be used and thus
can be removed. This also allows netdev_enumerate(), the
enumerate member of struct netdev_class and netdev_linux_enumerate()
to be removed.

I noticed this as netdev_linux_enumerate() makes use of if_nameindex()
and if_freenameindex() which are not available when compiling using
the Android NDK r6b (Android API level 13).
2011-09-22 09:03:03 -07:00
Simon Horman
00fa9d37c2 Do not include net/ethernet.h and linux/if_tunnel.h
net/ethernet.h and linux/if_tunnel.h do not appear to be needed
on lib/netdev-linux.c.

I noticed this while trying to build on the Android NDK r6b (Android API
level 13) as these headers are not present there.
2011-09-22 09:03:02 -07:00
Ethan Jackson
2ee6545f2b notifiers: Create and destroy nln_notifiers.
This patch changes the interface of netlink-notifier and
rtnetlink-link.  Now nln_notifiers are allocated and destroyed by
the module instead of passed in by callers.  This allows the
definition of nln_notifier to be hidden, and generally cleans up
the code.
2011-09-16 11:22:30 -07:00
Ethan Jackson
18a2378164 notifiers: Rename run and wait functions.
It makes more sense to call nln_notifier_run() and
nln_notifier_wait() simply nln_run() and nln_wait() since they
don't operate on notifiers but the entire nln object.  This patch
changes the nln and the rtnetlink-link modules to the new
convention.
2011-09-16 11:22:30 -07:00
Pravin Shelar
f613a0d72c datapath: Always use generic stats for devices (vports)
Currently ovs is using device stats for Linux devices and count them
itself in other situations. This leads to overlap with hardware stats,
inconsistencies, etc. It's much better to just always count the packets
flowing through the switch and let userspace do any merging that it wants.

Following patch removes vport->get_stats() interface. vport-stat is changed
to use new `struct ovs_vport_stat` rather than rtnl_link_stats64.
Definitions of rtnl_link_stats64 is removed from OVS.  dipf_port->stat is also
removed as aggregate stats are only available at netdev layer.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-15 19:36:17 -07:00
Pravin Shelar
9b02078077 datapath: Strip down vport interface : OVS_VPORT_ATTR_MTU
There is no need to have vport attribute MTU (OVS_VPORT_ATTR_MTU) as
linux net-dev-ioctl can be used to get/set MTU for linux device.
Following patch removes OVS_VPORT_ATTR_MTU from datapath protocol.

This patch also adds netdev_set_mtu interface. So that MTU adjustments
can be done from OVS userspace. get_mtu() interface is also changed, now
get_mtu() returns EOPNOTSUPP rather than returning 0 and setting *pmtu
to INT_MAX in case there is no MTU attribute for given device.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-12 17:12:52 -07:00
Ethan Jackson
0a811051ff netlink-notifier: Rename rtnetlink code.
This patch renames the rtnetlink module's code to "nln" for
"netlink notifier".  Callers are now required to pass in the
netlink protocol to he newly renamed nln_create() function.
2011-09-01 17:18:52 -07:00
Ethan Jackson
45c8d3a189 lib: Rename rtnetlink.[ch] files.
The only rtnetlink specific functionality contained in the
rtnetlink module is the use of the NETLINK_ROUTE protocol.  This
can easily be passed in by callers.

In preparation for generalization, this patch renames
rtnetlink.[ch] to netlink-notifier.[ch].  Future patches will
complete the transition.
2011-09-01 17:18:51 -07:00
Justin Pettit
e47bd51a7b netdev-linux: Introduce netdev_linux_ethtool_set_flag().
There will be a caller added soon.
2011-08-28 13:43:38 -07:00
Ben Pfaff
de5cdb90f7 netdev: Decouple creating and configuring network devices.
Until now, each call to netdev_open() for a particular network device
had to either specify a set of network device arguments that was either
empty or (for devices that already existed) equal to the existing device's
configuration.  Unfortunately, the definition of "equality" in the latter
case was mostly done in terms of strict equality of string-to-string maps,
which caused problems in cases where, for example, one set of arguments
specified the default value of an optional argument explicitly and the
other omitted it.

The netdev interface does have provisions for defining equality other ways,
but this had only been done in one case that was especially problematic in
practice.  One way to solve this particular problem would be to carefully
define equality in all the problematic cases.

This commit takes another approach based on the realization that there is
really no need to do any comparisons.  Instead, it removes configuration
at netdev_open() time entirely, because almost all of netdev_open()'s
callers are not interested in creating and configuring a netdev.  Most of
them just want to open a configured device and use it.  Therefore, this
commit stops providing any configuration arguments to netdev_open() and the
provider functions that it calls.  Instead, a caller that does want to
configure a device does so after it opens it, by calling
netdev_set_config().

This change allows us to simplify the netdev interface a bit.  There is no
longer any need to implement argument comparisons.  As a result, there is
also no need for "struct netdev_dev" to keep track of configuration at all.
Instead, the network devices that have configuration keep track of it in
their own internal form.

This new interface does mean that it becomes possible to accidentally
create and try to use an unconfigured netdev that requires configuration.

Bug #6677.
Reported-by: Paul Ingram <paul@nicira.com>
2011-08-08 12:49:17 -07:00
Ben Pfaff
7b6b0ef47e netdev: Clean up and refactor packet receive interface.
The Open vSwitch tree only has one user of the ability for a netdev to
receive packets from a network device.  Thus, this commit simplifies the
common-case use of the netdev interface by replacing the "ethertype" option
from "struct netdev_options" by a new netdev_listen() call.

The only user of netdev_listen() wants to receive all packets from a
network device, so this commit also removes the ability to restrict the
received packets to a particular protocol.  (This ability was once used by
the Open vSwitch integrated DHCP client, but that code has been removed.)

This commit also simplifies and improves the implementation of the code
in netdev-linux that started listening to a network device.  Before, I had
not figured out how to avoid receiving all packets on all devices before
binding to a particular device, but I took a closer look at the kernel code
and figured it out.

I've tested that the userspace datapath (dpif-netdev), the only user of
netdev_recv(), still works after this change.
2011-08-08 10:24:24 -07:00
Ben Pfaff
78857dfb3d Reduce log level for ENODEV errors getting Ethernet address.
Bug #5844 reports several log messages of the form:

    netdev_linux|ERR|ioctl(SIOCGIFHWADDR) on vif426.1 device failed: No
    such device

during migrations.  These are normal and unavoidable, because the vifs
disappear from the kernel before they are removed them from the OVS
database.  Reduce the log level to avoid making people worry.

Bug #5844.
2011-06-17 13:38:31 -07:00
Justin Pettit
aebf4235f3 netdev: Add methods to do netdev-specific argument comparisons.
When doing a netdev_open(), a check is first done to make sure the
arguments are equivalent for any open devices with the same name.  In
most cases, a simple shash comparison is sufficient.  However, IPsec
key configuration is handled by an external program, so it is not pushed
down into the kernel module.  Thus, when the "unparse_config" method is
called on an existing IPsec-based vport, a simple comparison with the
returned data will not match the original configuration.  This commit
adds code to allow netdev-specific argument comparisons and has
"ipsec_gre" make use of them.

Bug #5575
2011-06-14 10:28:40 -07:00
Ethan Jackson
943e5afe0b netdev: Remove monitors and notifiers.
Neither of these constructs are used anymore.
2011-05-31 14:34:39 -07:00
Ethan Jackson
ac4d3bcb46 netdev: New Function netdev_change_seq().
This new function will provide a much simpler replacement for
netdev_monitor in the future.
2011-05-31 14:34:38 -07:00
Ethan Jackson
1670c579a8 netdev: Take responsibility for polling MII registers.
This patch moves miimon logic from the bond module to netdev-linux.
This greatly simplifies the bonding code while adding minimal
complexity to netdev-linux.  The bonding code is so high level, it
really has no business worrying about how precisely slave status is
determined.
2011-05-20 12:55:36 -07:00
Ben Pfaff
b2fda3effc Merge 'next' into 'master'.
I know already that this breaks the statsfixes that were implemented by the
following commits:

827ab71c97f "ofproto: Datapath statistics accounted twice."
6f1435fc8f7 "ofproto: Resubmit statistics improperly account during..."

These were already broken in a previous merge.  I will work on a fix.
2011-05-18 14:01:13 -07:00
Ben Pfaff
6506f45c08 Make the source tree sparse clean.
With this commit, the tree compiles clean with sparse commit 87f4a7fda3d
"Teach 'already_tokenized()' to use the stream name hash table" with patch
"evaluate: Allow sizeof(_Bool) to succeed" available at
http://permalink.gmane.org/gmane.comp.parsers.sparse/2461 applied, as long
as the "include/sparse" directory is included for use by sparse (only),
e.g.:
     make CC="CHECK='sparse -I../include/sparse' cgcc"
2011-05-16 13:45:53 -07:00
Ben Pfaff
dbba996be2 Convert remaining network-byte-order "uint<N>_t"s into "ovs_be<N>"s.
I looked at almost every uint<N>_t in the tree to determine whether it was
really in network byte order, and converted the ones that were.

The only remaining ones, modulo my mistakes, are in openflow.h.  I'm not
sure whether we should convert those, because there might be some value
in remaining close to upstream for this header.
2011-05-16 13:40:47 -07:00
Ben Pfaff
7afa4f1d98 netdev-linux: Initialize rx_compressed, tx_compressed when converting.
rtnl_link_stats64 has rx_compressed and tx_compressed members that
struct netdev_stats lacks, so we need to initialize them to zero when
converting.

Found by valgrind.
2011-05-16 13:22:05 -07:00