2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-23 02:17:42 +00:00

129 Commits

Author SHA1 Message Date
Justin Pettit
f850000440 netdev-linux: Don't restrict policing to IPv4 and don't call "tc".
Mike Bursell pointed out that our policer only works on IPv4
traffic--and specifically not IPv6.  By using the "basic" filter, we can
enforce policing on all traffic for a particular interface.

Jamal Hadi Salim pointed out that calling "tc" directly with system() is
pretty ugly.  This commit switches our remaining "tc" calls to directly
sending the appropriate netlink messages.

Suggested-by: Mike Bursell <mike.bursell@citrix.com>
Suggested-by: Jamal Hadi Salim <hadi@cyberus.ca>
2011-12-05 14:08:06 -08:00
Ben Pfaff
1f6e0fbd81 netdev-linux: Ref and unref the netdev_linux_cache_notifier for taps too.
netdev-linux uses netdev_linux_cache_notifier to flush its cache when the
kernel notifies userspace that a particular network device's configuration
or status has changed.  This is as applicable to tap devices as to system
and internal devices, so we should create and destroy the notifier for
tap devices also.

I doubt that in practice it's possible to run ovs-vswitchd without having
a non-tap device open, at least with the kernel datapath, because the
local port for a bridge is not a tap device, so there should be no need to
backport this to older versions.

Reported-by: Gaetano Catalli <gaetano.catalli@gmail.com>
2011-12-02 10:39:07 -08:00
Ben Pfaff
025e874a9c vlandev: New library for working with Linux VLAN devices. 2011-11-23 15:32:36 -08:00
Ben Pfaff
aaf2fb1aac netdev-linux: Reorganize slightly. 2011-11-23 13:19:53 -08:00
Ben Pfaff
52e2fbfbab netdev: Remove netdev_get_vlan_vid().
It has no remaining users.
2011-11-23 13:19:53 -08:00
Ethan Jackson
65c3058c22 vswitchd: New column "link_resets".
An interface's 'link_resets' column represents the number of times
Open vSwitch has observed its link_state change.
2011-10-17 15:03:03 -07:00
Ethan Jackson
3a18312428 netdev-linux: Maintain carrier flag constantly.
Before this patch, the carrier of a linux device was only updated
if requested by a caller.  This patch updates it whenever it
changes.
2011-10-17 15:03:03 -07:00
Ben Pfaff
bb7d0e2287 netdev-linux: Fix broken build on RHEL 6.
Commit 00fa9d37c2b "Do not include net/ethernet.h and linux/if_tunnel.h"
introduced a compile error on RHEL 6:

lib/netdev-linux.c: In function 'netdev_linux_listen':
lib/netdev-linux.c:734: error: 'ETH_P_ALL' undeclared (first use in this
function)

This fixes the problem.

I verified that the Android NDK r6b mentioned in the previous commit
contains a file named android-ndk-r6b/platforms/android-3/arch-x86/use/
linux/if_ether.h that defines ETH_P_ALL.  I didn't try building on that
platform.
2011-09-22 11:54:22 -07:00
Simon Horman
ee9bed06cd Remove netdev_find_dev_by_in4
netdev_find_dev_by_in4() appears to no longer be used and thus
can be removed. This also allows netdev_enumerate(), the
enumerate member of struct netdev_class and netdev_linux_enumerate()
to be removed.

I noticed this as netdev_linux_enumerate() makes use of if_nameindex()
and if_freenameindex() which are not available when compiling using
the Android NDK r6b (Android API level 13).
2011-09-22 09:03:03 -07:00
Simon Horman
00fa9d37c2 Do not include net/ethernet.h and linux/if_tunnel.h
net/ethernet.h and linux/if_tunnel.h do not appear to be needed
on lib/netdev-linux.c.

I noticed this while trying to build on the Android NDK r6b (Android API
level 13) as these headers are not present there.
2011-09-22 09:03:02 -07:00
Ethan Jackson
2ee6545f2b notifiers: Create and destroy nln_notifiers.
This patch changes the interface of netlink-notifier and
rtnetlink-link.  Now nln_notifiers are allocated and destroyed by
the module instead of passed in by callers.  This allows the
definition of nln_notifier to be hidden, and generally cleans up
the code.
2011-09-16 11:22:30 -07:00
Ethan Jackson
18a2378164 notifiers: Rename run and wait functions.
It makes more sense to call nln_notifier_run() and
nln_notifier_wait() simply nln_run() and nln_wait() since they
don't operate on notifiers but the entire nln object.  This patch
changes the nln and the rtnetlink-link modules to the new
convention.
2011-09-16 11:22:30 -07:00
Pravin Shelar
f613a0d72c datapath: Always use generic stats for devices (vports)
Currently ovs is using device stats for Linux devices and count them
itself in other situations. This leads to overlap with hardware stats,
inconsistencies, etc. It's much better to just always count the packets
flowing through the switch and let userspace do any merging that it wants.

Following patch removes vport->get_stats() interface. vport-stat is changed
to use new `struct ovs_vport_stat` rather than rtnl_link_stats64.
Definitions of rtnl_link_stats64 is removed from OVS.  dipf_port->stat is also
removed as aggregate stats are only available at netdev layer.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-15 19:36:17 -07:00
Pravin Shelar
9b02078077 datapath: Strip down vport interface : OVS_VPORT_ATTR_MTU
There is no need to have vport attribute MTU (OVS_VPORT_ATTR_MTU) as
linux net-dev-ioctl can be used to get/set MTU for linux device.
Following patch removes OVS_VPORT_ATTR_MTU from datapath protocol.

This patch also adds netdev_set_mtu interface. So that MTU adjustments
can be done from OVS userspace. get_mtu() interface is also changed, now
get_mtu() returns EOPNOTSUPP rather than returning 0 and setting *pmtu
to INT_MAX in case there is no MTU attribute for given device.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-12 17:12:52 -07:00
Ethan Jackson
0a811051ff netlink-notifier: Rename rtnetlink code.
This patch renames the rtnetlink module's code to "nln" for
"netlink notifier".  Callers are now required to pass in the
netlink protocol to he newly renamed nln_create() function.
2011-09-01 17:18:52 -07:00
Ethan Jackson
45c8d3a189 lib: Rename rtnetlink.[ch] files.
The only rtnetlink specific functionality contained in the
rtnetlink module is the use of the NETLINK_ROUTE protocol.  This
can easily be passed in by callers.

In preparation for generalization, this patch renames
rtnetlink.[ch] to netlink-notifier.[ch].  Future patches will
complete the transition.
2011-09-01 17:18:51 -07:00
Justin Pettit
e47bd51a7b netdev-linux: Introduce netdev_linux_ethtool_set_flag().
There will be a caller added soon.
2011-08-28 13:43:38 -07:00
Ben Pfaff
de5cdb90f7 netdev: Decouple creating and configuring network devices.
Until now, each call to netdev_open() for a particular network device
had to either specify a set of network device arguments that was either
empty or (for devices that already existed) equal to the existing device's
configuration.  Unfortunately, the definition of "equality" in the latter
case was mostly done in terms of strict equality of string-to-string maps,
which caused problems in cases where, for example, one set of arguments
specified the default value of an optional argument explicitly and the
other omitted it.

The netdev interface does have provisions for defining equality other ways,
but this had only been done in one case that was especially problematic in
practice.  One way to solve this particular problem would be to carefully
define equality in all the problematic cases.

This commit takes another approach based on the realization that there is
really no need to do any comparisons.  Instead, it removes configuration
at netdev_open() time entirely, because almost all of netdev_open()'s
callers are not interested in creating and configuring a netdev.  Most of
them just want to open a configured device and use it.  Therefore, this
commit stops providing any configuration arguments to netdev_open() and the
provider functions that it calls.  Instead, a caller that does want to
configure a device does so after it opens it, by calling
netdev_set_config().

This change allows us to simplify the netdev interface a bit.  There is no
longer any need to implement argument comparisons.  As a result, there is
also no need for "struct netdev_dev" to keep track of configuration at all.
Instead, the network devices that have configuration keep track of it in
their own internal form.

This new interface does mean that it becomes possible to accidentally
create and try to use an unconfigured netdev that requires configuration.

Bug #6677.
Reported-by: Paul Ingram <paul@nicira.com>
2011-08-08 12:49:17 -07:00
Ben Pfaff
7b6b0ef47e netdev: Clean up and refactor packet receive interface.
The Open vSwitch tree only has one user of the ability for a netdev to
receive packets from a network device.  Thus, this commit simplifies the
common-case use of the netdev interface by replacing the "ethertype" option
from "struct netdev_options" by a new netdev_listen() call.

The only user of netdev_listen() wants to receive all packets from a
network device, so this commit also removes the ability to restrict the
received packets to a particular protocol.  (This ability was once used by
the Open vSwitch integrated DHCP client, but that code has been removed.)

This commit also simplifies and improves the implementation of the code
in netdev-linux that started listening to a network device.  Before, I had
not figured out how to avoid receiving all packets on all devices before
binding to a particular device, but I took a closer look at the kernel code
and figured it out.

I've tested that the userspace datapath (dpif-netdev), the only user of
netdev_recv(), still works after this change.
2011-08-08 10:24:24 -07:00
Ben Pfaff
78857dfb3d Reduce log level for ENODEV errors getting Ethernet address.
Bug #5844 reports several log messages of the form:

    netdev_linux|ERR|ioctl(SIOCGIFHWADDR) on vif426.1 device failed: No
    such device

during migrations.  These are normal and unavoidable, because the vifs
disappear from the kernel before they are removed them from the OVS
database.  Reduce the log level to avoid making people worry.

Bug #5844.
2011-06-17 13:38:31 -07:00
Justin Pettit
aebf4235f3 netdev: Add methods to do netdev-specific argument comparisons.
When doing a netdev_open(), a check is first done to make sure the
arguments are equivalent for any open devices with the same name.  In
most cases, a simple shash comparison is sufficient.  However, IPsec
key configuration is handled by an external program, so it is not pushed
down into the kernel module.  Thus, when the "unparse_config" method is
called on an existing IPsec-based vport, a simple comparison with the
returned data will not match the original configuration.  This commit
adds code to allow netdev-specific argument comparisons and has
"ipsec_gre" make use of them.

Bug #5575
2011-06-14 10:28:40 -07:00
Ethan Jackson
943e5afe0b netdev: Remove monitors and notifiers.
Neither of these constructs are used anymore.
2011-05-31 14:34:39 -07:00
Ethan Jackson
ac4d3bcb46 netdev: New Function netdev_change_seq().
This new function will provide a much simpler replacement for
netdev_monitor in the future.
2011-05-31 14:34:38 -07:00
Ethan Jackson
1670c579a8 netdev: Take responsibility for polling MII registers.
This patch moves miimon logic from the bond module to netdev-linux.
This greatly simplifies the bonding code while adding minimal
complexity to netdev-linux.  The bonding code is so high level, it
really has no business worrying about how precisely slave status is
determined.
2011-05-20 12:55:36 -07:00
Ben Pfaff
b2fda3effc Merge 'next' into 'master'.
I know already that this breaks the statsfixes that were implemented by the
following commits:

827ab71c97f "ofproto: Datapath statistics accounted twice."
6f1435fc8f7 "ofproto: Resubmit statistics improperly account during..."

These were already broken in a previous merge.  I will work on a fix.
2011-05-18 14:01:13 -07:00
Ben Pfaff
6506f45c08 Make the source tree sparse clean.
With this commit, the tree compiles clean with sparse commit 87f4a7fda3d
"Teach 'already_tokenized()' to use the stream name hash table" with patch
"evaluate: Allow sizeof(_Bool) to succeed" available at
http://permalink.gmane.org/gmane.comp.parsers.sparse/2461 applied, as long
as the "include/sparse" directory is included for use by sparse (only),
e.g.:
     make CC="CHECK='sparse -I../include/sparse' cgcc"
2011-05-16 13:45:53 -07:00
Ben Pfaff
dbba996be2 Convert remaining network-byte-order "uint<N>_t"s into "ovs_be<N>"s.
I looked at almost every uint<N>_t in the tree to determine whether it was
really in network byte order, and converted the ones that were.

The only remaining ones, modulo my mistakes, are in openflow.h.  I'm not
sure whether we should convert those, because there might be some value
in remaining close to upstream for this header.
2011-05-16 13:40:47 -07:00
Ben Pfaff
7afa4f1d98 netdev-linux: Initialize rx_compressed, tx_compressed when converting.
rtnl_link_stats64 has rx_compressed and tx_compressed members that
struct netdev_stats lacks, so we need to initialize them to zero when
converting.

Found by valgrind.
2011-05-16 13:22:05 -07:00
Ben Pfaff
488d734d6e netdev-linux: Open AF_PACKET socket only when it is needed.
Only a privileged process can open a raw AF_PACKET socket, so netdev-linux
will fail to initialize if run as non-root and you get a cascade of error
messages, like this:

netdev_linux|ERR|failed to create packet socket: Operation not permitted
netdev|ERR|failed to initialize system network device class: Operation not permitted
netdev|ERR|failed to initialize internal network device class: Operation not permitted
netdev|ERR|failed to initialize tap network device class: Operation not permitted

But in fact the AF_PACKET socket is not needed for most operations (only
for sending packets) and it is never needed for testing with the "dummy"
datapath and network device, so we can avoid logging all of these errors
by opening the packet socket only on demand, as this commit does.

Reviewed-by: Simon Horman <horms@verge.net.au>
2011-05-13 14:45:53 -07:00
Ben Pfaff
bd3932826b netdev-linux: Only call set_nonblocking() if socket creation succeeds.
Reviewed-by: Simon Horman <horms@verge.net.au>
2011-05-13 14:45:53 -07:00
Ben Pfaff
0079481775 Merge 'master' into 'next'. 2011-05-12 12:05:42 -07:00
Ben Pfaff
2c360fbb27 Convert remaining network-byte-order "uint<N>_t"s into "ovs_be<N>"s.
I looked at almost every uint<N>_t in the tree to determine whether it was
really in network byte order, and converted the ones that were.

The only remaining ones, modulo my mistakes, are in openflow.h.  I'm not
sure whether we should convert those, because there might be some value
in remaining close to upstream for this header.
2011-05-04 10:12:05 -07:00
Ben Pfaff
d39808227b netdev-linux: New functions for converting netdev stats formats.
An upcoming commit will introduce another function that needs to convert
between rtnl_link_stats64 and netdev_stats, so it seemed best to just add
functions to do the conversion.
2011-05-02 09:33:12 -07:00
Ben Pfaff
f23347eaf5 netdev-linux: Fix netdev_send() to tap device.
Commit 76c308b50d3 "netdev-linux: Support 'send' for netdevs opened with
NETDEV_ETH_TYPE_NONE" broke sending packets to tap devices.  Sending a
packet to a tap device with an AF_PACKET socket causes that packet to be
looped back to be received on the tap device again, which obviously isn't
useful.
2011-04-11 09:36:49 -07:00
Ben Pfaff
b981622b84 netdev-linux: Fix blocking while sending packets.
The AF_PACKET socket needs to be in nonblocking mode or trying to send
a packet can take a long time.
2011-04-11 09:36:48 -07:00
Ben Pfaff
1de0e8ae5c netdev-linux: Avoid "cleverness" in swap_uint64().
Obviously correct code is easier on everyone.  As the C FAQ says:

20.15c:	How can I swap two values without using a temporary?

A:	The standard hoary old assembly language programmer's trick is:

		a ^= b;
		b ^= a;
		a ^= b;

	But this sort of code has little place in modern, HLL
	programming.  Temporary variables are essentially free,
	and the idiomatic code using three assignments, namely

		int t = a;
		a = b;
		b = t;

	is not only clearer to the human reader, it is more likely to be
	recognized by the compiler and turned into the most-efficient
	code (e.g. using a swap instruction, if available).  The latter
	code is obviously also amenable to use with pointers and
	floating-point values, unlike the XOR trick.  See also questions
	3.3b and 10.3.
2011-04-11 09:35:55 -07:00
Ben Pfaff
76c308b50d netdev-linux: Support 'send' for netdevs opened with NETDEV_ETH_TYPE_NONE.
The new implementation of the bonding code expects to be able to
send packets using netdev_send().  This makes it possible for
Linux netdevs.
2011-04-01 15:52:19 -07:00
Ben Pfaff
19993ef3ca netdev: Use sset instead of svec in netdev interface. 2011-03-31 16:42:01 -07:00
Ethan Jackson
24045e3502 netdev-linux: Remove dead assignments. 2011-03-31 16:40:36 -07:00
Ethan Jackson
4f1046117c htb: Set required min-rate to mtu not 1500. 2011-03-15 15:23:10 -07:00
Ethan Jackson
79398bad9e hfsc: min-rate tweaks.
There doesn't appear to be any reason to enforce a minimum min-rate
of 1500Bps on queues. This commit lowers the minimum to 1Bps.  A
min-rate of 0 is not allowed by hfsc in the kernel.
2011-03-15 15:23:10 -07:00
Ethan Jackson
c45ab5e9b7 qos: Remove min-rate requirement for linux-htb and linux-hfsc.
One could quite reasonably desire to create a queue with no
min-rate.  For example, a default queue could be reasonably
configured without a min-rate or a max-rate.  This commit removes
the requirement that min-rate be configured on all queues.  If not
configured, defaults to something very small.
2011-03-15 15:23:10 -07:00
Justin Pettit
f2cc621bac netdev-linux: Zero-out "sin" in netdev_linux_arp_lookup().
Coverity complains that we're copying the unitialized "sin_zero" member
from "sin" into "r".  I don't think this is an actual problem, but
there's no harm in zeroing out the structure, either.

Coverity #10916
2011-02-23 11:08:20 -08:00
Ben Pfaff
71d7c22f54 util: New function ovs_strzcpy().
Static analyzers hate strncpy().  This new function shares its property of
initializing an entire buffer, without its nasty habit of failing to
null-terminate long strings.

Coverity #10697,10696,10695,10694,10693,10692,10691,10690.
2011-02-22 16:33:58 -08:00
Ben Pfaff
f915f1a8ca datapath: Consider tunnels to have no MTU, fixing jumbo frame support.
Until now, tunnel vports have had a specific MTU, in the same way that
ordinary network devices have an MTU, but treating them this way does not
always make sense.  For example, consider a datapath that has three ports:
the local port, a GRE tunnel to another host, and a physical port.  If
the physical port is configured with a jumbo MTU, it should be possible to
send jumbo packets across the tunnel: the tunnel can do fragmentation or
the physical port traversed by the tunnel might have a jumbo MTU.

However, until now, tunnels always had a 1500-byte MTU by default.  It
could be adjusted using ODP_VPORT_MTU_SET, but nothing actually did this.
One alternative would be to make ovs-vswitchd able to set the vport's MTU.
This commit, however, takes a different approach, of dropping the concept
of MTU entirely for tunnel vports.  This also solves the problem described
above, without making any additional work for anyone.

I tested that, without this change, I could not send 1600-byte "pings"
between two machines whose NICs had 2000-byte MTUs that were connected to
vswitches that were in turn connected over GRE tunnels with the default
1500-byte MTU.  With this change, it worked OK, regardless of the MTU of
the network traversed by the GRE tunnel.

This patch also makes "patch" ports MTU-less.

It might make sense to remove vport_set_mtu() and the associated callback
now, since ordinary network devices are the only vports that support it
now.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Suggested-by: Jesse Gross <jesse@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #3728.
2011-02-04 09:46:26 -08:00
Ben Pfaff
6d9e6eb44f netdev: Make netdev arguments fetchable, and implement for netdev-vport.
This gives network device implementations the opportunity to fetch an
existing device's configuration and store it as their arguments, so that
netdev clients can find out how an existing device is configured.

So far netdev-vport is the only implementation that needs to use this.

The next commit will add use by clients.

Reviewed by Justin Pettit.
2011-01-27 21:08:36 -08:00
Ben Pfaff
9fe3b9a2ee datapath: Drop datapath index and port number from Ethtool output.
I introduced this a long time ago as an efficient way for userspace to find
out whether and where an internal device was attached, but I've always
considered it an ugly kluge.  Now that ODP_VPORT_QUERY can fetch a vport's
info regardless of datapath, it is no longer necessary.  This commit
stops using Ethtool for this purpose and drops the feature.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:36 -08:00
Ben Pfaff
cceb11f5b1 netlink-socket: Add functions for joining and leaving multicast groups.
When this library was originally implemented, support for Linux 2.4 was
important.  The Netlink implementation in Linux only added support for
joining and leaving multicast groups after a socket is bound as of Linux
2.6.14, so the library did not support it either.  But the current version
of Open vSwitch targets Linux 2.6.18 and over, so it's fine to add this
support now, and this commit does so.

This will be used more extensively in upcoming commits.

Reviewed by Justin Pettit.
2011-01-27 09:26:05 -08:00
Andrew Evans
e210037edd bridge: Store status of physical network interfaces in Interface table.
New columns in Interface table: admin_state, link_state, link_speed, duplex,
mtu.

New keys in status map in Interface table: driver_name, driver_version,
firmware_version.

Requested-by: Peter Balland <pballand@nicira.com>
Bug #4299.
2011-01-18 21:09:02 -08:00
Andrew Evans
6f2f5cce6c netdev: Make 'netdev' parameter of 'get_features()' const.
Implementations shouldn't need to modify it.
2011-01-17 17:44:07 -08:00