mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

Author	SHA1	Message	Date
Ben Pfaff	dbba996be2	Convert remaining network-byte-order "uint<N>_t"s into "ovs_be<N>"s. I looked at almost every uint<N>_t in the tree to determine whether it was really in network byte order, and converted the ones that were. The only remaining ones, modulo my mistakes, are in openflow.h. I'm not sure whether we should convert those, because there might be some value in remaining close to upstream for this header.	2011-05-16 13:40:47 -07:00
Ben Pfaff	488d734d6e	netdev-linux: Open AF_PACKET socket only when it is needed. Only a privileged process can open a raw AF_PACKET socket, so netdev-linux will fail to initialize if run as non-root and you get a cascade of error messages, like this: netdev_linux\|ERR\|failed to create packet socket: Operation not permitted netdev\|ERR\|failed to initialize system network device class: Operation not permitted netdev\|ERR\|failed to initialize internal network device class: Operation not permitted netdev\|ERR\|failed to initialize tap network device class: Operation not permitted But in fact the AF_PACKET socket is not needed for most operations (only for sending packets) and it is never needed for testing with the "dummy" datapath and network device, so we can avoid logging all of these errors by opening the packet socket only on demand, as this commit does. Reviewed-by: Simon Horman <horms@verge.net.au>	2011-05-13 14:45:53 -07:00
Ben Pfaff	bd3932826b	netdev-linux: Only call set_nonblocking() if socket creation succeeds. Reviewed-by: Simon Horman <horms@verge.net.au>	2011-05-13 14:45:53 -07:00
Ben Pfaff	d39808227b	netdev-linux: New functions for converting netdev stats formats. An upcoming commit will introduce another function that needs to convert between rtnl_link_stats64 and netdev_stats, so it seemed best to just add functions to do the conversion.	2011-05-02 09:33:12 -07:00
Ben Pfaff	f23347eaf5	netdev-linux: Fix netdev_send() to tap device. Commit 76c308b50d3 "netdev-linux: Support 'send' for netdevs opened with NETDEV_ETH_TYPE_NONE" broke sending packets to tap devices. Sending a packet to a tap device with an AF_PACKET socket causes that packet to be looped back to be received on the tap device again, which obviously isn't useful.	2011-04-11 09:36:49 -07:00
Ben Pfaff	b981622b84	netdev-linux: Fix blocking while sending packets. The AF_PACKET socket needs to be in nonblocking mode or trying to send a packet can take a long time.	2011-04-11 09:36:48 -07:00
Ben Pfaff	1de0e8ae5c	netdev-linux: Avoid "cleverness" in swap_uint64(). Obviously correct code is easier on everyone. As the C FAQ says: 20.15c: How can I swap two values without using a temporary? A: The standard hoary old assembly language programmer's trick is: a ^= b; b ^= a; a ^= b; But this sort of code has little place in modern, HLL programming. Temporary variables are essentially free, and the idiomatic code using three assignments, namely int t = a; a = b; b = t; is not only clearer to the human reader, it is more likely to be recognized by the compiler and turned into the most-efficient code (e.g. using a swap instruction, if available). The latter code is obviously also amenable to use with pointers and floating-point values, unlike the XOR trick. See also questions 3.3b and 10.3.	2011-04-11 09:35:55 -07:00
Ben Pfaff	76c308b50d	netdev-linux: Support 'send' for netdevs opened with NETDEV_ETH_TYPE_NONE. The new implementation of the bonding code expects to be able to send packets using netdev_send(). This makes it possible for Linux netdevs.	2011-04-01 15:52:19 -07:00
Ben Pfaff	19993ef3ca	netdev: Use sset instead of svec in netdev interface.	2011-03-31 16:42:01 -07:00
Ethan Jackson	24045e3502	netdev-linux: Remove dead assignments.	2011-03-31 16:40:36 -07:00
Ethan Jackson	4f1046117c	htb: Set required min-rate to mtu not 1500.	2011-03-15 15:23:10 -07:00
Ethan Jackson	79398bad9e	hfsc: min-rate tweaks. There doesn't appear to be any reason to enforce a minimum min-rate of 1500Bps on queues. This commit lowers the minimum to 1Bps. A min-rate of 0 is not allowed by hfsc in the kernel.	2011-03-15 15:23:10 -07:00
Ethan Jackson	c45ab5e9b7	qos: Remove min-rate requirement for linux-htb and linux-hfsc. One could quite reasonably desire to create a queue with no min-rate. For example, a default queue could be reasonably configured without a min-rate or a max-rate. This commit removes the requirement that min-rate be configured on all queues. If not configured, defaults to something very small.	2011-03-15 15:23:10 -07:00
Justin Pettit	f2cc621bac	netdev-linux: Zero-out "sin" in netdev_linux_arp_lookup(). Coverity complains that we're copying the unitialized "sin_zero" member from "sin" into "r". I don't think this is an actual problem, but there's no harm in zeroing out the structure, either. Coverity #10916	2011-02-23 11:08:20 -08:00
Ben Pfaff	71d7c22f54	util: New function ovs_strzcpy(). Static analyzers hate strncpy(). This new function shares its property of initializing an entire buffer, without its nasty habit of failing to null-terminate long strings. Coverity #10697,10696,10695,10694,10693,10692,10691,10690.	2011-02-22 16:33:58 -08:00
Ben Pfaff	f915f1a8ca	datapath: Consider tunnels to have no MTU, fixing jumbo frame support. Until now, tunnel vports have had a specific MTU, in the same way that ordinary network devices have an MTU, but treating them this way does not always make sense. For example, consider a datapath that has three ports: the local port, a GRE tunnel to another host, and a physical port. If the physical port is configured with a jumbo MTU, it should be possible to send jumbo packets across the tunnel: the tunnel can do fragmentation or the physical port traversed by the tunnel might have a jumbo MTU. However, until now, tunnels always had a 1500-byte MTU by default. It could be adjusted using ODP_VPORT_MTU_SET, but nothing actually did this. One alternative would be to make ovs-vswitchd able to set the vport's MTU. This commit, however, takes a different approach, of dropping the concept of MTU entirely for tunnel vports. This also solves the problem described above, without making any additional work for anyone. I tested that, without this change, I could not send 1600-byte "pings" between two machines whose NICs had 2000-byte MTUs that were connected to vswitches that were in turn connected over GRE tunnels with the default 1500-byte MTU. With this change, it worked OK, regardless of the MTU of the network traversed by the GRE tunnel. This patch also makes "patch" ports MTU-less. It might make sense to remove vport_set_mtu() and the associated callback now, since ordinary network devices are the only vports that support it now. Signed-off-by: Ben Pfaff <blp@nicira.com> Suggested-by: Jesse Gross <jesse@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Bug #3728.	2011-02-04 09:46:26 -08:00
Ben Pfaff	6d9e6eb44f	netdev: Make netdev arguments fetchable, and implement for netdev-vport. This gives network device implementations the opportunity to fetch an existing device's configuration and store it as their arguments, so that netdev clients can find out how an existing device is configured. So far netdev-vport is the only implementation that needs to use this. The next commit will add use by clients. Reviewed by Justin Pettit.	2011-01-27 21:08:36 -08:00
Ben Pfaff	9fe3b9a2ee	datapath: Drop datapath index and port number from Ethtool output. I introduced this a long time ago as an efficient way for userspace to find out whether and where an internal device was attached, but I've always considered it an ugly kluge. Now that ODP_VPORT_QUERY can fetch a vport's info regardless of datapath, it is no longer necessary. This commit stops using Ethtool for this purpose and drops the feature. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	cceb11f5b1	netlink-socket: Add functions for joining and leaving multicast groups. When this library was originally implemented, support for Linux 2.4 was important. The Netlink implementation in Linux only added support for joining and leaving multicast groups after a socket is bound as of Linux 2.6.14, so the library did not support it either. But the current version of Open vSwitch targets Linux 2.6.18 and over, so it's fine to add this support now, and this commit does so. This will be used more extensively in upcoming commits. Reviewed by Justin Pettit.	2011-01-27 09:26:05 -08:00
Andrew Evans	e210037edd	bridge: Store status of physical network interfaces in Interface table. New columns in Interface table: admin_state, link_state, link_speed, duplex, mtu. New keys in status map in Interface table: driver_name, driver_version, firmware_version. Requested-by: Peter Balland <pballand@nicira.com> Bug #4299.	2011-01-18 21:09:02 -08:00
Andrew Evans	6f2f5cce6c	netdev: Make 'netdev' parameter of 'get_features()' const. Implementations shouldn't need to modify it.	2011-01-17 17:44:07 -08:00
Ethan Jackson	782e611166	netdev-linux: Fix strict aliasing warnings. This commit fixes warnings caused by the miimon code's breakage of strict aliasing rules. Reported-by: Jesse Gross <jesse@nicira.com> Implemented-by: Ben Pfaff <blp@nicira.com>	2011-01-14 12:41:39 -08:00
Ethan Jackson	6333182946	vswitchd: Add miimon support. This commit allows users to check link status in bonded ports using MII instead of carrier.	2011-01-12 15:50:20 -08:00
Ethan Jackson	ea763e0e28	bridge: Move tunnel_egress_iface to status column. This commit removes the tunnel_egress_iface column from the interface table and moves it's data to the status column. In the process it reverts the database to version 1.0.0.	2011-01-11 12:33:44 -08:00
Ethan Jackson	ea83a2fcd0	lib: Show tunnel egress interface in ovsdb This commit parses rtnetlink address notifications from the kernel in order to display the egress interface of tunnels in the database. Bug #4103.	2011-01-04 12:35:59 -08:00
Ethan Jackson	21d6e22eee	rtnetlink: Remove LINK specific messages from rtnetlink Abstracted rtnetlink so that it may be used for messages other than RTM LINK messages. Created a new rtnetlink-link module which specifically deals with these kinds of messages and follows the old rtnetlink API.	2011-01-04 10:34:55 -08:00
Ben Pfaff	d2bb2799e1	netdev-linux: Fix pairing of rtnetlink register and unregister calls. netdev_linux_create() called rtnetlink_notifier_register() for both system and internal devices, but netdev_linux_destroy() only did the reverse accounting for system devices. This fixes the pairing. This isn't really much of a bug, since it would only cause the notifier to be active unnecessarily (not to be removed even though it was needed). At most it was a missed opportunity for optimization, but I don't think that optimization would ever happen anyway. Found with valgrind --leak-check=full --show-reachable=yes.	2010-12-13 14:29:13 -08:00
Ben Pfaff	2fe27d5ad2	netlink: Split into generic and Linux-specific parts. The parts of the netlink module that are related to sockets are Linux-specific, since only Linux has AF_NETLINK sockets. The rest can be built anywhere. This commit breaks them into two modules, and builds the generic one on all platforms. Acked-by: Jesse Gross <jesse@nicira.com>	2010-12-10 11:13:27 -08:00
Ben Pfaff	98563392db	netdev-linux: Don't treat "system" devices as vports for setting stats. Linux kernel datapath vports have a "set_stats" method. Until now, internal vports have been handled in the userspace netdev library as type "system", so the "system" netdevs would try to use the vport "set_stats" method. Now, however, internal netdevs have been broken out as a separate netdev type, so only that new type of netdev has to be able to call into "set_stats". This commit, therefore, removes it from the "system" netdevs.	2010-12-03 14:43:38 -08:00
Ben Pfaff	c3827f619a	datapath: Make adding and attaching a vport a single step. For some time now, Open vSwitch datapaths have internally made a distinction between adding a vport and attaching it to a datapath. Adding a vport just means to create it, as an entity detached from any datapath. Attaching it gives it a port number and a datapath. Similarly, a vport could be detached and deleted separately. After some study, I think I understand why this distinction exists. It is because ovs-vswitchd tries to open all the datapath ports before it tries to create them. However, changing it to create them before it tries to open them is not difficult, so this commit does this. The bulk of this commit, however, changes the datapath interface to one that always creates a vport and attaches it to a datapath in a single step, and similarly detaches a vport and deletes it in a single step. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2010-12-03 14:41:38 -08:00
Ben Pfaff	d76f09ea77	coverage: Make the coverage counters catalog program-specific. Until now, the collection of coverage counters supported by a given OVS program was not specific to that program. That means that, for example, even though ovs-dpctl does not have anything to do with mac_learning, it still has a coverage counter for it. This is confusing, at best. This commit fixes the problem on some systems, in particular on ones that use GCC and the GNU linker. It uses the feature of the GNU linker described in its manual as: If an orphaned section's name is representable as a C identifier then the linker will automatically see PROVIDE two symbols: __start_SECNAME and __end_SECNAME, where SECNAME is the name of the section. These indicate the start address and end address of the orphaned section respectively. Systems that don't support these features retain the earlier behavior. This commit also fixes the annoyance that files that include coverage counters must be listed on COVERAGE_FILES in lib/automake.mk. This commit also fixes the annoyance that modifying any source file that includes a coverage counter caused all programs that link against libopenvswitch.a to relink, even programs that the source file was not linked into. For example, modifying ofproto/ofproto.c (which includes coverage counters) caused tests/test-aes128 to relink, even though test-aes128 does not link again ofproto.o.	2010-11-30 10:30:30 -08:00
Ben Pfaff	f4e2e60be4	netdev-linux: Remove counter double-increments. A few coverage counters were incremented both in netdev generic code and in netdev_linux code. This commit drops the increments from the lower-level code. (This is not an actual bug because these counters are used only for logging.)	2010-11-30 10:30:30 -08:00
Ethan Jackson	a339aa8162	netdev-linux: HFSC in linux This commit implements the Hierarchical Fair Service Curve queuing discipline in linux. HFSC performs better at high bandwidth and implements min-rate proportional sharing of excess bandwidth. Only a simplified configuration interface is exposed to the user. This can be expand to allow more tweaking in the future.	2010-11-11 12:32:20 -08:00
Ben Pfaff	d98e600755	vlog: Make client supply semicolon for VLOG_DEFINE_THIS_MODULE. It's kind of odd for VLOG_DEFINE_THIS_MODULE to supply its own semicolon, so this commit switches to the more common form.	2010-10-29 09:48:47 -07:00
Ben Pfaff	23a98ffed7	netdev-linux: Always check tc_make_request() for NULL return value. Bug #3912.	2010-10-22 14:51:50 -07:00
Ben Pfaff	f8da634725	netdev-linux: Remove unused data in htb_tc_load().	2010-10-22 14:51:50 -07:00
Ethan Jackson	4ecf12d501	netdev-linux: Make queue 0 the default QOS policy This patch defines, by convention, queue 0 as the default queue in a particular QOS. Thus, if queue 0 is defined, all traffic going through the relevant interface will be enqueued in it. If queue 0 is not defined then ovs will send the traffic directly through the interface without applying any policy to it.	2010-10-21 16:03:03 -07:00
Justin Pettit	da3827b551	netdev: Enforce a floor "linux-htb" min-rate	2010-10-08 14:30:31 -07:00
Justin Pettit	015c93a49a	netdev: Don't divide by zero when "linux-htb" zero min-rate is used A "min-rate" of zero for the "linux-htb" QoS type would cause a divide by zero exception. This patch prevents that by just returning zero. A later patch will try to enforce reasonable values for "min-rate". Bug #3745	2010-10-08 14:22:36 -07:00
Ben Pfaff	b8dcf5e9c5	netdev: Pass class structure, instead of type, to "create" function. This opens up the possibility of storing private data at a relative offset to the class structure, instead of having to keep a separate table.	2010-10-06 13:49:07 -07:00
Ben Pfaff	d5590e7e41	netdev-linux: Fix off-by-one error dumping queue stats. Linux kernel queue numbers are one greater than OpenFlow queue numbers, for HTB anyhow. The code to dump queues wasn't compensating for this, so this commit fixes it up.	2010-10-01 10:40:00 -07:00
Ben Pfaff	4e8e4213a8	Switch many macros from using CONTAINER_OF to using OBJECT_CONTAINING. These macros require one fewer argument by switching, which makes code that uses them shorter and more readable.	2010-10-01 10:25:29 -07:00
Ben Pfaff	93b13be8e6	netdev-linux: Use hash table instead of sparse array for QoS classes. The main advantage of a sparse array over a hash table is that it can be iterated in numerical order. But the OVS implementation of sparse arrays is quite expensive in terms of memory: on a 32-bit system, a sparse array with exactly 1 nonnull element has 512 bytes of overhead. In this case, the sparse array's property of iteration in numerical order is not important, so this commit converts it to a hash table to save memory.	2010-10-01 10:25:10 -07:00
Ben Pfaff	2a022368f4	Avoid shadowing local variable names. All of these changes avoid using the same name for two local variables within a same function. None of them are actual bugs as far as I can tell, but any of them could be confusing to the casual reader. The one in lib/ovsdb-idl.c is particularly brilliant: inner and outer loops both using (different) variables named 'i'. Found with GCC -Wshadow.	2010-09-20 09:39:54 -07:00
Joe Perches	d295e8e97a	treewide: Remove trailing whitespace Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com>	2010-08-30 13:23:08 -07:00
Jesse Gross	d1eb60ccff	datapath: Abstract tunneling implementation from GRE. Much of the code in the GRE implementation is not specific to the GRE protocol but is actually common to all types of tunnels. In order to support future types of tunnels, move this code into a common library. Signed-off-by: Jesse Gross <jesse@nicira.com>	2010-08-24 15:17:29 -04:00
Ben Pfaff	5136ce492c	vlog: Introduce VLOG_DEFINE_THIS_MODULE for declaring vlog module in use. Adding a macro to define the vlog module in use adds a level of indirection, which makes it easier to change how the vlog module must be defined. A followup commit needs to do that, so getting these widespread changes out of the way first should make that commit easier to review.	2010-07-21 15:47:09 -07:00
Ben Pfaff	17ee3c1ffd	netdev-linux: Avoid minor number 0 in traffic control. Linux traffic control handles with minor number 0 refer to qdiscs, not to classes. This commit deals with this by using a conversion function: OpenFlow queue 0 maps to minor 1, queue 1 to minor 2, and so on.	2010-07-20 11:26:58 -07:00
Ben Pfaff	3c4de644d2	netdev-linux: Dump all queues, not just direct children of the root. A netdev-linux traffic control implementation has to dump all of a port's traffic classes in a couple of different situations. start_queue_dump() is supposed to do that. But it was specifying TC_H_ROOT as tcm_parent, which only dumped classes that were direct children of the root. This commit changes tcm_parent to 0, which obtains all traffic classes regardless of parent.	2010-07-20 11:26:58 -07:00
Ben Pfaff	c1c9c9c4b6	Implement QoS framework. ovs-vswitchd doesn't declare its QoS capabilities in the database yet, so the controller has to know what they are. We can add that later. The linux-htb QoS class has been tested to the extent that I can see that it sets up the queues I expect when I run "tc qdisc show" and "tc class show". I haven't tested that the effects on flows are what we expect them to be. I am sure that there will be problems in that area that we will have to fix.	2010-06-17 15:04:12 -07:00

1 2

100 Commits