mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 21:07:47 +00:00

Author	SHA1	Message	Date
Ben Pfaff	9fe3b9a2ee	datapath: Drop datapath index and port number from Ethtool output. I introduced this a long time ago as an efficient way for userspace to find out whether and where an internal device was attached, but I've always considered it an ugly kluge. Now that ODP_VPORT_QUERY can fetch a vport's info regardless of datapath, it is no longer necessary. This commit stops using Ethtool for this purpose and drops the feature. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	51d4d59822	datapath: Make it possible to query vports by name regardless of datapath. Until now it has only been possible to query a vport if you know what datapath it is on. This doesn't really make sense, so this commit removes that restriction. It is a little bigger than one might naturally expect because locking changes are required. This also allows us to get rid of the ETHTOOL_GDRVINFO kluge that has bothered me for a long time. The next commit does that. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	b0ec0f279e	datapath: Change listing ports to use an iterator concept. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to add new features to the kernel vport layer without changing userspace software. In turn, that means that the odp_port structure must become variable-length. This does not, however, fit in well with the ODP_PORT_LIST ioctl in its current form, because that would require userspace to know how much space to allocate for each port in advance, or to allocate as much space as could possibly be needed. Neither choice is very attractive. This commit prepares for a different solution, by replacing ODP_PORT_LIST by a new ioctl ODP_VPORT_DUMP that retrieves information about a single vport from the datapath on each call. It is much cleaner to allocate the maximum amount of space for a single vport than to do so for possibly a large number of vports. It would be faster to retrieve a number of vports in batch instead of just one at a time, but that will naturally happen later when the kernel datapath interface is changed to use Netlink, so this patch does not bother with it. The Netlink version won't need to take the starting port number from userspace, since Netlink sockets can keep track of that state as part of their "dump" feature. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	856081f683	datapath: Report kernel's flow key when passing packets up to userspace. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to change the kernel's idea of the flow key separately from the userspace version. This commit takes one step in that direction by making the kernel report its idea of the flow that a packet belongs to whenever it passes a packet up to userspace. This means that userspace can intelligently figure out what to do: - If userspace's notion of the flow for the packet matches the kernel's, then nothing special is necessary. - If the kernel has a more specific notion for the flow than userspace, for example if the kernel decoded IPv6 headers but userspace stopped at the Ethernet type (because it does not understand IPv6), then again nothing special is necessary: userspace can still set up the flow in the usual way. - If userspace has a more specific notion for the flow than the kernel, for example if userspace decoded an IPv6 header but the kernel stopped at the Ethernet type, then userspace can forward the packet manually, without setting up a flow in the kernel. (This case is bad from a performance point of view, but at least it is correct.) This commit does not actually make userspace flexible enough to handle changes in the kernel flow key structure, although userspace does now have enough information to do that intelligently. This will have to wait for later commits. This commit is bigger than it would otherwise be because it is rolled together with changing "struct odp_msg" to a sequence of Netlink attributes. The alternative, to do each of those changes in a separate patch, seemed like overkill because it meant that either we would have to introduce and then kill off Netlink attributes for in_port and tun_id, if Netlink conversion went first, or shove yet another variable-length header into the stuff already after odp_msg, if adding the flow key to odp_msg went first. This commit will slow down performance of checksumming packets sent up to userspace. I'm not entirely pleased with how I did it. I considered a couple of alternatives, but none of them seemed that much better. Suggestions welcome. Not changing anything wasn't an option, unfortunately. At any rate some slowdown will become unavoidable when OVS actually starts using Netlink instead of just Netlink framing. (Actually, I thought of one option where we could avoid that: make userspace do the checksum instead, by passing csum_start and csum_offset as part of what goes to userspace. But that's not perfect either.) Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	704a1e09e9	datapath: Change listing flows to use an iterator concept. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to change the kernel's idea of the flow key separately from the userspace version. In turn, that means that flow keys must become variable-length. This does not, however, fit in well with the ODP_FLOW_LIST ioctl in its current form, because that would require userspace to know how much space to allocate for each flow's key in advance, or to allocate as much space as could possibly be needed. Neither choice is very attractive. This commit prepares for a different solution, by replacing ODP_FLOW_LIST by a new ioctl ODP_FLOW_DUMP that retrieves a single flow from the datapath on each call. It is much cleaner to allocate the maximum amount of space for a single flow key than to do so for possibly a very large number of flow keys. As a side effect, this patch also fixes a race condition that sometimes made "ovs-dpctl dump-flows" print an error: previously, flows were listed and then their actions were retrieved, which left a window in which ovs-vswitchd could delete the flow. Now dumping a flow and its actions is a single step, closing that window. Dumping all of the flows in a datapath is no longer an atomic step, so now it is possible to miss some flows or see a single flow twice during iteration, if the flow table is modified by another process. It doesn't look like this should be a problem for ovs-vswitchd. It would be faster to retrieve a number of flows in batch instead of just one at a time, but that will naturally happen later when the kernel datapath interface is changed to use Netlink, so this patch does not bother with it. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:35 -08:00
Ethan Jackson	21d6e22eee	rtnetlink: Remove LINK specific messages from rtnetlink Abstracted rtnetlink so that it may be used for messages other than RTM LINK messages. Created a new rtnetlink-link module which specifically deals with these kinds of messages and follows the old rtnetlink API.	2011-01-04 10:34:55 -08:00
Justin Pettit	e16a28b585	vswitch: Use "ipsec_gre" vport instead of "gre" with "other_config" Previously, a GRE-over-IPsec tunnel was created as an interface with a "type" of "gre" and the "other_config" column with "ipsec_cert" or "ipsec_psk" set. This could lead to a potential security problem if a user intended to create a GRE-over-IPsec tunnel, but misconfigured the "ipsec_*" config and created an unencrypted GRE tunnel. This commit defines an "ipsec_gre" tunnel type, which should prevent users from inadvertently establishing insecure tunnels.	2010-12-28 14:30:36 -08:00
Jesse Gross	cf22f8cba3	vswitchd: Consistently use size_t for action lengths. Currently the type of the datapath action length is mixture of size_t and unsigned int. However, size_t is really defined as an unsigned long, which causes the build to fail on 64-bit platforms. This consistently uses size_t.	2010-12-13 11:07:15 -08:00
Ben Pfaff	cdee00fd63	datapath: Replace "struct odp_action" by Netlink attributes. In the medium term, we plan to migrate the datapath to use Netlink as its communication channel. In the short term, we need to be able to have actions with 64-bit arguments but "struct odp_action" only has room for 48 bits. So this patch shifts to variable-length arguments using Netlink attributes, which starts in on the Netlink transition and makes 64-bit arguments possible at the same time. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2010-12-10 11:13:32 -08:00
Ben Pfaff	c3827f619a	datapath: Make adding and attaching a vport a single step. For some time now, Open vSwitch datapaths have internally made a distinction between adding a vport and attaching it to a datapath. Adding a vport just means to create it, as an entity detached from any datapath. Attaching it gives it a port number and a datapath. Similarly, a vport could be detached and deleted separately. After some study, I think I understand why this distinction exists. It is because ovs-vswitchd tries to open all the datapath ports before it tries to create them. However, changing it to create them before it tries to open them is not difficult, so this commit does this. The bulk of this commit, however, changes the datapath interface to one that always creates a vport and attaches it to a datapath in a single step, and similarly detaches a vport and deletes it in a single step. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2010-12-03 14:41:38 -08:00
Ben Pfaff	4a38774146	dpif: Make dpif_class 'open' function take class instead of type name. This makes it easier for dpif_provider implementations to share code but distinguish the class actually in use, because comparing a pointer is easier than comparing a string.	2010-11-18 10:08:05 -08:00
Ben Pfaff	d98e600755	vlog: Make client supply semicolon for VLOG_DEFINE_THIS_MODULE. It's kind of odd for VLOG_DEFINE_THIS_MODULE to supply its own semicolon, so this commit switches to the more common form.	2010-10-29 09:48:47 -07:00
Ben Pfaff	f1588b1fa1	datapath: Remove implementation of port groups. The "port group" concept seems like a good one, but it has not been used very much in userspace so far, so before we commit ourselves to a frozen API that we must maintain forever, remove it. We can always add it back in later as a new kind of vport. Signed-off-by: Ben Pfaff <blp@nicira.com>	2010-10-11 12:40:11 -07:00
Ben Pfaff	4f2226487d	shash: New function shash_steal().	2010-09-23 11:45:34 -07:00
Ben Pfaff	5136364f41	vlog: Add VLOG_WARN_ONCE() and similar macros.	2010-09-23 11:45:34 -07:00
Ben Pfaff	68efcbec41	ofpbuf: Add ofpbuf_new_with_headroom(), ofpbuf_clone_with_headroom(). These new functions simplify an increasingly common usage pattern. Suggested-by: Jesse Gross <jesse@nicira.com>	2010-09-01 12:55:50 -07:00
Ben Pfaff	5136ce492c	vlog: Introduce VLOG_DEFINE_THIS_MODULE for declaring vlog module in use. Adding a macro to define the vlog module in use adds a level of indirection, which makes it easier to change how the vlog module must be defined. A followup commit needs to do that, so getting these widespread changes out of the way first should make that commit easier to review.	2010-07-21 15:47:09 -07:00
Ben Pfaff	17ee3c1ffd	netdev-linux: Avoid minor number 0 in traffic control. Linux traffic control handles with minor number 0 refer to qdiscs, not to classes. This commit deals with this by using a conversion function: OpenFlow queue 0 maps to minor 1, queue 1 to minor 2, and so on.	2010-07-20 11:26:58 -07:00
Ben Pfaff	5b3941ee17	dpif-linux: Translate queues to priorities correctly. The TC_H_MAKE macro does not shift the major number into position.	2010-07-20 11:26:58 -07:00
Ben Pfaff	aae51f5335	dpif: Abstract translation from OpenFlow queue ID into ODP priority value. When the QoS code was integrated, I didn't yet know how to abstract the translation from a queue ID in an OpenFlow OFPAT_ENQUEUE action into a priority value for an ODP ODPAT_SET_PRIORITY action. This commit is a first attempt that works OK for Linux, so far. It's possible that in fact this translation needs the 'netdev' as an argument too, but it's not needed yet.	2010-07-20 11:23:21 -07:00
Ben Pfaff	b90fa799b9	datapath: Make datapath-protocol.h portable to non-Linux systems. datapath-protocol.h is not a very clean interface. I originally intended it to be solely a Linux-kernel specific interface. Over time it became a general-purpose interface to dpifs. This is not a good situation, because clearly the header is still Linux-specific. In the long run, the correct solution is to separate the generic and Linux-specific bits. This is not that patch. Instead, this patch modifies datapath-protocol.h enough that it can be used on non-Linux hosts. In particular I tested that it works OK with FreeBSD 8.0.	2010-05-26 15:32:34 -07:00
Justin Pettit	10dcf8deec	dpif: Include stat.h header	2010-05-20 13:27:55 -07:00
Ben Pfaff	54825e09b3	dpif-linux: Use hash instead of sorted array. With 1000 network devices being added or removed, sorting the array was a profiling hot spot. Using a hash makes it drop off the profile.	2010-05-05 14:00:50 -07:00
Ben Pfaff	4325359529	ofproto: Avoid buffer copy in OFPT_PACKET_IN path. When a dpif passes an odp_msg down to ofproto, and ofproto transforms it into an ofp_packet_in to send to the controller, until now this always involved a full copy of the packet inside ofproto. This commit eliminates this copy by ensuring that there is always enough headroom in the ofpbuf that holds the odp_msg to replace it by an ofp_packet_in in-place. From Jean Tourrilhes <jt@hpl.hp.com>, with some revisions.	2010-04-27 09:40:46 -07:00
Jesse Gross	3abc4a1a6c	dpif-linux: Clean up vports that are no longer in config. If the config changes while ovs-vswitchd is not running it is possible that there could be some vports which are no longer needed but won't be destroyed when closed because they aren't open. This deletes unneeded vports at the same time that we clean up unneeded datapaths.	2010-04-19 09:11:57 -04:00
Jesse Gross	f2459fe7d9	datapath: Add generic virtual port layer. Currently the datapath directly accesses devices through their Linux functions. Obviously this doesn't work for virtual devices that are not backed by an actual Linux device. This creates a new virtual port layer which handles all interaction with devices. The existing support for Linux devices was then implemented on top of this layer as two device types. It splits out and renames dp_dev to internal_dev. There were several places where datapath devices had to handled in a special manner and this cleans that up by putting all the special casing in a single location.	2010-04-19 09:11:57 -04:00
Tetsuo NAKAGAWA	ed30fb10e1	dpif-linux: Fix file descriptor leak. get_major() opens /proc/devices to get the openvswitch major number but never closes the FD.	2010-03-25 10:54:52 -04:00
Ben Pfaff	c69ee87c10	Merge "master" into "next". The main change here is the need to update all of the uses of UNUSED in the next branch to OVS_UNUSED as it is now spelled on "master".	2010-02-11 11:11:23 -08:00
Ben Pfaff	67a4917b07	Rename UNUSED macro to OVS_UNUSED to avoid naming conflict. Requested by Jean Tourrilhes <jt@hpl.hp.com>.	2010-02-11 10:59:47 -08:00
Jesse Gross	7dab847a19	Fix some regressions from the merge from master.	2010-02-08 13:31:33 -05:00
Justin Pettit	a4af00400a	Merge branch 'master' into next Conflicts: COPYING datapath/datapath.h lib/automake.mk lib/dpif-provider.h lib/dpif.c lib/hmap.h lib/netdev-provider.h lib/netdev.c lib/stream-ssl.h ofproto/executer.c ofproto/ofproto.c ofproto/ofproto.h tests/automake.mk utilities/ovs-ofctl.c utilities/ovs-vsctl.in vswitchd/ovs-vswitchd.conf.5.in xenserver/etc_init.d_vswitch xenserver/etc_xensource_scripts_vif xenserver/opt_xensource_libexec_interface-reconfigure	2010-02-05 17:14:55 -08:00
Jesse Gross	999401aa9c	dpif: Allow providers to be managed at runtime. The list of datapath providers was previously staticly defined at compile time. This allows new providers to be added and removed at runtime.	2010-02-01 12:00:53 -05:00
Jesse Gross	1a6f1e2a6d	dpif: Update dpif interface to match netdev. This brings over some features that were added to the netdev interface, most notably the separation between the name and the type. In addition to being cleaner, this also avoids problems where it is expected that the local port has the same name as the datapath.	2010-01-27 20:03:38 -05:00
Ben Pfaff	8334b477a4	dpif-linux: Always set *fnp in make_openvswitch_device(). Some versions of GCC warn about this. Always initializing it seems like the right thing to do, since we "almost always" initialized it before. Reported-by: Neil McKee <neil.mckee@inmon.com>	2010-01-19 10:10:52 -08:00
Ben Pfaff	72b0630028	Initial implementation of sFlow. Tested very slightly with "ping" and "sflowtool -t \| tcpdump -r -".	2010-01-04 13:08:37 -08:00
Jesse Gross	d65349ea28	Merge citrix branch into master.	2009-11-10 15:12:01 -08:00
Justin Pettit	f17d7bd838	dpif-linux: Clarify bad device warning message The message warning that the device number is wrong for the Open vSwitch devices could have been clearer. Thanks to Ben Pfaff for the suggested wording.	2009-10-02 16:59:28 -07:00
Justin Pettit	57aaff8a99	dpif-linux: Fail earlier if OVS kernel module isn't loaded When the kernel module isn't loaded, the bridge tries to open all the possible minor devices, regardless. This change first checks that there is a major device number for Open vSwitch and only then tries to open the minor devices. This change also removes the assumption that there's a default Open vSwitch major device number, since the kernel module always attempts to get a dynamic one. Maybe one day we'll have one... Bug #1179	2009-10-02 16:07:32 -07:00
Justin Pettit	be2c418b73	Cleanup isdigit() warnings. NetBSD's gcc complains if isdigit()'s argument is an unadorned char. This provides an appropriate cast.	2009-08-25 14:11:44 -07:00
Ben Pfaff	559843ed53	rtnetlink: Move into separate source and header file. Now that rtnetlink isn't named similarly to netdev_linux, it might as well have its own source and header files to avoid confusing everyone.	2009-07-30 16:07:15 -07:00
Ben Pfaff	46097491e4	netdev-linux: Rename "linux_netdev_" to "rtnetlink_". It was getting to be too confusing to have both netdev_linux_* functions and linux_netdev_* functions. Rename the latter to make the distinction more obvious. "rtnetlink" seems to be a fairly good name because that's what the kernel calls it, so the name will be familiar at least to people who know about rtnetlink.	2009-07-30 16:07:14 -07:00
Ben Pfaff	8b61709d5e	netdev: Implement an abstract interface to network devices. This new abstraction layer allows multiple implementations of network devices in a single running process. This will be useful, for example, to support network devices that are simulated entirely in the running process or that communicate with other processes over Unix domain sockets, etc. The reimplemented tap device support in this commit has not been tested.	2009-07-30 16:07:14 -07:00
Ben Pfaff	d3d22744a7	vswitch: Avoid knowledge of details specific to Linux datapaths. At startup, the vswitch needs to delete datapaths that are not configured by the administrator. Until now this was done by knowing the possible names of Linux datapaths. This commit cleans up by allowing each datapath class to enumerate its existing datapaths and their names.	2009-07-06 11:06:36 -07:00
Ben Pfaff	a165b67e53	dpif-linux: Don't allow arbitrary internal ports to identify a datapath. The userspace tools were allowing the name of any internal port to be used to identify a datapath. This, however, makes it hard to enumerate all the names by which a datapath can be known, and it was never documented or intentional behavior, so this commit disables it.	2009-07-06 11:02:57 -07:00
Ben Pfaff	e9e28be359	Introduce general-purpose ways to wait for dpif and netdev changes. The dpif and netdev code has had various ways to check for changes to dpifs and netdevs over the course of Open vSwitch development. All of these have been thus far fairly specific to the Linux implementation. This commit is the start of a more general API for watching for such changes. The dpif-related parts seem fairly mature and so they are documented, the netdev parts will probably need to change somewhat and so they are not documented yet.	2009-07-06 09:07:24 -07:00
Ben Pfaff	5792c5c64a	dpif: Add new functions dp_run() and dp_wait(). The upcoming netdev-based dpif needs a hook where it can process packets and throw them against the flow table, and this provides a suitable place.	2009-07-06 09:07:24 -07:00
Ben Pfaff	96fba48f52	dpif: Make dpifs abstract, to allow multiple datapath implementations. This commit initially introduces only a single datapath implementation, which is the same as the original one, but it paves the way for additional implementations, such as the upcoming userspace datapath.	2009-07-06 09:07:24 -07:00

47 Commits