mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 21:07:47 +00:00

Author	SHA1	Message	Date
Ben Pfaff	640e1b2077	dpif: Improve abstraction by making 'run' and 'wait' functions per-dpif. Until now, the dp_run() and dp_wait() functions had to be called at the top level of the program because they applied to every open dpif. By replacing them by functions that take a specific dpif as an argument, we can call them only from ofproto, which is currently the correct layer to deal with dpifs.	2011-05-11 12:26:07 -07:00
Ben Pfaff	7c66b273a2	packets: New function eth_set_vlan_tci(), from dpif-netdev. This will soon be used in the upcoming bond library.	2011-04-01 15:52:19 -07:00
Ben Pfaff	7ecb095d0b	ofpbuf: Make ofpbufs initialized with ofpbuf_use_stack() not expandable. My original intent for ofpbufs initialized with ofpbuf_use_stack() was that the caller was providing enough space on the stack for the common case, with dynamic allocation as a fallback. But in practice, none of the clients actually do this. Instead, all of them actually know that the stack-allocated buffer is big enough and, since they don't want to bother with having to call ofpbuf_delete(), they instead assert that the buffer wasn't reallocated. Since this is a bit of a pain, this commit changes the semantics of ofpbuf_use_stack() to be that the stack-allocated buffer cannot be reallocated at all. This is more convenient for the existing clients.	2011-03-30 15:08:47 -07:00
Ben Pfaff	19cf40693d	odp-util: Replace ODPUTIL_FLOW_KEY_U32S by new struct odputil_keybuf. This seems to me to better encapsulate the inherent ugliness.	2011-03-30 15:08:47 -07:00
Ben Pfaff	ed4031e467	dpif-netdev: Fix segfault handling packets. Reported-by: Hassan Khan <hassan.khan@seecs.edu.pk>	2011-02-15 10:08:15 -08:00
Ben Pfaff	f915f1a8ca	datapath: Consider tunnels to have no MTU, fixing jumbo frame support. Until now, tunnel vports have had a specific MTU, in the same way that ordinary network devices have an MTU, but treating them this way does not always make sense. For example, consider a datapath that has three ports: the local port, a GRE tunnel to another host, and a physical port. If the physical port is configured with a jumbo MTU, it should be possible to send jumbo packets across the tunnel: the tunnel can do fragmentation or the physical port traversed by the tunnel might have a jumbo MTU. However, until now, tunnels always had a 1500-byte MTU by default. It could be adjusted using ODP_VPORT_MTU_SET, but nothing actually did this. One alternative would be to make ovs-vswitchd able to set the vport's MTU. This commit, however, takes a different approach, of dropping the concept of MTU entirely for tunnel vports. This also solves the problem described above, without making any additional work for anyone. I tested that, without this change, I could not send 1600-byte "pings" between two machines whose NICs had 2000-byte MTUs that were connected to vswitches that were in turn connected over GRE tunnels with the default 1500-byte MTU. With this change, it worked OK, regardless of the MTU of the network traversed by the GRE tunnel. This patch also makes "patch" ports MTU-less. It might make sense to remove vport_set_mtu() and the associated callback now, since ordinary network devices are the only vports that support it now. Signed-off-by: Ben Pfaff <blp@nicira.com> Suggested-by: Jesse Gross <jesse@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Bug #3728.	2011-02-04 09:46:26 -08:00
Justin Pettit	6767a2cce9	lib: Replace IP_TYPE_ references with IPPROTO_. A few common IP protocol types were defined in "lib/packets.h". However, we already assume the existence of <netinet/in.h> which contains a more exhaustive list and should be available on POSIX systems.	2011-02-02 11:50:17 -08:00
Ben Pfaff	7aec165dbc	datapath: s/ODPAT_/ODP_ACTION_ATTR_/ to fit new naming scheme. Jesse suggested this naming scheme, so I'm adjusting existing names to fit it. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-28 15:34:40 -08:00
Ben Pfaff	3d8c95357f	dpif: Remove dpif_get_all_names(). None of the remaining dpif implementations have more than one name per dpif, so there's no need for this function anymore. Suggested-by: Jesse Gross <jesse@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-28 15:34:40 -08:00
Ben Pfaff	82272eded1	Eliminate ODPL_* from userspace-facing interface. Reviewed by Justin Pettit.	2011-01-27 21:08:41 -08:00
Ben Pfaff	693c4a0112	datapath: Eliminate 'flags' member from odp_flow. Nothing was productively using the 'flags' member of odp_flow, so this commit removes it. ODPFF_ZERO_TCP_FLAGS isn't used at all (as of the previous commit). ODPFF_EOF has been replaced by a special case of the 'key_len' member. This will go away, too, once AF_NETLINK starts being used. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:39 -08:00
Ben Pfaff	ba25b8f41f	dpif: Eliminate ODPPF_* constants from client-visible interface. Following this commit, the ODPPF_* constants are only used in Linux-specific parts of OVS userspace code. This allows the actual Linux datapath interface to evolve more freely. Reviewed by Justin Pettit.	2011-01-27 21:08:39 -08:00
Ben Pfaff	c97fb13280	dpif: Eliminate "struct odp_flow_stats" from client-visible interface. Following this commit, "struct odp_flow_stats" is only used in Linux-specific parts of OVS userspace code. This allows the actual Linux datapath interface to evolve more freely. Reviewed by Justin Pettit.	2011-01-27 21:08:38 -08:00
Ben Pfaff	feebdea2e5	dpif: Eliminate "struct odp_flow" from client-visible interface. Following this commit, "struct odp_flow" and related data structures are only used in Linux-specific parts of OVS userspace code. This allows the actual Linux datapath interface to evolve more freely. Reviewed by Justin Pettit.	2011-01-27 21:08:38 -08:00
Ben Pfaff	bc4a05c639	datapath: Change ODP_FLOW_GET to retrieve only a single flow at a time. This brings the code closer to what the Netlink interface will need to implement. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:38 -08:00
Ben Pfaff	996c1b3d7a	datapath: Drop port information from odp_stats. As with n_flows, n_ports was used regularly by userspace to determine how much memory to allocate when listing ports, but it is no longer needed for that. max_ports, on the other hand, is necessary but it is also a fixed value for the kernel datapath right now and if we expand it we can also come up with a way to report the expanded value. The remaining members of odp_stats are actually real statistics that I intend to keep. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:38 -08:00
Ben Pfaff	1ba530f4b2	datapath: Drop queue information from odp_stats. This queue information will be available through the kernel socket layer once we move over to Netlink socket as transports, so we might as well get rid of the redundancy. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:38 -08:00
Ben Pfaff	ea7bd5973f	datapath: Drop flow information from odp_stats. Userspace used to use the n_flows information here to decide how much memory needed to be allocated to list flows, but that isn't necessary any longer now that listing flows uses an iterator abstraction. The cur_capacity and max_capacity members are just curiosities and don't provide much information; if the implementation ever changes away from the current hash table implementation then they could become meaningless anyhow. But more than anything, these aren't really the kind of statistics that networking people usually care about. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:38 -08:00
Ben Pfaff	4c738a8da5	dpif: Eliminate "struct odp_port" from client-visible interface. Following this commit, "struct odp_port" is only used in Linux-specific parts of OVS userspace code. This allows the actual Linux datapath interface to evolve more freely. Reviewed by Justin Pettit.	2011-01-27 21:08:37 -08:00
Ben Pfaff	b0ec0f279e	datapath: Change listing ports to use an iterator concept. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to add new features to the kernel vport layer without changing userspace software. In turn, that means that the odp_port structure must become variable-length. This does not, however, fit in well with the ODP_PORT_LIST ioctl in its current form, because that would require userspace to know how much space to allocate for each port in advance, or to allocate as much space as could possibly be needed. Neither choice is very attractive. This commit prepares for a different solution, by replacing ODP_PORT_LIST by a new ioctl ODP_VPORT_DUMP that retrieves information about a single vport from the datapath on each call. It is much cleaner to allocate the maximum amount of space for a single vport than to do so for possibly a large number of vports. It would be faster to retrieve a number of vports in batch instead of just one at a time, but that will naturally happen later when the kernel datapath interface is changed to use Netlink, so this patch does not bother with it. The Netlink version won't need to take the starting port number from userspace, since Netlink sockets can keep track of that state as part of their "dump" feature. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	856081f683	datapath: Report kernel's flow key when passing packets up to userspace. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to change the kernel's idea of the flow key separately from the userspace version. This commit takes one step in that direction by making the kernel report its idea of the flow that a packet belongs to whenever it passes a packet up to userspace. This means that userspace can intelligently figure out what to do: - If userspace's notion of the flow for the packet matches the kernel's, then nothing special is necessary. - If the kernel has a more specific notion for the flow than userspace, for example if the kernel decoded IPv6 headers but userspace stopped at the Ethernet type (because it does not understand IPv6), then again nothing special is necessary: userspace can still set up the flow in the usual way. - If userspace has a more specific notion for the flow than the kernel, for example if userspace decoded an IPv6 header but the kernel stopped at the Ethernet type, then userspace can forward the packet manually, without setting up a flow in the kernel. (This case is bad from a performance point of view, but at least it is correct.) This commit does not actually make userspace flexible enough to handle changes in the kernel flow key structure, although userspace does now have enough information to do that intelligently. This will have to wait for later commits. This commit is bigger than it would otherwise be because it is rolled together with changing "struct odp_msg" to a sequence of Netlink attributes. The alternative, to do each of those changes in a separate patch, seemed like overkill because it meant that either we would have to introduce and then kill off Netlink attributes for in_port and tun_id, if Netlink conversion went first, or shove yet another variable-length header into the stuff already after odp_msg, if adding the flow key to odp_msg went first. This commit will slow down performance of checksumming packets sent up to userspace. I'm not entirely pleased with how I did it. I considered a couple of alternatives, but none of them seemed that much better. Suggestions welcome. Not changing anything wasn't an option, unfortunately. At any rate some slowdown will become unavoidable when OVS actually starts using Netlink instead of just Netlink framing. (Actually, I thought of one option where we could avoid that: make userspace do the checksum instead, by passing csum_start and csum_offset as part of what goes to userspace. But that's not perfect either.) Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	36956a7d33	datapath: Convert odp_flow_key to use Netlink attributes instead. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to change the kernel's idea of the flow key separately from the userspace version. In turn, that means that flow keys must become variable-length. This commit makes that change using Netlink attribute sequences. This commit does not actually make userspace flexible enough to handle changes in the kernel flow key structure, because userspace doesn't yet have enough information to do that intelligently. Upcoming commits will fix that. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:35 -08:00
Ben Pfaff	704a1e09e9	datapath: Change listing flows to use an iterator concept. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to change the kernel's idea of the flow key separately from the userspace version. In turn, that means that flow keys must become variable-length. This does not, however, fit in well with the ODP_FLOW_LIST ioctl in its current form, because that would require userspace to know how much space to allocate for each flow's key in advance, or to allocate as much space as could possibly be needed. Neither choice is very attractive. This commit prepares for a different solution, by replacing ODP_FLOW_LIST by a new ioctl ODP_FLOW_DUMP that retrieves a single flow from the datapath on each call. It is much cleaner to allocate the maximum amount of space for a single flow key than to do so for possibly a very large number of flow keys. As a side effect, this patch also fixes a race condition that sometimes made "ovs-dpctl dump-flows" print an error: previously, flows were listed and then their actions were retrieved, which left a window in which ovs-vswitchd could delete the flow. Now dumping a flow and its actions is a single step, closing that window. Dumping all of the flows in a datapath is no longer an atomic step, so now it is possible to miss some flows or see a single flow twice during iteration, if the flow table is modified by another process. It doesn't look like this should be a problem for ovs-vswitchd. It would be faster to retrieve a number of flows in batch instead of just one at a time, but that will naturally happen later when the kernel datapath interface is changed to use Netlink, so this patch does not bother with it. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:35 -08:00
Ben Pfaff	ecf9ebff6c	dpif-netdev: Allow for Ethernet and VLAN header in buffer size calculation. This is a long-standing bug--it was present in version 1.0 too. Reported-by: Gaetano Catalli <gaetano.catalli@gmail.com> Solution by Jesse Gross <jesse@nicira.com>	2011-01-24 12:44:19 -08:00
Ben Pfaff	db5ce51427	Fix non-static instances of "struct vlog_rate_limit" and add check. A non-static vlog_rate_limit is not actually going to rate-limit anything.	2011-01-12 09:22:12 -08:00
Ben Pfaff	e70f33fc8a	dpif-netdev: Add missing 'const' qualifiers to function parameters. These functions don't modify their flow key arguments but the prototypes implied that they did. Acked-by: Jesse Gross <jesse@nicira.com>	2010-12-28 22:40:52 -08:00
Jesse Gross	cf22f8cba3	vswitchd: Consistently use size_t for action lengths. Currently the type of the datapath action length is mixture of size_t and unsigned int. However, size_t is really defined as an unsigned long, which causes the build to fail on 64-bit platforms. This consistently uses size_t.	2010-12-13 11:07:15 -08:00
Ben Pfaff	b9298d3f82	Expand tunnel IDs from 32 to 64 bits. We have a need to identify tunnels with keys longer than 32 bits. This commit adds basic datapath and OpenFlow support for such keys. It doesn't actually add any tunnel protocols that support 64-bit keys, so this is not very useful yet. The 'arg' member of struct odp_msg had to be expanded to 64-bits also, because it sometimes contains a tunnel ID. This member also contains the argument passed to ODPAT_CONTROLLER, so I expanded that action's argument to 64 bits also so that it can use the full width of the expanded 'arg'. Userspace doesn't take advantage of the new space though (it was only using 16 bits anyhow). This commit has been tested only to the extent that it doesn't disrupt basic Open vSwitch operation. I have not tested it with tunnel traffic. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Feature #3976.	2010-12-10 11:14:13 -08:00
Ben Pfaff	cdee00fd63	datapath: Replace "struct odp_action" by Netlink attributes. In the medium term, we plan to migrate the datapath to use Netlink as its communication channel. In the short term, we need to be able to have actions with 64-bit arguments but "struct odp_action" only has room for 48 bits. So this patch shifts to variable-length arguments using Netlink attributes, which starts in on the Netlink transition and makes 64-bit arguments possible at the same time. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2010-12-10 11:13:32 -08:00
Jean Tourrilhes	3d785a2421	dpif-netdev: Handle ECN bits when updating IP checksum. When recalculating the checksum after a set ToS action, we were not taking into account the ECN bits copied from the original header.	2010-12-08 22:05:07 -08:00
Ben Pfaff	4b87bea69d	dpif-netdev: Use ofpbuf functions instead of their out-of-line expansions. Acked-by: Jesse Gross <jesse@nicira.com>	2010-12-07 13:44:14 -08:00
Ben Pfaff	b3907fbc6c	queue: Get rid of ovs_queue data structure. ovs_queue doesn't seem very useful; it's just a singly-linked list. It's more generally useful to use a general-purpose "struct list" for lists of packets, so this commit adds such a member to "struct ofpbuf" and shifts the existing users to use it.	2010-12-06 10:03:31 -08:00
Ben Pfaff	c3827f619a	datapath: Make adding and attaching a vport a single step. For some time now, Open vSwitch datapaths have internally made a distinction between adding a vport and attaching it to a datapath. Adding a vport just means to create it, as an entity detached from any datapath. Attaching it gives it a port number and a datapath. Similarly, a vport could be detached and deleted separately. After some study, I think I understand why this distinction exists. It is because ovs-vswitchd tries to open all the datapath ports before it tries to create them. However, changing it to create them before it tries to open them is not difficult, so this commit does this. The bulk of this commit, however, changes the datapath interface to one that always creates a vport and attaches it to a datapath in a single step, and similarly detaches a vport and deletes it in a single step. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2010-12-03 14:41:38 -08:00
Ben Pfaff	614c489203	Add new "dummy" netdev and dpif implementations for use in unit tests.	2010-11-29 16:29:10 -08:00
Ben Pfaff	7fa710e43f	dpif-netdev: Do not log error for EOPNOTSUPP return from netdev_recv(). If a network device does not implement receiving packets, there is no point in logging it as an error.	2010-11-24 12:35:50 -08:00
Ben Pfaff	462278dbfd	dpif-netdev: Simplify code by using shash for names and dropping indexes.	2010-11-24 12:35:22 -08:00
Ben Pfaff	4a38774146	dpif: Make dpif_class 'open' function take class instead of type name. This makes it easier for dpif_provider implementations to share code but distinguish the class actually in use, because comparing a pointer is easier than comparing a string.	2010-11-18 10:08:05 -08:00
Ben Pfaff	d98e600755	vlog: Make client supply semicolon for VLOG_DEFINE_THIS_MODULE. It's kind of odd for VLOG_DEFINE_THIS_MODULE to supply its own semicolon, so this commit switches to the more common form.	2010-10-29 09:48:47 -07:00
Ben Pfaff	27bcf966b4	datapath: Simplify ODPAT_SET_DL_TCI action. There's no need to have a mask in this action, because both parts of the TCI are part of the flow structure. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2010-10-18 11:18:23 -07:00
Ben Pfaff	26233bb461	datapath: Combine dl_vlan and dl_vlan_pcp. This allows eliminating padding from odp_flow_key, although actually doing that is postponed until the next commit. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2010-10-11 13:31:43 -07:00
Ben Pfaff	ae412e7dd8	flow: Get rid of flow_t typedef. When userspace and the kernel were using the same structure for flows, flow_t was a useful way to indicate that a structure was really a userspace flow instead of a kernel one, but now it's better to just write "struct flow" for consistency, since OVS doesn't use typedefs for structs elsewhere. Acked-by: Jesse Gross <jesse@nicira.com>	2010-10-11 13:31:43 -07:00
Ben Pfaff	14608a1539	flow: Separate "flow_t" from "struct odp_flow_key". The "struct odp_flow_key" used in the kernel datapath is conceptually separate from the "flow_t" used in userspace, but until now we have used the latter as a typedef for the former for convenience. This commit separates them. This makes it possible in upcoming commits to change them independently. This is cross-ported from the "wdp" branch, which has had it for months.	2010-10-11 13:31:35 -07:00
Ben Pfaff	f1588b1fa1	datapath: Remove implementation of port groups. The "port group" concept seems like a good one, but it has not been used very much in userspace so far, so before we commit ourselves to a frozen API that we must maintain forever, remove it. We can always add it back in later as a new kind of vport. Signed-off-by: Ben Pfaff <blp@nicira.com>	2010-10-11 12:40:11 -07:00
Ben Pfaff	4e8e4213a8	Switch many macros from using CONTAINER_OF to using OBJECT_CONTAINING. These macros require one fewer argument by switching, which makes code that uses them shorter and more readable.	2010-10-01 10:25:29 -07:00
Ben Pfaff	2a022368f4	Avoid shadowing local variable names. All of these changes avoid using the same name for two local variables within a same function. None of them are actual bugs as far as I can tell, but any of them could be confusing to the casual reader. The one in lib/ovsdb-idl.c is particularly brilliant: inner and outer loops both using (different) variables named 'i'. Found with GCC -Wshadow.	2010-09-20 09:39:54 -07:00
Ben Pfaff	68efcbec41	ofpbuf: Add ofpbuf_new_with_headroom(), ofpbuf_clone_with_headroom(). These new functions simplify an increasingly common usage pattern. Suggested-by: Jesse Gross <jesse@nicira.com>	2010-09-01 12:55:50 -07:00
Joe Perches	d295e8e97a	treewide: Remove trailing whitespace Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com>	2010-08-30 13:23:08 -07:00
Ben Pfaff	ca78c6b69c	datapath: Avoid accesses past the end of skbuff data in actions. Some of the flow actions that modify skbuff data did not check that the skbuff was long enough before doing so. This commit fixes that problem. Previously, the strategy for avoiding this was to only indicate the layer-3 nw_proto field in the flow if the corresponding layer-4 header was fully present, so that if, for example, nw_proto was IPPROTO_TCP, this meant that a TCP header was present. The original motivation for this patch was to add corresponding code to only indicate a layer-2 dl_type if the corresponding layer-3 header was fully present. But I'm now convinced that this approach is conceptually wrong, because the meaning of a layer-N header should not be affected by the meaning of a layer-(N+1) header. This commit switches to a new approach. Now, when a header is missing, its fields in the flow are simply zeroed and have no effect on the "type" field for the outer header. Responsibility for ensuring that a header is fully present is now shifted to the actions that wish to modify that header. Signed-off-by: Ben Pfaff <blp@nicira.com>	2010-08-27 12:42:39 -07:00
Ben Pfaff	2105ccc850	dpif-netdev: Expand tabs.	2010-08-26 10:56:20 -07:00
Ben Pfaff	401eeb92d3	Add Nicira extension to OpenFlow for dropping spoofed ARP packets. "ARP spoofing" is when a host claims an incorrect association between an IP address and a MAC address for deceptive purposes. OpenFlow by itself can prevent a host from sending out ARP replies from an incorrect MAC address in the Ethernet L2 header, but it cannot control the MAC addresses inside the ARP L3 packet. This commit adds a new action that can be used to drop these spoofed packets. CC: Paul Ingram <paul@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>	2010-08-26 10:56:20 -07:00

1 2 3

138 Commits