mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

Author	SHA1	Message	Date
Justin Pettit	530180fd5a	Support matching and modifying IP ECN bits. Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-11-09 10:47:59 -08:00
Pravin B Shelar	abff858b5a	datapath: Convert kernel priority actions into match/set. Following patch adds skb-priority to flow key. So userspace will know what was priority when packet arrived and we can remove the pop/reset priority action. It's no longer necessary to have a special action for pop that is based on the kernel remembering original skb->priority. Userspace can just emit a set priority action with the original value. Since the priority field is a match field with just a normal set action, we can convert it into the new model for actions that are based on matches. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Bug #7715	2011-11-01 10:13:16 -07:00
Ben Pfaff	7257b535ab	Implement new fragment handling policy. Until now, OVS has handled IP fragments more awkwardly than necessary. It has not been possible to match on L4 headers, even in fragments with offset 0 where they are actually present. This means that there was no way to implement ACLs that treat, say, different TCP ports differently, on fragmented traffic; instead, all decisions for fragment forwarding had to be made on the basis of L2 and L3 headers alone. This commit improves the situation significantly. It is still not possible to match on L4 headers in fragments with nonzero offset, because that information is simply not present in such fragments, but this commit adds the ability to match on L4 headers for fragments with zero offset. This means that it becomes possible to implement ACLs that drop such "first fragments" on the basis of L4 headers. In practice, that effectively blocks even fragmented traffic on an L4 basis, because the receiving IP stack cannot reassemble a full packet when the first fragment is missing. This commit works by adding a new "fragment type" to the kernel flow match and making it available through OpenFlow as a new NXM field named NXM_NX_IP_FRAG. Because OpenFlow 1.0 explicitly says that the L4 fields are always 0 for IP fragments, it adds a new OpenFlow fragment handling mode that fills in the L4 fields for "first fragments". It also enhances ovs-ofctl to allow users to configure this new fragment handling mode and to parse the new field. Signed-off-by: Ben Pfaff <blp@nicira.com> Bug #7557.	2011-10-21 15:07:36 -07:00
Pravin B Shelar	4edb9ae90e	datapath: Refactor actions in terms of match fields. Almost all current actions can be expressed in the form of push/pop/set <field>, where field is one of the match fields. We can create three base actions and take a field. This has both a nice symmetry and avoids inconsistencies where we can match on the vlan TPID but not set it. Following patch converts all actions to this new format. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Bug #7115	2011-10-21 14:38:54 -07:00
Ben Pfaff	6bc6002490	dpif: New function dpif_operate() and dpif-linux implementation. This will be used in an upcoming commit.	2011-10-14 14:08:44 -07:00
Ben Pfaff	09ded0ad48	datapath-protocol: Rename enums for consistency. Most of the enum tags in this file are lowercased versions of the uppercase enum prefixes (or slightly less abbreviated versions, e.g. "dp" becomes "datapath"). This commit fixes up the others for consistency. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-10-12 16:27:09 -07:00
Ben Pfaff	98403001ec	datapath: Move Netlink PID for userspace actions from flows to actions. Commit b063d9f06 "datapath: Use unicast Netlink sockets for upcalls" that switched from multicast to unicast Netlink for sending upcalls added a Netlink PID to each kernel flow, used by OVS_ACTION_ATTR_USERSPACE actions within the flow as target. This commit drops this per-flow PID in favor of a per-action PID, because that is more flexible. It does not yet make use of this additional flexibility, so behavior should not change. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Bug #7559.	2011-10-12 16:27:00 -07:00
Ben Pfaff	26c6b6cd2b	dpif-netdev: Implement OVS_ACTION_ATTR_SAMPLE action. OVS_ACTION_ATTR_SAMPLE has never been implemented in dpif-netdev. This commit implements it and adds a cast to enum ovs_action_type in the switch statement that checks the action type, so that GCC complains if we forget to add a case for a new action type. I had to assign the return value of nl_attr_type() to a temporary variable, because "switch ((enum ovs_action_type) nl_attr_type(a))" provoked a GCC warning that I've never seen before: ../lib/dpif-netdev.c:1260: warning: cast from function call of type 'int' to non-matching type 'enum ovs_action_type'	2011-10-11 11:07:14 -07:00
Ben Pfaff	109ee2810d	dpif-netdev: Simplify code by removing dpif_netdev_validate_actions(). dpif_netdev_validate_actions() existed for three reasons. First, it checked that the actions were well-formed and valid. This isn't really necessary, because the actions are built internally by ofproto-dpif and will always be well-formed. (If not, that's a bug in ofproto-dpif.) Second, it checks whether the actions will modify (mutate) the data in the packet and reports that to the caller, which can use it to optimize what it does. However, the only caller that used this was dpif_netdev_execute(), which is not a fast-path (if dpif-netdev can be said to have a fast path at all). Third, dpif_netdev_validate_actions() rejects certain actions that dpif-netdev does not implement: OVS_ACTION_ATTR_SET_TUNNEL, OVS_ACTION_ATTR_SET_PRIORITY, and OVS_ACTION_ATTR_POP_PRIORITY. However, this doesn't really seem necessary to me. First, dpif-netdev can't support tunnels in any case, so OVS_ACTION_ATTR_SET_TUNNEL shouldn't come up. Second, the priority actions just aren't important enough to worry about; they only affect QoS, which isn't really important with dpif-netdev since it's going to be slow anyway. So this commit just drops dpif_netdev_validate_actions() entirely.	2011-10-11 10:37:25 -07:00
Ben Pfaff	a8d9304d12	dpif: Avoid use of "struct ovs_dp_stats" in platform-independent modules. Over time we wish to reduce the number of datapath-protocol.h definitions used directly outside of Linux-specific code. This commit removes use of "struct ovs_dp_stats" from platform-independent code. Bug #7559.	2011-10-05 11:18:13 -07:00
Pravin Shelar	6ff686f2bc	sFlow: Genericize/simplify kernel sFlow implementation Following patch adds sampling action which takes probability and set of actions as arguments. When probability is hit, actions are executed for given packet. USERSPACE action's userdata (u64) is used to store struct user_action_cookie as cookie. CONTROLLER action is fixed accordingly. Now we can remove sFlow code from kernel and implement sFlow generically as SAMPLE action. sFlow is defined as SAMPLE Action with probability (sFlow sampling rate) and USERSPACE action as argument. USERSPACE action's data is used as cookie. sFlow uses this cookie to store output-port, number of output ports and vlan-id. sample-pool is calculated by using vport stats. Signed-off-by: Pravin Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2011-09-28 10:43:07 -07:00
Ben Pfaff	26ce31583b	dpif-netdev: Also allow "dummy" netdevs in a dpif-netdev. I've always intended this to work, but either I never tested it or the support rotted. This will soon be used in some tests that I will add.	2011-09-13 11:46:09 -07:00
Pravin Shelar	9b02078077	datapath: Strip down vport interface : OVS_VPORT_ATTR_MTU There is no need to have vport attribute MTU (OVS_VPORT_ATTR_MTU) as linux net-dev-ioctl can be used to get/set MTU for linux device. Following patch removes OVS_VPORT_ATTR_MTU from datapath protocol. This patch also adds netdev_set_mtu interface. So that MTU adjustments can be done from OVS userspace. get_mtu() interface is also changed, now get_mtu() returns EOPNOTSUPP rather than returning 0 and setting *pmtu to INT_MAX in case there is no MTU attribute for given device. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-09-12 17:12:52 -07:00
Pravin Shelar	d9065a90b6	datapath: VLAN actions should use push/pop semantics Currently the kernel vlan actions mirror those used by OpenFlow 1.0. i.e. MODIFY and STRIP. More flexible approach is to have an action to push a tag and pop a tag off, so that it can handle multiple levels of vlan tags. Plus it aligns with newer version of OpenFlow. As this patch replaces MODIFY with PUSH semantic, action mapping done in userpace is fixed accordingly. GSO handling for multiple levels of vlan tags is also added as Jesse suggested before. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-09-09 18:13:26 -07:00
Ben Pfaff	18886b60bc	datapath: Allow a packet with no input port to omit OVS_KEY_ATTR_IN_PORT. When ovs-vswitchd executes actions on a synthesized packet, that is, on a packet that is not being forwarded from any particular port but is being generated by ovs-vswitchd itself or by an OpenFlow controller (using a OFPT_PACKET_OUT message with an in_port of OFPP_NONE), there is no good choice for the in_port to pass to the kernel in the flow in the OVS_PACKET_CMD_EXECUTE message. This commit allows ovs-vswitchd to omit the in_port entirely in this case. This fixes a bug in OFPT_PACKET_OUT: using an in_port of OFPP_NONE would cause the packet to be dropped by the kernel, since that's an invalid input port. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Reported-by: Aaron Rosen <arosen@clemson.edu>	2011-09-08 16:30:20 -07:00
Justin Pettit	df2c07f433	datapath: Use "OVS_" as opposed to "ODP_" for user<->kernel interactions. The prefix "ODP_" is not overly descriptive in the context of the larger Linux tree. This commit changes the prefix to "OVS_" for the userpace to kernel interactions. The userspace libraries still use "ODP_" in many of their interfaces since it is more descriptive in the OVS oeuvre. Feature #6904 Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-08-19 22:48:23 -07:00
Ben Pfaff	4ad2802695	dpif-netdev: Avoid pointlessly maintaining a port count. 'n_ports' was only used for testing for nonzero, and we can rewrite the code that does that to more straightforwardly use LIST_FOR_EACH_SAFE.	2011-08-15 09:49:15 -07:00
Ben Pfaff	18812dff32	netdev: Get rid of struct netdev_options and netdev_open_default(). Now that netdev_options only has two members, we might as well pass them directly as parameters.	2011-08-08 12:55:43 -07:00
Ben Pfaff	7b6b0ef47e	netdev: Clean up and refactor packet receive interface. The Open vSwitch tree only has one user of the ability for a netdev to receive packets from a network device. Thus, this commit simplifies the common-case use of the netdev interface by replacing the "ethertype" option from "struct netdev_options" by a new netdev_listen() call. The only user of netdev_listen() wants to receive all packets from a network device, so this commit also removes the ability to restrict the received packets to a particular protocol. (This ability was once used by the Open vSwitch integrated DHCP client, but that code has been removed.) This commit also simplifies and improves the implementation of the code in netdev-linux that started listening to a network device. Before, I had not figured out how to avoid receiving all packets on all devices before binding to a particular device, but I took a closer look at the kernel code and figured it out. I've tested that the userspace datapath (dpif-netdev), the only user of netdev_recv(), still works after this change.	2011-08-08 10:24:24 -07:00
Simon Horman	f180c2e2cc	ovs-dpctl: Show number of flows Expose the number of flows present in a datapath to user-space and to users via ovs-dpctl show. e.g.: ovs-dpctl show br3 system@br3: lookups: frags:0, hit:0, missed:0, lost:0 flows: 0 ... Signed-off-by: Simon Horman <horms@verge.net.au> [Jesse: Add same logic to userspace datapath.] Signed-off-by: Jesse Gross <jesse@nicira.com>	2011-08-03 21:38:03 -07:00
pravin shelar	b85d8d61a6	Datapath action should not refer to controller ODP_ACTION_ATTR_CONTROLLER in the kernel actually sends packets to userspace, not the controller. To make it generic rename this action to ODP_ACTION_ATTR_USERSPACE. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2011-07-28 09:05:25 -07:00
Justin Pettit	6c222e55fa	Remove NXAST_DROP_SPOOFED_ARP action. The NXAST_DROP_SPOOFED_ARP action has been deprecated in favor of defining flows using the NXM_NX_ARP_SHA flow match for a while. This commit removes it. Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-06-09 16:19:38 -07:00
Ben Pfaff	80e5eed9c2	datapath: Get packet metadata from userspace in odp_packet_cmd_execute(). Until now, the tun_id and in_port have been lost when a packet is sent from the kernel to userspace and then back to the kernel. I didn't think that this was a problem, but recent behavior made me look closer and see that it makes a difference if sFlow is turned on or if an ODP_ATTR_ACTION_CONTROLLER action is present. We could possibly kluge around those, but for future-proofing it seems better to pass the packet metadata from userspace to the kernel. That is what this commit does. This commit introduces a user-kernel protocol break. We could avoid that, if it is desirable, by making ODP_PACKET_ATTR_KEY optional for ODP_PACKET_CMD_EXECUTE commands. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-06-01 13:39:51 -07:00
Ben Pfaff	b2fda3effc	Merge 'next' into 'master'. I know already that this breaks the statsfixes that were implemented by the following commits: 827ab71c97f "ofproto: Datapath statistics accounted twice." 6f1435fc8f7 "ofproto: Resubmit statistics improperly account during..." These were already broken in a previous merge. I will work on a fix.	2011-05-18 14:01:13 -07:00
Ben Pfaff	d84d4b88d2	Fix incorrect byte order annotations. These are not actual bugs, just deceptive use of the wrong function or type. Found by sparse.	2011-05-16 13:40:47 -07:00
Ben Pfaff	640e1b2077	dpif: Improve abstraction by making 'run' and 'wait' functions per-dpif. Until now, the dp_run() and dp_wait() functions had to be called at the top level of the program because they applied to every open dpif. By replacing them by functions that take a specific dpif as an argument, we can call them only from ofproto, which is currently the correct layer to deal with dpifs.	2011-05-11 12:26:07 -07:00
Ben Pfaff	7c66b273a2	packets: New function eth_set_vlan_tci(), from dpif-netdev. This will soon be used in the upcoming bond library.	2011-04-01 15:52:19 -07:00
Ben Pfaff	7ecb095d0b	ofpbuf: Make ofpbufs initialized with ofpbuf_use_stack() not expandable. My original intent for ofpbufs initialized with ofpbuf_use_stack() was that the caller was providing enough space on the stack for the common case, with dynamic allocation as a fallback. But in practice, none of the clients actually do this. Instead, all of them actually know that the stack-allocated buffer is big enough and, since they don't want to bother with having to call ofpbuf_delete(), they instead assert that the buffer wasn't reallocated. Since this is a bit of a pain, this commit changes the semantics of ofpbuf_use_stack() to be that the stack-allocated buffer cannot be reallocated at all. This is more convenient for the existing clients.	2011-03-30 15:08:47 -07:00
Ben Pfaff	19cf40693d	odp-util: Replace ODPUTIL_FLOW_KEY_U32S by new struct odputil_keybuf. This seems to me to better encapsulate the inherent ugliness.	2011-03-30 15:08:47 -07:00
Ben Pfaff	ed4031e467	dpif-netdev: Fix segfault handling packets. Reported-by: Hassan Khan <hassan.khan@seecs.edu.pk>	2011-02-15 10:08:15 -08:00
Ben Pfaff	f915f1a8ca	datapath: Consider tunnels to have no MTU, fixing jumbo frame support. Until now, tunnel vports have had a specific MTU, in the same way that ordinary network devices have an MTU, but treating them this way does not always make sense. For example, consider a datapath that has three ports: the local port, a GRE tunnel to another host, and a physical port. If the physical port is configured with a jumbo MTU, it should be possible to send jumbo packets across the tunnel: the tunnel can do fragmentation or the physical port traversed by the tunnel might have a jumbo MTU. However, until now, tunnels always had a 1500-byte MTU by default. It could be adjusted using ODP_VPORT_MTU_SET, but nothing actually did this. One alternative would be to make ovs-vswitchd able to set the vport's MTU. This commit, however, takes a different approach, of dropping the concept of MTU entirely for tunnel vports. This also solves the problem described above, without making any additional work for anyone. I tested that, without this change, I could not send 1600-byte "pings" between two machines whose NICs had 2000-byte MTUs that were connected to vswitches that were in turn connected over GRE tunnels with the default 1500-byte MTU. With this change, it worked OK, regardless of the MTU of the network traversed by the GRE tunnel. This patch also makes "patch" ports MTU-less. It might make sense to remove vport_set_mtu() and the associated callback now, since ordinary network devices are the only vports that support it now. Signed-off-by: Ben Pfaff <blp@nicira.com> Suggested-by: Jesse Gross <jesse@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Bug #3728.	2011-02-04 09:46:26 -08:00
Justin Pettit	6767a2cce9	lib: Replace IP_TYPE_ references with IPPROTO_. A few common IP protocol types were defined in "lib/packets.h". However, we already assume the existence of <netinet/in.h> which contains a more exhaustive list and should be available on POSIX systems.	2011-02-02 11:50:17 -08:00
Ben Pfaff	7aec165dbc	datapath: s/ODPAT_/ODP_ACTION_ATTR_/ to fit new naming scheme. Jesse suggested this naming scheme, so I'm adjusting existing names to fit it. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-28 15:34:40 -08:00
Ben Pfaff	3d8c95357f	dpif: Remove dpif_get_all_names(). None of the remaining dpif implementations have more than one name per dpif, so there's no need for this function anymore. Suggested-by: Jesse Gross <jesse@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-28 15:34:40 -08:00
Ben Pfaff	82272eded1	Eliminate ODPL_* from userspace-facing interface. Reviewed by Justin Pettit.	2011-01-27 21:08:41 -08:00
Ben Pfaff	693c4a0112	datapath: Eliminate 'flags' member from odp_flow. Nothing was productively using the 'flags' member of odp_flow, so this commit removes it. ODPFF_ZERO_TCP_FLAGS isn't used at all (as of the previous commit). ODPFF_EOF has been replaced by a special case of the 'key_len' member. This will go away, too, once AF_NETLINK starts being used. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:39 -08:00
Ben Pfaff	ba25b8f41f	dpif: Eliminate ODPPF_* constants from client-visible interface. Following this commit, the ODPPF_* constants are only used in Linux-specific parts of OVS userspace code. This allows the actual Linux datapath interface to evolve more freely. Reviewed by Justin Pettit.	2011-01-27 21:08:39 -08:00
Ben Pfaff	c97fb13280	dpif: Eliminate "struct odp_flow_stats" from client-visible interface. Following this commit, "struct odp_flow_stats" is only used in Linux-specific parts of OVS userspace code. This allows the actual Linux datapath interface to evolve more freely. Reviewed by Justin Pettit.	2011-01-27 21:08:38 -08:00
Ben Pfaff	feebdea2e5	dpif: Eliminate "struct odp_flow" from client-visible interface. Following this commit, "struct odp_flow" and related data structures are only used in Linux-specific parts of OVS userspace code. This allows the actual Linux datapath interface to evolve more freely. Reviewed by Justin Pettit.	2011-01-27 21:08:38 -08:00
Ben Pfaff	bc4a05c639	datapath: Change ODP_FLOW_GET to retrieve only a single flow at a time. This brings the code closer to what the Netlink interface will need to implement. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:38 -08:00
Ben Pfaff	996c1b3d7a	datapath: Drop port information from odp_stats. As with n_flows, n_ports was used regularly by userspace to determine how much memory to allocate when listing ports, but it is no longer needed for that. max_ports, on the other hand, is necessary but it is also a fixed value for the kernel datapath right now and if we expand it we can also come up with a way to report the expanded value. The remaining members of odp_stats are actually real statistics that I intend to keep. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:38 -08:00
Ben Pfaff	1ba530f4b2	datapath: Drop queue information from odp_stats. This queue information will be available through the kernel socket layer once we move over to Netlink socket as transports, so we might as well get rid of the redundancy. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:38 -08:00
Ben Pfaff	ea7bd5973f	datapath: Drop flow information from odp_stats. Userspace used to use the n_flows information here to decide how much memory needed to be allocated to list flows, but that isn't necessary any longer now that listing flows uses an iterator abstraction. The cur_capacity and max_capacity members are just curiosities and don't provide much information; if the implementation ever changes away from the current hash table implementation then they could become meaningless anyhow. But more than anything, these aren't really the kind of statistics that networking people usually care about. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:38 -08:00
Ben Pfaff	4c738a8da5	dpif: Eliminate "struct odp_port" from client-visible interface. Following this commit, "struct odp_port" is only used in Linux-specific parts of OVS userspace code. This allows the actual Linux datapath interface to evolve more freely. Reviewed by Justin Pettit.	2011-01-27 21:08:37 -08:00
Ben Pfaff	b0ec0f279e	datapath: Change listing ports to use an iterator concept. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to add new features to the kernel vport layer without changing userspace software. In turn, that means that the odp_port structure must become variable-length. This does not, however, fit in well with the ODP_PORT_LIST ioctl in its current form, because that would require userspace to know how much space to allocate for each port in advance, or to allocate as much space as could possibly be needed. Neither choice is very attractive. This commit prepares for a different solution, by replacing ODP_PORT_LIST by a new ioctl ODP_VPORT_DUMP that retrieves information about a single vport from the datapath on each call. It is much cleaner to allocate the maximum amount of space for a single vport than to do so for possibly a large number of vports. It would be faster to retrieve a number of vports in batch instead of just one at a time, but that will naturally happen later when the kernel datapath interface is changed to use Netlink, so this patch does not bother with it. The Netlink version won't need to take the starting port number from userspace, since Netlink sockets can keep track of that state as part of their "dump" feature. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	856081f683	datapath: Report kernel's flow key when passing packets up to userspace. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to change the kernel's idea of the flow key separately from the userspace version. This commit takes one step in that direction by making the kernel report its idea of the flow that a packet belongs to whenever it passes a packet up to userspace. This means that userspace can intelligently figure out what to do: - If userspace's notion of the flow for the packet matches the kernel's, then nothing special is necessary. - If the kernel has a more specific notion for the flow than userspace, for example if the kernel decoded IPv6 headers but userspace stopped at the Ethernet type (because it does not understand IPv6), then again nothing special is necessary: userspace can still set up the flow in the usual way. - If userspace has a more specific notion for the flow than the kernel, for example if userspace decoded an IPv6 header but the kernel stopped at the Ethernet type, then userspace can forward the packet manually, without setting up a flow in the kernel. (This case is bad from a performance point of view, but at least it is correct.) This commit does not actually make userspace flexible enough to handle changes in the kernel flow key structure, although userspace does now have enough information to do that intelligently. This will have to wait for later commits. This commit is bigger than it would otherwise be because it is rolled together with changing "struct odp_msg" to a sequence of Netlink attributes. The alternative, to do each of those changes in a separate patch, seemed like overkill because it meant that either we would have to introduce and then kill off Netlink attributes for in_port and tun_id, if Netlink conversion went first, or shove yet another variable-length header into the stuff already after odp_msg, if adding the flow key to odp_msg went first. This commit will slow down performance of checksumming packets sent up to userspace. I'm not entirely pleased with how I did it. I considered a couple of alternatives, but none of them seemed that much better. Suggestions welcome. Not changing anything wasn't an option, unfortunately. At any rate some slowdown will become unavoidable when OVS actually starts using Netlink instead of just Netlink framing. (Actually, I thought of one option where we could avoid that: make userspace do the checksum instead, by passing csum_start and csum_offset as part of what goes to userspace. But that's not perfect either.) Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:36 -08:00
Ben Pfaff	36956a7d33	datapath: Convert odp_flow_key to use Netlink attributes instead. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to change the kernel's idea of the flow key separately from the userspace version. In turn, that means that flow keys must become variable-length. This commit makes that change using Netlink attribute sequences. This commit does not actually make userspace flexible enough to handle changes in the kernel flow key structure, because userspace doesn't yet have enough information to do that intelligently. Upcoming commits will fix that. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:35 -08:00
Ben Pfaff	704a1e09e9	datapath: Change listing flows to use an iterator concept. One of the goals for Open vSwitch is to decouple kernel and userspace software, so that either one can be upgraded or rolled back independent of the other. To do this in full generality, it must be possible to change the kernel's idea of the flow key separately from the userspace version. In turn, that means that flow keys must become variable-length. This does not, however, fit in well with the ODP_FLOW_LIST ioctl in its current form, because that would require userspace to know how much space to allocate for each flow's key in advance, or to allocate as much space as could possibly be needed. Neither choice is very attractive. This commit prepares for a different solution, by replacing ODP_FLOW_LIST by a new ioctl ODP_FLOW_DUMP that retrieves a single flow from the datapath on each call. It is much cleaner to allocate the maximum amount of space for a single flow key than to do so for possibly a very large number of flow keys. As a side effect, this patch also fixes a race condition that sometimes made "ovs-dpctl dump-flows" print an error: previously, flows were listed and then their actions were retrieved, which left a window in which ovs-vswitchd could delete the flow. Now dumping a flow and its actions is a single step, closing that window. Dumping all of the flows in a datapath is no longer an atomic step, so now it is possible to miss some flows or see a single flow twice during iteration, if the flow table is modified by another process. It doesn't look like this should be a problem for ovs-vswitchd. It would be faster to retrieve a number of flows in batch instead of just one at a time, but that will naturally happen later when the kernel datapath interface is changed to use Netlink, so this patch does not bother with it. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2011-01-27 21:08:35 -08:00
Ben Pfaff	ecf9ebff6c	dpif-netdev: Allow for Ethernet and VLAN header in buffer size calculation. This is a long-standing bug--it was present in version 1.0 too. Reported-by: Gaetano Catalli <gaetano.catalli@gmail.com> Solution by Jesse Gross <jesse@nicira.com>	2011-01-24 12:44:19 -08:00
Ben Pfaff	db5ce51427	Fix non-static instances of "struct vlog_rate_limit" and add check. A non-static vlog_rate_limit is not actually going to rate-limit anything.	2011-01-12 09:22:12 -08:00

1 2 3

113 Commits