| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | /*
 | 
					
						
							| 
									
										
										
										
											2011-01-12 10:44:43 -08:00
										 |  |  |  * Copyright (c) 2010, 2011 Nicira Networks. | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  |  * Distributed under the terms of the GNU GPL version 2. | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * Significant portions of this file may be copied from parts of the Linux | 
					
						
							|  |  |  |  * kernel, by Linus Torvalds and others. | 
					
						
							|  |  |  |  */ | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-08-26 23:34:40 -07:00
										 |  |  | #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | #include <linux/if_arp.h>
 | 
					
						
							|  |  |  | #include <linux/if_bridge.h>
 | 
					
						
							|  |  |  | #include <linux/if_vlan.h>
 | 
					
						
							|  |  |  | #include <linux/kernel.h>
 | 
					
						
							|  |  |  | #include <linux/llc.h>
 | 
					
						
							|  |  |  | #include <linux/rtnetlink.h>
 | 
					
						
							|  |  |  | #include <linux/skbuff.h>
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | #include <net/llc.h>
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-22 14:17:24 -08:00
										 |  |  | #include "checksum.h"
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | #include "datapath.h"
 | 
					
						
							| 
									
										
										
										
											2010-12-30 20:48:38 -08:00
										 |  |  | #include "vlan.h"
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | #include "vport-internal_dev.h"
 | 
					
						
							|  |  |  | #include "vport-netdev.h"
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												Support vlan_group workaround implemented in XenServer kernels.
Some Linux network drivers support a feature called "VLAN acceleration",
associated with a data structure called a "vlan_group".  A vlan_group is,
abstractly, a dictionary that maps from a VLAN ID (in the range 0...4095)
to a VLAN device, that is, a Linux network device associated with a
particular VLAN, e.g. "eth0.9" for VLAN 9 on eth0.
Some drivers that support VLAN acceleration have bugs that fall roughly
into the following categories:
    * Some NICs strip VLAN tags on receive if no vlan_group is registered,
      so that the tag is completely lost.
    * Some drivers size their receive buffers based on whether a vlan_group
      is enabled, meaning that a maximum size packet with a VLAN tag will
      not fit if a vlan_group is not configured.
    * On transmit some drivers expect that VLAN acceleration will be used
      if it is available (which can only be done if a vlan_group is
      configured).  In these cases, the driver may fail to parse the packet
      and correctly setup checksum offloading and/or TSO.
The correct long term solution is to fix these driver bugs.  To cope until
then, we have prepared a patch to the Linux kernel network stack that works
around these problems.  This commit adds support for the workaround
implemented by that patch.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
											
										 
											2011-03-16 14:39:17 -07:00
										 |  |  | #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,37) && \
 | 
					
						
							|  |  |  |     !defined(HAVE_VLAN_BUG_WORKAROUND) | 
					
						
							| 
									
										
										
										
											2010-12-30 12:28:10 -08:00
										 |  |  | #include <linux/module.h>
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | static int vlan_tso __read_mostly = 0; | 
					
						
							|  |  |  | module_param(vlan_tso, int, 0644); | 
					
						
							|  |  |  | MODULE_PARM_DESC(vlan_tso, "Enable TSO for VLAN packets"); | 
					
						
							| 
									
										
											  
											
												Support vlan_group workaround implemented in XenServer kernels.
Some Linux network drivers support a feature called "VLAN acceleration",
associated with a data structure called a "vlan_group".  A vlan_group is,
abstractly, a dictionary that maps from a VLAN ID (in the range 0...4095)
to a VLAN device, that is, a Linux network device associated with a
particular VLAN, e.g. "eth0.9" for VLAN 9 on eth0.
Some drivers that support VLAN acceleration have bugs that fall roughly
into the following categories:
    * Some NICs strip VLAN tags on receive if no vlan_group is registered,
      so that the tag is completely lost.
    * Some drivers size their receive buffers based on whether a vlan_group
      is enabled, meaning that a maximum size packet with a VLAN tag will
      not fit if a vlan_group is not configured.
    * On transmit some drivers expect that VLAN acceleration will be used
      if it is available (which can only be done if a vlan_group is
      configured).  In these cases, the driver may fail to parse the packet
      and correctly setup checksum offloading and/or TSO.
The correct long term solution is to fix these driver bugs.  To cope until
then, we have prepared a patch to the Linux kernel network stack that works
around these problems.  This commit adds support for the workaround
implemented by that patch.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
											
										 
											2011-03-16 14:39:17 -07:00
										 |  |  | #else
 | 
					
						
							|  |  |  | #define vlan_tso true
 | 
					
						
							| 
									
										
										
										
											2010-12-30 12:28:10 -08:00
										 |  |  | #endif
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | static void netdev_port_receive(struct vport *vport, struct sk_buff *skb); | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-04-07 19:43:18 +00:00
										 |  |  | #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,39)
 | 
					
						
							|  |  |  | /* Called with rcu_read_lock and bottom-halves disabled. */ | 
					
						
							|  |  |  | static rx_handler_result_t netdev_frame_hook(struct sk_buff **pskb) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  | 	struct sk_buff *skb = *pskb; | 
					
						
							|  |  |  | 	struct vport *vport; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	if (unlikely(skb->pkt_type == PACKET_LOOPBACK)) | 
					
						
							|  |  |  | 		return RX_HANDLER_PASS; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	vport = netdev_get_vport(skb->dev); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	netdev_port_receive(vport, skb); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	return RX_HANDLER_CONSUMED; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | #elif LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,36)
 | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | /* Called with rcu_read_lock and bottom-halves disabled. */ | 
					
						
							|  |  |  | static struct sk_buff *netdev_frame_hook(struct sk_buff *skb) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  | 	struct vport *vport; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	if (unlikely(skb->pkt_type == PACKET_LOOPBACK)) | 
					
						
							|  |  |  | 		return skb; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:10 +09:00
										 |  |  | 	vport = netdev_get_vport(skb->dev); | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | 
 | 
					
						
							|  |  |  | 	netdev_port_receive(vport, skb); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	return NULL; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | #elif LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,22)
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | /*
 | 
					
						
							|  |  |  |  * Used as br_handle_frame_hook.  (Cannot run bridge at the same time, even on | 
					
						
							|  |  |  |  * different set of devices!) | 
					
						
							|  |  |  |  */ | 
					
						
							|  |  |  | /* Called with rcu_read_lock and bottom-halves disabled. */ | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | static struct sk_buff *netdev_frame_hook(struct net_bridge_port *p, | 
					
						
							|  |  |  | 					 struct sk_buff *skb) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | 	netdev_port_receive((struct vport *)p, skb); | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	return NULL; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | #elif LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
 | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | /*
 | 
					
						
							|  |  |  |  * Used as br_handle_frame_hook.  (Cannot run bridge at the same time, even on | 
					
						
							|  |  |  |  * different set of devices!) | 
					
						
							|  |  |  |  */ | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | /* Called with rcu_read_lock and bottom-halves disabled. */ | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | static int netdev_frame_hook(struct net_bridge_port *p, struct sk_buff **pskb) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | 	netdev_port_receive((struct vport *)p, *pskb); | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	return 1; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | #else
 | 
					
						
							|  |  |  | #error
 | 
					
						
							|  |  |  | #endif
 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,36)
 | 
					
						
							|  |  |  | static int netdev_init(void) { return 0; } | 
					
						
							|  |  |  | static void netdev_exit(void) { } | 
					
						
							|  |  |  | #else
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | static int netdev_init(void) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	/* Hook into callback used by the bridge to intercept packets.
 | 
					
						
							|  |  |  | 	 * Parasites we are. */ | 
					
						
							|  |  |  | 	br_handle_frame_hook = netdev_frame_hook; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	return 0; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-29 18:15:14 -07:00
										 |  |  | static void netdev_exit(void) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	br_handle_frame_hook = NULL; | 
					
						
							|  |  |  | } | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | #endif
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-04 16:32:57 -07:00
										 |  |  | static struct vport *netdev_create(const struct vport_parms *parms) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	struct vport *vport; | 
					
						
							|  |  |  | 	struct netdev_vport *netdev_vport; | 
					
						
							|  |  |  | 	int err; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-12-03 13:09:26 -08:00
										 |  |  | 	vport = vport_alloc(sizeof(struct netdev_vport), &netdev_vport_ops, parms); | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	if (IS_ERR(vport)) { | 
					
						
							|  |  |  | 		err = PTR_ERR(vport); | 
					
						
							|  |  |  | 		goto error; | 
					
						
							|  |  |  | 	} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-04 16:32:57 -07:00
										 |  |  | 	netdev_vport->dev = dev_get_by_name(&init_net, parms->name); | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	if (!netdev_vport->dev) { | 
					
						
							|  |  |  | 		err = -ENODEV; | 
					
						
							|  |  |  | 		goto error_free_vport; | 
					
						
							|  |  |  | 	} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	if (netdev_vport->dev->flags & IFF_LOOPBACK || | 
					
						
							|  |  |  | 	    netdev_vport->dev->type != ARPHRD_ETHER || | 
					
						
							|  |  |  | 	    is_internal_dev(netdev_vport->dev)) { | 
					
						
							|  |  |  | 		err = -EINVAL; | 
					
						
							|  |  |  | 		goto error_put; | 
					
						
							|  |  |  | 	} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-12-03 15:44:51 -08:00
										 |  |  | 	err = netdev_rx_handler_register(netdev_vport->dev, netdev_frame_hook, | 
					
						
							|  |  |  | 					 vport); | 
					
						
							|  |  |  | 	if (err) | 
					
						
							|  |  |  | 		goto error_put; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	dev_set_promiscuity(netdev_vport->dev, 1); | 
					
						
							| 
									
										
										
										
											2011-08-26 23:34:40 -07:00
										 |  |  | #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,24)
 | 
					
						
							| 
									
										
										
										
											2010-12-03 15:44:51 -08:00
										 |  |  | 	dev_disable_lro(netdev_vport->dev); | 
					
						
							| 
									
										
										
										
											2011-08-26 23:34:40 -07:00
										 |  |  | #endif
 | 
					
						
							| 
									
										
										
										
											2010-12-03 15:44:51 -08:00
										 |  |  | 	netdev_vport->dev->priv_flags |= IFF_OVS_DATAPATH; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	return vport; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | error_put: | 
					
						
							|  |  |  | 	dev_put(netdev_vport->dev); | 
					
						
							|  |  |  | error_free_vport: | 
					
						
							|  |  |  | 	vport_free(vport); | 
					
						
							|  |  |  | error: | 
					
						
							|  |  |  | 	return ERR_PTR(err); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-09-09 19:09:47 -07:00
										 |  |  | static void netdev_destroy(struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:10 +09:00
										 |  |  | 	netdev_vport->dev->priv_flags &= ~IFF_OVS_DATAPATH; | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | 	netdev_rx_handler_unregister(netdev_vport->dev); | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	dev_set_promiscuity(netdev_vport->dev, -1); | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-12-03 15:44:51 -08:00
										 |  |  | 	synchronize_rcu(); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	dev_put(netdev_vport->dev); | 
					
						
							|  |  |  | 	vport_free(vport); | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | int netdev_set_addr(struct vport *vport, const unsigned char *addr) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	struct sockaddr sa; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	sa.sa_family = ARPHRD_ETHER; | 
					
						
							|  |  |  | 	memcpy(sa.sa_data, addr, ETH_ALEN); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	return dev_set_mac_address(netdev_vport->dev, &sa); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | const char *netdev_get_name(const struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	const struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	return netdev_vport->dev->name; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | const unsigned char *netdev_get_addr(const struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	const struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	return netdev_vport->dev->dev_addr; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | struct kobject *netdev_get_kobj(const struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	const struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	return &netdev_vport->dev->NETDEV_DEV_MEMBER.kobj; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | unsigned netdev_get_dev_flags(const struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	const struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	return dev_get_flags(netdev_vport->dev); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | int netdev_is_running(const struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	const struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	return netif_running(netdev_vport->dev); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | unsigned char netdev_get_operstate(const struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	const struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	return netdev_vport->dev->operstate; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-29 18:15:14 -07:00
										 |  |  | int netdev_get_ifindex(const struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	const struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	return netdev_vport->dev->ifindex; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | int netdev_get_mtu(const struct vport *vport) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	const struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							|  |  |  | 	return netdev_vport->dev->mtu; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | /* Must be called with rcu_read_lock. */ | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:06 +09:00
										 |  |  | static void netdev_port_receive(struct vport *vport, struct sk_buff *skb) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2011-05-19 13:17:39 -07:00
										 |  |  | 	if (unlikely(!vport)) { | 
					
						
							|  |  |  | 		kfree_skb(skb); | 
					
						
							|  |  |  | 		return; | 
					
						
							|  |  |  | 	} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	/* Make our own copy of the packet.  Otherwise we will mangle the
 | 
					
						
							|  |  |  | 	 * packet for anyone who came before us (e.g. tcpdump via AF_PACKET). | 
					
						
							|  |  |  | 	 * (No one comes after us, since we tell handle_bridge() that we took | 
					
						
							|  |  |  | 	 * the packet.) */ | 
					
						
							|  |  |  | 	skb = skb_share_check(skb, GFP_ATOMIC); | 
					
						
							| 
									
										
										
										
											2010-07-29 19:01:02 -07:00
										 |  |  | 	if (unlikely(!skb)) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 		return; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	skb_push(skb, ETH_HLEN); | 
					
						
							| 
									
										
										
										
											2011-05-27 15:53:49 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | 	if (unlikely(compute_ip_summed(skb, false))) { | 
					
						
							|  |  |  | 		kfree_skb(skb); | 
					
						
							|  |  |  | 		return; | 
					
						
							|  |  |  | 	} | 
					
						
							| 
									
										
										
										
											2010-12-30 20:48:38 -08:00
										 |  |  | 	vlan_copy_skb_tci(skb); | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 
 | 
					
						
							|  |  |  | 	vport_receive(vport, skb); | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-08-26 23:34:40 -07:00
										 |  |  | static inline unsigned packet_length(const struct sk_buff *skb) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  | 	unsigned length = skb->len - ETH_HLEN; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	if (skb->protocol == htons(ETH_P_8021Q)) | 
					
						
							|  |  |  | 		length -= VLAN_HLEN; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	return length; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
											  
											
												Support vlan_group workaround implemented in XenServer kernels.
Some Linux network drivers support a feature called "VLAN acceleration",
associated with a data structure called a "vlan_group".  A vlan_group is,
abstractly, a dictionary that maps from a VLAN ID (in the range 0...4095)
to a VLAN device, that is, a Linux network device associated with a
particular VLAN, e.g. "eth0.9" for VLAN 9 on eth0.
Some drivers that support VLAN acceleration have bugs that fall roughly
into the following categories:
    * Some NICs strip VLAN tags on receive if no vlan_group is registered,
      so that the tag is completely lost.
    * Some drivers size their receive buffers based on whether a vlan_group
      is enabled, meaning that a maximum size packet with a VLAN tag will
      not fit if a vlan_group is not configured.
    * On transmit some drivers expect that VLAN acceleration will be used
      if it is available (which can only be done if a vlan_group is
      configured).  In these cases, the driver may fail to parse the packet
      and correctly setup checksum offloading and/or TSO.
The correct long term solution is to fix these driver bugs.  To cope until
then, we have prepared a patch to the Linux kernel network stack that works
around these problems.  This commit adds support for the workaround
implemented by that patch.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
											
										 
											2011-03-16 14:39:17 -07:00
										 |  |  | static bool dev_supports_vlan_tx(struct net_device *dev) | 
					
						
							|  |  |  | { | 
					
						
							|  |  |  | #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,37)
 | 
					
						
							|  |  |  | 	/* Software fallback means every device supports vlan_tci on TX. */ | 
					
						
							|  |  |  | 	return true; | 
					
						
							|  |  |  | #elif defined(HAVE_VLAN_BUG_WORKAROUND)
 | 
					
						
							|  |  |  | 	return dev->features & NETIF_F_HW_VLAN_TX; | 
					
						
							|  |  |  | #else
 | 
					
						
							|  |  |  | 	/* Assume that the driver is buggy. */ | 
					
						
							|  |  |  | 	return false; | 
					
						
							|  |  |  | #endif
 | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | static int netdev_send(struct vport *vport, struct sk_buff *skb) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							|  |  |  | 	struct netdev_vport *netdev_vport = netdev_vport_priv(vport); | 
					
						
							| 
									
										
										
										
											2011-08-26 23:34:40 -07:00
										 |  |  | 	int mtu = netdev_vport->dev->mtu; | 
					
						
							| 
									
										
										
										
											2010-12-29 22:13:15 -08:00
										 |  |  | 	int len; | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-08-26 23:34:40 -07:00
										 |  |  | 	if (unlikely(packet_length(skb) > mtu && !skb_is_gso(skb))) { | 
					
						
							|  |  |  | 		if (net_ratelimit()) | 
					
						
							|  |  |  | 			pr_warn("%s: dropped over-mtu packet: %d > %d\n", | 
					
						
							|  |  |  | 				dp_name(vport->dp), packet_length(skb), mtu); | 
					
						
							|  |  |  | 		goto error; | 
					
						
							|  |  |  | 	} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	if (unlikely(skb_warn_if_lro(skb))) | 
					
						
							|  |  |  | 		goto error; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	skb->dev = netdev_vport->dev; | 
					
						
							| 
									
										
										
										
											2011-05-27 15:53:49 -07:00
										 |  |  | 	forward_ip_summed(skb, true); | 
					
						
							| 
									
										
										
										
											2010-12-29 22:13:15 -08:00
										 |  |  | 
 | 
					
						
							| 
									
										
											  
											
												Support vlan_group workaround implemented in XenServer kernels.
Some Linux network drivers support a feature called "VLAN acceleration",
associated with a data structure called a "vlan_group".  A vlan_group is,
abstractly, a dictionary that maps from a VLAN ID (in the range 0...4095)
to a VLAN device, that is, a Linux network device associated with a
particular VLAN, e.g. "eth0.9" for VLAN 9 on eth0.
Some drivers that support VLAN acceleration have bugs that fall roughly
into the following categories:
    * Some NICs strip VLAN tags on receive if no vlan_group is registered,
      so that the tag is completely lost.
    * Some drivers size their receive buffers based on whether a vlan_group
      is enabled, meaning that a maximum size packet with a VLAN tag will
      not fit if a vlan_group is not configured.
    * On transmit some drivers expect that VLAN acceleration will be used
      if it is available (which can only be done if a vlan_group is
      configured).  In these cases, the driver may fail to parse the packet
      and correctly setup checksum offloading and/or TSO.
The correct long term solution is to fix these driver bugs.  To cope until
then, we have prepared a patch to the Linux kernel network stack that works
around these problems.  This commit adds support for the workaround
implemented by that patch.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
											
										 
											2011-03-16 14:39:17 -07:00
										 |  |  | 	if (vlan_tx_tag_present(skb) && !dev_supports_vlan_tx(skb->dev)) { | 
					
						
							| 
									
										
										
										
											2011-09-09 18:13:26 -07:00
										 |  |  | 		int features; | 
					
						
							| 
									
										
										
										
											2011-02-07 15:50:04 -08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-09-09 18:13:26 -07:00
										 |  |  | 		features = netif_skb_features(skb); | 
					
						
							| 
									
										
										
										
											2010-12-29 22:13:15 -08:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-12-30 12:28:10 -08:00
										 |  |  | 		if (!vlan_tso) | 
					
						
							|  |  |  | 			features &= ~(NETIF_F_TSO | NETIF_F_TSO6 | | 
					
						
							|  |  |  | 				      NETIF_F_UFO | NETIF_F_FSO); | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-09-09 18:13:26 -07:00
										 |  |  | 		if (netif_needs_gso(skb, features)) { | 
					
						
							| 
									
										
										
										
											2010-12-29 22:13:15 -08:00
										 |  |  | 			struct sk_buff *nskb; | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-12-30 12:28:10 -08:00
										 |  |  | 			nskb = skb_gso_segment(skb, features); | 
					
						
							|  |  |  | 			if (!nskb) { | 
					
						
							|  |  |  | 				if (unlikely(skb_cloned(skb) && | 
					
						
							|  |  |  | 				    pskb_expand_head(skb, 0, 0, GFP_ATOMIC))) { | 
					
						
							|  |  |  | 					kfree_skb(skb); | 
					
						
							|  |  |  | 					return 0; | 
					
						
							|  |  |  | 				} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 				skb_shinfo(skb)->gso_type &= ~SKB_GSO_DODGY; | 
					
						
							|  |  |  | 				goto tag; | 
					
						
							|  |  |  | 			} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2011-06-16 15:32:26 -07:00
										 |  |  | 			if (IS_ERR(nskb)) { | 
					
						
							|  |  |  | 				kfree_skb(skb); | 
					
						
							| 
									
										
										
										
											2010-12-29 22:13:15 -08:00
										 |  |  | 				return 0; | 
					
						
							| 
									
										
										
										
											2011-06-16 15:32:26 -07:00
										 |  |  | 			} | 
					
						
							|  |  |  | 			consume_skb(skb); | 
					
						
							|  |  |  | 			skb = nskb; | 
					
						
							| 
									
										
										
										
											2010-12-29 22:13:15 -08:00
										 |  |  | 
 | 
					
						
							|  |  |  | 			len = 0; | 
					
						
							|  |  |  | 			do { | 
					
						
							|  |  |  | 				nskb = skb->next; | 
					
						
							|  |  |  | 				skb->next = NULL; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 				skb = __vlan_put_tag(skb, vlan_tx_tag_get(skb)); | 
					
						
							|  |  |  | 				if (likely(skb)) { | 
					
						
							|  |  |  | 					len += skb->len; | 
					
						
							|  |  |  | 					vlan_set_tci(skb, 0); | 
					
						
							|  |  |  | 					dev_queue_xmit(skb); | 
					
						
							|  |  |  | 				} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 				skb = nskb; | 
					
						
							|  |  |  | 			} while (skb); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 			return len; | 
					
						
							|  |  |  | 		} | 
					
						
							| 
									
										
										
										
											2010-12-30 12:28:10 -08:00
										 |  |  | 
 | 
					
						
							|  |  |  | tag: | 
					
						
							|  |  |  | 		skb = __vlan_put_tag(skb, vlan_tx_tag_get(skb)); | 
					
						
							|  |  |  | 		if (unlikely(!skb)) | 
					
						
							|  |  |  | 			return 0; | 
					
						
							|  |  |  | 		vlan_set_tci(skb, 0); | 
					
						
							| 
									
										
										
										
											2010-12-29 22:13:15 -08:00
										 |  |  | 	} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	len = skb->len; | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	dev_queue_xmit(skb); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 	return len; | 
					
						
							| 
									
										
										
										
											2011-08-26 23:34:40 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | error: | 
					
						
							|  |  |  | 	kfree_skb(skb); | 
					
						
							|  |  |  | 	vport_record_error(vport, VPORT_E_TX_DROPPED); | 
					
						
							|  |  |  | 	return 0; | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | /* Returns null if this device is not attached to a datapath. */ | 
					
						
							| 
									
										
										
										
											2010-07-14 19:27:18 -07:00
										 |  |  | struct vport *netdev_get_vport(struct net_device *dev) | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | { | 
					
						
							| 
									
										
										
										
											2011-09-21 12:41:52 -07:00
										 |  |  | #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,36)
 | 
					
						
							| 
									
										
										
										
											2010-12-08 19:28:32 -08:00
										 |  |  | #if IFF_BRIDGE_PORT != IFF_OVS_DATAPATH
 | 
					
						
							|  |  |  | 	if (likely(dev->priv_flags & IFF_OVS_DATAPATH)) | 
					
						
							|  |  |  | #else
 | 
					
						
							|  |  |  | 	if (likely(rcu_access_pointer(dev->rx_handler) == netdev_frame_hook))	 | 
					
						
							|  |  |  | #endif
 | 
					
						
							| 
									
										
										
										
											2010-12-06 15:15:47 -08:00
										 |  |  | 		return (struct vport *)rcu_dereference_rtnl(dev->rx_handler_data); | 
					
						
							| 
									
										
										
										
											2010-12-08 19:28:32 -08:00
										 |  |  | 	else | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:10 +09:00
										 |  |  | 		return NULL; | 
					
						
							|  |  |  | #else
 | 
					
						
							| 
									
										
										
										
											2010-12-06 15:15:47 -08:00
										 |  |  | 	return (struct vport *)rcu_dereference_rtnl(dev->br_port); | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:10 +09:00
										 |  |  | #endif
 | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | } | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-11-23 22:08:27 -08:00
										 |  |  | const struct vport_ops netdev_vport_ops = { | 
					
						
							| 
									
										
										
										
											2011-08-18 10:35:40 -07:00
										 |  |  | 	.type		= OVS_VPORT_TYPE_NETDEV, | 
					
						
							| 
									
										
										
										
											2011-09-15 19:36:17 -07:00
										 |  |  | 	.flags          = VPORT_F_REQUIRED, | 
					
						
							| 
									
										
										
										
											2010-04-12 15:53:39 -04:00
										 |  |  | 	.init		= netdev_init, | 
					
						
							|  |  |  | 	.exit		= netdev_exit, | 
					
						
							|  |  |  | 	.create		= netdev_create, | 
					
						
							|  |  |  | 	.destroy	= netdev_destroy, | 
					
						
							|  |  |  | 	.set_addr	= netdev_set_addr, | 
					
						
							|  |  |  | 	.get_name	= netdev_get_name, | 
					
						
							|  |  |  | 	.get_addr	= netdev_get_addr, | 
					
						
							|  |  |  | 	.get_kobj	= netdev_get_kobj, | 
					
						
							|  |  |  | 	.get_dev_flags	= netdev_get_dev_flags, | 
					
						
							|  |  |  | 	.is_running	= netdev_is_running, | 
					
						
							|  |  |  | 	.get_operstate	= netdev_get_operstate, | 
					
						
							|  |  |  | 	.get_ifindex	= netdev_get_ifindex, | 
					
						
							|  |  |  | 	.get_mtu	= netdev_get_mtu, | 
					
						
							|  |  |  | 	.send		= netdev_send, | 
					
						
							|  |  |  | }; | 
					
						
							| 
									
										
										
										
											2010-06-02 15:35:15 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:10 +09:00
										 |  |  | #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,36)
 | 
					
						
							| 
									
										
										
										
											2010-06-02 15:35:15 -07:00
										 |  |  | /*
 | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:10 +09:00
										 |  |  |  * In kernels earlier than 2.6.36, Open vSwitch cannot safely coexist with | 
					
						
							|  |  |  |  * the Linux bridge module on any released version of Linux, because there | 
					
						
							|  |  |  |  * is only a single bridge hook function and only a single br_port member | 
					
						
							|  |  |  |  * in struct net_device. | 
					
						
							| 
									
										
										
										
											2010-06-02 15:35:15 -07:00
										 |  |  |  * | 
					
						
							|  |  |  |  * Declaring and exporting this symbol enforces mutual exclusion.  The bridge | 
					
						
							|  |  |  |  * module also exports the same symbol, so the module loader will refuse to | 
					
						
							|  |  |  |  * load both modules at the same time (e.g. "bridge: exports duplicate symbol | 
					
						
							|  |  |  |  * br_should_route_hook (owned by openvswitch_mod)"). | 
					
						
							|  |  |  |  * | 
					
						
							|  |  |  |  * The use of "typeof" here avoids the need to track changes in the type of | 
					
						
							|  |  |  |  * br_should_route_hook over various kernel versions. | 
					
						
							|  |  |  |  */ | 
					
						
							|  |  |  | typeof(br_should_route_hook) br_should_route_hook; | 
					
						
							|  |  |  | EXPORT_SYMBOL(br_should_route_hook); | 
					
						
							| 
									
										
										
										
											2010-08-23 15:30:10 +09:00
										 |  |  | #endif
 |