2
0
mirror of https://github.com/openvswitch/ovs synced 2025-10-07 13:40:45 +00:00
Commit Graph

76 Commits

Author SHA1 Message Date
Ben Pfaff
092a872d6e datapath: Always null-terminate network device name in create_dp().
strncpy() does not null-terminate its output buffer if the source string's
length is at least as large as its 'count' argument.  We know that the
source and destination buffers are the same size and that the source buffer
is null-terminated, so just use strcpy().

This fixes a kernel BUG message that often occurred when strlen(devname)
was exactly IFNAMSIZ-1.  In such a case, if
internal_dev_port.devname[IFNAMSIZ-1] happened to be nonzero, it would
eventually fail the following check in alloc_netdev_mq():
	BUG_ON(strlen(name) >= sizeof(dev->name));

Bug #2722.
2010-04-27 10:43:24 -07:00
Ben Pfaff
1224e8fc1d datapath: Fix argument to strncpy_from_user().
The strncpy_from_user() function's 'count' argument is documented to
include the trailing null byte, but create_dp() did not include it.  This
commit adds it in.
2010-04-27 10:21:12 -07:00
Jesse Gross
2d2a0e309f datapath: Add skb_csum_help compatibility function.
Later kernel versions remove the direction argument from
skb_checksum_help.  This provides a compatibility function so we
can have consistent syntax across versions.

Since CHECKSUM_PARTIAL is the same as CHECKSUM_HW on older kernels
this allows a unified code path for computing checksums.
2010-04-19 09:11:57 -04:00
Jesse Gross
8d5ebd839b datapath: Genericize hash table.
Currently the flow hash table assumes that it is storing flows.
However, we will need additional types of hash tables in the
future so remove assumptions about flows and convert the datapath
to use the new table.
2010-04-19 09:11:57 -04:00
Jesse Gross
f2459fe7d9 datapath: Add generic virtual port layer.
Currently the datapath directly accesses devices through their
Linux functions.  Obviously this doesn't work for virtual devices
that are not backed by an actual Linux device.  This creates a
new virtual port layer which handles all interaction with devices.

The existing support for Linux devices was then implemented on top
of this layer as two device types.  It splits out and renames dp_dev
to internal_dev.  There were several places where datapath devices
had to handled in a special manner and this cleans that up by putting
all the special casing in a single location.
2010-04-19 09:11:57 -04:00
Jesse Gross
659586efcf tunneling: Add support for tunnel ID.
Add a tun_id field which contains the ID of the encapsulating tunnel
on which a packet was received (0 if not received on a tunnel).  Also
add an action which allows the tunnel ID to be set for outgoing
packets.  At this point there aren't any tunnel implementations so
these fields don't have any effect.

The matching is exposed to OpenFlow by overloading the high 32 bits
of the cookie as the tunnel ID.  ovs-ofctl is capable of turning
on this special behavior using a new "tun-cookie" command but this
command is intentially undocumented to avoid it being used without
a full understanding of the consequences.
2010-04-19 09:11:51 -04:00
Jesse Gross
3c5f6de385 datapath: Validate ToS when flow is added.
Check that the ToS is valid when the flow is added, not every time
it is used.
2010-03-15 15:44:41 -04:00
Jesse Gross
2de320799d datapath: Disable large receive offload.
LRO can play fast and loose with the packets that it merges, which
isn't very polite when you are bridging packets for other operating
systems.  This disables LRO on any underlying devices that are added
to the datapath, which is the same as what the bridge does.

Note that this does not disable GRO, which has a more strict set of
rules about what is merged and is therefore safe for bridging.  Both
are typically done in software anyways.
2010-03-05 16:31:26 -05:00
Jesse Gross
635c9298b9 datapath: Update hardware computed checksum on VLAN change.
The checksum computed by hardware on receive stored in skb->csum
when skb->ip_summed == CHECKSUM_COMPLETE is supposed to reflect
the contents of the packet starting at skb->data, which includes
the VLAN tag if there is one.  However, when we manipulate the
VLAN tag we don't update the checksum.  This leads to all kinds
of nasty warnings about broken hardware, not to mention we can't
take advantage of the checksum that was already computed.

This also fixes some issues with our private checksum type value
on some different kernels and after GSO.
2010-03-05 15:22:17 -05:00
Jesse Gross
a063b0dff0 datapath: Consistently maintain checksum offloading state.
When adding a VLAN tag it is necessary for us to setup checksum
pointers for offloaded packets manually.  However, this process
clobbers some of the fields that other components need to determine
the current status.  Here we mark the packet with its status upon
ingress in our own format that does not get clobbered and is
consistent across kernel versions.

Bug #2436
2010-02-28 15:45:00 -05:00
Justin Pettit
834377ea55 ofproto: Match on IP ToS/DSCP bits (OpenFlow 1.0)
OpenFlow 1.0 adds support for matching on IP ToS/DSCP bits.

NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0 until
the final commit in this OpenFlow 1.0 set.
2010-02-20 02:22:28 -08:00
Justin Pettit
959a2ecdc8 ofproto: Match VLAN PCP and rewrite ToS bits (OpenFlow 0.9)
Starting in OpenFlow 0.9, it is possible to match on the VLAN PCP
(priority) field and rewrite the IP ToS/DSCP bits.  This check-in
provides that support and bumps the wire protocol number to 0x98.

NOTE: The wire changes come together over the set of OpenFlow 0.9 commits,
so OVS will not be OpenFlow-compatible with any official release between
this commit and the one that completes the set.
2010-02-20 02:22:26 -08:00
Ben Pfaff
c69ee87c10 Merge "master" into "next".
The main change here is the need to update all of the uses of UNUSED in
the next branch to OVS_UNUSED as it is now spelled on "master".
2010-02-11 11:11:23 -08:00
Ben Pfaff
35f7605b3f datapath: Mark functions "static".
Found by sparse (http://sparse.wiki.kernel.org/).
2010-02-10 16:54:48 -08:00
Ben Pfaff
2175540274 datapath: When adding a port, return the new port number to userspace.
'port' is a kernel-space copy of the odp_port and modifying it is useless.
'portp' is the userspace copy; modifying it is useful.

None of our current userspace users care about the port number and so we
never noticed.

Found by sparse (http://sparse.wiki.kernel.org/).
2010-02-10 16:54:48 -08:00
Jesse Gross
7dab847a19 Fix some regressions from the merge from master. 2010-02-08 13:31:33 -05:00
Justin Pettit
a4af00400a Merge branch 'master' into next
Conflicts:
	COPYING
	datapath/datapath.h
	lib/automake.mk
	lib/dpif-provider.h
	lib/dpif.c
	lib/hmap.h
	lib/netdev-provider.h
	lib/netdev.c
	lib/stream-ssl.h
	ofproto/executer.c
	ofproto/ofproto.c
	ofproto/ofproto.h
	tests/automake.mk
	utilities/ovs-ofctl.c
	utilities/ovs-vsctl.in
	vswitchd/ovs-vswitchd.conf.5.in
	xenserver/etc_init.d_vswitch
	xenserver/etc_xensource_scripts_vif
	xenserver/opt_xensource_libexec_interface-reconfigure
2010-02-05 17:14:55 -08:00
Jesse Gross
a778696338 datapath: Set datapath device MTU to minimum of MTU of ports.
The MTU of the local port should be no larger than the minimum of
the MTUs of the ports attached to the bridge, overwise packets may be
dropped.  We already prevent changes to the MTU that would violate
this constraint but don't actuallly proactively set the MTU.  This
changes makes everything consistent and matches the behavior of
the bridge.
2010-02-02 14:56:40 -05:00
Jesse Gross
8cdaca99b5 datapath: Fix compilation on newer old-style Xen kernels.
Some ports of Xen (such as Debian Lenny's) use the old style
Xen checksumming fields on newer kernels.  Normally the code that
deals with those fields isn't used at all on newer kernels.  This
updates the checksumming pointer code with some changes from Lenny
Xen since it is cleaner and works well with our existing compatibility
layer.

CC:pspreadborough@comcast.net
2010-01-26 18:09:39 -05:00
Jesse Gross
a605732386 datapath: Handle packets with precomputed checksums.
On older kernels (< 2.6.19) CHECKSUM_HW can mean either that the
checksum has already been computed by hardware or that the checksum
needs to be computed by hardware, depending on whether we are on
the transmit or receive path.  Unfortunately since we are in the
middle of these two paths it is impossible to tell which is the
case.  Code after us assumes that CHECKSUM_HW means that the
checksum needs to be computed and will panic if there already is
a checksum.  On these kernels we mark these packets as CHECKSUM_NONE
before handing them off.

Without this change using certain NICs will cause panics.
2010-01-26 17:17:21 -05:00
Ben Pfaff
49c36903d6 Merge "sflow" into "master".
No conflicts, but lib/dpif.c needed a few changes since struct dpif's
member "class" was renamed to "dpif_class" in master since sflow was
branched off.
2010-01-25 10:52:28 -08:00
Ben Pfaff
53d3bbbc09 datapath: Clean up vswitch_skb_checksum_setup().
vswitch_skb_checksum_setup() can be defined in datapath.h as a no-op
when defined(CONFIG_XEN) && defined(HAVE_PROTO_DATA_VALID) is false.

Also, skb_checksum_setup(), which was defined similarly, can be dropped
now, since it was unused.
2010-01-25 10:24:38 -08:00
Ben Pfaff
56fd8edf80 sflow: Fix sFlow sampling structure.
According to Neil McKee, in an email archived at
http://openvswitch.org/pipermail/dev_openvswitch.org/2010-January/000934.html:

    The containment rule is that a given sflow-datasource (sampler or
    poller) should be scoped within only one sflow-agent (or
    sub-agent).  So the issue arrises when you have two
    switches/datapaths defined on the same host being managed with
    the same IP address: each switch is a separate sub-agent, so they
    can run independently (e.g. with their own sequence numbers) but
    they can't both claim to speak for the same sflow-datasource.
    Specifically, they can't both represent the <ifindex>:0
    data-source.  This containment rule is necessary so that the
    sFlow collector can scale and combine the results accurately.

    One option would be to stick with the <ifindex>:0 data-source but
    elevate it to be global across all bridges, with a global
    sample_pool and a global sflow_agent.  Not tempting.  Better to
    go the other way and allow each interface to have it's own
    sampler, just as it already has it's own poller.  The ifIndex
    numbers are globally unique across all switches/datapaths on the
    host, so the containment is now clean.  Datasource <ifindex>:5
    might be on one switch, whille <ifindex>:7 can be on another.
    Other benefits are that 1) you can support the option of
    overriding the default sampling-rate on an interface-by-interface
    basis, and 2) this is how most sFlow implementations are coded,
    so there will be no surprises or interoperability issues with any
    sFlow collectors out there.

This commit implements the approach suggested by Neil.

This commit uses an atomic_t to represent the sampling pool.  This is
because we do want access to it to be atomic, but we expect that it will
"mostly" be accessed from a single CPU at a time.  Perhaps this is a bad
assumption; we can always switch to another form of synchronization later.

CC: Neil McKee <neil.mckee@inmon.com>
2010-01-20 14:33:28 -08:00
Ben Pfaff
72b0630028 Initial implementation of sFlow.
Tested very slightly with "ping" and "sflowtool -t | tcpdump -r -".
2010-01-04 13:08:37 -08:00
Ian Campbell
f7fed00035 datapath: Use HAVE_PROTO_DATA_VALID when defining vswitch_skb_checksum_setup
The purpose of the non-empty variant of vswitch_skb_checksum_setup is to
synchronise the proto_data_valid and proto_csum_blank fields into the
standard skb csum/ip_summed fields, therefore it is more correct to key
off of HAVE_PROTO_DATA_VALID.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
2009-11-19 10:25:57 -08:00
Justin Pettit
de3f65ea52 datapath: Cleanup tab/space issues in datapath 2009-11-16 18:48:29 -08:00
Jesse Gross
d65349ea28 Merge citrix branch into master. 2009-11-10 15:12:01 -08:00
Jesse Gross
18fdbe16de datapath: Allow TCP flags to be cleared.
When querying flow stats allow the TCP flags to be reset.  Since
the datapath ORs together all flags that have previously been
seen it is otherwise impossible to determine the set of flags from
after a particular time.
2009-11-06 14:05:14 -08:00
Jesse Gross
6d91c2fb6c datapath: Allow TCP flags to be cleared.
When querying flow stats allow the TCP flags to be reset.  Since
the datapath ORs together all flags that have previously been
seen it is otherwise impossible to determine the set of flags from
after a particular time.
2009-11-02 16:12:21 -08:00
Ben Pfaff
d6fbec6de0 Spell verb form of "set up" correctly throughout the tree. 2009-10-26 14:41:32 -07:00
Ben Pfaff
3f355f47f8 Merge "citrix" into "master".
This merge took a little bit of care due to two issues:

    - Crossport of "interface-reconfigure" fixes from master back to
      citrix that had happened and needed to be canceled out of the merge.

    - New script "refresh-xs-network-uuids" added on citrix branch that
      needed to be moved from /root/vswitch/scripts to
      /usr/share/vswitch/scripts.
2009-10-22 17:43:28 -07:00
Ben Pfaff
5562d3ccd2 datapath: Ignore return value from rtnl_notify().
In Linux 2.6.30, the rtnl_notify() return type was changed from int to
void along with the following commit message:

    This patch also modifies the rtnetlink code to ignore the return
    value of rtnl_notify() in all callers. The function rtnl_notify()
    (before this patch) returned the error of the unicast notification
    which makes rtnl_set_sk_err() reports errors to all listeners. This
    is not of any help since the origin of the change (the socket that
    requested the echoing) notices the ENOBUFS error if the notification
    fails and should resync itself.

Thus there's no point in checking the return value, even in older versions
of the kernel, and so this commit changes our code to ignore it, even
on older kernel versions.  We also update the rtnl_notify() wrapper macros
to make the return type void on older kernel versions.

This has not been tested, just built.

Thanks to Mikio for spurring me to try building with Linux 2.6.29 and
2.6.30.
2009-10-15 10:24:36 -07:00
Ben Pfaff
1b378b99f6 datapath: Fix warning on 64-bit builds. 2009-10-15 10:24:36 -07:00
Ben Pfaff
7c40efc9d3 datapath: Factor out code for getting and setting listen mask.
This fixes GCC warnings on 64-bit architectures caused by storing an "int"
in the "void *" f->private_data field.
2009-10-15 10:24:36 -07:00
Jean Tourrilhes
a4fbb689b0 datapath: Fix validation of ODPAT_SET_VLAN_PCP actions.
The VLAN PCP mask is in the rightmost bits of the vlan_pcp member but we
were checking for it in its position in the VLAN tag field instead.

Slightly modified from Jean's original patch by adding and using the
VLAN_PCP_SHIFT macro.
2009-10-08 10:41:48 -07:00
Ben Pfaff
22d24ebf66 datapath: Fix mutual exclusion with bridge on Linux 2.6.27+.
Linux 2.6.27 introduces a new mechanism for sharing STP packets among
kernel modules, which means that the code in datapath.c to avoid loading
when the Linux bridging module is also loaded has false positives.  So
fall back on these newer kernels to a less reliable way of avoiding the
bridge module, but one that does not have false positives.

CC: Jean Tourrihles <jt@hpl.hp.com>
2009-09-15 15:51:01 -07:00
Ben Pfaff
cb5087cadd datapath: Fix WARN_ON sending GSO packets to userspace in Linux 2.6.22+.
Until now, when dp_output_control() queued a GSO packet to userspace, it
would first compute the checksum for the whole GSO packet, then break the
packet into segments.  However this had two drawbacks:

    1. The checksum had to be recomputed for each segment, wasting time.
    2. Linux 2.6.22 and later would emit a warning in skb_gso_segment()
       because the checksum was precomputed.

This commit changes dp_output_control() to instead break the packet into
segments, then compute the checksum across each of the segments
individually.  This fixes both drawbacks.

This commit has seen light testing on Xen's 2.6.27.  It has been build
tested on a few different kernel versions.
2009-09-14 12:20:00 -07:00
Ben Pfaff
0d3b8a34d6 datapath: Fix comments. 2009-09-14 12:20:00 -07:00
Justin Pettit
8398cf7efc Merge commit 'origin/citrix' 2009-09-04 22:23:29 -07:00
Justin Pettit
a393b897f2 datapath: Don't drop MTU-sized VLAN packets from userspace
Before transimitting a packet, the datapath checks that the packet
length is not greater than the MTU.  It determines the length based on
the 'protocol' field in the skb.  If 'protocol' is ETH_P_8021Q, it reduces
the packet length as stored in the 'len' field by four bytes, which
is the size of a VLAN tag header.  Unfortunately, packets that arrived
from userspace were not having the 'protocol' field set, which would
cause MTU-sized packets to be dropped.  This commit sets the 'protocol'
field appropriately.

Thanks to Ben Pfaff for the help diagnosing this issue.

NIC-17 and NIC-26
2009-09-04 22:19:15 -07:00
Ben Pfaff
f1acd62b54 Merge citrix branch into master. 2009-09-02 10:14:53 -07:00
Ben Pfaff
6fa58f7a15 datapath: Use hash table more tolerant of collisions for flow table.
The hash table used until now in the kernel datapath for storing the flow
table provides only two slots that a given flow can occupy.  If both of
those slots are already full, for a given flow, then that flow cannot be
added at all and its packets must be handled entirely in userspace, taking
a performance hit.  The code does attempt to compensate for this by making
the flow table rather large: 8 slots per flow actually in the flow table.
In practice, this is usually good enough, but some of the tests that we
have run show bad enough performance degradation or even timeouts of
various kinds that we want to implement something better.

This commit replaces the existing hash table by one with a completely
different design in which buckets are flexibly sized and can accept any
number of collisions.  By use of suitable levels of indirection, this
design is both simple and RCU-compatible.  I did consider other schemes,
but none of the ones that I came up with shared both of those two
properties.

This commit also adds kerneldoc comments for all of the flow table
non-static functions and data structures.

This has been lightly tested for correctness.  It has not been tested for
performance.

Bug #1656.  Bug #1851.
2009-09-01 10:36:42 -07:00
Ben Pfaff
2c7807ac4f datapath: Remove WARN_ON_ONCE(1) now that this code has been exercised.
The code on one side of this #if fork was difficult to test until Xen
upgraded to a new enough kernel that it would exercise it.  Later Xen
kernels are now available and this code path has been tested, at least to
some extent, so remove the warning.

Thanks to Ian Campbell <Ian.Campbell@citrix.com> for pointing out the
warning.
2009-09-01 10:12:12 -07:00
Justin Pettit
3c71830aef dpif: Address portability issues in dpif-netdev
There were a number of Linux assumptions in dpif-netdev that were not
necessary.  This commit cleans those up to aid portability.
2009-08-25 14:12:01 -07:00
Justin Pettit
00908dc27a Merge commit 'origin/citrix' 2009-08-25 13:23:11 -07:00
Justin Pettit
4f0b85d66b datapath: Return EFBIG instead of EXFULL when no room in flow table
The EXFULL errno is only defined in Linux.  While this datapath is
Linux-specific, the userspace that interacts with it is not.
2009-08-25 13:17:26 -07:00
Ben Pfaff
1d87357a13 Merge citrix into master. 2009-08-19 16:08:18 -07:00
Ben Pfaff
8fef8c7121 Merge citrix into master.
This was a somewhat difficult merge since there was a fair amount of
superficially divergent development on the two branches, especially in the
datapath.

This has been build-tested against XenServer 5.5.0 and XenServer 5.7.0
build 15122.  It has been booted and connected to XenCenter on 5.5.0.

The merge revealed a couple of outstanding bugs, which will be fixed on
citrix and then merged back into master.
2009-08-19 13:03:46 -07:00
Ian Campbell
b2f460c72d datapath: Only call skb_checksum_setup on 2.6.18 && Xen.
For newer kernels the checksum setup is done at the point the skb is
received in netback or netfront so there is no more need to sprinkle
skb_checksum_setup calls throughout the kernel.
2009-08-18 16:09:32 -07:00
Ben Pfaff
b0c32774c9 datapath: Improve comments. 2009-08-18 12:36:46 -07:00