2
0
mirror of https://github.com/openvswitch/ovs synced 2025-10-17 14:28:02 +00:00
Commit Graph

45 Commits

Author SHA1 Message Date
Jesse Gross
e7d737d175 datapath: Try to avoid custom checksum update function.
Our update_csum() function was exactly the same as
inet_proto_csum_replace4() with the one exception that it uses our
checksum status fields on older kernels that need it.  Unfortunately,
we can't completely move the code to the compat directory because it
relies on fields in OVS CB but we can at least exile it to checksum.h.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2010-12-09 17:43:36 -08:00
Jesse Gross
8b69563f97 datapath: Correctly update IP checksum with actions.
The update_csum() function that we currently use to update
checksums on actions is really intended for L4 checksums.  In
particular, if the packet has a partial checksum and the field
is not in the pseudo header, it doesn't do anything at all.
This doesn't make sense for the IP header because Linux doesn't
use hardware offload for it, so we always need to recompute the
checksum.  Instead, we can use the kernel function csum_replace4(),
which will always do the right thing.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2010-12-09 17:43:36 -08:00
Jesse Gross
dd8d6b8cd4 datapath: Consolidate checksum compatibility code.
Checksum offloading has changed quite a bit across different kernel
and Xen versions.  Since it is part of the skb data structure it is
unfortunately difficult to separate out into compatibility code.
This consolidates all of the checksum code in one place which makes
it easier read and remove as we prepare for upstreaming.  On newer
kernels it also puts everything in inline functions, eliminating the
need to run through the compat code or make extra function calls.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2010-12-03 17:40:26 -08:00
Ben Pfaff
e779d8d90d datapath: Merge "struct dp_port" into "struct vport".
After the previous commit, which changed the datapath to always create and
attach a vport at the same time, and to always detach and delete a vport
at the same time, there is no longer any real distinction between a dp_port
and a vport.  This commit, therefore, merges the two together to simplify
code.  It might even improve performance, although I have not checked.

I wasn't sure at first whether the merged structure should be "struct
dp_port" or "struct vport".  I went with the latter since the "v" prefix
sounds cool.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-03 14:43:38 -08:00
Ben Pfaff
27bcf966b4 datapath: Simplify ODPAT_SET_DL_TCI action.
There's no need to have a mask in this action, because both parts of the
TCI are part of the flow structure.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-10-18 11:18:23 -07:00
Ben Pfaff
7956695a6c datapath: Always use GFP_ATOMIC to execute actions.
These functions run 99% of the time in atomic context and the benefit of
passing along the 'gfp' argument for the other 1% doesn't seem to outweigh
the cost.

Suggested-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2010-10-11 13:31:43 -07:00
Ben Pfaff
26233bb461 datapath: Combine dl_vlan and dl_vlan_pcp.
This allows eliminating padding from odp_flow_key, although actually doing
that is postponed until the next commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-10-11 13:31:43 -07:00
Ben Pfaff
f1588b1fa1 datapath: Remove implementation of port groups.
The "port group" concept seems like a good one, but it has not been
used very much in userspace so far, so before we commit ourselves to
a frozen API that we must maintain forever, remove it.  We can always
add it back in later as a new kind of vport.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2010-10-11 12:40:11 -07:00
Joe Perches
d295e8e97a treewide: Remove trailing whitespace
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2010-08-30 13:23:08 -07:00
Ben Pfaff
ca78c6b69c datapath: Avoid accesses past the end of skbuff data in actions.
Some of the flow actions that modify skbuff data did not check that the
skbuff was long enough before doing so.  This commit fixes that problem.

Previously, the strategy for avoiding this was to only indicate the layer-3
nw_proto field in the flow if the corresponding layer-4 header was fully
present, so that if, for example, nw_proto was IPPROTO_TCP, this meant
that a TCP header was present.  The original motivation for this patch was
to add corresponding code to only indicate a layer-2 dl_type if the
corresponding layer-3 header was fully present.  But I'm now convinced that
this approach is conceptually wrong, because the meaning of a layer-N
header should not be affected by the meaning of a layer-(N+1) header.

This commit switches to a new approach.  Now, when a header is missing, its
fields in the flow are simply zeroed and have no effect on the "type" field
for the outer header.  Responsibility for ensuring that a header is fully
present is now shifted to the actions that wish to modify that header.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2010-08-27 12:42:39 -07:00
Ben Pfaff
401eeb92d3 Add Nicira extension to OpenFlow for dropping spoofed ARP packets.
"ARP spoofing" is when a host claims an incorrect association between an
IP address and a MAC address for deceptive purposes.  OpenFlow by itself
can prevent a host from sending out ARP replies from an incorrect MAC
address in the Ethernet L2 header, but it cannot control the MAC addresses
inside the ARP L3 packet.  This commit adds a new action that can be used
to drop these spoofed packets.

CC: Paul Ingram <paul@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2010-08-26 10:56:20 -07:00
Jesse Gross
67c74f7515 datapath: Remove redundant checks on SKBs.
On vport ingress we already check for shared SKBs but then later
warn in several other places.  In a similar vein, we check every
packet to see if it is LRO but only certain vports can produce
these packets.  Remove and consolidate checks to the places where
they are needed.
2010-07-30 13:35:20 -07:00
Jesse Gross
aebdcb93e0 datapath: Don't update flow key when applying actions.
Currently the flow key is updated to match an action that is applied
to a packet but these field are never looked at again.  Not only is
this a waste of time it also makes optimizations involving caching
the flow key more difficult.
2010-07-15 15:09:08 -07:00
Jesse Gross
1cb8b59f8c datapath: Don't set tunnel_id in a function.
We don't need a function to set a variable.  In practice it will
almost certainly get inlined but this makes it easier to read.
2010-07-15 15:09:08 -07:00
Jesse Gross
fceb2a5bb2 datapath: Put return type on same line as arguments for functions.
In some places we would put the return type on the same line as
the rest of the function definition and other places we wouldn't.
Reformat everything to match kernel style.
2010-07-15 15:09:08 -07:00
Jesse Gross
38aeef15d8 datapath: Move over-MTU checking into vport_send().
We currently check for packets that are over the MTU in do_output().
There is a one-to-one correlation between this function and
vport_send() so move the MTU check there for consistency with
other error checking.
2010-07-13 14:10:18 -07:00
Jesse Gross
b28c72ba7c datapath: Check vswitch_skb_checksum_setup() return code.
If vswitch_skb_checksum_setup() returns an error, the checksum
pointers probably haven't been set correctly which could cause
a crash later.  We should give up immediately on error.
2010-06-18 11:38:10 -07:00
Ben Pfaff
c1c9c9c4b6 Implement QoS framework.
ovs-vswitchd doesn't declare its QoS capabilities in the database yet,
so the controller has to know what they are.  We can add that later.

The linux-htb QoS class has been tested to the extent that I can see that
it sets up the queues I expect when I run "tc qdisc show" and "tc class
show".  I haven't tested that the effects on flows are what we expect them
to be.  I am sure that there will be problems in that area that we will
have to fix.
2010-06-17 15:04:12 -07:00
Jesse Gross
ff6402a9f0 datapath: Add function to copy skb checksum bits.
Some kernels don't copy the checksum offload state in the skb
header when doing different types of copies.  Xen adds even more
fields, which are also not consistently copied.  The result is
uninitialized memory and random outcomes.  This adds a function to
consistently copy these bits across all kernel versions.
2010-04-19 09:11:57 -04:00
Jesse Gross
f2459fe7d9 datapath: Add generic virtual port layer.
Currently the datapath directly accesses devices through their
Linux functions.  Obviously this doesn't work for virtual devices
that are not backed by an actual Linux device.  This creates a
new virtual port layer which handles all interaction with devices.

The existing support for Linux devices was then implemented on top
of this layer as two device types.  It splits out and renames dp_dev
to internal_dev.  There were several places where datapath devices
had to handled in a special manner and this cleans that up by putting
all the special casing in a single location.
2010-04-19 09:11:57 -04:00
Jesse Gross
659586efcf tunneling: Add support for tunnel ID.
Add a tun_id field which contains the ID of the encapsulating tunnel
on which a packet was received (0 if not received on a tunnel).  Also
add an action which allows the tunnel ID to be set for outgoing
packets.  At this point there aren't any tunnel implementations so
these fields don't have any effect.

The matching is exposed to OpenFlow by overloading the high 32 bits
of the cookie as the tunnel ID.  ovs-ofctl is capable of turning
on this special behavior using a new "tun-cookie" command but this
command is intentially undocumented to avoid it being used without
a full understanding of the consequences.
2010-04-19 09:11:51 -04:00
Jesse Gross
11cdf5e612 datapath: Consistently maintain flow key.
After executing an action that changes a packet sometimes we update
the flow key and sometimes we don't.  This is potentially problematic
because we sometimes use the key for checks later on.  This consistently
maintains the key.
2010-03-15 15:44:41 -04:00
Jesse Gross
3c5f6de385 datapath: Validate ToS when flow is added.
Check that the ToS is valid when the flow is added, not every time
it is used.
2010-03-15 15:44:41 -04:00
Jesse Gross
635c9298b9 datapath: Update hardware computed checksum on VLAN change.
The checksum computed by hardware on receive stored in skb->csum
when skb->ip_summed == CHECKSUM_COMPLETE is supposed to reflect
the contents of the packet starting at skb->data, which includes
the VLAN tag if there is one.  However, when we manipulate the
VLAN tag we don't update the checksum.  This leads to all kinds
of nasty warnings about broken hardware, not to mention we can't
take advantage of the checksum that was already computed.

This also fixes some issues with our private checksum type value
on some different kernels and after GSO.
2010-03-05 15:22:17 -05:00
Jesse Gross
a063b0dff0 datapath: Consistently maintain checksum offloading state.
When adding a VLAN tag it is necessary for us to setup checksum
pointers for offloaded packets manually.  However, this process
clobbers some of the fields that other components need to determine
the current status.  Here we mark the packet with its status upon
ingress in our own format that does not get clobbered and is
consistent across kernel versions.

Bug #2436
2010-02-28 15:45:00 -05:00
Ben Pfaff
02dd3123a0 Merge "master" into "next". 2010-02-24 13:47:09 -08:00
Ben Pfaff
f119330116 datapath: Set the correct bits for OFPAT_SET_NW_TOS action.
The DSCP bits are the high bits, not the low bits.

Reported-by: Jean Tourrilhes <jt@hpl.hp.com>
2010-02-20 02:22:30 -08:00
Justin Pettit
959a2ecdc8 ofproto: Match VLAN PCP and rewrite ToS bits (OpenFlow 0.9)
Starting in OpenFlow 0.9, it is possible to match on the VLAN PCP
(priority) field and rewrite the IP ToS/DSCP bits.  This check-in
provides that support and bumps the wire protocol number to 0x98.

NOTE: The wire changes come together over the set of OpenFlow 0.9 commits,
so OVS will not be OpenFlow-compatible with any official release between
this commit and the one that completes the set.
2010-02-20 02:22:26 -08:00
Ben Pfaff
d42c4f8dc1 Use VLAN_PCP_SHIFT consistently, instead of open-coding "13".
Reported-by: Jesse Gross <jesse@nicira.com>
2010-02-12 13:56:15 -08:00
Justin Pettit
a4af00400a Merge branch 'master' into next
Conflicts:
	COPYING
	datapath/datapath.h
	lib/automake.mk
	lib/dpif-provider.h
	lib/dpif.c
	lib/hmap.h
	lib/netdev-provider.h
	lib/netdev.c
	lib/stream-ssl.h
	ofproto/executer.c
	ofproto/ofproto.c
	ofproto/ofproto.h
	tests/automake.mk
	utilities/ovs-ofctl.c
	utilities/ovs-vsctl.in
	vswitchd/ovs-vswitchd.conf.5.in
	xenserver/etc_init.d_vswitch
	xenserver/etc_xensource_scripts_vif
	xenserver/opt_xensource_libexec_interface-reconfigure
2010-02-05 17:14:55 -08:00
Jesse Gross
1cdb82b9f5 datapath: Support CHECKSUM_PARTIAL on older kernels.
On older kernels we would not correctly update partial checksums
because it was difficult to determine the type of checksum.  This
uses some hints to infer the correct type of checksum so that it
can be updated.  It also allows us to correctly define
CHECKSUM_PARTIAL, which is important for other components.
2010-01-26 18:09:34 -05:00
Jesse Gross
00ba5f3b15 datapath: Transport port is not part of psuedoheader.
While updating the checksum after changing the transport port,
we indicated that this was a change to the psuedoheader.  Since
this is not the case, it could produce an incorrect checksum.
2010-01-26 17:18:24 -05:00
Jesse Gross
a605732386 datapath: Handle packets with precomputed checksums.
On older kernels (< 2.6.19) CHECKSUM_HW can mean either that the
checksum has already been computed by hardware or that the checksum
needs to be computed by hardware, depending on whether we are on
the transmit or receive path.  Unfortunately since we are in the
middle of these two paths it is impossible to tell which is the
case.  Code after us assumes that CHECKSUM_HW means that the
checksum needs to be computed and will panic if there already is
a checksum.  On these kernels we mark these packets as CHECKSUM_NONE
before handing them off.

Without this change using certain NICs will cause panics.
2010-01-26 17:17:21 -05:00
Ben Pfaff
56fd8edf80 sflow: Fix sFlow sampling structure.
According to Neil McKee, in an email archived at
http://openvswitch.org/pipermail/dev_openvswitch.org/2010-January/000934.html:

    The containment rule is that a given sflow-datasource (sampler or
    poller) should be scoped within only one sflow-agent (or
    sub-agent).  So the issue arrises when you have two
    switches/datapaths defined on the same host being managed with
    the same IP address: each switch is a separate sub-agent, so they
    can run independently (e.g. with their own sequence numbers) but
    they can't both claim to speak for the same sflow-datasource.
    Specifically, they can't both represent the <ifindex>:0
    data-source.  This containment rule is necessary so that the
    sFlow collector can scale and combine the results accurately.

    One option would be to stick with the <ifindex>:0 data-source but
    elevate it to be global across all bridges, with a global
    sample_pool and a global sflow_agent.  Not tempting.  Better to
    go the other way and allow each interface to have it's own
    sampler, just as it already has it's own poller.  The ifIndex
    numbers are globally unique across all switches/datapaths on the
    host, so the containment is now clean.  Datasource <ifindex>:5
    might be on one switch, whille <ifindex>:7 can be on another.
    Other benefits are that 1) you can support the option of
    overriding the default sampling-rate on an interface-by-interface
    basis, and 2) this is how most sFlow implementations are coded,
    so there will be no surprises or interoperability issues with any
    sFlow collectors out there.

This commit implements the approach suggested by Neil.

This commit uses an atomic_t to represent the sampling pool.  This is
because we do want access to it to be atomic, but we expect that it will
"mostly" be accessed from a single CPU at a time.  Perhaps this is a bad
assumption; we can always switch to another form of synchronization later.

CC: Neil McKee <neil.mckee@inmon.com>
2010-01-20 14:33:28 -08:00
Ben Pfaff
72b0630028 Initial implementation of sFlow.
Tested very slightly with "ping" and "sflowtool -t | tcpdump -r -".
2010-01-04 13:08:37 -08:00
Ben Pfaff
6a33828dbc datapath: Check for proto_data_valid member instead of kernel version.
Commit 5ef800a69 "datapath: Copy Xen's checksumming fields when doing
skb_copy" should copy proto_data_valid between sk_buffs when that field
is present.  However the check for CONFIG_XEN plus kernel version 2.6.18
isn't sufficient, because SLES 11 kernels are version 2.6.27 but do have
this field.

This commit adds a configure-time check for the presence of the member
instead of attempting to guess based on the kernel version.

Thanks to Ian Campbell for reporting this problem.

CC: <Ian.Campbell@citrix.com>
2009-11-18 15:19:50 -08:00
Ben Pfaff
d17ee8689b Merge citrix branch into master. 2009-11-18 14:14:29 -08:00
Jesse Gross
0cd8a05ec0 datapath: Allow minimum headroom to be set when copying buffers.
If we need to copy an sk_buff in order to make it writable, allow
the minimum headroom to be specified.  This ensures that if we
need to add additional data, such as a VLAN tag, we will not have
to make a second copy.

Solves bug #2197 in certain situations.
2009-11-18 13:51:55 -08:00
Jesse Gross
5ef800a690 datapath: Copy Xen's checksumming fields when doing skb_copy.
Two fields that control checksumming were added to sk_buff in
Xen: proto_data_valid and proto_csum_blank.  These fields are copied
when doing a skb_clone but not in other functions such as skb_copy,
which can lead to checksum errors in TCP and UDP when offloading is
enabled in the guest.  To fix this we manually copy these fields,
though ideally this should be fixed upstream in Xen.

Bug #2299
2009-11-18 13:40:36 -08:00
Justin Pettit
de3f65ea52 datapath: Cleanup tab/space issues in datapath 2009-11-16 18:48:29 -08:00
Justin Pettit
985224ac0c datapath: Calculate proper checksum for set_tp_src/dst action
When the set_tp_src or set_tp_dst action is used, the calculation for
where the checksum is located was wrong.  This caused the checksum to
not be updated and packet corruption in the bad offset.
2009-11-16 18:11:40 -08:00
Ian Campbell
b2f460c72d datapath: Only call skb_checksum_setup on 2.6.18 && Xen.
For newer kernels the checksum setup is done at the point the skb is
received in netback or netfront so there is no more need to sprinkle
skb_checksum_setup calls throughout the kernel.
2009-08-18 16:09:32 -07:00
Ben Pfaff
a5225dd627 datapath: Drop unneeded local variable initialization. 2009-07-06 09:07:24 -07:00
Ben Pfaff
a14bc59fb8 Update primary code license to Apache 2.0. 2009-06-15 15:11:30 -07:00
Ben Pfaff
064af42167 Import from old repository commit 61ef2b42a9c4ba8e1600f15bb0236765edc2ad45. 2009-07-08 13:19:16 -07:00