There is no reason to believe that the source and destination ports
will be symmetric in a bidirectional UDP stream. This patch no
longer uses them for symmetric hashing.
I know already that this breaks the statsfixes that were implemented by the
following commits:
827ab71c97f "ofproto: Datapath statistics accounted twice."
6f1435fc8f7 "ofproto: Resubmit statistics improperly account during..."
These were already broken in a previous merge. I will work on a fix.
In addition to the changes to ofproto, this commit changes all of the
instances of "struct flow" in the tree so that the "in_port" member is an
OpenFlow port number. Previously, this member was an OpenFlow port number
in some cases and an ODP port number in other cases.
The flow extraction code for IPv6 has some deviations from both the
kernel version and other protocols in userspace. These differences
make it difficult to compare the two for correctness. This updates
the code to be more similar to the others in design and style. There
is no functional change.
Currently we stop parsing packets that are IPv6 jumbograms. While
it isn't possible to send such large packets to userspace, it's better
to drop them at that point rather than prematurely in the IPv6 code.
IPv6 does make some use of the payload length field but we can just as
easily use skb->len, which is what all other parsing uses.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
We compute the length of the IPv6 header by parsing all of the
extension headers that we know about. However, the final result
is checked using ofpbuf_pull(), which checks the size with an
assertion. Since the length of the final header is not checked
in any other way an invalid packet can trigger this assertion.
IPv6 uses Neighbor Discovery messages in a similar manner to how IPv4
uses ARP. This commit adds support for matching deeper into the
payloads of Neighbor Solicitation (NS) and Neighbor Advertisement (NA)
messages. Currently, the matching fields include:
- NS and NA Target (nd_target)
- NS Source Link Layer Address (nd_sll)
- NA Target Link Layer Address (nd_tll)
When defining IPv6 Neighbor Discovery rules, the Nicira Extensible Match
(NXM) extension to OVS must be used.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Provides ability to match over IPv6 traffic in the same manner as IPv4.
Currently, the matching fields include:
- IPv6 source and destination addresses (ipv6_src and ipv6_dst)
- Traffic Class (nw_tos)
- Next Header (nw_proto)
- ICMPv6 Type and Code (icmp_type and icmp_code)
- TCP and UDP Ports over IPv6 (tp_src and tp_dst)
When defining IPv6 rules, the Nicira Extensible Match (NXM) extension to
OVS must be used.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
OpenFlow 1.0 doesn't allow matching on the ARP source and target
hardware address. This has caused us to introduce hacks such as the
Drop Spoofed ARP action. Now that we have extensible match, we can
match on more fields within ARP:
- Source Hardware Address (arp_sha)
- Target Hardware Address (arp_tha)
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
A few common IP protocol types were defined in "lib/packets.h". However,
we already assume the existence of <netinet/in.h> which contains a more
exhaustive list and should be available on POSIX systems.
Following this commit, "struct odp_flow_stats" is only used in
Linux-specific parts of OVS userspace code. This allows the actual Linux
datapath interface to evolve more freely.
Reviewed by Justin Pettit.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length. This
commit makes that change using Netlink attribute sequences.
This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, because userspace doesn't yet
have enough information to do that intelligently. Upcoming commits will
fix that.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
We have a need to identify tunnels with keys longer than 32 bits. This
commit adds basic datapath and OpenFlow support for such keys. It doesn't
actually add any tunnel protocols that support 64-bit keys, so this is not
very useful yet.
The 'arg' member of struct odp_msg had to be expanded to 64-bits also,
because it sometimes contains a tunnel ID. This member also contains the
argument passed to ODPAT_CONTROLLER, so I expanded that action's argument
to 64 bits also so that it can use the full width of the expanded 'arg'.
Userspace doesn't take advantage of the new space though (it was only
using 16 bits anyhow).
This commit has been tested only to the extent that it doesn't disrupt
basic Open vSwitch operation. I have not tested it with tunnel traffic.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Feature #3976.
Before this commit, the compiler would add two bytes of padding to
the 'flow_wildcards' structure to achieve 32bit alignment. These
two bytes had inconsistent values which caused 'flow_wildcards_hash'
to behave inconsistently. This commit explicitly 32bit aligns
'flow_wildcards' with zero padding.
This commit also fixes an issue where in-band rules were not
getting deleted when in-band control was disabled.
Some code failed to convert tunnel IDs to host byte order for printing,
so this fixes that. Some code printed tunnel IDs with a 0x prefix and
other code didn't, so this code uses the '#' flag consistently (which
prints 0x for nonzero values and omits it for zero).
This commit also stops always printing all 8 digits. When tunnel IDs
are expanded to 64 bits, as they will be soon, printing 16 digits all the
time wastes too much space.
Acked-by: Jesse Gross <jesse@nicira.com>
Until now, the collection of coverage counters supported by a given OVS
program was not specific to that program. That means that, for example,
even though ovs-dpctl does not have anything to do with mac_learning, it
still has a coverage counter for it. This is confusing, at best.
This commit fixes the problem on some systems, in particular on ones that
use GCC and the GNU linker. It uses the feature of the GNU linker
described in its manual as:
If an orphaned section's name is representable as a C identifier then
the linker will automatically see PROVIDE two symbols: __start_SECNAME
and __end_SECNAME, where SECNAME is the name of the section. These
indicate the start address and end address of the orphaned section
respectively.
Systems that don't support these features retain the earlier behavior.
This commit also fixes the annoyance that files that include coverage
counters must be listed on COVERAGE_FILES in lib/automake.mk.
This commit also fixes the annoyance that modifying any source file that
includes a coverage counter caused all programs that link against
libopenvswitch.a to relink, even programs that the source file was not
linked into. For example, modifying ofproto/ofproto.c (which includes
coverage counters) caused tests/test-aes128 to relink, even though
test-aes128 does not link again ofproto.o.
These accessors are semantically identical to the ones for uint<N>_t data,
but the names are more informative to readers, and the types provide
annotations for sparse.
Since the Nicira Extended Match was specified nicira-ext.h has claimed that
arbitrary masks are allowed, but in fact only certain masks were actually
implemented. This commit implements general masking for the 802.1Q VLAN
TCI field.
Originally, wildcards were just the OpenFlow OFPFW_* bits. Then, when
OpenFlow added CIDR masks for IP addresses, struct flow_wildcards was born
with additional members for those masks, derived from the wildcard bits.
Then, when OVS added support for tunnels, we added another bit
NXFW_TUN_ID that coexisted with the OFPFW_*. Later we added even more bits
that do not appear in the OpenFlow 1.0 match structure at all. This had
become really confusing, and the difficulties were especially visible in
the long list of invariants in comments on struct flow_wildcards.
This commit cleanly separates the OpenFlow 1.0 wildcard bits from the
bits used inside Open vSwitch, by defining a new set of bits that are
used only internally to Open vSwitch and converting to and from those
wildcard bits at the point where data comes off or goes onto the wire.
It also moves those functions into ofp-util.[ch] since they are only for
dealing with OpenFlow wire protocol now.
The flow_from_match() and flow_to_match() functions have to deal with most
of the state in a cls_rule anyhow, and this will increase in upcoming
commits, to the point that we might as well just use a cls_rule anyhow.
This commit therefore deletes flow_from_match() and flow_to_match(),
integrating their code into cls_rule_from_match() and the new function
cls_rule_to_match(), respectively. It also changes each of the functions'
callers to use the new cls_rule_*() function.
The old classifier was not adaptive: it required knowing the structure of
the flows that were likely to be in use to get good performance. It is
likely that it degenerated to linear search in any real-world case.
This new classifier is adaptive and should perform better in the real
world.
There are many more places in OVS where using these types would be an
improvement, but the flow code is particularly confusing because it uses
a mix of byte orders.
When userspace and the kernel were using the same structure for flows,
flow_t was a useful way to indicate that a structure was really a userspace
flow instead of a kernel one, but now it's better to just write "struct
flow" for consistency, since OVS doesn't use typedefs for structs
elsewhere.
Acked-by: Jesse Gross <jesse@nicira.com>
The "struct odp_flow_key" used in the kernel datapath is conceptually
separate from the "flow_t" used in userspace, but until now we have
used the latter as a typedef for the former for convenience. This commit
separates them. This makes it possible in upcoming commits to change
them independently.
This is cross-ported from the "wdp" branch, which has had it for months.
Some of the flow actions that modify skbuff data did not check that the
skbuff was long enough before doing so. This commit fixes that problem.
Previously, the strategy for avoiding this was to only indicate the layer-3
nw_proto field in the flow if the corresponding layer-4 header was fully
present, so that if, for example, nw_proto was IPPROTO_TCP, this meant
that a TCP header was present. The original motivation for this patch was
to add corresponding code to only indicate a layer-2 dl_type if the
corresponding layer-3 header was fully present. But I'm now convinced that
this approach is conceptually wrong, because the meaning of a layer-N
header should not be affected by the meaning of a layer-(N+1) header.
This commit switches to a new approach. Now, when a header is missing, its
fields in the flow are simply zeroed and have no effect on the "type" field
for the outer header. Responsibility for ensuring that a header is fully
present is now shifted to the actions that wish to modify that header.
Signed-off-by: Ben Pfaff <blp@nicira.com>
The kernel and user datapaths have code that assumes that 802.1Q headers
are used only inside Ethernet II frames, not inside SNAP-encapsulated
frames. But the kernel and user flow_extract() implementations would
interpret 802.1Q headers inside SNAP headers as being valid VLANs. This
would cause packet corruption if any VLAN-related actions were to be taken,
so change the two flow_extract() implementations only to accept 802.1Q as
an Ethernet II frame type, not as a SNAP-encoded frame type.
802.1Q-2005 says that this is correct anyhow:
Where the ISS instance used to transmit and receive tagged frames is
provided by a media access control method that can support Ethernet
Type encoding directly (e.g., is an IEEE 802.3 or IEEE 802.11 MAC) or
is media access method independent (e.g., 6.6), the TPID is Ethernet
Type encoded, i.e., is two octets in length and comprises solely the
assigned Ethernet Type value.
Where the ISS instance is provided by a media access method that
cannot directly support Ethernet Type encoding (e.g., is an IEEE
802.5 or FDDI MAC), the TPID is encoded according to the rule for
a Subnetwork Access Protocol (Clause 10 of IEEE Std 802) that
encapsulates Ethernet frames over LLC, and comprises the SNAP
header (AA-AA-03) followed by the SNAP PID (00-00-00) followed by
the two octets of the assigned Ethernet Type value.
All of the media that OVS handles supports Ethernet Type fields, so to me
that means that we don't have to handle 802.1Q-inside-SNAP.
On the other hand, we *do* have to handle SNAP-inside-802.1Q, because this
is actually allowed by the standards. So this commit also adds that
support.
I verified that, with this change, both SNAP and Ethernet packets are
properly recognized both with and without 802.1Q encapsulation.
I was a bit surprised to find out that Linux does not accept
SNAP-encapsulated IP frames on Ethernet.
Here's a summary of how frames are handled before and after this commit:
Common cases
------------
Ethernet
+------------+
1. |dst|src|TYPE|
+------------+
Ethernet LLC SNAP
+------------+ +--------+ +-----------+
2. |dst|src| len| |aa|aa|03| |000000|TYPE|
+------------+ +--------+ +-----------+
Ethernet 802.1Q
+------------+ +---------+
3. |dst|src|8100| |VLAN|TYPE|
+------------+ +---------+
Ethernet 802.1Q LLC SNAP
+------------+ +---------+ +--------+ +-----------+
4. |dst|src|8100| |VLAN| LEN| |aa|aa|03| |000000|TYPE|
+------------+ +---------+ +--------+ +-----------+
Unusual cases
-------------
Ethernet LLC SNAP 802.1Q
+------------+ +--------+ +-----------+ +---------+
5. |dst|src| len| |aa|aa|03| |000000|8100| |VLAN|TYPE|
+------------+ +--------+ +-----------+ +---------+
Ethernet LLC
+------------+ +--------+
6. |dst|src| len| |xx|xx|xx|
+------------+ +--------+
Ethernet LLC SNAP
+------------+ +--------+ +-----------+
7. |dst|src| len| |aa|aa|03| |xxxxxx|xxxx|
+------------+ +--------+ +-----------+
Ethernet 802.1Q LLC
+------------+ +---------+ +--------+
8. |dst|src|8100| |VLAN| LEN| |xx|xx|xx|
+------------+ +---------+ +--------+
Ethernet 802.1Q LLC SNAP
+------------+ +---------+ +--------+ +-----------+
9. |dst|src|8100| |VLAN| LEN| |aa|aa|03| |xxxxxx|xxxx|
+------------+ +---------+ +--------+ +-----------+
Behavior
--------
--------------- --------------- -------------------------------------
Before After
this commit this commit
dl_type dl_vlan dl_type dl_vlan Notes
------- ------- ------- ------- -------------------------------------
1. TYPE ffff TYPE ffff no change
2. TYPE ffff TYPE ffff no change
3. TYPE VLAN TYPE VLAN no change
4. LEN VLAN TYPE VLAN proposal fixes behavior
5. TYPE VLAN 8100 ffff 802.1Q says this is invalid framing
6. 05ff ffff 05ff ffff no change
7. 05ff ffff 05ff ffff no change
8. LEN VLAN 05ff VLAN proposal fixes behavior
9. LEN VLAN 05ff VLAN proposal fixes behavior
Signed-off-by: Ben Pfaff <blp@nicira.com>
Originally, the datapath didn't care about IP TOS at all. Then, to support
NetFlow, we made it keep track of the last-seen IP TOS value on a per-flow
basis. Then, to support OpenFlow 1.0, we added a nw_tos field to
odp_flow_key. We don't need both methods, so this commit drops the
NetFlow-specific tracking.
This introduces a small kernel ABI break: upgrading the kernel module
without upgrading the OVS userspace will mean that NetFlow records will
all show an IP TOS value of 0. I don't consider that to be a serious
problem.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
Normally match fields are zeroed if they are wildcarded in
normalize_match(). However, tun_id isn't part of struct ofp_match
so do it when we convert to a flow instead.
OpenFlow provides the ability to delete flows that match a "strict"
description. This means that wildcards are not active, and thus will
only match a single flow that exactly matches the description. The code
that checks for a match is pretty dumb and still compares the values of
fields that are wildcarded. A recent change added a "tun_id" matching
field, but did not zero out the field when it was supposed to be
ignored, which broke the matching used by strict deletions. This sets
the field regardless of whether the field is wildcarded or not.
Reported-by: Natasha Gude <natasha@nicira.com>
Bug #2775
Add a tun_id field which contains the ID of the encapsulating tunnel
on which a packet was received (0 if not received on a tunnel). Also
add an action which allows the tunnel ID to be set for outgoing
packets. At this point there aren't any tunnel implementations so
these fields don't have any effect.
The matching is exposed to OpenFlow by overloading the high 32 bits
of the cookie as the tunnel ID. ovs-ofctl is capable of turning
on this special behavior using a new "tun-cookie" command but this
command is intentially undocumented to avoid it being used without
a full understanding of the consequences.