Changes to allow the tpid to be specified and all vlan tpid checking to be
generalized.
Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
In 'struct ofpbuf' the 'frame' pointer was used to parse different kinds of
data (Ethernet, OpenFlow, Netlink attributes). For Ethernet packets the
'frame' pointer was supposed to have the same value as the 'data'
pointer.
Since 'struct dp_packet' is only used for Ethernet packets, there's no
need for a separate 'frame' pointer: we can use the 'data' pointer
instead.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
As OVS adds userspace support for being the endpoint in protocols
like tunnels, it will need to be able to calculate pseudoheaders
as part of the checksum calculation.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Currently dp-packet make use of ofpbuf for managing packet
buffers. That complicates ofpbuf, by making dp-packet
independent of ofpbuf both libraries can be optimized for
their own use case.
This avoids mapping operation between ofpbuf and dp_packet
in datapath upcalls.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This patch adds set-field operations for nd_target, nd_sll, and nd_tll
fields, with and without masks, using Nicira extensions and OpenFlow 1.2
protocol.
Signed-off-by: Randall A Sharo <randall.sharo at navy.mil>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.
Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The system defined ICMPv6 header doesn't have sparse annotation,
so this adds a definition so that endianness can be checked.
Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
The checksum of ICMPv6 packets uses the IP pseudoheader as part of
the calculation, unlike ICMP in IPv4. This was not implemented,
which means that modifying the IP addresses of an ICMPv6 packet
would cause the checksum to no longer be correct as the psuedoheader
did not match.
Reported-by: Neal Shrader <icosahedral@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Rename 'l2' to 'frame' and add new ofpbuf_set_frame() and ofpbuf_l2().
ofpbuf_set_frame() alse resets all the layer offsets. ofpbuf_l2()
returns NULL if the packet has no Ethernet header, as indicated either
by unset l3 offset or NULL frame pointer. Callers of ofpbuf_l2() are
supposed to check the return value, unless they can otherwise be sure
that the packet has a valid Ethernet header.
The recent commit 437d0d22 made some assumptions that were not valid
regarding the use of the 'l2' pointer in rconn module and by
compose_rarp(). This is now fixed as follows: rconn now relies on the
fact that once OpenFlow messages are given to rconn for transport, the
frame pointer is no longer needed to refer to the OpenFlow header; and
compose_rarp() now sets the frame pointer and offsets as expected.
In addition to storing network frames, ofpbufs are also used for
handling OpenFlow messages and action lists. lib/ofpbuf.h now has a
comment documenting the current usage conventions and invariants.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Code reads better without the "get", for example "ofpbuf_l3()"
v.s. "ofpbuf_get_l3()". L4 payoad access functions still use the
"get" (e.g., "ofpbuf_get_tcp_payload()").
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
This patch shrinks the struct ofpbuf from 104 to 48 bytes on 64-bit
systems, or from 52 to 36 bytes on 32-bit systems (counting in the
'l7' removal from an earlier patch). This may help contribute to
cache efficiency, and will speed up initializing, copying and
manipulating ofpbufs. This is potentially important for the DPDK
datapath, but the rest of the code base may also see a little benefit.
Changes are:
- Remove 'l7' pointer (previous patch).
- Use offsets instead of layer pointers for l2_5, l3, and l4 using
'l2' as basis. Usually 'data' is the same as 'l2', but this is not
always the case (e.g., when parsing or constructing a packet), so it
can not be easily used as the offset basis. Also, packet parsing is
faster if we do not need to maintain the offsets each time we pull
data from the ofpbuf.
- Use uint32_t for 'allocated' and 'size', as 2^32 is enough even for
largest possible messages/packets.
- Use packed enum for 'source'.
- Rearrange to avoid unnecessary padding.
- Remove 'private_p', which was used only in two cases, both of which
had the invariant ('l2' == 'data'), so we can temporarily use 'l2'
as a private pointer.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Now that we don't need to parse TCP flags from the packet after
extraction, we usually do not need the 'l7' pointer any more. When
needed, ofpbuf_get_tcp|udp|sctp|icmp_payload() or ofpbuf_get_l4_size()
can be used instead.
Removal of 'l7' was requested by Pravin for the DPDK datapath work, as
it simplifies packet parsing a bit.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
We used to map ODPP_NONE to port number 0, which is wrong, as
ODPP_NONE is a valid value of the flow's in_port.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Change the flow_extract() API to accept struct pkt_metadata,
instead of individual metadata fields. It will make the API more
logical and easier to maintain when we need to expand metadata
down the road.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>¬
There are two different MPLS ethertypes, 0x8847 and 0x8848 and a push MPLS
action applied to an MPLS packet may cause the ethertype to change from one
to the other. To ensure that this happens update the ethertype in
push_mpls() regardless of if the packet is already MPLS or not.
Test based on a similar test by Joe Stringer.
Cc: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Make set_ethertype() static as it is not used outside of packet.c
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This is in preparation for pushing vlan tags
using the TPID provided by the kernel via auxdata.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
ctz() returns 32 for zero input, and we already have ctz64(),
so it makes sense to rename ctz() as ctz32().
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Allow TCP flags match specification with symbolic flag names. TCP
flags are optionally specified as a string of flag names, each
preceded by '+' when the flag must be one, or '-' when the flag must
be zero. Any flags not explicitly included are wildcarded. The
existing hex syntax is still allowed, and is used in flow dumps when
all the flags are matched.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Widen TCP flags handling from 7 bits (uint8_t) to 12 bits (uint16_t).
The kernel interface remains at 8 bits, which makes no functional
difference now, as none of the higher bits is currently of interest
to the userspace.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
eth_mpls_depth() has been unused as of 1ac7c9bdb2b6fdcb ("ofproto-dpif: Use
execute_actions to execute controller actions").
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Ethernet headers are 14 bytes long, so when the beginning of such a header
is 32-bit aligned, the following data is misaligned. The usual trick to
fix that is to start the Ethernet header on an odd-numbered 16-bit
boundary. That trick works OK for Open vSwitch, but there are two
problems:
- OVS doesn't use that trick everywhere. Maybe it should, but it's
difficult to make sure that it does consistently because the CPUs
most commonly used with OVS don't care about misalignment, so we
only find problems when porting.
- Some protocols (GRE, VXLAN) don't use that trick, so in such a case
one can properly align the inner or outer L3/L4/L7 but not both. (OVS
userspace doesn't directly deal with such protocols yet, so this is
just future-proofing.)
- OpenFlow uses the alignment trick in a few places but not all of them.
This commit starts the adoption of what I hope will be a more robust way
to avoid misalignment problems and the resulting bus errors on RISC
architectures. Instead of trying to ensure that 32-bit quantities are
always aligned, we always read them as if they were misaligned. To ensure
that they are read this way, we change their types from 32-bit types to
pairs of 16-bit types. (I don't know of any protocols that offset the
next header by an odd number of bytes, so a 16-bit alignment assumption
seems OK.)
The same would be necessary for 64-bit types in protocol headers, but we
don't yet have any protocol definitions with 64-bit types.
IPv6 protocol headers need the same treatment, but for those we rely on
structs provided by system headers, so I'll leave them for an upcoming
patch.
Signed-off-by: Ben Pfaff <blp@nicira.com>
This commit fixes the warning issued by 'clang' when pointer is casted
to one with greater alignment.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The ethertype should always be updated on mpls_pop
as there may be a transition between MPLS unicast (0x8847) and
MPLS multicast (0x8848).
Ben Pfaff tells me that this is consistent with the
behaviour described in EXT-194 of the JIRA bug tracker.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The reserved multicast Ethernet addresses begin with 01:80:c2, not
01:08:c2.
Reported-by: Padmanabhan Krishnan <kprad1@yahoo.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
There were plans to use this in conjunction with inner/outer flows,
however that plan has been changed in favour of using recirculation.
This leaves us with the current usage.
encal_dl_type is currently only used to allow decoding of packets used in
the test suite. However, this is a bit of a fudge and the packets may be
provided as hexadecimal instead.
Also remove comments from parse_l2_5_onward() relating to MPLS which are
not in keeping with the commenting throughout the rest of the function.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
This adds support for the OpenFlow 1.1+ dec_mpls_ttl action.
And also adds an NX dec_mpls_ttl action.
The handling of the TTL modification is entirely handled in userspace.
Reviewed-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Use the innermost dl_type when decoding L3 and L4 data from a packet.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The ethertype should be set before resetting l2_5 in order
for the packet to be updated correctly.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This patch implements use-space datapath and non-datapath code
to match and use the datapath API set out in Leo Alterman's patch
"user-space datapath: Add basic MPLS support to kernel".
The resulting MPLS implementation supports:
* Pushing a single MPLS label
* Poping a single MPLS label
* Modifying an MPLS lable using set-field or load actions
that act on the label value, tc and bos bit.
* There is no support for manipulating the TTL
this is considered future work.
The single-level push pop limitation is implemented by processing
push, pop and set-field/load actions in order and discarding information
that would require multiple levels of push/pop to be supported.
e.g.
push,push -> the first push is discarded
pop,pop -> the first pop is discarded
This patch is based heavily on work by Ravi K.
Cc: Ravi K <rkerur@gmail.com>
Reviewed-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
An ovs_be32 is a more obvious way to represent an IP address than a
pointer to one. It is also more type-safe, especially since "sparse" is
able to check that the argument is in network byte order.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Rarp packets had their own header definition in the packets
library. This doesn't make sense because they have the same packet
format as arps.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Commit c93f9a78c349 (packets: Update the reserved protocols list.) added
a number of first-hop router redundancy protocol MAC addresses to the
list of BPDU MAC addresses. This means that packets destined to those MAC
addresses are dropped when other-config:forward-bpdu is set to false on a
bridge (the default setting).
However, this behavior is incorrect, because these MAC addresses are not
special in the way that, say, STP frames are special. STP is a
switch-to-switch protocol that end hosts have no use for, but end hosts do
speak directly to routers on the MAC addresses assigned by VRRP and the
other protocols in this category. Therefore, dropping packets in this
category means that end hosts can no longer talk to their first-hop router,
if that router is running one of these protocols.
This commit also refines the match used for EDP and EAPS, and adds Cisco
CFM to the protocols that are dropped.
After this commit, the following destination MACs are dropped:
- 01:08:c2:00:00:00
- 01:08:c2:00:00:01
- 01:08:c2:00:00:02
- 01:08:c2:00:00:03
- 01:08:c2:00:00:04
- 01:08:c2:00:00:05
- 01:08:c2:00:00:06
- 01:08:c2:00:00:07
- 01:08:c2:00:00:08
- 01:08:c2:00:00:09
- 01:08:c2:00:00:0a
- 01:08:c2:00:00:0b
- 01:08:c2:00:00:0c
- 01:08:c2:00:00:0d
- 01:08:c2:00:00:0e
- 01:08:c2:00:00:0f
- 00:e0:2b:00:00:00
- 00:e0:2b:00:00:04
- 00:e0:2b:00:00:06
- 01:00:0c:00:00:00
- 01:00:0c:cc:cc:cc
- 01:00:0c:cc:cc:cd
- 01:00:0c💿cd:cd
- 01:00:0c:cc:cc:c0
- 01:00:0c:cc:cc:c1
- 01:00:0c:cc:cc:c2
- 01:00:0c:cc:cc:c3
- 01:00:0c:cc:cc:c4
- 01:00:0c:cc:cc:c5
- 01:00:0c:cc:cc:c6
- 01:00:0c:cc:cc:c7
Bug #12618.
CC: Ben Basler <bbasler@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
OF1.2 and later make these fields fully maskable so we might as well also.
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
OF1.1 and later make these fields fully maskable so we might as well also.
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The name compose_rarp() more clearly describes what it's doing now.
Requested-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>