ovs_queue doesn't seem very useful; it's just a singly-linked list. It's
more generally useful to use a general-purpose "struct list" for lists of
packets, so this commit adds such a member to "struct ofpbuf" and shifts
the existing users to use it.
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath. Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath. Similarly, a vport
could be detached and deleted separately.
After some study, I think I understand why this distinction exists. It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them. However, changing it to create them before it tries to
open them is not difficult, so this commit does this.
The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This makes it easier for dpif_provider implementations to share code but
distinguish the class actually in use, because comparing a pointer is
easier than comparing a string.
There's no need to have a mask in this action, because both parts of the
TCI are part of the flow structure.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This allows eliminating padding from odp_flow_key, although actually doing
that is postponed until the next commit.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
When userspace and the kernel were using the same structure for flows,
flow_t was a useful way to indicate that a structure was really a userspace
flow instead of a kernel one, but now it's better to just write "struct
flow" for consistency, since OVS doesn't use typedefs for structs
elsewhere.
Acked-by: Jesse Gross <jesse@nicira.com>
The "struct odp_flow_key" used in the kernel datapath is conceptually
separate from the "flow_t" used in userspace, but until now we have
used the latter as a typedef for the former for convenience. This commit
separates them. This makes it possible in upcoming commits to change
them independently.
This is cross-ported from the "wdp" branch, which has had it for months.
The "port group" concept seems like a good one, but it has not been
used very much in userspace so far, so before we commit ourselves to
a frozen API that we must maintain forever, remove it. We can always
add it back in later as a new kind of vport.
Signed-off-by: Ben Pfaff <blp@nicira.com>
All of these changes avoid using the same name for two local variables
within a same function. None of them are actual bugs as far as I can tell,
but any of them could be confusing to the casual reader.
The one in lib/ovsdb-idl.c is particularly brilliant: inner and outer
loops both using (different) variables named 'i'.
Found with GCC -Wshadow.
Some of the flow actions that modify skbuff data did not check that the
skbuff was long enough before doing so. This commit fixes that problem.
Previously, the strategy for avoiding this was to only indicate the layer-3
nw_proto field in the flow if the corresponding layer-4 header was fully
present, so that if, for example, nw_proto was IPPROTO_TCP, this meant
that a TCP header was present. The original motivation for this patch was
to add corresponding code to only indicate a layer-2 dl_type if the
corresponding layer-3 header was fully present. But I'm now convinced that
this approach is conceptually wrong, because the meaning of a layer-N
header should not be affected by the meaning of a layer-(N+1) header.
This commit switches to a new approach. Now, when a header is missing, its
fields in the flow are simply zeroed and have no effect on the "type" field
for the outer header. Responsibility for ensuring that a header is fully
present is now shifted to the actions that wish to modify that header.
Signed-off-by: Ben Pfaff <blp@nicira.com>
"ARP spoofing" is when a host claims an incorrect association between an
IP address and a MAC address for deceptive purposes. OpenFlow by itself
can prevent a host from sending out ARP replies from an incorrect MAC
address in the Ethernet L2 header, but it cannot control the MAC addresses
inside the ARP L3 packet. This commit adds a new action that can be used
to drop these spoofed packets.
CC: Paul Ingram <paul@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
It looks to me like the current dpif-netdev implementation doesn't handle
the case where a packet comes in without a VLAN and then is subjected to
multiple ODPAT_SET_VLAN_* operations. dp_netdev_modify_vlan_tci() just
checks the flow key each time to see whether there's a VLAN, but it doesn't
update the flow key to note that there is now a VLAN.
One fix would be to update the flow key, but it's "const" these days.
Instead, add a check for whether the Ethernet type is ETH_TYPE_VLAN,
which should be equivalent.
Actions that modify packets need to tolerate packets that are too small.
Most of the actions already implicitly do this check, since they check for
appropriate values in the flow key that would only be there if the
corresponding data was present. But actions to modify the Ethernet header
didn't have a guarantee that the packet was at least 14 bytes long, and
actions to modify the VLAN didn't have such a guarantee either, so this
adds appropriate checks.
Problem found by code inspection.
Originally, the datapath didn't care about IP TOS at all. Then, to support
NetFlow, we made it keep track of the last-seen IP TOS value on a per-flow
basis. Then, to support OpenFlow 1.0, we added a nw_tos field to
odp_flow_key. We don't need both methods, so this commit drops the
NetFlow-specific tracking.
This introduces a small kernel ABI break: upgrading the kernel module
without upgrading the OVS userspace will mean that NetFlow records will
all show an IP TOS value of 0. I don't consider that to be a serious
problem.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
When the QoS code was integrated, I didn't yet know how to abstract the
translation from a queue ID in an OpenFlow OFPAT_ENQUEUE action into a
priority value for an ODP ODPAT_SET_PRIORITY action. This commit is a
first attempt that works OK for Linux, so far. It's possible that in fact
this translation needs the 'netdev' as an argument too, but it's not needed
yet.
Currently the flow key is updated to match an action that is applied
to a packet but these field are never looked at again. Not only is
this a waste of time it also makes optimizations involving caching
the flow key more difficult.
Most of the timekeeping needs of OVS are simply to measure intervals,
which means that it is sensitive to changes in the clock. This commit
replaces the existing clocks with monotonic timers. An additional set
of wall clock timers are added and used in locations that need absolute
time.
Bug #1858
The most recent revision of the netdev library added may_create
and may_open flags to explicitly state the intent of the caller as
to whether the device should already be in use. This was simply
a sanity check for users of the netdev library and the configuration.
At this point the netdev library and its users are well behaved and
should no longer need to be checked. Additional checks have also
been added for incorrect configuration that mean the netdev library
is no longer the primary line of defense.
These flags themselves create problems because it is not always
easy for a library to know what the state of devices should be.
This is particularly a problem for ovs-openflowd, which expects
ports to be added by ovs-dpctl. Fixing this either requires that
the checks are so permissive to be useless or ugly hacks to get
around them. Since they are no longer needed, just remove the
checks.
This commit restores the previous behavior of ovs-openflowd to
not require that ports be specified on the command line or
cleaned up after use.
Bug #2652
CC: Natasha Gude <natasha@nicira.com>
CC: Jean Tourrilhes <jt@hpl.hp.com>
CC: 蒲彦 <yan.p.bjtu@gmail.com>
When a dpif passes an odp_msg down to ofproto, and ofproto transforms it
into an ofp_packet_in to send to the controller, until now this always
involved a full copy of the packet inside ofproto. This commit eliminates
this copy by ensuring that there is always enough headroom in the ofpbuf
that holds the odp_msg to replace it by an ofp_packet_in in-place.
From Jean Tourrilhes <jt@hpl.hp.com>, with some revisions.
Add a tun_id field which contains the ID of the encapsulating tunnel
on which a packet was received (0 if not received on a tunnel). Also
add an action which allows the tunnel ID to be set for outgoing
packets. At this point there aren't any tunnel implementations so
these fields don't have any effect.
The matching is exposed to OpenFlow by overloading the high 32 bits
of the cookie as the tunnel ID. ovs-ofctl is capable of turning
on this special behavior using a new "tun-cookie" command but this
command is intentially undocumented to avoid it being used without
a full understanding of the consequences.
After executing an action that changes a packet sometimes we update
the flow key and sometimes we don't. This is potentially problematic
because we sometimes use the key for checks later on. This consistently
maintains the key.
OpenFlow 1.0 adds support for matching on IP ToS/DSCP bits.
NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0 until
the final commit in this OpenFlow 1.0 set.
Starting in OpenFlow 0.9, it is possible to match on the VLAN PCP
(priority) field and rewrite the IP ToS/DSCP bits. This check-in
provides that support and bumps the wire protocol number to 0x98.
NOTE: The wire changes come together over the set of OpenFlow 0.9 commits,
so OVS will not be OpenFlow-compatible with any official release between
this commit and the one that completes the set.
On some system, at least, one must include <sys/types.h> before
<netinet/in.h>, and <netinet/in.h> before <arpa/inet.h> or <net/if.h>.
From Jean Tourrilhes <jt@hpl.hp.com>.
This brings over some features that were added to the netdev interface,
most notably the separation between the name and the type. In addition
to being cleaner, this also avoids problems where it is expected that
the local port has the same name as the datapath.
This builds on earlier work that implemented netdev object refcounting.
However, rather than requiring explicit create and destroy calls,
these operations are now performed automatically based on the referenece
count. This is important because in certain situations it is not
possible to know whether a netdev has already been created. A
workaround existed (which looked fairly similar to this paradigm) but
introduced it's own issues. This simplifies and unifies the API.
This change adds netdev_create() and netdev_destroy() functions to allow
the creation of network devices through the netdev library. Previously,
network devices had to already exist or be created on demand through
netdev_open(). This caused problems such as not being able to specify
TAP devices as ports in ovs-vswitchd, which this patch fixes.
This also lays the groundwork for adding GRE and VDE support.
An option to zero the TCP flags when querying flow stats was added
to the kernel datapath to support NetFlow active timeouts. This
adds that same support to the user datapath.