2
0
mirror of https://github.com/openvswitch/ovs synced 2025-10-17 14:28:02 +00:00
Commit Graph

79 Commits

Author SHA1 Message Date
Justin Pettit
9197df76b4 Set MTU in userspace rather than kernel.
Currently the kernel automatically sets the MTU of any internal
interfaces to the minimum of all attached interfaces because the Linux
bridge does this.  Userspace can do this with more knowledge and
flexibility.

Feature #7323

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-15 16:27:15 -07:00
Pravin Shelar
3544358aa5 datapath: Improve kernel hash table
Currently OVS uses its own hashing implmentation for hash tables
which has some problems, e.g. error case on deletion code.
Following patch replaces that with hlist based hash table which is
consistent with other kernel hash tables. As Jesse suggested, flex-array
is used for allocating hash buckets, So that we can have large
hash-table without large contiguous kernel memory.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-09 19:09:47 -07:00
Pravin Shelar
ff8d7a5e81 Strip down vport interface : iflink
Remove iflink from vport interface. iflink is not used anywhere in
OVS. So there is not need to have iflink as vport attribute.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-08 15:18:42 -07:00
Justin Pettit
df2c07f433 datapath: Use "OVS_*" as opposed to "ODP_*" for user<->kernel interactions.
The prefix "ODP_*" is not overly descriptive in the context of the
larger Linux tree.  This commit changes the prefix to "OVS_*" for the
userpace to kernel interactions.  The userspace libraries still use
"ODP_" in many of their interfaces since it is more descriptive in the
OVS oeuvre.

Feature #6904

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-08-19 22:48:23 -07:00
Ben Pfaff
f915f1a8ca datapath: Consider tunnels to have no MTU, fixing jumbo frame support.
Until now, tunnel vports have had a specific MTU, in the same way that
ordinary network devices have an MTU, but treating them this way does not
always make sense.  For example, consider a datapath that has three ports:
the local port, a GRE tunnel to another host, and a physical port.  If
the physical port is configured with a jumbo MTU, it should be possible to
send jumbo packets across the tunnel: the tunnel can do fragmentation or
the physical port traversed by the tunnel might have a jumbo MTU.

However, until now, tunnels always had a 1500-byte MTU by default.  It
could be adjusted using ODP_VPORT_MTU_SET, but nothing actually did this.
One alternative would be to make ovs-vswitchd able to set the vport's MTU.
This commit, however, takes a different approach, of dropping the concept
of MTU entirely for tunnel vports.  This also solves the problem described
above, without making any additional work for anyone.

I tested that, without this change, I could not send 1600-byte "pings"
between two machines whose NICs had 2000-byte MTUs that were connected to
vswitches that were in turn connected over GRE tunnels with the default
1500-byte MTU.  With this change, it worked OK, regardless of the MTU of
the network traversed by the GRE tunnel.

This patch also makes "patch" ports MTU-less.

It might make sense to remove vport_set_mtu() and the associated callback
now, since ordinary network devices are the only vports that support it
now.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Suggested-by: Jesse Gross <jesse@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #3728.
2011-02-04 09:46:26 -08:00
Ben Pfaff
057dd6d279 datapath: Eliminate vport_mutex by protecting vport table with RCU.
The vport_mutex really only protects the vport dev_table, which isn't very
much.  By getting rid of it we take one step toward simplifying the vswitch
locking, which will necessarily have to be based mainly around the Generic
Netlink genl_mutex once we switch to Generic Netlink.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:41 -08:00
Ben Pfaff
f7cd0081f5 datapath: Convert ODP_EXECUTE to use Netlink framing.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:40 -08:00
Ben Pfaff
c19e653509 datapath: Change userspace vport interface to use Netlink attributes.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to add new
features to the kernel vport layer without changing userspace software.
The customary way to do this in the Linux networking stack is to use
Netlink and in particular Netlink attributes.  This commit adopts that
model for the vport layer.  It does not yet actually start using the
Netlink socket layer, which will come later.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:37 -08:00
Ben Pfaff
c283069c71 datapath: Change vport type from string to integer enumeration.
I plan to make the vport type part of the standard header stuck on each
Netlink message related to a vport.  As such, it is more convenient to use
an integer than a string.  In addition, by being fundamentally different
from strings, using an integer may reduce the confusion we've had in the
past over the differences in userspace and kernel names for network device
and vport types.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:37 -08:00
Ben Pfaff
3517d8bf80 datapath: Remove unused ->set_stats() function from vport_ops.
No vport implements this function and, as far as I can tell, no vport has
ever implemented it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-22 20:07:20 -08:00
Justin Pettit
dd851cbbcc datapath: Return vport configuration when queried.
Additional configuration is passed down to the kernel in the "config"
array of an odp_port when a vport is created.  This information is not
returned when a vport is queried, though.  This information is useful
for debugging, since it may be used to distinguish ports based on
additional data, such as the peer in tunnels.  In a forthcoming patch, it
will be essential to distinguish between plain GRE and GRE over IPsec.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-28 14:30:35 -08:00
Jesse Gross
d3097efff7 datapath: Add usage of __percpu annotation.
Sparse can warn if percpu pointers are incorrectly directly
dereference.  This adds the annotation where we declare percpu
pointers.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2010-12-13 13:40:00 -08:00
Ben Pfaff
7237e4f4b6 datapath: Merge vport "attach" into "create" and "detach" into "destroy".
These steps are sequentially in lockstep, so we might as well combine them.

This expands the region over which the vport_lock is held.  I didn't
carefully verify that this was OK.

This also eliminates the synchronize_rcu() call from destruction of tunnel
vports, since they didn't appear to me to need it.

It should be possible to eliminate the synchronize_rcu() from the netdev,
patch, and internal_dev vports, but this commit does not do that.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-03 15:44:51 -08:00
Ben Pfaff
e779d8d90d datapath: Merge "struct dp_port" into "struct vport".
After the previous commit, which changed the datapath to always create and
attach a vport at the same time, and to always detach and delete a vport
at the same time, there is no longer any real distinction between a dp_port
and a vport.  This commit, therefore, merges the two together to simplify
code.  It might even improve performance, although I have not checked.

I wasn't sure at first whether the merged structure should be "struct
dp_port" or "struct vport".  I went with the latter since the "v" prefix
sounds cool.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-03 14:43:38 -08:00
Ben Pfaff
c3827f619a datapath: Make adding and attaching a vport a single step.
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath.  Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath.  Similarly, a vport
could be detached and deleted separately.

After some study, I think I understand why this distinction exists.  It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them.  However, changing it to create them before it tries to
open them is not difficult, so this commit does this.

The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-03 14:41:38 -08:00
Ben Pfaff
94903c9898 datapath: Encapsulate parameters for new vports in new struct vport_parms.
Upcoming commits will keep needing to pass more information to the vport
'create' member function.  It's annoying to have to modify a dozen pieces
of code every time just to do this, so this commit encapsulates all of the
parameters in a new struct and passes that instead.

Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-03 11:33:41 -08:00
Jesse Gross
b279fccf5b datapath: Constify ops structures.
vport_ops, tunnel_ops, and ethtool_ops should not change at runtime.
Therefore, mark them as const to keep them out of the hotpath and to
prevent them from getting trampled.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2010-12-02 17:10:16 -08:00
Jesse Gross
6d1d631e10 vport: Remove unused error types.
We currently track rx_over_errors, rx_crc_errors, rx_frame_errors,
and collisions but never increment these counters.  It seems likely
that we will never use them since they are primarily hardware errors
and we pull hardware stats directly from the NIC.  This removes those
counters, saving 32 bytes per port.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2010-12-02 17:10:15 -08:00
Ben Pfaff
ec61a01cd8 datapath: Use "struct rtnl_link_stats64" instead of "struct odp_vport_stats".
Linux 2.6.35 added struct rtnl_link_stats64, which as a set of 64-bit
network device counters is what the OVS datapath needs.  We might as well
use it instead of our own.

This commit moves the if_link.h compat header from datapath/ into the
top-level include/ directory so that it is visible both to kernel and
userspace code.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-11-09 13:48:57 -08:00
Jesse Gross
3976f6d57b datapath: Enable usage of cached flows.
An upcoming commit will add support for supplying cached flows for
packets entering the datapath.  This adds the code in the datapath
itself to recognize these cached flows and use them instead of
extracting the flow fields and doing a lookup.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Reviewed-by: Ben Pfaff <blp@nicira.com>
2010-09-22 13:43:01 -07:00
Jesse Gross
e90b1cf9ce datapath: Add support for CAPWAP UDP transport.
Add support for the transport portion of the CAPWAP protocol as
an alternative to GRE for L2 over L3 tunneling.  This is not
full support for the CAPWAP protocol.  CAPWAP covers management
of wireless access points and describes a control protocol for
setting those devices up.  It also describes a data plane protocol
that allows packets to be tunneled to a controller for inspection.
This data plane protocol is the only component covered by this
commit.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2010-08-24 16:58:00 -04:00
Jesse Gross
38c6ecbc8d datpath: Avoid reporting half updated statistics.
We enforce mutual exclusion when updating statistics by disabling
bottom halves and only writing to per-CPU state.  However, reading
requires looking at the statistics for foreign CPUs, which could be
in the process of updating them since there isn't a lock.  This means
we could get garbage values for 64-bit values on 32-bit machines or
byte counts that don't correspond to packet counts, etc.

This commit introduces a sequence lock for statistics values to avoid
this problem.  Getting a write lock is very cheap - it only requires
incrementing a counter plus a memory barrier (which is compiled away
on x86) to acquire or release the lock and will never block.  On
read we spin until the sequence number hasn't changed in the middle
of the operation, indicating that the we have a consistent set of
values.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2010-08-20 19:44:46 -07:00
Jesse Gross
fceb2a5bb2 datapath: Put return type on same line as arguments for functions.
In some places we would put the return type on the same line as
the rest of the function definition and other places we wouldn't.
Reformat everything to match kernel style.
2010-07-15 15:09:08 -07:00
Jesse Gross
780e620781 vport: Allow offsets to be set for stats.
Adds a method to set a group of stats to be added to the values
gathered normally.  This is needed for the fake bond device to
show the stats of its underlying slaves.  Also enables devices
that use the generic stats layer to define a get_stats() function
to provide additional error counts.
2010-06-10 14:29:32 -07:00
Jesse Gross
5953c70e61 vport: Move 'extern' declarations of vports to header.
Since vport implementations have no header files they needed to be
declared as extern before being used.  They are currently declared
in vport.c but this isn't safe because the compiler will silently
accept it if the type is incorrect.  This moves those declarations
into vport.h, which is included by all implementations and will
cause errors about conflicting types if there is a mismatch.
2010-06-10 14:29:30 -07:00
Jesse Gross
61e89cd6d6 vport: Rename userspace functions.
The vport library can be accessed from both userspace and the
kernel using different sets of functions.  These functions were
named similarly, so add _user to the userspace variants to
distinguish them.
2010-06-09 12:30:41 -07:00
Jesse Gross
b19e8815ad vport: Extract common functions for virtual devices.
Pull some generic implementations of vport functions out of the
GRE vport so they can be used by others.

Also move the code to set the MTUs of internal devices to the minimum
of attached devices to the generic vport_set_mtu layer.
2010-05-18 12:55:42 -07:00
Ben Pfaff
3fbd517acf datapath: Add 32-bit compatibility ioctls.
When a 32-bit userspace program runs on a 64-bit kernel, data structures
that contain members whose sizes or alignments change from 32- to 64-bit
must be translated when they are passed to ioctls.  This commit adds such
support for openvswitch_mod.

We should really reconsider some parts of the Open vSwitch ioctl interface
to avoid needing as much translation as we do.

Lightly tested with 32-bit userspace on sparc64.
2010-05-13 15:29:51 -07:00
Jesse Gross
f2459fe7d9 datapath: Add generic virtual port layer.
Currently the datapath directly accesses devices through their
Linux functions.  Obviously this doesn't work for virtual devices
that are not backed by an actual Linux device.  This creates a
new virtual port layer which handles all interaction with devices.

The existing support for Linux devices was then implemented on top
of this layer as two device types.  It splits out and renames dp_dev
to internal_dev.  There were several places where datapath devices
had to handled in a special manner and this cleans that up by putting
all the special casing in a single location.
2010-04-19 09:11:57 -04:00