2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 13:58:14 +00:00
Commit Graph

64 Commits

Author SHA1 Message Date
Ben Pfaff
52e2fbfbab netdev: Remove netdev_get_vlan_vid().
It has no remaining users.
2011-11-23 13:19:53 -08:00
Ben Pfaff
52fa1bcf5f netdev-vport: Again allow "tap" devices to be added to bridges.
I did not check that tap devices otherwise work.  This at least allows
them to be part of a bridge again.

Reported-by: Janis Hamme <janis.hamme@student.kit.edu>
2011-10-31 10:54:30 -07:00
Ben Pfaff
b37e6334fd datapath: Add multicast tunnel support.
Something like this, on two separate vswitches, works to try it out:
    route add -net 224.0.0.0 netmask 240.0.0.0 dev eth0
    ovs-vsctl \
        -- add-port br0 gre0 \
        -- set interface gre0 type=gre options:remote_ip=224.0.0.1

Runtime tested on Linux 3.0, build tested on Linux 2.6.18, both i386.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-10-24 12:27:36 -07:00
Ethan Jackson
65c3058c22 vswitchd: New column "link_resets".
An interface's 'link_resets' column represents the number of times
Open vSwitch has observed its link_state change.
2011-10-17 15:03:03 -07:00
Ben Pfaff
077257b83c datapath-protocol: Rename to <linux/openvswitch.h>.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #7559.
2011-10-12 16:27:09 -07:00
Ben Pfaff
69ebca1e35 dpif-linux: Avoid unaligned accesses to vport stats sent by the datapath.
Reported-by: Jesse Gross <jesse@nicira.com>
2011-10-12 16:23:16 -07:00
Simon Horman
ee9bed06cd Remove netdev_find_dev_by_in4
netdev_find_dev_by_in4() appears to no longer be used and thus
can be removed. This also allows netdev_enumerate(), the
enumerate member of struct netdev_class and netdev_linux_enumerate()
to be removed.

I noticed this as netdev_linux_enumerate() makes use of if_nameindex()
and if_freenameindex() which are not available when compiling using
the Android NDK r6b (Android API level 13).
2011-09-22 09:03:03 -07:00
Pravin Shelar
f613a0d72c datapath: Always use generic stats for devices (vports)
Currently ovs is using device stats for Linux devices and count them
itself in other situations. This leads to overlap with hardware stats,
inconsistencies, etc. It's much better to just always count the packets
flowing through the switch and let userspace do any merging that it wants.

Following patch removes vport->get_stats() interface. vport-stat is changed
to use new `struct ovs_vport_stat` rather than rtnl_link_stats64.
Definitions of rtnl_link_stats64 is removed from OVS.  dipf_port->stat is also
removed as aggregate stats are only available at netdev layer.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-15 19:36:17 -07:00
Ben Pfaff
14622f22ab netdev: Allow get_mtu and set_mtu provider functions to be null.
Most netdev provider functions are allowed to be null if the implementation
does not support this feature.  This commit adds this feature for get_mtu
and set_mtu, and changes netdev-vport to take advantage of it.

Also, changes netdev_get_mtu() to report an MTU of 0 on error, instead of
leaving the MTU indeterminate.
2011-09-15 10:41:15 -07:00
Pravin Shelar
9b02078077 datapath: Strip down vport interface : OVS_VPORT_ATTR_MTU
There is no need to have vport attribute MTU (OVS_VPORT_ATTR_MTU) as
linux net-dev-ioctl can be used to get/set MTU for linux device.
Following patch removes OVS_VPORT_ATTR_MTU from datapath protocol.

This patch also adds netdev_set_mtu interface. So that MTU adjustments
can be done from OVS userspace. get_mtu() interface is also changed, now
get_mtu() returns EOPNOTSUPP rather than returning 0 and setting *pmtu
to INT_MAX in case there is no MTU attribute for given device.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-09-12 17:12:52 -07:00
Valient Gough
40a751774f datapath: add key support to CAPWAP tunnel
Add tunnel key support to CAPWAP vport.  Uses the optional WSI field in a
CAPWAP header to store a 64bit key.  It can also be used without keys, in which
case it is backward compatible with the old code.  Documentation about the
WSI field format is in CAPWAP.txt.

Signed-off-by: Valient Gough <vgough@pobox.com>
[horms@verge.net.au: Various minor fixes (v4.1)]
Signed-off-by: Simon Horman <horms@verge.net.au>
[jesse: Additional parsing fixes]
Signed-off-by: Jesse Gross <jesse@nicira.com>
2011-09-09 19:29:46 -07:00
Ethan Jackson
45c8d3a189 lib: Rename rtnetlink.[ch] files.
The only rtnetlink specific functionality contained in the
rtnetlink module is the use of the NETLINK_ROUTE protocol.  This
can easily be passed in by callers.

In preparation for generalization, this patch renames
rtnetlink.[ch] to netlink-notifier.[ch].  Future patches will
complete the transition.
2011-09-01 17:18:51 -07:00
Justin Pettit
df2c07f433 datapath: Use "OVS_*" as opposed to "ODP_*" for user<->kernel interactions.
The prefix "ODP_*" is not overly descriptive in the context of the
larger Linux tree.  This commit changes the prefix to "OVS_*" for the
userpace to kernel interactions.  The userspace libraries still use
"ODP_" in many of their interfaces since it is more descriptive in the
OVS oeuvre.

Feature #6904

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-08-19 22:48:23 -07:00
Ben Pfaff
18812dff32 netdev: Get rid of struct netdev_options and netdev_open_default().
Now that netdev_options only has two members, we might as well pass them
directly as parameters.
2011-08-08 12:55:43 -07:00
Ben Pfaff
de5cdb90f7 netdev: Decouple creating and configuring network devices.
Until now, each call to netdev_open() for a particular network device
had to either specify a set of network device arguments that was either
empty or (for devices that already existed) equal to the existing device's
configuration.  Unfortunately, the definition of "equality" in the latter
case was mostly done in terms of strict equality of string-to-string maps,
which caused problems in cases where, for example, one set of arguments
specified the default value of an optional argument explicitly and the
other omitted it.

The netdev interface does have provisions for defining equality other ways,
but this had only been done in one case that was especially problematic in
practice.  One way to solve this particular problem would be to carefully
define equality in all the problematic cases.

This commit takes another approach based on the realization that there is
really no need to do any comparisons.  Instead, it removes configuration
at netdev_open() time entirely, because almost all of netdev_open()'s
callers are not interested in creating and configuring a netdev.  Most of
them just want to open a configured device and use it.  Therefore, this
commit stops providing any configuration arguments to netdev_open() and the
provider functions that it calls.  Instead, a caller that does want to
configure a device does so after it opens it, by calling
netdev_set_config().

This change allows us to simplify the netdev interface a bit.  There is no
longer any need to implement argument comparisons.  As a result, there is
also no need for "struct netdev_dev" to keep track of configuration at all.
Instead, the network devices that have configuration keep track of it in
their own internal form.

This new interface does mean that it becomes possible to accidentally
create and try to use an unconfigured netdev that requires configuration.

Bug #6677.
Reported-by: Paul Ingram <paul@nicira.com>
2011-08-08 12:49:17 -07:00
Ben Pfaff
7b6b0ef47e netdev: Clean up and refactor packet receive interface.
The Open vSwitch tree only has one user of the ability for a netdev to
receive packets from a network device.  Thus, this commit simplifies the
common-case use of the netdev interface by replacing the "ethertype" option
from "struct netdev_options" by a new netdev_listen() call.

The only user of netdev_listen() wants to receive all packets from a
network device, so this commit also removes the ability to restrict the
received packets to a particular protocol.  (This ability was once used by
the Open vSwitch integrated DHCP client, but that code has been removed.)

This commit also simplifies and improves the implementation of the code
in netdev-linux that started listening to a network device.  Before, I had
not figured out how to avoid receiving all packets on all devices before
binding to a particular device, but I took a closer look at the kernel code
and figured it out.

I've tested that the userspace datapath (dpif-netdev), the only user of
netdev_recv(), still works after this change.
2011-08-08 10:24:24 -07:00
Justin Pettit
7211b387b7 netdev-vport: Don't use ipsec options for either arg in config_equal_ipsec().
Commit aebf423 (netdev: Add methods to do netdev-specific argument
comparisons.) added a new config_equal_ipsec() function to ignore
IPsec key options when comparing an existing netdev's options with a new
netdev.  We only ignored the options for the new netdev configuration,
which works when pulling the existing configuration from the kernel.
Unfortunately, if this is just a re-init of a netdev for which we just
created, this ignoring of the IPsec key options on the new netdev will
cause the check to fail, since the full options actually available in
both netdevs.  This commit just ignore all IPsec key options from both
netdevs.
2011-06-16 18:04:41 -07:00
Justin Pettit
aebf4235f3 netdev: Add methods to do netdev-specific argument comparisons.
When doing a netdev_open(), a check is first done to make sure the
arguments are equivalent for any open devices with the same name.  In
most cases, a simple shash comparison is sufficient.  However, IPsec
key configuration is handled by an external program, so it is not pushed
down into the kernel module.  Thus, when the "unparse_config" method is
called on an existing IPsec-based vport, a simple comparison with the
returned data will not match the original configuration.  This commit
adds code to allow netdev-specific argument comparisons and has
"ipsec_gre" make use of them.

Bug #5575
2011-06-14 10:28:40 -07:00
Ethan Jackson
943e5afe0b netdev: Remove monitors and notifiers.
Neither of these constructs are used anymore.
2011-05-31 14:34:39 -07:00
Ethan Jackson
ac4d3bcb46 netdev: New Function netdev_change_seq().
This new function will provide a much simpler replacement for
netdev_monitor in the future.
2011-05-31 14:34:38 -07:00
Ben Pfaff
d84d4b88d2 Fix incorrect byte order annotations.
These are not actual bugs, just deceptive use of the wrong function or
type.

Found by sparse.
2011-05-16 13:40:47 -07:00
Ben Pfaff
d39808227b netdev-linux: New functions for converting netdev stats formats.
An upcoming commit will introduce another function that needs to convert
between rtnl_link_stats64 and netdev_stats, so it seemed best to just add
functions to do the conversion.
2011-05-02 09:33:12 -07:00
Andrew Evans
66409d1bcc tunneling: Add df_default and df_inherit tunnel options.
Split existing pmtud tunnel option's functionality into three. Existing pmtud
option still exists, but now governs only whether datapath sends ICMP frag
needed messages. New df_inherit option controls whether DF bit is copied from
packet inner header to outer tunnel header. New df_default option controls
whether DF bit is set if inner packet isn't IP or if df_inherit is disabled.

Suggested-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Andrew Evans <aevans@nicira.com>
Feature #5456.
2011-04-29 17:05:58 -07:00
Ben Pfaff
7feba1acdd netdev-vport: Implement 'send' function.
The new implementation of the bonding code expects to be able to send
packets on netdevs using netdev_send().  This implements it.
2011-04-01 15:52:19 -07:00
Justin Pettit
8283e5143a netdev-vport: Log at ERR level when port won't be created.
Suggested-by: Ben Pfaff <blp@nicira.com>
2011-03-14 14:22:19 -07:00
Justin Pettit
e7009c3640 netdev-vport: Don't create port when ovs-monitor-ipsec not running.
It was suggested by Jesse that it would be better to just not create
IPsec tunnel devices if the ovs-monitor-ipsec daemon is not running.  He
had legitimate concerns about users missing the warning message printed
and traffic possibly going out unencrypted.

Suggested-by: Jesse Gross <jesse@nicira.com>
2011-03-14 14:22:19 -07:00
Justin Pettit
5059eff3bd netdev-vport: Warn on IPsec tunnels when ovs-monitor-ipsec not running.
IPsec tunnels are only supported on Debian systems running
ovs-monitor-ipsec.  Since that daemon configures IPsec, ovs-vswitchd
doesn't know whether IPsec will actually work.  With this commit, a
warning is printed that it is unlikely to work unless that daemon is
started.

There is a more serious issue that IPsec traffic can pass unencrypted if
that daemon is not running.  To fix that problem, changes to the kernel
module will need to occur.  A future commit will address that issue, but
this earlier warning will be useful regardless.

Bug #4854
2011-03-13 11:06:50 -07:00
Justin Pettit
8a86254e98 netdev-vport: Don't warn when a tunnel key is set.
Reported-by: Reid Price <reid@nicira.com>
2011-03-10 14:02:23 -08:00
Ben Pfaff
0574f71b4b netdev-port: Fix invalid memory access in netdev_vport_poll_add().
shash_find_data() returns an shash_node's 'data' member, but this code here
wants the shash_node itself, so it needs to use shash_find() instead.

This bug meant that any attempt to add a single netdev_vport to more than
one netdev_monitor would cause a segmentation fault.  Here's an example
command that reproduces it reliably for me under valgrind (because ofproto
always monitors its ports and the bridge monitors bond interfaces):

ovs-vsctl -- add-bond br0 bond0 p0 p1 \
          -- set interface p0 type=patch options:peer=p1 \
          -- set interface p1 type=patch options:peer=p0

Bug #4527.
Reported-by: Krishna Miriyala <krishna@nicira.com>
2011-03-04 13:58:03 -08:00
Ethan Jackson
b46ccdf582 lib: route-table improvements.
This commit makes several changes to the route_table code used to
populate tunnel_egress_iface.

- It removes name_table code from netdev-vport and puts it into
  route-table.

- It no longer attempts to build the name_table dynamically by
  listening to rtnetlink-link notifications.  Instead it dumps the
  entire table, and uses rtnetlink-link notifications to indicate a
  re-dump is required.

- It forces rtnetlink-link notifications to re-dump the routing
  table.  This fixes an issue where bringing an interface down or
  removing it altogether would not have the expected effect on
  related tunnel_egress_ifaces.
2011-01-31 18:27:34 -08:00
Ethan Jackson
ba615c2b5b lib: netdev-vport improperly initialized route-table.
netdev-vport unregistered the routing table in its destroy
function, but registered it in its init function.  This could
cause the routing table to be unregistered when it shouldn't have
been causing segmentation faults.

Bug #4526.
2011-01-31 15:46:18 -08:00
Ben Pfaff
254f2dc8e3 datapath: Change dp_idx to dp_ifindex, the ifindex of the local port.
I can't see any real value in maintaining a dp_idx separate from the
ifindex of the local port.  With the current implementation it also
artificially limits the number of datapaths.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-28 15:34:40 -08:00
Ben Pfaff
f0fef76062 datapath: Convert ODP_VPORT_* to use AF_NETLINK socket layer.
This commit calls genl_lock() and thus doesn't support Linux before
2.6.35, which wasn't exported before that version.  That problem will
be fixed once the whole userspace interface transitions to Generic
Netlink a few commits from now.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-28 15:34:27 -08:00
Ben Pfaff
c19e653509 datapath: Change userspace vport interface to use Netlink attributes.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other.  To do this in full generality, it must be possible to add new
features to the kernel vport layer without changing userspace software.
The customary way to do this in the Linux networking stack is to use
Netlink and in particular Netlink attributes.  This commit adopts that
model for the vport layer.  It does not yet actually start using the
Netlink socket layer, which will come later.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:37 -08:00
Ben Pfaff
c283069c71 datapath: Change vport type from string to integer enumeration.
I plan to make the vport type part of the standard header stuck on each
Netlink message related to a vport.  As such, it is more convenient to use
an integer than a string.  In addition, by being fundamentally different
from strings, using an integer may reduce the confusion we've had in the
past over the differences in userspace and kernel names for network device
and vport types.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-01-27 21:08:37 -08:00
Ben Pfaff
6d9e6eb44f netdev: Make netdev arguments fetchable, and implement for netdev-vport.
This gives network device implementations the opportunity to fetch an
existing device's configuration and store it as their arguments, so that
netdev clients can find out how an existing device is configured.

So far netdev-vport is the only implementation that needs to use this.

The next commit will add use by clients.

Reviewed by Justin Pettit.
2011-01-27 21:08:36 -08:00
Ben Pfaff
c6eab56db4 netlink-socket: Make dumping and doing transactions on same nl_sock safe.
It's not safe to use a single Netlink fd to do multiple operations in an
synchronous way.  Some of the limitations are fundamental; for example, the
kernel only supports a single "dump" operation at a time.  Others are
limitations imposed by the OVS coding style; for example, our Netlink
library is not callback based, so nothing can be done about incoming
messages that can't be handled immediately.  Regardless, in OVS multicast
groups, transactions, and dumps cannot coexist on a single nl_sock.

This is only mildly irritating at the moment, but it will become much worse
later on, when dpif-linux shifts to using Netlink dumps for listing various
kinds of datapath entities.  When that happens, a dump will be in progress
in situations where the dpif-linux client might want to do other
operations.  For example, it is reasonable for the client to list flows
and, in the middle, look up information on vports mentioned in those flows.
It might be possible to simply ban and avoid such nested operations--I have
not even audited the source tree to find out whether we do anything like
that already--but that seems like an unnecessary cramp on our coding style.
Furthermore, it's difficult to explain and justify without understanding
the implementation.

This patch takes another approach, by improving the Netlink socket library
to avoid artificial constraints.  When an operation, or a dump, or joining
a multicast group would cause a problem, this patch makes the library
transparently create a separate Netlink socket.  This solves the problem
without putting any onerous restrictions on use.

This commit also slightly simplifies netdev_vport_reset_names().  It had
been written to destroy the dump object before the Netlink socket that it
used, but this is no longer necessary and doing it in the opposite order
saved a few lines of code.

Reviewed by Ethan Jackson <ethan@nicira.com>.
2011-01-27 09:26:06 -08:00
Ben Pfaff
cceb11f5b1 netlink-socket: Add functions for joining and leaving multicast groups.
When this library was originally implemented, support for Linux 2.4 was
important.  The Netlink implementation in Linux only added support for
joining and leaving multicast groups after a socket is bound as of Linux
2.6.14, so the library did not support it either.  But the current version
of Open vSwitch targets Linux 2.6.18 and over, so it's fine to add this
support now, and this commit does so.

This will be used more extensively in upcoming commits.

Reviewed by Justin Pettit.
2011-01-27 09:26:05 -08:00
Andrew Evans
a404826e90 netdev-vport: Report carrier state of tunnel egress interfaces.
Record carrier state of tunnel egress interface in
"tunnel_egress_iface_carrier" key in "status" map of Interface table.
2011-01-19 18:25:12 -08:00
Ethan Jackson
a132aa969e lib: Simplify rtnetlink routing functionality.
This commit removes the rtnetlink-route module and replaces it with
a much simpler to use route-table module.  The route-table uses
rtnetlink to maintain a routing table which may be used to query
the egress interface of particular addresses.

This commit also converts netdev-vport to use the new route-table
module.
2011-01-14 11:19:32 -08:00
Ethan Jackson
6333182946 vswitchd: Add miimon support.
This commit allows users to check link status in bonded ports using
MII instead of carrier.
2011-01-12 15:50:20 -08:00
Ethan Jackson
ea763e0e28 bridge: Move tunnel_egress_iface to status column.
This commit removes the tunnel_egress_iface column from the
interface table and moves it's data to the status column.  In the
process it reverts the database to version 1.0.0.
2011-01-11 12:33:44 -08:00
Ethan Jackson
ea83a2fcd0 lib: Show tunnel egress interface in ovsdb
This commit parses rtnetlink address notifications from the
kernel in order to display the egress interface of tunnels in the
database.

Bug #4103.
2011-01-04 12:35:59 -08:00
Justin Pettit
ef7ee76a41 vswitch: Provide option to pull cert from SSL table
Introduce "use_ssl_cert" option to "ipsec_gre" interface types, which
will pull certificate and private key options from the SSL table.  In
the future, multiple SSL entries will be supported through the
configuration database, so use of this option is strongly discouraged as
this "feature" will be retired.
2010-12-28 16:26:48 -08:00
Justin Pettit
3c52fa7b69 vswitch: Add support for IPsec certificate authentication.
Previously, it was possible to fake configuring the use of certificate
authentication for IPsec, but it really just used a static pre-shared key
behind the scenes.  This commit publicly mentions certificate
authentication and finally does the real work behind the scenes.
2010-12-28 16:26:47 -08:00
Justin Pettit
e16a28b585 vswitch: Use "ipsec_gre" vport instead of "gre" with "other_config"
Previously, a GRE-over-IPsec tunnel was created as an interface with a
"type" of "gre" and the "other_config" column with "ipsec_cert" or
"ipsec_psk" set.  This could lead to a potential security problem if a user
intended to create a GRE-over-IPsec tunnel, but misconfigured the
"ipsec_*" config and created an unencrypted GRE tunnel.

This commit defines an "ipsec_gre" tunnel type, which should prevent
users from inadvertently establishing insecure tunnels.
2010-12-28 14:30:36 -08:00
Justin Pettit
4c2fa71d66 debian: Don't require ipsec_local_ip to configure IPsec
Commit e97a103 (Open vSwitch: ovs-monitor-ipsec: Add ability to traverse
NATs) removed the requirement that the "ipsec_local_ip" key must be set
to use IPsec, but other code and documentation was not updated to
reflect this.  This commit does that.
2010-12-28 14:30:35 -08:00
Ben Pfaff
b9298d3f82 Expand tunnel IDs from 32 to 64 bits.
We have a need to identify tunnels with keys longer than 32 bits.  This
commit adds basic datapath and OpenFlow support for such keys.  It doesn't
actually add any tunnel protocols that support 64-bit keys, so this is not
very useful yet.

The 'arg' member of struct odp_msg had to be expanded to 64-bits also,
because it sometimes contains a tunnel ID.  This member also contains the
argument passed to ODPAT_CONTROLLER, so I expanded that action's argument
to 64 bits also so that it can use the full width of the expanded 'arg'.
Userspace doesn't take advantage of the new space though (it was only
using 16 bits anyhow).

This commit has been tested only to the extent that it doesn't disrupt
basic Open vSwitch operation.  I have not tested it with tunnel traffic.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Feature #3976.
2010-12-10 11:14:13 -08:00
Ben Pfaff
73d67a4355 netdev-vport: Properly initialize tnl_port_config structure to 0.
Otherwise the configuration starts out indeterminate and tends to be
rejected by the kernel.

Bug #4195.
2010-12-07 10:41:05 -08:00
Ben Pfaff
c3827f619a datapath: Make adding and attaching a vport a single step.
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath.  Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath.  Similarly, a vport
could be detached and deleted separately.

After some study, I think I understand why this distinction exists.  It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them.  However, changing it to create them before it tries to
open them is not difficult, so this commit does this.

The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-03 14:41:38 -08:00