This gives network device implementations the opportunity to fetch an
existing device's configuration and store it as their arguments, so that
netdev clients can find out how an existing device is configured.
So far netdev-vport is the only implementation that needs to use this.
The next commit will add use by clients.
Reviewed by Justin Pettit.
It's not safe to use a single Netlink fd to do multiple operations in an
synchronous way. Some of the limitations are fundamental; for example, the
kernel only supports a single "dump" operation at a time. Others are
limitations imposed by the OVS coding style; for example, our Netlink
library is not callback based, so nothing can be done about incoming
messages that can't be handled immediately. Regardless, in OVS multicast
groups, transactions, and dumps cannot coexist on a single nl_sock.
This is only mildly irritating at the moment, but it will become much worse
later on, when dpif-linux shifts to using Netlink dumps for listing various
kinds of datapath entities. When that happens, a dump will be in progress
in situations where the dpif-linux client might want to do other
operations. For example, it is reasonable for the client to list flows
and, in the middle, look up information on vports mentioned in those flows.
It might be possible to simply ban and avoid such nested operations--I have
not even audited the source tree to find out whether we do anything like
that already--but that seems like an unnecessary cramp on our coding style.
Furthermore, it's difficult to explain and justify without understanding
the implementation.
This patch takes another approach, by improving the Netlink socket library
to avoid artificial constraints. When an operation, or a dump, or joining
a multicast group would cause a problem, this patch makes the library
transparently create a separate Netlink socket. This solves the problem
without putting any onerous restrictions on use.
This commit also slightly simplifies netdev_vport_reset_names(). It had
been written to destroy the dump object before the Netlink socket that it
used, but this is no longer necessary and doing it in the opposite order
saved a few lines of code.
Reviewed by Ethan Jackson <ethan@nicira.com>.
When this library was originally implemented, support for Linux 2.4 was
important. The Netlink implementation in Linux only added support for
joining and leaving multicast groups after a socket is bound as of Linux
2.6.14, so the library did not support it either. But the current version
of Open vSwitch targets Linux 2.6.18 and over, so it's fine to add this
support now, and this commit does so.
This will be used more extensively in upcoming commits.
Reviewed by Justin Pettit.
This commit removes the rtnetlink-route module and replaces it with
a much simpler to use route-table module. The route-table uses
rtnetlink to maintain a routing table which may be used to query
the egress interface of particular addresses.
This commit also converts netdev-vport to use the new route-table
module.
This commit removes the tunnel_egress_iface column from the
interface table and moves it's data to the status column. In the
process it reverts the database to version 1.0.0.
Introduce "use_ssl_cert" option to "ipsec_gre" interface types, which
will pull certificate and private key options from the SSL table. In
the future, multiple SSL entries will be supported through the
configuration database, so use of this option is strongly discouraged as
this "feature" will be retired.
Previously, it was possible to fake configuring the use of certificate
authentication for IPsec, but it really just used a static pre-shared key
behind the scenes. This commit publicly mentions certificate
authentication and finally does the real work behind the scenes.
Previously, a GRE-over-IPsec tunnel was created as an interface with a
"type" of "gre" and the "other_config" column with "ipsec_cert" or
"ipsec_psk" set. This could lead to a potential security problem if a user
intended to create a GRE-over-IPsec tunnel, but misconfigured the
"ipsec_*" config and created an unencrypted GRE tunnel.
This commit defines an "ipsec_gre" tunnel type, which should prevent
users from inadvertently establishing insecure tunnels.
Commit e97a103 (Open vSwitch: ovs-monitor-ipsec: Add ability to traverse
NATs) removed the requirement that the "ipsec_local_ip" key must be set
to use IPsec, but other code and documentation was not updated to
reflect this. This commit does that.
We have a need to identify tunnels with keys longer than 32 bits. This
commit adds basic datapath and OpenFlow support for such keys. It doesn't
actually add any tunnel protocols that support 64-bit keys, so this is not
very useful yet.
The 'arg' member of struct odp_msg had to be expanded to 64-bits also,
because it sometimes contains a tunnel ID. This member also contains the
argument passed to ODPAT_CONTROLLER, so I expanded that action's argument
to 64 bits also so that it can use the full width of the expanded 'arg'.
Userspace doesn't take advantage of the new space though (it was only
using 16 bits anyhow).
This commit has been tested only to the extent that it doesn't disrupt
basic Open vSwitch operation. I have not tested it with tunnel traffic.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Feature #3976.
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath. Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath. Similarly, a vport
could be detached and deleted separately.
After some study, I think I understand why this distinction exists. It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them. However, changing it to create them before it tries to
open them is not difficult, so this commit does this.
The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
When a vport parse error occurs, the vport_class's parse_config function
doesn't necessarily store a valid pointer into the vport_info's 'config'
member, so netdev_vport_create() needs to supply a null pointer here to
avoiding passing a wild pointer to free().
The existing implementation never worked because it used different strings
for notifier shash addition and lookup: for adding to the shash, it used
the vport name; for lookup, it used "<type>:<name>". This fixes the
problem, by using "<type>:<name>" in both cases.
Linux 2.6.35 added struct rtnl_link_stats64, which as a set of 64-bit
network device counters is what the OVS datapath needs. We might as well
use it instead of our own.
This commit moves the if_link.h compat header from datapath/ into the
top-level include/ directory so that it is visible both to kernel and
userspace code.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Currently netdev_get_carrier() returns both a carrier status and
an error code. However, usage of the error code was inconsistent:
most callers either ignored it or didn't perform their task if an
error occured, which prevented bond rebalancing. This makes the
handling consistent by translating an error into a down status in
the netdev library.
Bug #3959
Commit 2b9d65 (netdev-vport: Merge in netdev-patch and netdev-tunnel.)
refactored the common parts of the netdev-patch and netdev-tunnel
sources into netdev-vport. During the refactoring, the "destroy" method
didn't inherit the netdev_vport_do_ioctl(ODP_VPORT_DEL, ...) call, which
is needed to actually destroy the device in the kernel. This commit
fixes that.
Bug #3267
The only real difference between netdev-patch and netdev-tunnel is in their
parse_config() implementation. That's a lot of extra code to maintain, for
questionable benefit. This commit merges them into the netdev-vport code,
which was heretofore merely a collection of helper functions.
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
In certain cases we require the ability to provide stats that are
added to the values collected by the kernel (currently only used
by bond fake devices). Internal devices previously implemented
this directly but now that their stats are now handled by the vport
layer the functionality has been moved there. This removes the
userspace code to set the stats and replaces it with a mechanism
to access the equivalent functionality in the vport layer.