netdev linux devices uses mtu ioctl to get and set MTU for a device.
By caching error code from ioctl we can reduce number of ioctl calls
for device which is unregistered from system.
netdev notification is used to update mtu which saves get-mtu-ioctl.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
netdev_get_features() and other functions have always used OpenFlow 1.0
"enum ofp_port_features" bits as part of their interface. This commit
switches over to using an internally defined interface that is not tied
directly to any OpenFlow version, making evolution of each side of the
interface easier in the future.
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
netdev_find_dev_by_in4() appears to no longer be used and thus
can be removed. This also allows netdev_enumerate(), the
enumerate member of struct netdev_class and netdev_linux_enumerate()
to be removed.
I noticed this as netdev_linux_enumerate() makes use of if_nameindex()
and if_freenameindex() which are not available when compiling using
the Android NDK r6b (Android API level 13).
Most netdev provider functions are allowed to be null if the implementation
does not support this feature. This commit adds this feature for get_mtu
and set_mtu, and changes netdev-vport to take advantage of it.
Also, changes netdev_get_mtu() to report an MTU of 0 on error, instead of
leaving the MTU indeterminate.
This allows a command like "test-openflowd --enable-dummy dummy@br0
--ports=dummy@eth0,dummy@eth1,dummy@eth2" to create a dummy datapath with
a number of dummy ports. This is more useful for testing than a dummy
datapath with just an internal port, since output to "flood" and "normal"
has less pathological results.
There is no need to have vport attribute MTU (OVS_VPORT_ATTR_MTU) as
linux net-dev-ioctl can be used to get/set MTU for linux device.
Following patch removes OVS_VPORT_ATTR_MTU from datapath protocol.
This patch also adds netdev_set_mtu interface. So that MTU adjustments
can be done from OVS userspace. get_mtu() interface is also changed, now
get_mtu() returns EOPNOTSUPP rather than returning 0 and setting *pmtu
to INT_MAX in case there is no MTU attribute for given device.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Until now, each call to netdev_open() for a particular network device
had to either specify a set of network device arguments that was either
empty or (for devices that already existed) equal to the existing device's
configuration. Unfortunately, the definition of "equality" in the latter
case was mostly done in terms of strict equality of string-to-string maps,
which caused problems in cases where, for example, one set of arguments
specified the default value of an optional argument explicitly and the
other omitted it.
The netdev interface does have provisions for defining equality other ways,
but this had only been done in one case that was especially problematic in
practice. One way to solve this particular problem would be to carefully
define equality in all the problematic cases.
This commit takes another approach based on the realization that there is
really no need to do any comparisons. Instead, it removes configuration
at netdev_open() time entirely, because almost all of netdev_open()'s
callers are not interested in creating and configuring a netdev. Most of
them just want to open a configured device and use it. Therefore, this
commit stops providing any configuration arguments to netdev_open() and the
provider functions that it calls. Instead, a caller that does want to
configure a device does so after it opens it, by calling
netdev_set_config().
This change allows us to simplify the netdev interface a bit. There is no
longer any need to implement argument comparisons. As a result, there is
also no need for "struct netdev_dev" to keep track of configuration at all.
Instead, the network devices that have configuration keep track of it in
their own internal form.
This new interface does mean that it becomes possible to accidentally
create and try to use an unconfigured netdev that requires configuration.
Bug #6677.
Reported-by: Paul Ingram <paul@nicira.com>
The Open vSwitch tree only has one user of the ability for a netdev to
receive packets from a network device. Thus, this commit simplifies the
common-case use of the netdev interface by replacing the "ethertype" option
from "struct netdev_options" by a new netdev_listen() call.
The only user of netdev_listen() wants to receive all packets from a
network device, so this commit also removes the ability to restrict the
received packets to a particular protocol. (This ability was once used by
the Open vSwitch integrated DHCP client, but that code has been removed.)
This commit also simplifies and improves the implementation of the code
in netdev-linux that started listening to a network device. Before, I had
not figured out how to avoid receiving all packets on all devices before
binding to a particular device, but I took a closer look at the kernel code
and figured it out.
I've tested that the userspace datapath (dpif-netdev), the only user of
netdev_recv(), still works after this change.
When doing a netdev_open(), a check is first done to make sure the
arguments are equivalent for any open devices with the same name. In
most cases, a simple shash comparison is sufficient. However, IPsec
key configuration is handled by an external program, so it is not pushed
down into the kernel module. Thus, when the "unparse_config" method is
called on an existing IPsec-based vport, a simple comparison with the
returned data will not match the original configuration. This commit
adds code to allow netdev-specific argument comparisons and has
"ipsec_gre" make use of them.
Bug #5575
This patch moves miimon logic from the bond module to netdev-linux.
This greatly simplifies the bonding code while adding minimal
complexity to netdev-linux. The bonding code is so high level, it
really has no business worrying about how precisely slave status is
determined.
The bonding code neglected to call netdev_monitor_poll() on its
monitor during bond_run(). Thus carrier changes would be
permanently queued in the monitor, preventing it from ever allowing
poll_loop to sleep.
I looked at almost every uint<N>_t in the tree to determine whether it was
really in network byte order, and converted the ones that were.
The only remaining ones, modulo my mistakes, are in openflow.h. I'm not
sure whether we should convert those, because there might be some value
in remaining close to upstream for this header.
In each of the cases converted here, an shash was used simply to maintain
a set of strings, with the shash_nodes' 'data' values set to NULL. This
commit converts them to use sset instead.
Until now, tunnel vports have had a specific MTU, in the same way that
ordinary network devices have an MTU, but treating them this way does not
always make sense. For example, consider a datapath that has three ports:
the local port, a GRE tunnel to another host, and a physical port. If
the physical port is configured with a jumbo MTU, it should be possible to
send jumbo packets across the tunnel: the tunnel can do fragmentation or
the physical port traversed by the tunnel might have a jumbo MTU.
However, until now, tunnels always had a 1500-byte MTU by default. It
could be adjusted using ODP_VPORT_MTU_SET, but nothing actually did this.
One alternative would be to make ovs-vswitchd able to set the vport's MTU.
This commit, however, takes a different approach, of dropping the concept
of MTU entirely for tunnel vports. This also solves the problem described
above, without making any additional work for anyone.
I tested that, without this change, I could not send 1600-byte "pings"
between two machines whose NICs had 2000-byte MTUs that were connected to
vswitches that were in turn connected over GRE tunnels with the default
1500-byte MTU. With this change, it worked OK, regardless of the MTU of
the network traversed by the GRE tunnel.
This patch also makes "patch" ports MTU-less.
It might make sense to remove vport_set_mtu() and the associated callback
now, since ordinary network devices are the only vports that support it
now.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Suggested-by: Jesse Gross <jesse@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #3728.
This gives network device implementations the opportunity to fetch an
existing device's configuration and store it as their arguments, so that
netdev clients can find out how an existing device is configured.
So far netdev-vport is the only implementation that needs to use this.
The next commit will add use by clients.
Reviewed by Justin Pettit.
This commit removes the tunnel_egress_iface column from the
interface table and moves it's data to the status column. In the
process it reverts the database to version 1.0.0.
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath. Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath. Similarly, a vport
could be detached and deleted separately.
After some study, I think I understand why this distinction exists. It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them. However, changing it to create them before it tries to
open them is not difficult, so this commit does this.
The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Until now, the collection of coverage counters supported by a given OVS
program was not specific to that program. That means that, for example,
even though ovs-dpctl does not have anything to do with mac_learning, it
still has a coverage counter for it. This is confusing, at best.
This commit fixes the problem on some systems, in particular on ones that
use GCC and the GNU linker. It uses the feature of the GNU linker
described in its manual as:
If an orphaned section's name is representable as a C identifier then
the linker will automatically see PROVIDE two symbols: __start_SECNAME
and __end_SECNAME, where SECNAME is the name of the section. These
indicate the start address and end address of the orphaned section
respectively.
Systems that don't support these features retain the earlier behavior.
This commit also fixes the annoyance that files that include coverage
counters must be listed on COVERAGE_FILES in lib/automake.mk.
This commit also fixes the annoyance that modifying any source file that
includes a coverage counter caused all programs that link against
libopenvswitch.a to relink, even programs that the source file was not
linked into. For example, modifying ofproto/ofproto.c (which includes
coverage counters) caused tests/test-aes128 to relink, even though
test-aes128 does not link again ofproto.o.
Currently netdev_get_carrier() returns both a carrier status and
an error code. However, usage of the error code was inconsistent:
most callers either ignored it or didn't perform their task if an
error occured, which prevented bond rebalancing. This makes the
handling consistent by translating an error into a down status in
the netdev library.
Bug #3959
The only real difference between netdev-patch and netdev-tunnel is in their
parse_config() implementation. That's a lot of extra code to maintain, for
questionable benefit. This commit merges them into the netdev-vport code,
which was heretofore merely a collection of helper functions.
Currently we print a warning if a user tries to configure a
netdev that is not in the list that userspace knows about.
However, it is possible that a given netdev maybe be enabled but
when it tries to create a device it finds out that it can't
(not supported by kernel module, hardware not present, etc.).
This makes the behavior the same in both cases.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
ovs-vswitchd doesn't declare its QoS capabilities in the database yet,
so the controller has to know what they are. We can add that later.
The linux-htb QoS class has been tested to the extent that I can see that
it sets up the queues I expect when I run "tc qdisc show" and "tc class
show". I haven't tested that the effects on flows are what we expect them
to be. I am sure that there will be problems in that area that we will
have to fix.
The most recent revision of the netdev library added may_create
and may_open flags to explicitly state the intent of the caller as
to whether the device should already be in use. This was simply
a sanity check for users of the netdev library and the configuration.
At this point the netdev library and its users are well behaved and
should no longer need to be checked. Additional checks have also
been added for incorrect configuration that mean the netdev library
is no longer the primary line of defense.
These flags themselves create problems because it is not always
easy for a library to know what the state of devices should be.
This is particularly a problem for ovs-openflowd, which expects
ports to be added by ovs-dpctl. Fixing this either requires that
the checks are so permissive to be useless or ugly hacks to get
around them. Since they are no longer needed, just remove the
checks.
This commit restores the previous behavior of ovs-openflowd to
not require that ports be specified on the command line or
cleaned up after use.
Bug #2652
CC: Natasha Gude <natasha@nicira.com>
CC: Jean Tourrilhes <jt@hpl.hp.com>
CC: 蒲彦 <yan.p.bjtu@gmail.com>
Now that we have a new patch implementation, remove the veth driver
and its userspace components. Then rename 'patchnew' to 'patch'.
The new implementation is a drop-in replacement for the old one.
Add a netdev to talk to the 'patch' vport in the kenerl. Since
there is currently a 'patch' implementation using the veth driver,
this one is temporarily called 'patchnew'.