2
0
mirror of https://github.com/openvswitch/ovs synced 2025-10-25 15:07:05 +00:00
Commit Graph

85 Commits

Author SHA1 Message Date
Ben Pfaff
1fa39e1d50 process: Factor code out of process_start() into helper functions.
An upcoming commit will add a new function that can also use these helper
functions.
2009-07-16 09:17:06 -07:00
Ben Pfaff
52dc2ef41b process: Fix races on fatal signal handling in process_start().
To prevent fatal signals in a child process from causing the parent
process's pidfile, etc. to be deleted, we need to block fatal signals
around fork and call fatal_signal_fork() in the child process.

This problem was noticed through code inspection; it has not been observed
in practice.
2009-07-16 09:17:06 -07:00
Ben Pfaff
8c4c1387fd vswitchd: New unixctl command "fdb/show" to print the MAC learing table.
To implement "brctl showmacs" for bridge compatibility, brcompatd needs to
be able to extract the MAC learning table from ovs-vswitchd.  This provides
a way, and it may be directly useful to switch administrators also.
2009-07-16 09:17:06 -07:00
Ben Pfaff
321943f790 mac-learning: New function mac_entry_age().
This function will be used as part of printing the MAC learning table at
user request.
2009-07-16 09:17:06 -07:00
Ben Pfaff
eaa7133434 Add macros for parsing MAC addresses from strings. 2009-07-16 09:17:06 -07:00
Ben Pfaff
4871855420 New function ds_steal_cstr(). 2009-07-16 09:17:06 -07:00
Ben Pfaff
a8b5f8b423 Add function get_null_fd(), to reduce code redundancy. 2009-07-16 09:17:03 -07:00
Ben Pfaff
aec7d2fba6 brcompat: Improve comments in header file. 2009-07-15 12:45:35 -07:00
Ben Pfaff
25ce84b23b vswitchd: Skip updelay on slave when only a single bond slave is up.
If a network device takes a few seconds to detect carrier, as some do, then
when bringing up a network device and then immediately adding that device
to a bridge, the bond code would start out with that slave considered down
and apply the full updelay to it before bringing it up.  With the 31-second
updelay set by XenServer, this is excessive: we end up having no
connectivity at all for 31 seconds even though there is no reason for it.

This commit makes the bond code disregard the updelay when an interface
comes up on a bond that has no enabled interfaces, and updates the
documentation to match.

Part of bug #1566.
2009-07-14 17:04:40 -07:00
Ben Pfaff
17ea75b2a2 vswitchd: Fix log messages when bond slaves are enabled or disabled.
We were printg "enabled" when a slave was disabled and vice versa.

Part of bug #1566.
2009-07-14 17:04:40 -07:00
Justin Pettit
e775da1437 Fix unitialized variable in coverage_log()
When providing the ability to force coverage printouts to occur, some
code was moved around that allowed the "hash" variable to be used
unitialized.  This fixes that.

Thanks to Ben for pointing out the problem.

Bug #1577
2009-07-14 00:25:44 -07:00
Ben Pfaff
923229363a datapath: Don't orphan packets in dp_dev transmit path.
Before commit 72ca14c1 "datapath: Fix race against workqueue in dp_dev and
simplify code," the dp_dev network device had a device queue, and we would
orphan packets before sticking them on the queue.  This screwed up socket
accounting a bit, but the effect was limited to the device queue length.

Now, after that commit, the dp_dev device has no device queue, but it still
orphans packets.  This screws up socket accounting a *lot*, because the
effect is now unlimited, since there is no queue to limit it.

The solution is to not orphan packets at all.  There is little need for it
now since packet transmission now happens immediately, not in a workqueue
whose execution may be delayed.

This should fix bug #1519, which tests "netperf -t UDP_STREAM" performance,
finding that an unrealistically high number of UDP packets could be sent
but that none at all were received.  The send rate is due to the orphaning,
the receive rate presumably because at least one out of approx. 65535/1500
= 44 fragments per full packet were dropped in each case.
2009-07-13 15:50:32 -07:00
Ben Pfaff
d7cca86710 vconn: Fix detection of vconn local IP address, to fix in-band control.
The in-band control code needs to know the IP and port of both ends of the
control connection.  However, the vconn code was only reporting the local
address after the connection had already succeeded, which created a
chicken-and-egg problem.  In practice we would fail to connect until the
switch went into fail-open, at which point the connection would go through.

Fortunately, we can get the local IP address right after we try to connect,
not just after the connection completes, so this commit changes the code
to do that.

This commit also breaks setting the remote IP and port into functions
separate from vconn_init(), which makes the code more readable.
2009-07-13 15:06:59 -07:00
Ben Pfaff
00c030a597 Revert "datapath: Don't orphan packets in dp_dev transmit path."
This reverts commit 0858ad2b9d.
Although that commit partially fixed a performance regression
relative to the Linux bridge, it introduced other problems that
were not apparent in the performance testing (e.g.
WARN_ON_ONCE(skb->destructor) now triggers in dp_process_received_packet()).
The fix is not completely obvious, so reverting now to ensure stability
while investigating the problem.
2009-07-13 11:56:26 -07:00
Ben Pfaff
0858ad2b9d datapath: Don't orphan packets in dp_dev transmit path.
Before commit 72ca14c1 "datapath: Fix race against workqueue in dp_dev and
simplify code," the dp_dev network device had a device queue, and we would
orphan packets before sticking them on the queue.  This screwed up socket
accounting a bit, but the effect was limited to the device queue length.

Now, after that commit, the dp_dev device has no device queue, but it still
orphans packets.  This screws up socket accounting a *lot*, because the
effect is now unlimited, since there is no queue to limit it.

The solution is to not orphan packets at all.  There is little need for it
now since packet transmission now happens immediately, not in a workqueue
whose execution may be delayed.

This should fix bug #1519, which tests "netperf -t UDP_STREAM" performance,
finding that an unrealistically high number of UDP packets could be sent
but that none at all were received.  The send rate is due to the orphaning,
the receive rate presumably because at least one out of approx. 65535/1500
= 44 fragments per full packet were dropped in each case.
2009-07-13 11:21:50 -07:00
Justin Pettit
67ca9e6d90 Remove "coverage/clear" command due its limited use
The poll loop calls coverage_clear on every pass anyway, so provide a
function to call it separately is of limited value.
2009-07-10 17:36:09 -07:00
Justin Pettit
cae40bbd0c Fix small typo in ovs-ofctl man page. 2009-07-10 17:33:41 -07:00
Justin Pettit
f5c6854a73 Provide ability to retrieve coverage information
Previously, there was no way to induce coverage information to be
displayed; it would only print when the system noticed unusual delays
between polling intervals.  Now, production of coverage logs can be
forced with "coverage/log" command in ovs-appctl.  Coverage counters may
be reset with "coverage/clear".
2009-07-10 15:09:41 -07:00
Ben Pfaff
10bf9f637b datapath: Fix races in updating dp_dev port statistics. 2009-07-08 14:13:15 -07:00
Ben Pfaff
828bc1f072 datapath: Fix race in datapath creation.
Before we create the local port, we should allocate and assign the table.
Otherwise packets sent on the local port before we do so will cause an
OOPS.

This is a theoretical race that has not been observed in practice.
2009-07-08 14:13:15 -07:00
Ben Pfaff
72ca14c154 datapath: Fix race against workqueue in dp_dev and simplify code.
The dp_dev_destroy() function failed to cancel the xmit_queue work, which
allowed it to run after the device had been destroyed, accessing freed
memory.  However, simply canceling the work with cancel_work_sync() would
be insufficient, since other packets could get queued while the work
function was running.  Stopping the queue with netif_tx_disable() doesn't
help, because the final action in dp_dev_do_xmit() is to re-enable the TX
queue.

This issue led me to re-examine why the dp_dev needs to use a work_struct
at all.  This was implemented in commit 71f13ed0b "Send of0 packets from
workqueue, to avoid recursive locking of ofN device" due to a complaint
from lockdep about recursive locking.

However, there's no actual reason that we need any locking around
dp_dev_xmit().  Until now, it has accepted the standard locking provided
by the network stack.  But looking at the other software devices (veth,
loopback), those use NETIF_F_LLTX, which disables this locking, and
presumably do so for this very reason.  In fact, the lwn article at
http://lwn.net/Articles/121566/ hints that NETIF_F_LLTX, which is otherwise
discouraged in the kernel, is acceptable for "certain types of software
device."

So this commit switches to using NETIF_F_LLTX for dp_dev and gets rid
of the work_struct.

In the process, I noticed that veth and loopback also take advantage of
a network device destruction "hook" using the net_device "destructor"
member.  Using this we can automatically get called on network device
destruction at the point where rtnl_unlock() is called.  This allows us
to stop stringing the dp_devs that are being destroyed onto a list so
that we can free them, and thus simplifies the code along all the paths
that call dp_dev_destroy().

This commit gets rid of a call to synchronize_rcu() (disguised as a call
to synchronize_net(), which is a macro that expands to synchronize_rcu()),
so it probably speeds up deleting ports, too.
2009-07-08 14:13:15 -07:00
Ben Pfaff
6d867ef335 datapath: Remove declarations of functions that are never defined or used. 2009-07-08 14:13:15 -07:00
Ben Pfaff
6fba0d0b82 datapath: Fix use-after-free error in datapath destruction.
When we create a datapath we do this:

	1. Create local port.
	2. Call add_dp hook.
	3. Allow userspace to add more ports.

When we deleted a datapath we were doing this:

	1. Call del_dp hook
	2. Delete all the ports.

Unfortunately step 1 destroys dp->ifobj, then dp_del_port on any port other
than the local port in step 2 tries to reference dp->ifobj through a call
to sysfs_remove_link().

This commit fixes the problem by changing datapath deletion to mirror
creation:

	1. Delete all the ports but the local port.
	2. Call dp_del hook.
	3. Delete local port.

Commit 010082639 "datapath: Add sysfs support for all (otherwise supported)
Linux versions" makes this problem obvious on a 2.6.25+ kernel configured
with slab debugging, because on such kernels the ifobj is a pointer to a
slab object that is freed by the del_dp hook function (when brcompat_mod
is loaded).  This bug may be just as present on older kernels, but there
the ifobj is part of struct datapath, not a pointer, and thus it is much
harder to trigger.

Bug #1465.
2009-07-08 14:13:15 -07:00
Ben Pfaff
334b374988 datapath: Remove redundant synchronize_rcu() call.
There is no benefit to synchronizing twice, and it might cost us a lot of
time.
2009-07-08 14:13:15 -07:00
Justin Pettit
26d9fe3b2d secchan: Various fixes to run in-band on non-"local" port
The in-band control code assumed that we would connect to the controller
over the datapath's "local" port.  If the switch had multiple datapaths,
and the controller was only available through one of them, the other
datapaths would fail to connect.  This is addressed by the following
changes:

    - The controller's MAC address is looked up through the connection's
      interface instead of the datapath's local one.

    - We allow ARP traffic to be sent by the connection's interface.

    - We allow ARP traffic to be sent to the controller's MAC address.
      This is necessary if the controller is running locally (e.g. in a VM).

Bug #1466
2009-07-08 11:19:52 -07:00
Justin Pettit
79c720a830 Provide method to locate device by IP and add more nodev functions to netdev
In some cases we need to be able to locate a device by its IPv4 address.
There doesn't seem to be an easy way to do this on Linux, so we iterate
through all devices until it's found.  The caller can provide a hint as
to the device, so subsequent checks can be quicker.

This checkin also adds nodev versions of functions to lookup ARP entries
and the IPv4 address of a device.
2009-07-08 11:19:52 -07:00
Justin Pettit
ae8871ec5b Show more information about connection when querying status.
It is often useful in debugging to know the ports and IP addresses being
used in an OpenFlow connection.  This commit adds the remote and local
IP addresses and ports used for a connection to the "status" information
returned by ovs-ofctl.

Feature #1484.
2009-07-08 11:19:52 -07:00
Justin Pettit
193456d581 Have rconn and vconn export information about IPs and ports
Previously, rconn and vconn only allowed users to find out about the
remote IP address.  This set of changes allows users to retrieve the
remote port, local IP, and local port used for the connection.
2009-07-08 11:19:51 -07:00
Justin Pettit
ab9c78ff50 Include "inttypes.h" to pickup definition of PRIu8. 2009-07-08 11:19:51 -07:00
Justin Pettit
3b250e2911 secchan: Display mgmt and datapath id in status messages
Since it's not always easy to determine the management id, this adds the
ability to see it (and the datapath id) with the "ovs-ofctl status"
command.

Feature #1533
2009-07-07 09:34:37 -07:00
Justin Pettit
fef94d151c vswitchd: Add missing argument to check for update-resolv.conf config 2009-07-06 22:12:53 -07:00
Justin Pettit
a1525581ae vswitchd: Enable updating resolv.conf by default when using discovery
When vSwitch does discovery, it is supposed to update resolv.conf by
default.  The way configuration parameters were being read, it would
disable this update by default.
2009-07-01 13:58:19 -07:00
Justin Pettit
952efc486d vswitch: Set minimum probe interval to 5 seconds
In vSwitch, the minimum probe interval is supposed to be 5 seconds, but
that was not enforced.  If no interval was specified in the config file,
a value of 0 was being used, which would cause probes to never be sent
and the rconn not to move out of its ACTIVE state.

Possible fix to Bug #1466.
2009-06-30 15:24:54 -07:00
Ben Pfaff
7df824b7b2 Log more rconn status. 2009-06-26 16:00:07 -07:00
Justin Pettit
7e40e21d8f xenserver: Remove cacert when user reconfigures the controller
If a user moves from one controller to another, we did not remove the
cacert.  This prevents the switch from connecting to the new controller.
To ease confusion, we now delete the cacert when the user changes or
removes the controller in xsconsole.

Note: This commit has a minor security issue, since we do not remove
trust for the old certificate until the switch is restarted.  In
general, users should only be connected to trusted servers, so the
impact should be low.  Fixes this would require larger changes to the
vconn-ssl code, which we don't want to do so late in the release cycle.

Bug #1457
2009-06-26 12:39:50 -07:00
Justin Pettit
3dc6fca88b xenserver: Validate controller IP address in xsconsole
When a switch is using in-band control, the controller must be specified
in dotted quad format, since DNS names cannot be resolved until a
connection to the controller has been established.  This commit
validates the user input in the xsconsole plugin.
2009-06-24 21:52:34 -07:00
Justin Pettit
7c77c24dbd vswitchd: Adding and removing mgmt interface breaks connection
When a managment connection is configured and then removed, putting it
back causes the management connection to never be reestablished.  The
management code checks whether the configuration file has changed before
it attempts to reconfigure itself.  If the only thing that changed was
the lack of a management connection, then it tore down the connection
but didn't update its view of the configuration.  When the same
manager IP is configured, the cached version matches the new version, so
no changes are made.  This commit clears the cached version, so that a
removing and then adding the manager will be detected as a change.

Bug #1448
2009-06-24 18:03:44 -07:00
Ben Pfaff
083652d401 xenserver: Also log ovs-brcompatd messages at INFO level to syslog.
INFO level messages are meant to be logged in the ordinary case, and they
are useful for debugging problems, so turn them on by default.

It would be a good idea to do so for ovs-vswitchd also, but we have not
tested how much this would increase the log volume.
2009-06-23 11:03:52 -07:00
Ben Pfaff
f06b2aa9dd ovs-brcompatd: Turn up log level of port removal messages.
These messages proved useful during debugging, and they should not be too
common, so log them at a higher level.
2009-06-23 11:02:30 -07:00
Ben Pfaff
385533816c ovs-brcompatd: Handle XS Tools 5.0.0 destroying and recreating devices
XenServer Tools version 5.0.0 destroys and recreates network devices with
the same name on boot of (at least) Windows VMs.  We had a race such that
ovs-brcompatd would delete the new device from the vswitchd configuration
file (not the old one).  This commit fixes that problem.

Bug #1429.
2009-06-23 11:00:43 -07:00
Justin Pettit
23834de273 vswitchd: Reduce number of calls to reconfigure() during mgmt updates
When we receive an OpenFlow management protocol Config Update, we
immediately force the switch to reconfigure itself.  This is
functionally correct, but it can cause long delays before return control
back to the switch.  We now keep track of whether there were any changes
and then only force a reconfigure once per management run.
2009-06-19 17:41:42 -07:00
Ben Pfaff
0c7af78c37 cfg: Log accurate waiting times in cfg_lock().
When cfg_lock() has to block for some time to obtain the configuration file
lock, it logs the amount of time that it waited.  However, it did not
refresh the current time before it began waiting, so the time that it
logged could be off by a significant amount, which make interpreting the
log file more challenging than it should have been.

This change should mainly affect log output.  It should have little or no
effect on Open vSwitch operation because the factor by which the timeouts
were off is an order of magnitude smaller than the actual timeouts that we
pass into the function.

This is related to bug #1426, but it is not a fix for this bug, which will
be committed separately.
2009-06-19 17:14:53 -07:00
Justin Pettit
6eab8112cf xenserver: xsconsole plugin doesn't need execute permissions
When the vSwitch xsconsole plugin is installed, it doesn't need execute
permissions.  This commit changes the permissions from 755 to 644 to
match the other plugins.
2009-06-18 14:17:32 -07:00
Justin Pettit
73945b1fdd xenserver: Handle slave disconnection more gracefully in xsconsole plugin
When a slave cannot connect to the master, the vSwitch xsconsole plugin
complained with some Python style errors on the main display.  This
commit cleans up that behavior.

Bug #1341
2009-06-18 14:13:00 -07:00
Justin Pettit
f304568eaa xenserver: Force reload of config file after VIF deleted
When a VIF is deleted, the "vif" script modifies "/etc/ovs-vswitchd.conf".
After changes are made to the config file, ovs-vswitchd should be told
to reload it, but this wasn't happening.  Now it does.

Thanks to Natasha for catching this.
2009-06-16 12:57:25 -07:00
Ben Pfaff
5eccf35939 Replace SHA-1 library with one that is clearly licensed.
The SHA-1 library that we used until now was taken from RFC 3174.  That
library has no clearly free license statement, only a license on the text
of the RFC.  This commit replaces this library with a modified version of
the code from the Apache Portable Runtime library from apr.apache.org,
which is licensed under the Apache 2.0 license, the same as the rest of
Open vSwitch.
2009-06-15 16:03:28 -07:00
Ben Pfaff
a14bc59fb8 Update primary code license to Apache 2.0. 2009-06-15 15:11:30 -07:00
Ben Pfaff
e2ead27a72 vswitch: Avoid segfault when revalidating ARP flows.
The 'packet' argument to process_flow() is allowed to be null, but some of
the code was assuming that it was always non-null, which caused a segfault
while revalidating ARP flows.

Bug #1394.
2009-06-15 13:07:25 -07:00
Justin Pettit
91b47481fc xenserver: Fix key used to identify network UUID
The "dump-vif-details" script adds the network UUID to the
ovs-vswitchd.conf file.  Unfortunately, it wrote the key as
"network-uuid", but the code that retrieves it for the management
protocol checked our "net-uuid".  The script now uses the key
"net-uuid".

Thanks to Natasha for catching the problem.
2009-06-11 18:30:41 -07:00
Ben Pfaff
986b77c099 xenserver: Pass network UUID to controller for internal networks too.
vNetManager needs to know the xapi UUIDs for the networks that correspond
to OpenFlow connections.  For some time now we have passed these to it
over the management connection using bridge.<bridgename>.xs-network-uuids
configuration keys, but only now did we notice that this didn't get set for
internal networks.

The reason that it didn't get set is that interface-reconfigure is the
script that sets up these configuration keys, but interface-reconfigure
is never called for internal networks.  Instead, xapi creates them itself
using directly calls to bridge ioctls.  So no amount of tweaks to
interface-reconfigure will help.

This commit fixes the problem by modifying the vif script instead.  This
works acceptably only because xapi is lazy about creating bridges for
internal networks: it creates them only just before it is about to add the
first vif to them.  Thus, by setting up the configuration key in the vif
script, it gets added just after the bridge itself is created.  There is
a race, of course, meaning that there may be a delay between the initial
OpenFlow connection and the time when the configuration key is set up,
but vNetManager can tolerate that.
2009-06-11 11:33:39 -07:00