2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-24 10:58:04 +00:00

221 Commits

Author SHA1 Message Date
Daniele Di Proietto
910885540a dpif-netdev: use dpif_packet structure for packets
This commit introduces a new data structure used for receiving packets from
netdevs and passing them to dpifs.
The purpose of this change is to allow storing some private data for each
packet. The subsequent commits make use of it.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:12 -07:00
Ben Pfaff
e5e4b47cc2 Fix log message weird suffixes.
I think these were leftovers from the removal of %z for MSVC that happened
some time ago.

VMware-BZ: 1265762
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
2014-06-11 09:14:54 -07:00
Andy Zhou
04c881eb64 netdev-linux: favor netlink stats for physical ports
Currently physical ports stats are collected from kernel datapath.
However, those counter do not reflect actual wire packet counters
when GSO, TSO or GRO are enabled by the NIC. In the meantime, the
stats collected form routing stack does. While both stats are valid,
Reporting kernel netdev stats for packet counts and byte counts make
it easier to correlate those numbers with external measurements.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-05-02 18:10:46 -07:00
Alex Wang
3e912ffcbb netdev: Add 'change_seq' back to netdev.
This commit can be seen as a partial revert of commit
da4a619179d (netdev: Globally track port status changes)
by adding the 'change_seq' to 'struct netdev'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-10 12:55:28 -07:00
Pravin Shelar
1f317cb5c2 ofpbuf: Introduce access api for base, data and size.
These functions will be used by later patches.  Following patch
does not change functionality.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-03-30 06:18:43 -07:00
Pravin
f77917408a netdev: Rename netdev_rx to netdev_rxq
Preparation for multi queue netdev IO.  There are no functional changes
in this patch.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
40d26f04b2 netdev: Send ofpbuf directly to netdev.
DPDK netdev need to access ofpbuf while sending buffer. Following
patch changes netdev_send accordingly.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
df1e5a3bc7 netdev: Extend rx_recv to pass multiple packets.
DPDK can receive multiple packets but current netdev API does
not allow that.  Following patch allows dpif-netdev receive batch
of packet in a rx_recv() call for any netdev port.  This will be
used by dpdk-netdev.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-03-21 11:48:28 -07:00
Pravin Shelar
f35c2d60cd netdev-linux: Remove unnecessary header.
netdev-linux does not depend on linux kernel version. So
remove header file include.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-03-07 04:03:53 -08:00
Joe Stringer
d57695d77e netlink: Remove buffer from 'struct nl_dump'.
This patch makes all of the users of 'struct nl_dump' allocate their own
buffers to pass down to nl_dump_next(). This paves the way for allowing
multithreaded flow dumping.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-02-27 14:14:56 -08:00
Ben Pfaff
55bc98d6cb netdev-linux: Fix build break on RHEL 6.1.
Commit 73c85181d (netdev-linux: Read packet auxdata to obtain vlan_tid)
added #include <linux/if_packet.h> to this file, to get the definition
of PACKET_AUXDATA and some other definitions, but on RHEL 6.1 this
provoked compiler errors:

In file included from /usr/include/linux/rtnetlink.h:5,
    from lib/netdev-linux.c:34:
/usr/include/linux/netlink.h:34: error: expected specifier-qualifier-list
    before 'sa_family_t'

Since the old #includes worked everywhere, and this file already defined
its own versions of most of the new macros that it needed, this commit just
reverts the old #includes and adds the one macro definition it didn't
already have.

(RHEL 6.1 isn't necessarily the only platform where this is a problem, but
it's the first one for which we noticed the problem.)

This switches the definition of sockaddr_ll used from the Linux one, which
uses __be16 for sll_protocol, to the glibc one, which uses plain "unsigned
short int".  This makes sparse complain (rightly), so this commit also
adds a sparse-specific header that uses ovs_be16 to prevent the warning.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-16 23:40:43 -08:00
Simon Horman
b73c85181d netdev-linux: Read packet auxdata to obtain vlan_tid
If VLAN acceleration is used when the kernel receives a packet
then the outer-most VLAN tag will not be present in the packet
when it is received by netdev-linux. Rather, it will be present
in auxdata.

This patch uses recvmsg() instead of recv() to read auxdata for
each packet and if the vlan_tid is set then it is added to the packet.

Adding the vlan_tid makes use of headroom available
in the buffer parameter of rx_recv.

Signed-off-by: Simon Horman <horms@verge.net.au>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-16 15:09:14 -08:00
Simon Horman
bfd3367b9e netdev_class: Pass a struct ofpbuf * to rx_recv()
Update the netdev_class so that struct ofpbuf * is passed to rx_recv()
to provide both the data and size of the data to read a packet into.

On success, update struct ofpbuf size inside netdev_class rx_recv
implementation and return 0. This moves logic from the caller.
On error a positive error code is returned, whereas previously
a negative error code was returned. This is a more common convention.

This patch should not have any behavioural changes.

This patch is in preparation for the netdev-linux variant of rx_recv()
making use of headroom in the struct ofpbuf * parameter to push a VLAN tag
obtained from auxdata.

Signed-off-by: Simon Horman <horms@verge.net.au>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-16 14:37:43 -08:00
Ben Pfaff
13a24df8b9 netdev-linux: Simplify get_stats_via_netlink().
There's no need to obtain the ifindex, because RTM_GETLINK is happy to take
the interface name.  There's no need to do a full nl_policy_parse(),
because we only need a single attribute.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-12-30 14:04:21 -08:00
Ben Pfaff
35eef899b3 netdev-linux: Drop support for pre-2.6.19 kernels.
The OVS kernel module requires 2.6.32 or later, so there's no reason for
userspace to support older kernels.  This commit removes the special
fallback code for retrieving Linux netdev stats in pre-2.6.19 kernels,
which should no longer be useful.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-12-30 14:04:21 -08:00
Joe Stringer
da4a619179 netdev: Globally track port status changes
Previously, we tracked status changes for ofports on a per-device basis.
Each time in the main thread's loop, we would inspect every ofport
to determine whether the status had changed for corresponding devices.

This patch replaces the per-netdev change_seq with a global 'struct seq'
which tracks status change for all ports. In the average case where
ports are not constantly going up or down, this allows us to check the
sequence once per main loop and not poll any ports. In the worst case,
execution is expected to be similar to how it is currently.

In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces average CPU usage of the main thread from about 40% to
about 35%.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-12-12 15:03:19 -08:00
Alin Serdean
34582733d9 Avoid printf type modifiers not supported by MSVC C runtime library.
The MSVC C library printf() implementation does not support the 'z', 't',
'j', or 'hh' format specifiers.  This commit changes the Open vSwitch code
to avoid those format specifiers, switching to standard macros from
<inttypes.h> where available and inventing new macros resembling them
where necessary.  It also updates CodingStyle to specify the macros' use
and adds a Makefile rule to report violations.

Signed-off-by: Alin Serdean <aserdean@cloudbasesolutions.com>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-11-25 23:38:59 -08:00
Ben Pfaff
c2c28dfd68 Switch from sscanf() to ovs_scan() throughout the tree.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-11-15 08:55:02 -08:00
Joe Stringer
19c8e9c11b netdev-linux: Skip miimon execution when disabled
When dealing with a large number of ports, one of the performance
bottlenecks is that we loop through all netdevs in the main loop. Miimon
is a contributor to this, checking all devices even if it has never been
enabled.

This patch introduces a counter for the number of netdevs with miimon
configured. If this is 0, then we skip miimon_run() and miimon_wait().
In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces CPU usage from about 50% to about 45%.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-11-02 20:53:10 -07:00
Alexandru Copot
7ba19d412a netdev: update IFF_LOOPBACK flag for linux and bsd devices
Signed-off-by: Alexandru Copot <alex.mihai.c@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-09-07 09:33:51 -07:00
Ben Pfaff
89454bf477 netdev: Fix deadlock when netdev_dump_queues() callback calls into netdev.
We have a call chain like this:

    iface_configure_qos() calls
        netdev_dump_queues(), which calls
            netdev_linux_dump_queues(), which calls back through 'cb' to
                qos_unixctl_show_cb(), which calls
                    netdev_delete_queue(), which calls
                        netdev_linux_delete_queue().

Both netdev_dump_queues() and netdev_linux_delete_queue() take the same
mutex in the same netdev, which deadlocks.

This commit fixes the problem by getting rid of the callback.

netdev_linux_dump_queue_stats() would benefit from the same treatment but
it's less urgent because I don't see any callbacks from that function that
call back into a netdev function.

Bug #19319.
Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-27 21:50:40 -07:00
Ben Pfaff
834d6cafe4 Use "error-checking" mutexes in place of other kinds wherever possible.
We've seen a number of deadlocks in the tree since thread safety was
introduced.  So far, all of these are self-deadlocks, that is, a single
thread acquiring a lock and then attempting to re-acquire the same lock
recursively.  When this has happened, the process simply hung, and it was
somewhat difficult to find the cause.

POSIX "error-checking" mutexes check for this specific problem (and
others).  This commit switches from other types of mutexes to
error-checking mutexes everywhere that we can, that is, everywhere that
we're not using recursive mutexes.  This ought to help find problems more
quickly in the future.

There might be performance advantages to other kinds of mutexes in some
cases.  However, the existing mutex type choices were just guesses, so I'd
rather go for easy detection of errors until we know that other mutex
types actually perform better in specific cases.  Also, I did a quick
microbenchmark of glibc mutex types on my host and found that the
error checking mutexes weren't any slower than the other types, at least
when the mutex is uncontended.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-08-20 13:40:02 -07:00
Ben Pfaff
73371c09f9 netdev-linux: Fix self-deadlocks in traffic control code.
htb_parse_qdisc_details__(), which is called with the netdev mutex, called
netdev_get_mtu(), which tried to reacquire the mutex and thus deadlocked.
This commit fixes the problem and similar problems in
htb_parse_class_details__() and hfsc_parse_qdisc_details__().

Bug #19180.
Reported-by: Dhaval Badiani <dbadiani@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-16 14:25:16 -07:00
Ben Pfaff
4f9f3f2135 netdev-linux: Avoid deadlock in netdev_linux_update_flags() for taps.
netdev_linux_set_etheraddr() would attempt to recursively acquire
netdev->mutex via netdev_linux_update_flags() for tap devices.

Reported-by: ZhengLingyun <konghuarukhr@163.com>
Tested-by: ZhengLingyun <konghuarukhr@163.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-10 20:42:50 -07:00
Ben Pfaff
38e0065b1f netdev-linux: Fix netdev leak in corner case.
Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-10 09:02:56 -07:00
Ben Pfaff
863838160e netdev: Make netdev access thread-safe.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-09 21:34:02 -07:00
Ben Pfaff
cee8733832 netdev-linux: Use dedicated netlink notification socket.
The rtnetlink_link asynchronous netlink notifications seem somewhat
troublesome in a threaded environment.  It seems more straightforward
to have netdev-linux fend for itself.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-09 21:29:03 -07:00
Ben Pfaff
9dc63482bb netdev: Adopt four-step alloc/construct/destruct/dealloc lifecycle.
This is the same lifecycle used in the ofproto provider interface.
Compared to the previous netdev provider interface, it has the
advantage that the netdev top layer can control when any given
netdev becomes visible to the outside world.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-09 21:21:38 -07:00
Ben Pfaff
259e0b1ad1 netdev-linux, netdev-bsd: Make access to AF_INET socket thread-safe.
The only uses of 'af_inet_sock', in both drivers, were ioctls, so it seemed
like a good abstraction to write a function that just does such an ioctl,
and to factor out shared code into socket-util.

Signed-off-by: Ben Pfaff <blp@nicira.com>
CC: Ed Maste <emaste@freebsd.org>
2013-08-09 21:14:23 -07:00
Ben Pfaff
991e5fae57 netdev: Make netdev_from_name() take a reference to its returned netdev.
This API change is necessary for thread safety, to be added in an upcoming
commit.  Otherwise, the client would not be able to safely use the returned
netdev because it could already have been destroyed.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:52:43 -07:00
Ben Pfaff
2f980d7417 netdev: Make netdev_get_devices() take a reference to each netdev.
This API change is necessary for thread safety, to be added in an upcoming
commit.  Otherwise, the client would not be able to actually use any of
the returned netdevs because they could already have been destroyed.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:52:42 -07:00
Ben Pfaff
c22762302c netdev-linux: Move variable declaration inward in netdev_linux_cache_cb().
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:40:32 -07:00
Ben Pfaff
887ed8b26f netdev-linux: Remove useless member 'peer', which was always zero.
Always, correct a comment on netdev_linux_get_features().

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:40:32 -07:00
Ben Pfaff
1755ec4004 netdev-linux: Remove unused struct netdev_linux member.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:40:32 -07:00
Ben Pfaff
d0d08f8aa5 netdev-linux: Remove pointless layers of indirection for tap devices.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:40:31 -07:00
Ben Pfaff
15aee11695 netdev: Minor formatting improvements.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-02 12:24:19 -07:00
Ben Pfaff
96172faa34 netdev-linux: Don't assume 'struct netdev' has offset 0.
The data items returned by netdev_get_devices() are "struct netdev *"s.
The code fixed up by this commit used them as "struct netdev_linux *",
which happens to work because struct netdev happens to be at offset 0 in
each struct but it's better to do a proper cast in case someday
struct netdev gets moved to a nonzero offset.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-02 12:24:18 -07:00
Ben Pfaff
2e5ae318d5 netdev-linux: Initialize change_seq for tap devices too.
change_seq is supposed to always be nonzero but tap devices got this wrong.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-02 12:24:18 -07:00
Ben Pfaff
f61d8d2931 netdev-linux: Fix fd leak on error path.
Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-02 12:24:18 -07:00
Ben Pfaff
6dc34a0dab Implement OpenFlow 1.3 queue stats duration feature.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-26 10:20:04 -07:00
Alex Wang
db5a101931 clang: Fix the alignment warning.
This commit fixes the warning issued by 'clang' when pointer is casted
to one with greater alignment.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-23 12:34:41 -07:00
Ben Pfaff
2388211565 netdev-linux: Work on thread safety.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-07-23 11:37:57 -07:00
Ben Pfaff
a88b4e0412 netlink-socket: Simplify use of transactions and dumps.
This disentangles "struct nl_dump" from "struct nl_sock", clearing the way
to make the use of either one thread-safe in an obviously correct manner.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-18 13:18:46 -07:00
ZhengLingyun
bb5c146881 netdev-linux: avoid negative value cast to non-negative in comparison
I am using a userspace OVS, there is a constant ERR message in log:
"dpif_netdev|ERR|error receiving data from ovs-netdev: Message too long"
Check the code and find there is a comparison between ssize_t and size_t type in
netdev_rx_linux_recv(). It makes the negative "retval" greater than the "size".

Change the sequence of comparison to avoid negative return value cast to
a positive and become greater then a non-negative value.

Signed-off-by: ZhengLingyun <konghuarukhr@163.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-18 11:13:43 -07:00
Ben Pfaff
10a89ef04d Replace all uses of strerror() by ovs_strerror(), for thread safety.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-28 16:09:38 -07:00
Murphy McCauley
32383c3bd0 lib/netdev-linux.c: Prevent receiving of sent packets
Commit 796223f5 (netdev: Add new "struct netdev_rx" for capturing packets
from a netdev) refactored send and receive into separate netdevs.  As a
result, send and receive now use different socket descriptors (except for tap
interfaces which are treated specially).  An unintended side effect was that
all sent packets are looped back and received, which had previously been
avoided as the kernel specifically prevents this from happening on a single
socket descriptor.

To resolve the situation, a socket filter is added to the receive socket
so that it only accepts inbound packets.

Simon Horman co-discovered and initially reported this issue.

Signed-off-by: Murphy McCauley <murphy.mccauley@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Tested-by: Simon Horman <horms@verge.net.au>
Reviewed-by: Simon Horman <horms@verge.net.au>
2013-06-13 14:43:29 -07:00
Ben Pfaff
bbd5b6f44b netdev-linux: Skip NETDEV_UP test in netdev_linux_set_etheraddr() for taps.
netdev_turn_flags_off() does nothing if the flags that one turns off are
already off.

Reported-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-05-13 16:43:17 -07:00
Ben Pfaff
b5d57fc879 netdev: Get rid of netdev_dev.
The distinction between struct netdev_dev and struct netdev has always
been confusing.  Now that previous commits have eliminated all interesting
state from struct netdev, this commit deletes it and renames struct
netdev_dev to take its place.  Now the situation makes much more sense and
I won't have to continue making embarrassed explanations in the future.

Good riddance.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-10 14:46:15 -07:00
Ben Pfaff
180c6d0b44 Rename superclass members to 'up'.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-05-10 14:42:30 -07:00
Ben Pfaff
796223f5bc netdev: Add new "struct netdev_rx" for capturing packets from a netdev.
Separating packet capture from "struct netdev" means that there is no
remaining per-"struct netdev" state, which will allow us to get rid of
"struct netdev_dev" (by renaming it "struct netdev").

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-05-10 14:39:36 -07:00