2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 05:47:55 +00:00

149 Commits

Author SHA1 Message Date
Jesse Gross
d625fbd13e tunneling: Convert tunnel push/pop functions to act on single packets.
The userspace tunneling API for pushing and popping tunnel headers
is currently based on processing batches of packets. However, there
is no obvious way to take advantage of batching for these operations
and so each tunnel operation has a pair of loops to process the
batch. This changes the API to operate on single packets to enable
better code reuse.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-09 14:29:08 -07:00
Ricky Li
c876a4bb9b netdev: Fix user space tunneling for set_tunnel action.
e.g. Set tunnel id for encapsulated VxLAN packet (out_key=flow):

ovs-vsctl add-port int-br vxlan0 -- set interface vxlan0 \
    type=vxlan options:remote_ip=172.168.1.2 options:out_key=flow

ovs-ofctl add-flow int-br in_port=LOCAL, icmp,\
    actions=set_tunnel:3, output:1 (1 is the port# of vxlan0)

Output tunnel ID should be modified to 3 with this patch.

Signed-off-by: Ricky Li <ricky.li@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-03-26 18:56:12 -07:00
Pravin B Shelar
e14deea0bd dpif_packet: Rename to dp_packet
dp_packet is short and better name for datapath packet
structure.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-03-03 13:37:34 -08:00
Mark D. Gray
87df2bfc73 netdev-dpdk: Fix typo
Signed-off-by: Mark D. Gray <mark.d.gray@intel.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-02-13 12:32:07 -08:00
Thomas Graf
ca6ba70092 list: Rename struct list to struct ovs_list
struct list is a common name and can't be used in public headers.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-12-15 14:15:12 +01:00
Pravin B Shelar
a36de779d7 openvswitch: Userspace tunneling.
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.

Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-12 15:08:33 -08:00
Wang Sheng-Hui
3bd0fd39eb Use magic ETH_ADDR_LEN instead of 6 for Ethernet address length.
ETH_ADDR_LEN is defined in lib/packets.h, valued 6.
Use this macro instead of magic number 6 to represent the length
of eth mac address.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-10-22 08:46:52 -07:00
Nithin Raju
078eedf4f7 netdev-windows: New module.
In this patch, we add a lib/netdev-windows.c which mostly contains stub
code and in subsequent patches, would use the netlink interface to query
netdev information for a vport.

The code implements netdev functionality for "internal" and "system"
types of vports.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-10-06 15:38:22 -07:00
Alex Wang
5496878cbf netdev: Add function for configuring tx and rx queues.
This commit adds a new API to the 'struct netdev_class' which
allows user to configure the number of tx queues and rx queues
of 'netdev'.  Upcoming patches will use this function to set
multiple tx/rx queues when adding the netdev to dpif-netdev.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-15 11:43:48 -07:00
Pravin B Shelar
2f9dd77fcd ofproto: Do not update stats on fake bond interface.
There are couple of reasons to remove this support:
*   This is used in very old OVS use-case. It is much better
    to read stats directly from OVS.
*   Forthcoming commit will remove support for setting stats
    for vport. The stats update depends on stats-set.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-09-15 10:08:56 -07:00
Alex Wang
f00fa8cbad netdev: Add n_txq to 'struct netdev'.
This commit adds new variable n_txq to 'struct netdev' for recording
the number of tx queues.  Correspondingly, the send_*() functions are
extended to accept queue id as input argument.

All 'netdev-*' implementation will ignore the queue id since having
multiple tx queues is not supported.  Upcomping patches will start
using it and create multiple tx queues for dpdk netdev.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-12 11:30:58 -07:00
Alex Wang
7dec44fe1c netdev: Add function for getting the numa node id of netdev.
This commit adds a new API to the 'struct netdev_class' which
allows user to query the numa node id the 'netdev' is on.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-12 11:30:58 -07:00
Ben Pfaff
c27fd2f912 netdev-provider: Clarify comment where 'get_next_hop' function looks.
Some readers thought it was looking in an ofproto- or netdev-specific
table.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
2014-07-24 12:51:28 -07:00
Daniele Di Proietto
f4fd623c4c netdev: netdev_send accepts multiple packets
The netdev_send function has been modified to accept multiple packets, to
allow netdev providers to amortize locking and queuing costs.
This is especially true for netdev-dpdk.

Later commits exploit the new API.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:13 -07:00
Daniele Di Proietto
910885540a dpif-netdev: use dpif_packet structure for packets
This commit introduces a new data structure used for receiving packets from
netdevs and passing them to dpifs.
The purpose of this change is to allow storing some private data for each
packet. The subsequent commits make use of it.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:12 -07:00
Ethan Jackson
053acb81bc netdev: Allow netdev_change_seq_changed() to accept const pointers.
This fixes the following warning in the DPDK code.

../lib/netdev-dpdk.c:790:31: error: passing 'const struct netdev *' to
parameter of type 'struct netdev *' discards qualifiers

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-05-07 14:16:50 -07:00
Alex Wang
41ca1e0afb netdev-vport: Checks tunnel status change when route-table is reset.
Commit 3e912ffcbb (netdev: Add 'change_seq' back to netdev.) added per-
netdev change number for indicating status change.  Future commits used
this change number to optimize the netdev status update to database.
However, the work also introduced the bug in the following scenario:

- assume interface eth0 has address 1.2.3.4, eth1 has adddress 10.0.0.1.
- assume tunnel port p1 is set with remote_ip=10.0.0.5.
- after setup, 'ovs-vsctl list interface p1 status' should show the
  'tunnel_egress_iface="eth1"'.
- now if the address of eth1 is change to 0 via 'ifconfig eth1 0'.
- expectedly, after change, 'ovs-vsctl list interface p1 status' should
  show the 'tunnel_egress_iface="eth0"'

However, 'tunnel_egress_iface' will not be updated on current master.
This is in that, the 'netdev-vport' module corresponding to p1 does
not react to routing related changes.

To fix the bug, this commit adds a change sequence number in the route-
table module and makes netdev-vport check the sequence number for
tunnel status update.

Bug #1240626

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-05-02 14:29:18 -07:00
Alex Wang
3e912ffcbb netdev: Add 'change_seq' back to netdev.
This commit can be seen as a partial revert of commit
da4a619179d (netdev: Globally track port status changes)
by adding the 'change_seq' to 'struct netdev'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-10 12:55:28 -07:00
Pravin
55c955bd8a netdev: Add support multiqueue recv.
new netdev type like DPDK can support multi-queue IO. Following
patch Adds support for same.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
f77917408a netdev: Rename netdev_rx to netdev_rxq
Preparation for multi queue netdev IO.  There are no functional changes
in this patch.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
40d26f04b2 netdev: Send ofpbuf directly to netdev.
DPDK netdev need to access ofpbuf while sending buffer. Following
patch changes netdev_send accordingly.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
df1e5a3bc7 netdev: Extend rx_recv to pass multiple packets.
DPDK can receive multiple packets but current netdev API does
not allow that.  Following patch allows dpif-netdev receive batch
of packet in a rx_recv() call for any netdev port.  This will be
used by dpdk-netdev.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-03-21 11:48:28 -07:00
Simon Horman
b73c85181d netdev-linux: Read packet auxdata to obtain vlan_tid
If VLAN acceleration is used when the kernel receives a packet
then the outer-most VLAN tag will not be present in the packet
when it is received by netdev-linux. Rather, it will be present
in auxdata.

This patch uses recvmsg() instead of recv() to read auxdata for
each packet and if the vlan_tid is set then it is added to the packet.

Adding the vlan_tid makes use of headroom available
in the buffer parameter of rx_recv.

Signed-off-by: Simon Horman <horms@verge.net.au>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-16 15:09:14 -08:00
Simon Horman
bfd3367b9e netdev_class: Pass a struct ofpbuf * to rx_recv()
Update the netdev_class so that struct ofpbuf * is passed to rx_recv()
to provide both the data and size of the data to read a packet into.

On success, update struct ofpbuf size inside netdev_class rx_recv
implementation and return 0. This moves logic from the caller.
On error a positive error code is returned, whereas previously
a negative error code was returned. This is a more common convention.

This patch should not have any behavioural changes.

This patch is in preparation for the netdev-linux variant of rx_recv()
making use of headroom in the struct ofpbuf * parameter to push a VLAN tag
obtained from auxdata.

Signed-off-by: Simon Horman <horms@verge.net.au>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-16 14:37:43 -08:00
Simon Horman
fe68c00587 netdev: Update rx_recv documentation for NULL case
Replace truncated description of when rx_recv() may be NULL
with text used for other fields of netdev_class.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-07 14:53:25 -08:00
Joe Stringer
da4a619179 netdev: Globally track port status changes
Previously, we tracked status changes for ofports on a per-device basis.
Each time in the main thread's loop, we would inspect every ofport
to determine whether the status had changed for corresponding devices.

This patch replaces the per-netdev change_seq with a global 'struct seq'
which tracks status change for all ports. In the average case where
ports are not constantly going up or down, this allows us to check the
sequence once per main loop and not poll any ports. In the worst case,
execution is expected to be similar to how it is currently.

In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces average CPU usage of the main thread from about 40% to
about 35%.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-12-12 15:03:19 -08:00
Ben Pfaff
89454bf477 netdev: Fix deadlock when netdev_dump_queues() callback calls into netdev.
We have a call chain like this:

    iface_configure_qos() calls
        netdev_dump_queues(), which calls
            netdev_linux_dump_queues(), which calls back through 'cb' to
                qos_unixctl_show_cb(), which calls
                    netdev_delete_queue(), which calls
                        netdev_linux_delete_queue().

Both netdev_dump_queues() and netdev_linux_delete_queue() take the same
mutex in the same netdev, which deadlocks.

This commit fixes the problem by getting rid of the callback.

netdev_linux_dump_queue_stats() would benefit from the same treatment but
it's less urgent because I don't see any callbacks from that function that
call back into a netdev function.

Bug #19319.
Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-27 21:50:40 -07:00
Ben Pfaff
863838160e netdev: Make netdev access thread-safe.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-09 21:34:02 -07:00
Ben Pfaff
9dc63482bb netdev: Adopt four-step alloc/construct/destruct/dealloc lifecycle.
This is the same lifecycle used in the ofproto provider interface.
Compared to the previous netdev provider interface, it has the
advantage that the netdev top layer can control when any given
netdev becomes visible to the outside world.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-09 21:21:38 -07:00
Ben Pfaff
89608a13e3 netdev-provider: Remove unused function netdev_assert_class().
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:52:41 -07:00
Andy Hill
ec9f40dce1 Fix misspellings in comments and docs.
Flagged with: https://github.com/lyda/misspell-check
Run with: git ls-files | misspellings -f -

Signed-off-by: Andy Hill <hillad@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-04 21:53:33 -07:00
YAMAMOTO Takashi
666afb55e8 add minimal NetBSD support
mostly ride on the existing FreeBSD support.

Signed-off-by: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-22 20:39:14 -07:00
Ben Pfaff
b5d57fc879 netdev: Get rid of netdev_dev.
The distinction between struct netdev_dev and struct netdev has always
been confusing.  Now that previous commits have eliminated all interesting
state from struct netdev, this commit deletes it and renames struct
netdev_dev to take its place.  Now the situation makes much more sense and
I won't have to continue making embarrassed explanations in the future.

Good riddance.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-10 14:46:15 -07:00
Ben Pfaff
796223f5bc netdev: Add new "struct netdev_rx" for capturing packets from a netdev.
Separating packet capture from "struct netdev" means that there is no
remaining per-"struct netdev" state, which will allow us to get rid of
"struct netdev_dev" (by renaming it "struct netdev").

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-05-10 14:39:36 -07:00
Ben Pfaff
4b60911067 netdev: Factor restoring flags into new "struct netdev_saved_flags".
This gets rid of the only per-instance data in "struct netdev", which
will make it possible to merge "struct netdev_dev" into "struct netdev" in
a later commit.

Ed Maste wrote the netdev-bsd changes in this commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Co-authored-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ed Maste <emaste@freebsd.org>
Tested-by: Ed Maste <emaste@freebsd.org>
2013-05-10 11:24:07 -07:00
Ethan Jackson
c060c4cf83 netdev-vport: Build on all platforms.
This patch removes the final bit of linux specific code which
prevents building netdev-vport everywhere.  With this, other
platforms automatically get access to patch ports, and (if their
datapath supports it), flow based tunneling.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-28 19:09:58 -08:00
Ethan Jackson
0a740f4829 ofproto-dpif: Implement patch ports in userspace.
This commit moves responsibility for implementing patch ports from
the datapath to ofproto-dpif.  There are two main reasons to do
this.

The first is a matter of design:  ofproto-dpif both has more
information than the datapath, and is better suited to handle the
complexity required to implement patch ports.

The second is performance.  My setup is a virtual machine with two
basic learning bridges connected by patch ports.  I used
ovs-benchmark to ping the virtual router IP residing outside the
VM.  Over a 60 second run, "ovs-benchmark rate" improves from
14618.1 to 19311.9 transactions per second, or a 32% improvement.
Similarly, "ovs-benchmark latency" improves from 6ms to 4ms.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-24 12:34:07 -08:00
Ben Pfaff
cb22974d77 Replace most uses of assert by ovs_assert.
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-01-16 16:03:37 -08:00
Ethan Jackson
f431bf7d78 netdev: Parse and make available tunnel configuration.
Future patches will need to know the details of a netdev's tunnel
configuration from outside the netdev library.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-15 16:21:09 -08:00
Ethan Jackson
275707c33f netdev: Rename get_drv_info() to get_status().
get_status() is a much more intuitive name since "status" is what
the database column is called.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-01-03 16:55:43 -08:00
Giuseppe Lettieri
f6eb6b2025 netdev implementation for FreeBSD
This patch adds new netdev classes that implement
"system" and "tap" devices on FreeBSD using the
libpcap library. This enables the use of the
"netdev" datapath_type of Open vSwitch on FreeBSD.

Signed-off-by: Gaetano Catalli <gaetano.catalli@gmail.com>
Signed-off-by: Ed Maste <emaste@adaranet.com>
Signed-off-by: Giuseppe Lettieri <g.lettieri@iet.unipi.it>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-07-26 16:21:48 -07:00
Ethan Jackson
79f1cbe9f8 lib: New data structure - smap.
A smap is a string to string hash map.  It has a cleaner interface
than shash's which were traditionally used for the same purpose.
This patch implements the data structure, and changes netdev and
its providers to use it.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-06-14 15:09:31 -07:00
Raju Subramanian
e0edde6fee Global replace of Nicira Networks.
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.

Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-02 17:08:02 -07:00
Ben Pfaff
33f1ff8464 netdev: Document use for get_etheraddr member of struct netdev_class.
This has confused developers adding hardware support, e.g.:
http://openvswitch.org/pipermail/dev/2012-April/016350.html

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-01 14:29:35 -07:00
Pravin B Shelar
2c2ea5a88a netdev: Rename netdev->get_status() to netdev->get_drv_info().
get_status actually returns driver information, so get_drv_info()
is better name.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2012-03-22 10:58:32 -07:00
Ben Pfaff
f486e8405a netdev-linux: Fix use-after-free when netdev_dump_queues() deletes queues.
iface_configure_qos() passes a callback to netdev_dump_queues() that can
delete queues.  The netdev-linux implementation of this function was
unprepared for the callback to delete queues, so this could cause a
use-after-free.  This fixes the problem in netdev_linux_dump_queues() and
documents that netdev_dump_queues() implementations must support deletions
in the callback.

Found by valgrind:

==1593== Invalid read of size 8
==1593==    at 0x4A8C43: netdev_linux_dump_queues (hmap.h:326)
==1593==    by 0x4305F7: bridge_reconfigure (bridge.c:3084)
==1593==    by 0x431384: bridge_run (bridge.c:1892)
==1593==    by 0x432749: main (ovs-vswitchd.c:96)
==1593==  Address 0x632e078 is 8 bytes inside a block of size 32 free'd
==1593==    at 0x4C240FD: free (vg_replace_malloc.c:366)
==1593==    by 0x4A4D74: hfsc_class_delete (netdev-linux.c:3250)
==1593==    by 0x42AA59: iface_delete_queues (bridge.c:3055)
==1593==    by 0x4A8C8C: netdev_linux_dump_queues (netdev-linux.c:1881)
==1593==    by 0x4305F7: bridge_reconfigure (bridge.c:3084)
==1593==    by 0x431384: bridge_run (bridge.c:1892)

Bug #10164.
Reported-by: Ram Jothikumar <ram@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-03-19 13:48:24 -07:00
Ben Pfaff
6c0386119d netdev: Abstract "features" interface away from OpenFlow 1.0.
netdev_get_features() and other functions have always used OpenFlow 1.0
"enum ofp_port_features" bits as part of their interface.  This commit
switches over to using an internally defined interface that is not tied
directly to any OpenFlow version, making evolution of each side of the
interface easier in the future.

Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-03-07 14:05:09 -08:00
Ben Pfaff
02e83e83b4 netdev-linux: Report error for truncated packets on receive.
Found by inspection.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2011-12-19 14:53:28 -08:00
Ben Pfaff
52e2fbfbab netdev: Remove netdev_get_vlan_vid().
It has no remaining users.
2011-11-23 13:19:53 -08:00
Ethan Jackson
65c3058c22 vswitchd: New column "link_resets".
An interface's 'link_resets' column represents the number of times
Open vSwitch has observed its link_state change.
2011-10-17 15:03:03 -07:00