2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 13:58:14 +00:00
Commit Graph

118 Commits

Author SHA1 Message Date
Paul Blakey
53611f7b05 other-config: Add hw-offload switch to control netdev flow offloading
Add a new configuration option - hw-offload that enables netdev
flow api. Enabling this option will allow offloading flows
using netdev implementation instead of the kernel datapath.
This configuration option defaults to false - disabled.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-14 10:13:25 +02:00
Paul Blakey
18ebd48cfb netdev: Adding a new netdev API to be used for offloading flows
Add a new API interface for offloading dpif flows to netdev.
The API consist on the following:
  flow_put - offload a new flow
  flow_get - query an offloaded flow
  flow_del - delete an offloaded flow
  flow_flush - flush all offloaded flows
  flow_dump_* - dump all offloaded flows

In upcoming commits we will introduce an implementation of this
API for netdev-linux.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-14 10:12:30 +02:00
Jan Scheurich
beb75a40fd userspace: Switching of L3 packets in L2 pipeline
Ports have a new layer3 attribute if they send/receive L3 packets.

The packet_type included in structs dp_packet and flow is considered in
ofproto-dpif. The classical L2 match fields (dl_src, dl_dst, dl_type, and
vlan_tci, vlan_vid, vlan_pcp) now have Ethernet as pre-requisite.

A dummy ethernet header is pushed to L3 packets received from L3 ports
before the the pipeline processing starts. The ethernet header is popped
before sending a packet to a L3 port.

For datapath ports that can receive L2 or L3 packets, the packet_type
becomes part of the flow key for datapath flows and is handled
appropriately in dpif-netdev.

In the 'else' branch in flow_put_on_pmd() function, the additional check
flow_equal(&match.flow, &netdev_flow->flow) was removed, as a) the dpcls
lookup is sufficient to uniquely identify a flow and b) it caused false
negatives because the flow in netdev->flow may not properly masked.

In dpif_netdev_flow_put() we now use the same method for constructing the
netdev_flow_key as the one used when adding the flow to the dplcs to make sure
these always match. The function netdev_flow_key_from_flow() used so far was
not only inefficient but sometimes caused mismatches and subsequent flow
update failures.

The kernel datapath does not support the packet_type match field.
Instead it encodes the packet type implictly by the presence or absence of
the Ethernet attribute in the flow key and mask.
This patch filters the PACKET_TYPE attribute out of netlink flow key and
mask to be sent to the kernel datapath.

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-06-02 10:15:20 -07:00
Pravin B Shelar
bf4bbd0d12 tunnel: Add support to configure ptk_mark
Today packet mark action is broken for Tunnel ports with
tunnel monitoring. User can write a flow to set pkt-mark for
any tunnel traffic, but there is no way to set the packet
mark for corresponding BFD traffic.

Following patch introduces new option in OVSDB tunnel
configuration so that user can set skb-mark for given
tunnel endpoint. OVS would set the mark according to the
skb-mark option for all tunnel traffic including packets
generated by vSwitchd like tunnel monitoring BFD packet.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-28 12:16:34 -08:00
Daniele Di Proietto
57eebbb4c3 dpif-netdev: Don't try to output on a device without txqs.
Tunnel devices have 0 txqs and don't support netdev_send().  While
netdev_send() simply returns EOPNOTSUPP, the XPS logic is still executed
on output, and that might be confused by devices with no txqs.

It seems better to have different structures in the fast path for ports
that support netdev_{push,pop}_header (tunnel devices), and ports that
support netdev_send.  With this we can also remove a branch in
netdev_send().

This is also necessary for a future commit, which starts DPDK devices
without txqs.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2017-01-15 19:25:11 -08:00
Stephen Finucane
7c9afefd0a doc: Populate 'topics' section
There are many docs that don't need to kept at the top level, along
with many more hidden in random folders. Move them all.

This also allows us to add the '-W' flag to Sphinx, ensuring unindexed
docs result in build failures.

Signed-off-by: Stephen Finucane <stephen@that.guru>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-12-12 08:57:06 -08:00
Pravin B Shelar
2b02d770c4 openvswitch: Allow external IPsec tunnel management.
OVS GRE IPsec tunnel support has multiple issues, Therefore
it was deprecated in OVS 2.6.

Following patch removes support for GRE IPsec and allows external
IPsec tunnel management for any type of tunnel not just GRE.
e.g. user can encrypt Geneve or VxLan traffic.

It can be done by using openflow pipeline to set skb-mark
and using IPsec keying daemons to implement IPsec tunnels.
This packet can be matched for the skb-mark to encrypt
selective tunnel traffic.

VMware-BZ: 1710701
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ansis Atteka <aatteka@ovn.org>
2016-09-27 11:06:09 -07:00
Daniele Di Proietto
3a414a0a4f ofproto: Honor mtu_request even for internal ports.
By default Open vSwitch tries to configure internal interfaces MTU to
match the bridge minimum, overriding any attempt by the user to
configure it through standard system tools, or the database.

While this works in many simple cases (there are probably many users
that rely on this) it may create problems for more advanced use cases
(like any overlay networks).

This commit allows the user to override the default behavior by
providing an explict MTU in the mtu_request column in the Interface
table.

This means that Open vSwitch will now treat differently database MTU
requests from standard system tools MTU requests (coming from `ip link`
or `ifconfig`), but this seems the best way to remain compatible with
old users while providing a more powerful interface.

Suggested-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Tested-by: Joe Stringer <joe@ovn.org>
2016-09-02 16:01:12 -07:00
Daniele Di Proietto
4124cb1254 netdev: Make netdev_set_mtu() netdev parameter non-const.
Every provider silently drops the const attribute when converting the
parameter to the appropriate subclass.  Might as well drop the const
attribute from the parameter, since this is a "set" function.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2016-08-12 19:32:12 -07:00
Ilya Maximets
324c837485 dpif-netdev: XPS (Transmit Packet Steering) implementation.
If CPU number in pmd-cpu-mask is not divisible by the number of queues and
in a few more complex situations there may be unfair distribution of TX
queue-ids between PMD threads.

For example, if we have 2 ports with 4 queues and 6 CPUs in pmd-cpu-mask
such distribution is possible:
<------------------------------------------------------------------------>
pmd thread numa_id 0 core_id 13:
        port: vhost-user1       queue-id: 1
        port: dpdk0     queue-id: 3
pmd thread numa_id 0 core_id 14:
        port: vhost-user1       queue-id: 2
pmd thread numa_id 0 core_id 16:
        port: dpdk0     queue-id: 0
pmd thread numa_id 0 core_id 17:
        port: dpdk0     queue-id: 1
pmd thread numa_id 0 core_id 12:
        port: vhost-user1       queue-id: 0
        port: dpdk0     queue-id: 2
pmd thread numa_id 0 core_id 15:
        port: vhost-user1       queue-id: 3
<------------------------------------------------------------------------>

As we can see above dpdk0 port polled by threads on cores:
	12, 13, 16 and 17.

By design of dpif-netdev, there is only one TX queue-id assigned to each
pmd thread. This queue-id's are sequential similar to core-id's. And
thread will send packets to queue with exact this queue-id regardless
of port.

In previous example:

	pmd thread on core 12 will send packets to tx queue 0
	pmd thread on core 13 will send packets to tx queue 1
	...
	pmd thread on core 17 will send packets to tx queue 5

So, for dpdk0 port after truncating in netdev-dpdk:

	core 12 --> TX queue-id 0 % 4 == 0
	core 13 --> TX queue-id 1 % 4 == 1
	core 16 --> TX queue-id 4 % 4 == 0
	core 17 --> TX queue-id 5 % 4 == 1

As a result only 2 of 4 queues used.

To fix this issue some kind of XPS implemented in following way:

	* TX queue-ids are allocated dynamically.
	* When PMD thread first time tries to send packets to new port
	  it allocates less used TX queue for this port.
	* PMD threads periodically performes revalidation of
	  allocated TX queue-ids. If queue wasn't used in last
	  XPS_TIMEOUT_MS milliseconds it will be freed while revalidation.
        * XPS is not working if we have enough TX queues.

Reported-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-07-27 12:56:04 -07:00
Ben Pfaff
258e42fa2c netdev: Fix typo in comment.
The name of the macro was wrong.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
2016-06-03 13:08:31 -07:00
Pravin B Shelar
4975aa3ee6 netdev-native-tnl: Introduce ip_build_header()
The native tunneling build tunnel header code is spread across
two different modules, it makes pretty hard to follow the code.
Following patch refactors the code to move all code to
netdev-ative-tnl module.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-05-23 20:27:14 -07:00
Daniele Di Proietto
050c60bfb5 netdev-dpdk: Use ->reconfigure() call to change rx/tx queues.
This introduces in dpif-netdev and netdev-dpdk the first use for the
newly introduce reconfigure netdev call.

When a request to change the number of queues comes, netdev-dpdk will
remember this and notify the upper layer via
netdev_request_reconfigure().

The datapath, instead of periodically calling netdev_set_multiq(), can
detect this and call reconfigure().

This mechanism can also be used to:
* Automatically match the number of rxq with the one provided by qemu
  via the new_device callback.
* Provide a way to change the MTU of dpdk devices at runtime.
* Move a DPDK vhost device to the proper NUMA socket.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2016-05-23 10:27:42 -07:00
Daniele Di Proietto
790fb3b745 netdev: Add reconfigure request mechanism.
A netdev provider, especially a PMD provider (like netdev DPDK) might
not be able to change some of its parameters (such as MTU, or number of
queues) without stopping everything and restarting.

This commit introduces a mechanism that allows a netdev provider to
request a restart (netdev_request_reconfigure()).  The upper layer can
be notified via netdev_wait_reconf_required() and
netdev_is_reconf_required().  After closing all the rxqs the upper layer
can finally call netdev_reconfigure(), to make sure that the new
configuration is in place.

This will be used by next commit to reconfigure rx and tx queues in
netdev-dpdk.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
2016-05-23 10:27:42 -07:00
Pravin B Shelar
9235b4793e dpif-netdev: Fix memory leak in tunnel header pop action.
The tunnel header pop action can leak batch of packet
in case of error. Following patch fixex the error code path.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-05-18 19:39:18 -07:00
Pravin B Shelar
1895cc8dbb dpif-netdev: create batch object
DPDK datapath operate on batch of packets. To pass the batch of
packets around we use packets array and count.  Next patch needs
to associate meta-data with each batch of packets. So Introducing
a batch structure to make handling the metadata easier.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-05-18 19:39:18 -07:00
Pravin B Shelar
1c8f98d96a netdev: Return number of packet from netdev_pop_header()
Current tunnel-pop API does not allow the netdev implementation
retain a packet but STT can keep a packet from batch of packets
during TCP reassembly processing. To return exact count of
valid packet STT need to pass this number of packet parameter
as a reference.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
2016-05-18 19:39:18 -07:00
Ben Warren
b129cc9834 Break netdev.h into private and public parts
Public (struct definitions and some prototypes) go in
include/openvswitch

Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-04-14 13:43:22 -07:00
Pravin B Shelar
6b6e13293e netdev: remove netdev_get_in4()
Since netdev can have multiple IP address use
generic api netdev_get_addr_list().  This also make it
easier to handle IPv4 and IPv6 address across vswitchd
layers.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-03-24 09:30:57 -07:00
Pravin B Shelar
a8704b5027 tunneling: Handle multiple ip address for given device.
Device can have multiple IP address but netdev_get_in4/6()
returns only one configured IPv6 address. Following
patch fixes it.
OVS router is also updated to return source ip address for
given destination, This is required when interface has multiple
IP address configured.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-03-24 09:30:57 -07:00
Ilya Maximets
ce179f1163 dpif-netdev: Add dpif-netdev/pmd-rxq-show appctl command.
This command can be used to check the port/rxq assignment to
pmd threads. For each pmd thread of the datapath shows list
of queue-ids with port names.

Additionally log message from pmd_thread_main() extended with
queue-id, and type of this message changed from INFO to DBG.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-02-22 16:37:17 -08:00
Ilya Maximets
a14b8947fd dpif-netdev: Allow different numbers of rx queues for different ports.
Currently, all of the PMD netdevs can only have the same number of
rx queues, which is specified in other_config:n-dpdk-rxqs.

Fix that by introducing of new option for PMD interfaces: 'n_rxq', which
specifies the maximum number of rx queues to be created for this
interface.

Example:
	ovs-vsctl set Interface dpdk0 options:n_rxq=8

Old 'other_config:n-dpdk-rxqs' deleted.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-02-04 17:10:45 -08:00
Jiri Benc
3ae91c0190 tunneling: add IPv6 support to netdev_tunnel_config
Allow configuration of IPv6 tunnel endpoints.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Co-authored-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2015-11-30 10:13:13 -08:00
Jarno Rajahalme
74ff3298c8 userspace: Define and use struct eth_addr.
Define struct eth_addr and use it instead of a uint8_t array for all
ethernet addresses in OVS userspace.  The struct is always the right
size, and it can be assigned without an explicit memcpy, which makes
code more readable.

"struct eth_addr" is a good type name for this as many utility
functions are already named accordingly.

struct eth_addr can be accessed as bytes as well as ovs_be16's, which
makes the struct 16-bit aligned.  All use seems to be 16-bit aligned,
so some algorithms on the ethernet addresses can be made a bit more
efficient making use of this fact.

As the struct fits into a register (in 64-bit systems) we pass it by
value when possible.

This patch also changes the few uses of Linux specific ETH_ALEN to
OVS's own ETH_ADDR_LEN, and removes the OFP_ETH_ALEN, as it is no
longer needed.

This work stemmed from a desire to make all struct flow members
assignable for unrelated exploration purposes.  However, I think this
might be a nice code readability improvement by itself.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-08-28 14:55:11 -07:00
Ethan Jackson
cd159f1a82 dpdk: Ditch MAX_PKT_BURST macro.
The MAX_PKT_BURST and NETDEV_MAX_RX_BATCH macros had a confusing
relationship.  They basically purport to do the same thing, making it
unclear which is the source of truth.

Furthermore, while NETDEV_MAX_RX_BATCH was 256, MAX_PKT_BURST was 32,
meaning we never process a batch larger than 32 packets further adding
to the confusion.

This patch resolves the issue by removing MAX_PKT_BURST completely,
and shrinking the new NETDEV_MAX_BURST macro to only 32.  This should
have no change in the execution path except shrinking a couple of
structs and memory allocations (can't hurt).

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2015-05-19 14:47:00 -07:00
Ricky Li
c876a4bb9b netdev: Fix user space tunneling for set_tunnel action.
e.g. Set tunnel id for encapsulated VxLAN packet (out_key=flow):

ovs-vsctl add-port int-br vxlan0 -- set interface vxlan0 \
    type=vxlan options:remote_ip=172.168.1.2 options:out_key=flow

ovs-ofctl add-flow int-br in_port=LOCAL, icmp,\
    actions=set_tunnel:3, output:1 (1 is the port# of vxlan0)

Output tunnel ID should be modified to 3 with this patch.

Signed-off-by: Ricky Li <ricky.li@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-03-26 18:56:12 -07:00
Pravin B Shelar
e14deea0bd dpif_packet: Rename to dp_packet
dp_packet is short and better name for datapath packet
structure.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-03-03 13:37:34 -08:00
Thomas Graf
526df7d854 tunnel: Provide framework for tunnel extensions for VXLAN-GBP and others
Supports a new "exts" field in the tunnel configuration which takes a
comma separated list of enabled extensions.

The only extension supported so far is GBP but this can be used to
enable RCO and possibly others as soon as the OVS datapath supports
them.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-02-06 21:10:45 +01:00
Pravin B Shelar
a36de779d7 openvswitch: Userspace tunneling.
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.

Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-12 15:08:33 -08:00
Wang Sheng-Hui
3bd0fd39eb Use magic ETH_ADDR_LEN instead of 6 for Ethernet address length.
ETH_ADDR_LEN is defined in lib/packets.h, valued 6.
Use this macro instead of magic number 6 to represent the length
of eth mac address.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-10-22 08:46:52 -07:00
Alex Wang
5496878cbf netdev: Add function for configuring tx and rx queues.
This commit adds a new API to the 'struct netdev_class' which
allows user to configure the number of tx queues and rx queues
of 'netdev'.  Upcoming patches will use this function to set
multiple tx/rx queues when adding the netdev to dpif-netdev.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-15 11:43:48 -07:00
Pravin B Shelar
2f9dd77fcd ofproto: Do not update stats on fake bond interface.
There are couple of reasons to remove this support:
*   This is used in very old OVS use-case. It is much better
    to read stats directly from OVS.
*   Forthcoming commit will remove support for setting stats
    for vport. The stats update depends on stats-set.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-09-15 10:08:56 -07:00
Alex Wang
f00fa8cbad netdev: Add n_txq to 'struct netdev'.
This commit adds new variable n_txq to 'struct netdev' for recording
the number of tx queues.  Correspondingly, the send_*() functions are
extended to accept queue id as input argument.

All 'netdev-*' implementation will ignore the queue id since having
multiple tx queues is not supported.  Upcomping patches will start
using it and create multiple tx queues for dpdk netdev.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-12 11:30:58 -07:00
Alex Wang
7dec44fe1c netdev: Add function for getting the numa node id of netdev.
This commit adds a new API to the 'struct netdev_class' which
allows user to query the numa node id the 'netdev' is on.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-12 11:30:58 -07:00
Daniele Di Proietto
f4fd623c4c netdev: netdev_send accepts multiple packets
The netdev_send function has been modified to accept multiple packets, to
allow netdev providers to amortize locking and queuing costs.
This is especially true for netdev-dpdk.

Later commits exploit the new API.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:13 -07:00
Daniele Di Proietto
910885540a dpif-netdev: use dpif_packet structure for packets
This commit introduces a new data structure used for receiving packets from
netdevs and passing them to dpifs.
The purpose of this change is to allow storing some private data for each
packet. The subsequent commits make use of it.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-23 14:41:12 -07:00
Thomas Graf
bbe6109de7 vswitchd: Add error column to Interface table to store error condition
Store the error condition of a failed port configuration in a new
column 'error' in the Interface table.

Example:
$ ovs-vsctl add-port br0 test -- \
     set Interface test type=vxlan options:unknown=1
ovs-vsctl: Error detected while setting up 'test'.  [...]

$ ovs-vsctl list Interface test | grep error
error         : "test: could not set configuration (Invalid argument)"

Fixing the error will clear the error column:
$ ovs-vsctl set Interface test options:remote_ip=1.1.1.1
$ ovs-vsctl list Interface test | grep error
error         : []
$

For now, the high level error messages when opening and configuring
the netdev are used. Further patches can extend passing the error
pointer into the individual netdev implementations to allow for more
fine grained error messages to be stored.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-05-19 15:42:30 -07:00
Ryan Wilson
fe83f81df9 netdev: Remove netdev from global shash when the user is changing interface configuration.
When the user changes port type (i.e. changing p0 from type 'internal' to
'gre'), the netdev must first be deleted, then re-created with the new type.
Deleting the netdev requires there exist no more references to the netdev.
However, the xlate cache holds references to netdevs and the cache is only
invalidated by revalidator threads. Thus, if cache is not invalidated prior to
the netdev being re-created, the netdev will not be able to be re-created and
the configuration change will fail.

This patch always removes the netdev from the global netdev shash when the
user changes port type. This ensures that the new netdev can always be created
while handler and revalidator threads can retain references to the old netdev
until they are finished.

Signed-off-by: Ryan Wilson <wryan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-05-16 11:35:38 -07:00
Alex Wang
3e912ffcbb netdev: Add 'change_seq' back to netdev.
This commit can be seen as a partial revert of commit
da4a619179 (netdev: Globally track port status changes)
by adding the 'change_seq' to 'struct netdev'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-10 12:55:28 -07:00
Pravin
55c955bd8a netdev: Add support multiqueue recv.
new netdev type like DPDK can support multi-queue IO. Following
patch Adds support for same.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
f77917408a netdev: Rename netdev_rx to netdev_rxq
Preparation for multi queue netdev IO.  There are no functional changes
in this patch.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
e4cfed38b1 dpif-netdev: Add poll-mode-device thread.
This patch adds PMD type netdev for netdevice with poll-mode
drivers.  Since there is no way to get signal on a packet recv
from these devices we need to poll them in busy loop.  So minimize
system call overhead this patch uses dpif-thread exclusively
for PMD devices and rest of devices which needs system calls to
do IO are moved to dpif-netdev-run().
PMD device like DPDK work in userspace so there is no system call
overhead for them.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
40d26f04b2 netdev: Send ofpbuf directly to netdev.
DPDK netdev need to access ofpbuf while sending buffer. Following
patch changes netdev_send accordingly.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 11:48:28 -07:00
Pravin
df1e5a3bc7 netdev: Extend rx_recv to pass multiple packets.
DPDK can receive multiple packets but current netdev API does
not allow that.  Following patch allows dpif-netdev receive batch
of packet in a rx_recv() call for any netdev port.  This will be
used by dpdk-netdev.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-03-21 11:48:28 -07:00
Joe Stringer
da4a619179 netdev: Globally track port status changes
Previously, we tracked status changes for ofports on a per-device basis.
Each time in the main thread's loop, we would inspect every ofport
to determine whether the status had changed for corresponding devices.

This patch replaces the per-netdev change_seq with a global 'struct seq'
which tracks status change for all ports. In the average case where
ports are not constantly going up or down, this allows us to check the
sequence once per main loop and not poll any ports. In the worst case,
execution is expected to be similar to how it is currently.

In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces average CPU usage of the main thread from about 40% to
about 35%.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-12-12 15:03:19 -08:00
Ben Pfaff
89454bf477 netdev: Fix deadlock when netdev_dump_queues() callback calls into netdev.
We have a call chain like this:

    iface_configure_qos() calls
        netdev_dump_queues(), which calls
            netdev_linux_dump_queues(), which calls back through 'cb' to
                qos_unixctl_show_cb(), which calls
                    netdev_delete_queue(), which calls
                        netdev_linux_delete_queue().

Both netdev_dump_queues() and netdev_linux_delete_queue() take the same
mutex in the same netdev, which deadlocks.

This commit fixes the problem by getting rid of the callback.

netdev_linux_dump_queue_stats() would benefit from the same treatment but
it's less urgent because I don't see any callbacks from that function that
call back into a netdev function.

Bug #19319.
Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-27 21:50:40 -07:00
Ben Pfaff
863838160e netdev: Make netdev access thread-safe.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-09 21:34:02 -07:00
Ben Pfaff
6dc34a0dab Implement OpenFlow 1.3 queue stats duration feature.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-26 10:20:04 -07:00
Ben Pfaff
0bb0393a0b netdev: New function netdev_ref().
I suspect that this makes it easier to make sure that a netdev stays open
as long as needed in some cases where a module needs access to a netdev
opened by some higher-level module.

CC: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-22 11:10:30 -07:00
Alex Wang
94a538422d netdev: Prevent using reserved names
This commit adds a function to lib/netdev.c to check that the interface name
is not the same as any of the registered vport providers' dpif_port name
(e.g. gre_system) or the datapath's internal port name (e.g. ovs-system).

Bug #15077.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-05-16 14:03:17 -07:00