2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 05:47:55 +00:00

169 Commits

Author SHA1 Message Date
Jesse Gross
f41256d709 tunnel: Validate IP header for userspace tunneling.
Currently, when doing userspace tunneling we don't perform much in
the way of integrity checks on the incoming IP header. The case of
tunneling is different from the usual case of switching since we are
acting as the endpoint here and should not allow invalid packets to
pass.

This adds checks for IP checksum, version, total length, and options and
drops packets that don't pass.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-09-13 08:13:11 -07:00
Jarno Rajahalme
74ff3298c8 userspace: Define and use struct eth_addr.
Define struct eth_addr and use it instead of a uint8_t array for all
ethernet addresses in OVS userspace.  The struct is always the right
size, and it can be assigned without an explicit memcpy, which makes
code more readable.

"struct eth_addr" is a good type name for this as many utility
functions are already named accordingly.

struct eth_addr can be accessed as bytes as well as ovs_be16's, which
makes the struct 16-bit aligned.  All use seems to be 16-bit aligned,
so some algorithms on the ethernet addresses can be made a bit more
efficient making use of this fact.

As the struct fits into a register (in 64-bit systems) we pass it by
value when possible.

This patch also changes the few uses of Linux specific ETH_ALEN to
OVS's own ETH_ADDR_LEN, and removes the OFP_ETH_ALEN, as it is no
longer needed.

This work stemmed from a desire to make all struct flow members
assignable for unrelated exploration purposes.  However, I think this
might be a nice code readability improvement by itself.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-08-28 14:55:11 -07:00
Pravin B Shelar
99e7b07740 tunneling: Remove gre64 tunnel support.
GRE64 was introduced to extend gre key from 32-bit to 64-bit using
gre-key and sequence number field. But GRE64 is not standard
protocol. There are not many users of this protocol. Therefore we
have decided to remove it.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-08-20 13:01:58 -07:00
Jesse Gross
6728d578f6 dpif-netdev: Translate Geneve options per-flow, not per-packet.
The kernel implementation of Geneve options stores the TLV option
data in the flow exactly as received, without any further parsing.
This is then translated to known options for the purposes of matching
on flow setup (which will then install a datapath flow in the form
the kernel is expecting).

The userspace implementation behaves a little bit differently - it
looks up known options as each packet is received. The reason for this
is there is a much tighter coupling between datapath and flow translation
and the representation is generally expected to be the same. This works
but it incurs work on a per-packet basis that could be done per-flow
instead.

This introduces a small translation step for Geneve packets between
datapath and flow lookup for the userspace datapath in order to
allow the same kind of processing that the kernel does. A side effect
of this is that unknown options are now shown when flows dumped via
ovs-appctl dpif/dump-flows, similar to the kernel.

There is a second benefit to this as well: for some operations it is
preferable to keep the options exactly as they were received on the wire,
which this enables. One example is that for packets that are executed from
ofproto-dpif-upcall to the datapath, this avoids the translation of
Geneve metadata. Since this conversion is potentially lossy (for unknown
options), keeping everything in the same format removes the possibility
of dropping options if the packet comes back up to userspace and the
Geneve option translation table has changed. To help with these types of
operations, most functions can understand both formats of data and seamlessly
do the right thing.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-08-05 20:26:48 -07:00
Jesse Gross
35303d715b tunnels: Don't initialize unnecessary packet metadata.
The addition of Geneve options to packet metadata significantly
expanded its size. It was reported that this can decrease performance
for DPDK ports by up to 25% since we need to initialize the whole
structure on each packet receive.

It is not really necessary to zero out the entire structure because
miniflow_extract() only copies the tunnel metadata when particular
fields indicate that it is valid. Therefore, as long as we zero out
these fields when the metadata is initialized and ensure that the
rest of the structure is correctly set in the presence of a tunnel,
we can avoid touching the tunnel fields on packet reception.

Reported-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-07-01 15:24:04 -07:00
Jesse Gross
5bb08b0ef6 tunneling: Userspace datapath support for Geneve options.
Currently the userspace datapath only supports Geneve in a
basic mode - without options - since the rest of userspace
previously didn't support options either. This enables the
userspace datapath to send and receive options as well.

The receive path for extracting the tunnel options isn't entirely
optimal because it does a lookup on the options on a per-packet
basis, rather than per-flow like the kernel does. This is not
as straightforward to do in the userspace datapath since there
is no translation step between packet formats used in packet vs.
flow lookup. This can be optimized in the future and in the
meantime option support is still useful for testing and simulation.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-06-26 14:18:07 -07:00
Pravin B Shelar
0890056e59 dpctl: cleaner dpctl output for tunnel ports.
Currently dont-fragment and TTL are initialized to zero, but
those are not default config for tunnel ports.  dpctl
does not show default config of a port.  So by setting these
values to default we can get cleaner `dpctl show` output.

% ovs-dpctl show
system@ovs-system:
	port 0: ovs-system (internal)
	port 1: br0 (internal)
	port 4: gre_sys (gre: df_default=false, ttl=0)

% ovs-dpctl show # After initializing default values.
system@ovs-system:
	port 0: ovs-system (internal)
	port 1: br0 (internal)
	port 4: gre_sys (gre)

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-05-01 14:26:14 -07:00
Pravin B Shelar
4237026e52 datapath: Add Stateless TCP Tunneling protocol.
The Stateless TCP Tunnel (STT) protocol encapsulates traffic in
IPv4/TCP packets.
STT uses TCP segmentation offload available in most of NIC. On
packet xmit STT driver appends STT header along with TCP header
to the packet. For GSO packet GSO parameters are set according
to tunnel configuration and packet is handed over to networking
stack. This allows use of segmentation offload available in NICs

The protocol is documented at
http://www.ietf.org/archive/id/draft-davie-stt-06.txt

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2015-04-29 10:33:18 -07:00
Daniele Di Proietto
2bc1bbd27d dp-packet: Rename 'dp_hash' in 'rss_hash'.
We already have the 'dp_hash' embedded in the metadata.  This caused
confusion in the code.  With this commit it should be clear that
'rss_hash' is the packet hash used for internal purposes, while
'md.dp_hash' is part of the flow, computed during the execution of
certain actions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2015-04-20 12:49:41 -07:00
Jesse Gross
d625fbd13e tunneling: Convert tunnel push/pop functions to act on single packets.
The userspace tunneling API for pushing and popping tunnel headers
is currently based on processing batches of packets. However, there
is no obvious way to take advantage of batching for these operations
and so each tunnel operation has a pair of loops to process the
batch. This changes the API to operate on single packets to enable
better code reuse.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-09 14:29:08 -07:00
Jesse Gross
8e45fe7c9e tunneling: Add UDP checksum support for userspace tunnels.
Kernel based OVS recently added the ability to support checksums
for UDP based tunnels (Geneve and VXLAN). This adds similar support
for the userspace datapath to bring feature parity.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-07 16:51:57 -07:00
Jesse Gross
e5a1caeed4 tunneling: Add userspace tunnel support for Geneve.
This adds basic userspace dataplane support for the Geneve
tunneling protocol. The rest of userspace only has the ability
to handle Geneve without options and this follows that pattern
for the time being. However, when the rest of userspace is updated
it should be easy to extend the dataplane as well.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-07 16:51:43 -07:00
Jesse Gross
e066f78fea tunneling: Factor out common UDP tunnel code.
Currently, the userspace VXLAN implementation contains the code
for generating and parsing both the UDP and VXLAN headers. This
pulls out the UDP portion for better layering and to make it
easier to support additional UDP based tunnels and features.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-07 16:47:35 -07:00
Jesse Gross
83fbb69b50 vxlan: Set FLOW_TNL_F_KEY for received packets.
The VNI is always present in the VXLAN header, so we should
set the FLOW_TNL_F_KEY flag to indicate this. However, the
userspace implementation of VXLAN currently does not.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2015-04-07 16:45:03 -07:00
Jesse Gross
61cf6b7024 tunneling: Use flow flag for GRE checksum calculation.
The indication to calculate the GRE checksum is currently the port
config rather than the tunnel flow. Currently there is a one to one
mapping between the two so there is no difference. However, the
kernel datapath must use the flow and it is also potentially more
flexible, so this switches how we decide whether to calculate the
checksum.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-07 16:26:56 -07:00
Jesse Gross
d804d31e24 tunneling: Fix location of GRE checksums.
The GRE checksum is a 16 bit field stored in a 32 bit option (the
rest is reserved). The current code treats the checksum as a 32-bit
field and places it in the right place for little endian systems but
not big endian. This fixes the problem by storing the 16 bit field
directly.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-07 16:26:44 -07:00
Jesse Gross
6432e527ce tunneling: Add check for GRE protocol is Ethernet.
On receive, the userspace GRE code doesn't check the protocol
field. Since OVS only understands Ethernet packets, this adds a
check that the inner protocol is Ethernet and discards other types
of packets.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-07 16:26:34 -07:00
Jesse Gross
6625245743 tunneling: Include IP TTL in flow metadata.
The IP TTL is currently omitted in the extracted tunnel information
that is stored in the flow for userspace tunneling. This includes it
so that the same logic used by the kernel also applies.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-04-07 16:26:29 -07:00
Alex Wang
a19083996c netdev-vport: Do not update netdev when there is no config change.
When there is any update from ovsdb, ovs will call netdev_set_config()
for every vport.  Even though the change is not related to vport, the
current implementation will always increment the per-netdev sequence
number.  Subsequently this could cause even more unwanted effects,
e.g. the recreation of 'struct tnl_port' in ofproto level.

This commit fixes the issue by only updating the netdev when there
is actual configuration change.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-03-27 09:21:22 -07:00
Ricky Li
c876a4bb9b netdev: Fix user space tunneling for set_tunnel action.
e.g. Set tunnel id for encapsulated VxLAN packet (out_key=flow):

ovs-vsctl add-port int-br vxlan0 -- set interface vxlan0 \
    type=vxlan options:remote_ip=172.168.1.2 options:out_key=flow

ovs-ofctl add-flow int-br in_port=LOCAL, icmp,\
    actions=set_tunnel:3, output:1 (1 is the port# of vxlan0)

Output tunnel ID should be modified to 3 with this patch.

Signed-off-by: Ricky Li <ricky.li@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2015-03-26 18:56:12 -07:00
Jesse Gross
4752cc0c26 tunnels: Enable UDP checksum computation for Geneve and VXLAN.
The kernel module can already support outer UDP checksums for
Geneve and VXLAN using the standard checksum flag in tunnel
metadata. This makes userspace aware of the capability so that
users can enable it on tunnel ports.

There is a complication in that there is no way for userspace to
probe or detect if the kernel does not support this capability
in order to warn the user. In this case, connectivity will appear
to function normally but packets will not be checksum protected.
This is mainly an issue for VXLAN which has existed in the kernel
for a some time without checksum support - while there are also
a few kernel versions that support Geneve only without checksums,
they are much less common.

There isn't a particularly good solution to the compatibility
issue without introducing a larger capabilities structure. However,
UDP checksums are likely to be used only rarely at this point in
time and the VXLAN spec (where the main problem lies) recommends
against them. Therefore, this is considered to be an advanced user
feature and we settle for just documenting the issue.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
2015-03-24 12:59:02 -07:00
Pravin B Shelar
cf62fa4c70 dp-packet: Remove ofpbuf dependency.
Currently dp-packet make use of ofpbuf for managing packet
buffers. That complicates ofpbuf, by making dp-packet
independent of ofpbuf both libraries can be optimized for
their own use case.
This avoids mapping operation between ofpbuf and dp_packet
in datapath upcalls.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-03-03 13:37:37 -08:00
Pravin B Shelar
e14deea0bd dpif_packet: Rename to dp_packet
dp_packet is short and better name for datapath packet
structure.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-03-03 13:37:34 -08:00
Thomas Graf
526df7d854 tunnel: Provide framework for tunnel extensions for VXLAN-GBP and others
Supports a new "exts" field in the tunnel configuration which takes a
comma separated list of enabled extensions.

The only extension supported so far is GBP but this can be used to
enable RCO and possibly others as soon as the OVS datapath supports
them.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-02-06 21:10:45 +01:00
Thomas Graf
e6211adce4 lib: Move vlog.h to <openvswitch/vlog.h>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-12-15 14:15:19 +01:00
Pravin B Shelar
b772066ffd route-table: Remove Unregister.
Since dpif registering for routing table at initialization
there is no need to unregister it. Following patch removes
support for turning routing table notifications on and off.
Due to this change OVS always listens for these
notifications.

Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
2014-12-01 14:43:32 -08:00
Pravin B Shelar
1bc50ef389 dpctl: Fix crash.
ovs-dpctl crashed due to uninitialized router classifier. To
fix this issue move ovs router initialization to route table
module.

Reported-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-11-21 15:51:40 -08:00
Pravin B Shelar
a36de779d7 openvswitch: Userspace tunneling.
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.

Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-12 15:08:33 -08:00
Pravin B Shelar
d9b4ebc5d1 route-table: Use classifier to store routing table.
Rather than using hmap for storing routing entries we can directly use
classifier which has support for priority and wildcard entries.
This makes route lookup lockless. This help when we use route lookup
for native tunneling.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-11-03 13:27:31 -08:00
Alex Wang
5496878cbf netdev: Add function for configuring tx and rx queues.
This commit adds a new API to the 'struct netdev_class' which
allows user to configure the number of tx queues and rx queues
of 'netdev'.  Upcoming patches will use this function to set
multiple tx/rx queues when adding the netdev to dpif-netdev.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-15 11:43:48 -07:00
Pravin B Shelar
2f9dd77fcd ofproto: Do not update stats on fake bond interface.
There are couple of reasons to remove this support:
*   This is used in very old OVS use-case. It is much better
    to read stats directly from OVS.
*   Forthcoming commit will remove support for setting stats
    for vport. The stats update depends on stats-set.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-09-15 10:08:56 -07:00
Alex Wang
7dec44fe1c netdev: Add function for getting the numa node id of netdev.
This commit adds a new API to the 'struct netdev_class' which
allows user to query the numa node id the 'netdev' is on.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-12 11:30:58 -07:00
Ben Pfaff
b2f771efca netdev-vport: Fix use-after-free error in netdev_vport_route_changed().
We can't unlock the netdev's mutex after close the netdev, because closing
the netdev might destroy the mutex.

VMware-BZ: #1275187
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-06-24 14:37:57 -07:00
Jesse Gross
c1fc1411d2 datapath: Add support for Geneve tunneling.
This adds support for Geneve - Generic Network Virtualization
Encapsulation. The protocol is documented at
http://tools.ietf.org/html/draft-gross-geneve-00

The kernel implementation is completely agnostic to the options
that are in use and can handle newly defined options without
further work. It does this by simply matching on a byte array
of options and allowing userspace to setup flows on this array.

Userspace currently implements only support for basic version of
Geneve. It can work with the base header (including the VNI) and
is capable of parsing options but does not currently support any
particular option definitions. Over time, the intention is to
allow options to be matched through OpenFlow without requiring
explicit support in OVS userspace.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-20 15:19:35 -07:00
Jesse Gross
a5d4fadd00 netdev-vport: Use dpif_port as base for tunnel backing port.
In most cases, tunnel ports specify a dpif name to act as the backing
port in the datapath. However, in the case of UDP tunnels the type is
used with the port number appended. This is potentially a problem for
IPsec tunnels because they have different types but should have the
same backing port. The hasn't been a problem in practice though because
no UDP tunnels are currently used with IPsec.

This switches to use the dpif_port in all cases plus a port number if
necessary. It does this by making the names short enough to accomodate
ports, which also makes the naming more consistent.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-06-13 16:28:22 -07:00
Alex Wang
41ca1e0afb netdev-vport: Checks tunnel status change when route-table is reset.
Commit 3e912ffcbb (netdev: Add 'change_seq' back to netdev.) added per-
netdev change number for indicating status change.  Future commits used
this change number to optimize the netdev status update to database.
However, the work also introduced the bug in the following scenario:

- assume interface eth0 has address 1.2.3.4, eth1 has adddress 10.0.0.1.
- assume tunnel port p1 is set with remote_ip=10.0.0.5.
- after setup, 'ovs-vsctl list interface p1 status' should show the
  'tunnel_egress_iface="eth1"'.
- now if the address of eth1 is change to 0 via 'ifconfig eth1 0'.
- expectedly, after change, 'ovs-vsctl list interface p1 status' should
  show the 'tunnel_egress_iface="eth0"'

However, 'tunnel_egress_iface' will not be updated on current master.
This is in that, the 'netdev-vport' module corresponding to p1 does
not react to routing related changes.

To fix the bug, this commit adds a change sequence number in the route-
table module and makes netdev-vport check the sequence number for
tunnel status update.

Bug #1240626

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-05-02 14:29:18 -07:00
Alex Wang
3e912ffcbb netdev: Add 'change_seq' back to netdev.
This commit can be seen as a partial revert of commit
da4a619179d (netdev: Globally track port status changes)
by adding the 'change_seq' to 'struct netdev'.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-10 12:55:28 -07:00
Gurucharan Shetty
c89809c9fb netdev-vport: Don't look for ovs-monitor-ipsec's pid file.
We do not have pidfiles in Windows. And we do not yet have support
for ipsec tunnels. This lets us move forward with compilation.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-03-13 09:19:51 -07:00
Joe Stringer
da4a619179 netdev: Globally track port status changes
Previously, we tracked status changes for ofports on a per-device basis.
Each time in the main thread's loop, we would inspect every ofport
to determine whether the status had changed for corresponding devices.

This patch replaces the per-netdev change_seq with a global 'struct seq'
which tracks status change for all ports. In the average case where
ports are not constantly going up or down, this allows us to check the
sequence once per main loop and not poll any ports. In the worst case,
execution is expected to be similar to how it is currently.

In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces average CPU usage of the main thread from about 40% to
about 35%.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-12-12 15:03:19 -08:00
Lorand Jakab
a6363cfddb ofproto-dpif: add support for layer 3 ports
Add member is_layer3 to struct ofport_dpif to mark layer 3 ports.  Set
it to "true" for the only layer 3 port we support for now: lisp.

Additionally, prevent flooding to layer 3 ports.  A later patch will
also prevent MAC learning.

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-11-19 11:06:04 -08:00
Joe Stringer
89eb6a4313 netdev: Make naming more consistent
netdev-dummy and netdev-vport use the function name netdev_poll_notify()
for the same purpose as netdev-linux/bsd's netdev_*_changed(). This patch
changes the former two to be more consistent with the linux/bsd naming
scheme.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-11-11 08:00:12 -08:00
Ethan Jackson
9e04d6f676 netdev-vport: Fix indentation.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
2013-09-17 14:16:10 -07:00
Ben Pfaff
89454bf477 netdev: Fix deadlock when netdev_dump_queues() callback calls into netdev.
We have a call chain like this:

    iface_configure_qos() calls
        netdev_dump_queues(), which calls
            netdev_linux_dump_queues(), which calls back through 'cb' to
                qos_unixctl_show_cb(), which calls
                    netdev_delete_queue(), which calls
                        netdev_linux_delete_queue().

Both netdev_dump_queues() and netdev_linux_delete_queue() take the same
mutex in the same netdev, which deadlocks.

This commit fixes the problem by getting rid of the callback.

netdev_linux_dump_queue_stats() would benefit from the same treatment but
it's less urgent because I don't see any callbacks from that function that
call back into a netdev function.

Bug #19319.
Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-27 21:50:40 -07:00
Ben Pfaff
834d6cafe4 Use "error-checking" mutexes in place of other kinds wherever possible.
We've seen a number of deadlocks in the tree since thread safety was
introduced.  So far, all of these are self-deadlocks, that is, a single
thread acquiring a lock and then attempting to re-acquire the same lock
recursively.  When this has happened, the process simply hung, and it was
somewhat difficult to find the cause.

POSIX "error-checking" mutexes check for this specific problem (and
others).  This commit switches from other types of mutexes to
error-checking mutexes everywhere that we can, that is, everywhere that
we're not using recursive mutexes.  This ought to help find problems more
quickly in the future.

There might be performance advantages to other kinds of mutexes in some
cases.  However, the existing mutex type choices were just guesses, so I'd
rather go for easy detection of errors until we know that other mutex
types actually perform better in specific cases.  Also, I did a quick
microbenchmark of glibc mutex types on my host and found that the
error checking mutexes weren't any slower than the other types, at least
when the mutex is uncontended.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-08-20 13:40:02 -07:00
Ben Pfaff
863838160e netdev: Make netdev access thread-safe.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-09 21:34:02 -07:00
Ben Pfaff
161b6042d8 netdev-vport: Make netdev_vport_patch_peer() return a malloc()'d string.
When threading comes into the picture there arises the possibility of a
race between netdev_vport_patch_peer()'s caller using the returned string
and another caller changing the peer.  It is safer to return a copy.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-09 21:23:19 -07:00
Ben Pfaff
9dc63482bb netdev: Adopt four-step alloc/construct/destruct/dealloc lifecycle.
This is the same lifecycle used in the ofproto provider interface.
Compared to the previous netdev provider interface, it has the
advantage that the netdev top layer can control when any given
netdev becomes visible to the outside world.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-09 21:21:38 -07:00
Ben Pfaff
9366380ae8 netdev-vport: Use ovs_mutex rather than a raw pthread_mutex_t.
I'd forgotten even to use the xpthread variants here.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-07 23:40:31 -07:00
Ben Pfaff
6027d03dc3 netdev-vport: Make pid checking in set_tunnel_config() thread-safe
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-07-23 11:38:17 -07:00
Ethan Jackson
6cbbf4fa07 ofproto-dpif: Store patch port peer in struct ofport_dpif.
This removes ofproto-dpif-xlate's dependency on ofport_get_peer()
which, while cleaner in-and-of itself, will become more important
as ofproto-dpif_xlate modularizes.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-06-18 13:14:18 -07:00