rcu_dereference_raw() api is cleaner way of accessing RCU pointer
when no locking is required.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
All functions used struct vport *vport except
ovs_vport_find_upcall_portid.
This fixes 1 kerneldoc warning
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
OVS keeps pointer to packet key in skb->cb, but the packet key is
store on stack. This could make code bit tricky. So it is better to
get rid of the pointer.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Change the date type of error status from u64 to atomic_long_t, and use atomic
operation, then remove the lock which is used to protect the error status.
The operation of atomic maybe faster than spin lock.
Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
This was required for old compatibility code which update stats
on fake bond interface. Now vswitchd has dropped it. This
support was always deprecated, so finally removing it.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
There are many drivers calling alloc_percpu() to allocate pcpu stats
and then initializing ->syncp. So just introduce a helper function for them.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Future patches will change the recirc action implementation to not
using recursion. The stack depth detection is no longer necessary.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Extend IPFIX exporter to export tunnel headers when both input and output
of the port.
Add three other_config options in IPFIX table: enable-input-sampling,
enable-output-sampling and enable-tunnel-sampling, to control whether
sampling tunnel info, on which direction (input or output).
Insert sampling action before output action and the output tunnel port
is sent to datapath in the sampling action.
Make datapath collect output tunnel info and send it back to userpace
in upcall message with a new additional optional attribute.
Add a tunnel ports map to make the tunnel port lookup faster in sampling
upcalls in IPFIX exporter. Make the IPFIX exporter generate IPFIX template
sets with enterprise elements for the tunnel info, save the tunnel info
in IPFIX cache entries, and send IPFIX DATA with tunnel info.
Add flowDirection element in IPFIX templates.
Signed-off-by: Wenyu Zhang <wenyuz@vmware.com>
Acked-by: Romain Lenglet <rlenglet@vmware.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
ovs_vport_alloc() bails out without freeing the memory 'vport' points
to.
Picked up by Coverity - CID 1230503.
Fixes: beb1c69a3 ("datapath: Allow each vport to have an array of
'port_id's.")
Signed-off-by: Christoph Jaeger <cj@linux.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Currently tun_info is used for passing tunnel information
on ingress and egress path, this cause confusion. Following
patch removes its use on ingress path make it egress only parameter.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
Following patch adds NULL check for memory allocated
by kmalloc.
Reported-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
The upstream u64_stats API has been changed to remove the _bh()
versions and switch all consumers to use IRQ safe variants instead.
This was done to be safe for netpoll generated packets, which can
occur in hard IRQ context. From a safety perspective, this doesn't
directly affect OVS since it doesn't support netpoll. However, this
change has been backported to older kernels so OVS needs to use the
new API to compile.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pritesh Kothari <pritesh.kothari@cisco.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
On packet recv OVS CB is initialized in multiple function.
Following patch moves all these initialization to
ovs_vport_receive().
This patch also save a check in execute actions.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This adds support for Geneve - Generic Network Virtualization
Encapsulation. The protocol is documented at
http://tools.ietf.org/html/draft-gross-geneve-00
The kernel implementation is completely agnostic to the options
that are in use and can handle newly defined options without
further work. It does this by simply matching on a byte array
of options and allowing userspace to setup flows on this array.
Userspace currently implements only support for basic version of
Geneve. It can work with the base header (including the VNI) and
is capable of parsing options but does not currently support any
particular option definitions. Over time, the intention is to
allow options to be matched through OpenFlow without requiring
explicit support in OVS userspace.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Currently, the flow information that is matched for tunnels and
the tunnel data passed around with packets is the same. However,
as additional information is added this is not necessarily desirable,
as in the case of pointers.
This adds a new structure for tunnel metadata which currently contains
only the existing struct. This change is purely internal to the kernel
since the current OVS_KEY_ATTR_IPV4_TUNNEL is simply a compressed version
of OVS_KEY_ATTR_TUNNEL that is translated at flow setup.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
In order to allow handlers directly read upcalls from datapath,
we need to support per-handler netlink socket for each vport in
datapath. This commit makes this happen. Also, it is guaranteed
to be backward compatible with previous branch.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Add support for building the in-tree kernel datapath for
Linux kernels up to 3.13. There were some changes in the
netlink area which required adding new compatibility code
for this layer. Also, some new per-cpu stats initialization
code was added.
Based on patch from Kyle Mestery.
Signed-off-by: Kyle Mestery <mestery@noironetworks.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Kyle Mestery <mestery@noironetworks.com>
Several functions and datastructures could be local
Found with 'make namespacecheck'
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Now that GRE support has been upstreamed into Linux, OVS is
using the components in the native kernel when available. However,
this means that it is now dependent on the appropriate kernel
config, which is CONFIG_NET_IPGRE_DEMUX on 2.6.37 and later.
Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
When sending a packet, a positive length indicates success and a
negative length indicates failure. However, the check for success
looked for non-zero values which catches both of these cases. This
can result in incorrect stats and leak memory on failure.
Introduced by commit be7cd27e44 (datapath:
Unify vport error stats handling.).
CC: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
VPORT_F_TUN_ID is last remaining flag, once we remove it, flags
field from vport-ops can be removed. Since it does not complicate
much code, we decided to remove this flag.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
vport->init and exit() functions are defined by gre and netdev vport
only and both can be moved to first port create.
Following patch does same, it moves vport init to respective vport
create and gets rid of vport->init() and vport->exit() functions.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Following patch changes vport->send return type so that vport
layer can do error accounting.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
After flow based tunneling, kernel tunneling is greatly simplified.
There is no need to have extra tunneling layer between vport and
particular protocol.
Following patch removes tunneling struct which make code easy to read.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Rather than defining ovs specific stats struct (vport_percpu_stats),
we can use existing pcpu_tstats to achieve exactly same functionality.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Currently OVS uses combination of genl and rtnl lock to protect
datapath state. This was done due to networking stack locking.
But this has complicated locking and there are few lock ordering
issues with new tunneling protocols.
Following patch simplifies locking by introducing new ovs mutex
and now this lock is used to protect entire ovs state.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
The port specific options are currently unused resulting in an
empty OVS_VPORT_ATTR_OPTIONS nested attribute being inserted
into every OVS_VPORT_CMD_GET message.
Don't insert OVS_VPORT_ATTR_OPTIONS if no options are present.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
[jesse: Options are used by tunnels but the concept still applies.]
Signed-off-by: Jesse Gross <jesse@nicira.com>
Header caching used to store a precomputed flow along with the skb
but no longer exists. There were a few remaining checks for those
flows, which this removes. It simplifies the code slightly and brings
us closer to upstream.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Gross <jesse@nicira.com>
LISP is an experimental layer 3 tunneling protocol, described in RFC
6830. This patch adds support for LISP tunneling. Since LISP
encapsulated packets do not carry an Ethernet header, it is removed
before encapsulation, and added with hardcoded source and destination
MAC addresses after decapsulation. The harcoded MAC chosen for this
purpose is the locally administered address 02:00:00:00:00:00. Flow
actions can be used to rewrite this MAC for correct reception. As such,
this patch is intended to be used for static network configurations, or
with a LISP capable controller.
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
The CAPWAP implementation is just the encapsulation format and
therefore really not the full protocol. While there were some
uses of it (primarily hardware support and UDP transport). But
these are most likely better provided by VXLAN.
Following patch removes CAPWAP tunneling support.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Now that userspace implements patch ports completely internally,
it's possible to remove the kernel implementation of them.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
We want to move the GRE vport ID into the upstream range but in
order to ease the transition kept the old ID around for one release.
This removes the old value.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
The ability to retrieve and set MAC addresses on vports is only
necessary for tunnel ports (the addresses for actual devices can be
retrieved through direct Linux mechanisms). Tunnel ports only used
the information for the purpose of generating path MTU discovery
packets, which has now been removed. Current userspace code already
reflects these changes, so this drops the functionality from the
kernel.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
Currently brcompat does not work on master due to recent
datapath changes. We have decided to remove it as it is
not used very widely.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Add support for VXLAN tunnels to Open vSwitch. Add support
for setting the destination UDP port on a per-port basis.
This is done by adding a "dst_port" parameter to the port
configuration. This is only applicable currently to VXLAN
tunnels.
Please note this currently does not implement any sort of multicast
learning. With this patch, VXLAN tunnels must be configured similar
to GRE tunnels (e.g. point to point). A subsequent patch will implement
a VXLAN control plane in userspace to handle multicast learning.
This patch set is based on one posted by Ben Pfaff on Oct. 12, 2011
to the ovs-dev mailing list:
http://openvswitch.org/pipermail/dev/2011-October/012051.html
The patch has been maintained, updated, and freshened by me and a
version of it is available at the following github repository:
https://github.com/mestery/ovs-vxlan/tree/vxlan
I've tested this patch with multiple VXLAN tunnels between hosts
using different UDP port numbers. Performance is on par (though
slightly faster) than comparable GRE tunnels.
See the following IETF draft for additional information about VXLAN:
http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
[jesse: simplify error path in vxlan_tunnel_setup, don't print default VXLAN port,
and remove dead code]
Signed-off-by: Jesse Gross <jesse@nicira.com>
just use more faster this_cpu_ptr instead of per_cpu_ptr(p, smp_processor_id());
Signed-off-by: Shan Wei <davidshan@tencent.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Once GRE is upstream it will have new type to have continuous sequence
of ids for vport type. Following patch adds this ID to have
compatibility with it.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This is a first pass at providing a tun_key which can be
used as the basis for flow-based tunnelling. The
tun_key includes and replaces the tun_id in both struct
ovs_skb_cb and struct sw_tun_key.
This patch allows all existing tun_id behaviour to still work. Existing
users of tun_id are redirected to tun_key->tun_id to retain compatibility.
However, when the userspace code is updated to make use of the new
tun_key, the old behaviour will be deprecated and removed.
NOTE: With these changes, the tunneling code no longer assumes input and
output keys are symmetric. If they are not, PMTUD needs to be disabled
for tunneling to work.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reviewed-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Extend GRE to have a 64-bit key. Use GRE sequence number to
store upper 32-bits of the key, but this is not standard way of
using GRE sequence number.
Bug #13186
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>