2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 05:18:13 +00:00

422 Commits

Author SHA1 Message Date
Jarno Rajahalme
e0eecb1ca1 lib: Use tcp_flags from flow.
TCP flags are already extracted from the flow, no need to parse them
again.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-03-19 16:13:32 -07:00
Jarno Rajahalme
855dd13c9a dpif-netdev: Use packet key to parse TCP flags.
The flow that created the netdev_flow might have wildcarded TCP flags,
or it may not be a TCP flow at all.  Fix this by using the freshly
extracted flow key to parse TCP flags.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-03-19 16:13:32 -07:00
Ben Pfaff
61e7deb143 dpif-netdev: Use RCU to protect data.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2014-03-19 07:48:43 -07:00
Ben Pfaff
679ba04cab dpif-netdev: Use ovsthread_stats for flow stats.
This should scale better than a single mutex, though still not
ideally.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2014-03-19 07:48:42 -07:00
Ben Pfaff
51852a57a0 ovs-thread: Replace ovsthread_counter by more general ovsthread_stats.
This allows clients to do more than just increment a counter.  The
following commit will make the first use of that feature.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2014-03-19 07:47:12 -07:00
Andy Zhou
1a65ba8544 dpif-netdev: init atomic flag dp->destroyed
It is better to explicitly initialize the dp->destroy than to rely
on xzalloc().

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-03-18 00:40:10 -07:00
Ben Pfaff
8917f72cbb ovs-atomic: Delete atomic, atomic_flag, ovs_refcount destroy functions.
None of the atomic implementations need a destroy function anymore, so it's
"more standard" and more convenient for users to get rid of them.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2014-03-13 12:45:47 -07:00
Andy Zhou
b5e7e61a99 lib: simplify flow_extract() API
Change the flow_extract() API to accept struct pkt_metadata,
instead of individual metadata fields. It will make the API more
logical and easier to maintain when we need to expand metadata
down the road.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>¬
2014-02-28 16:29:37 -08:00
Joe Stringer
bdeadfdd95 dpif: New function flow_dump_next_may_destroy_keys().
This new function allows callers to determine whether previously
returned keys will be modified or reallocated on the next call to
dpif_flow_dump_next(). This will be used in a future commit to allow
batched flow deletion by revalidator threads.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-02-27 14:39:21 -08:00
Joe Stringer
d2ad7ef178 dpif: Make dpif_flow_dump_next() thread-safe.
This patch makes it the caller's responsibility to initialize a
per-thread 'state' object and pass it down to the dpif_flow_dump_next()
implementation. The implementation can expect to be called from multiple
threads with the same 'iter' and different 'state' objects.

When flow_dump_next() returns non-zero, the implementation must ensure
that subsequent calls with the same arguments also return non-zero.
Subsequent calls with the same 'iter' and different 'state' may return
zero, but should make progress towards returning non-zero.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-02-27 14:30:25 -08:00
Joe Stringer
e723fd32d5 dpif: Separate local and shared flow dump state.
This patch separates the structures for thread-local flow dump state
("state") from the shared flow dump state ("iter") in dpif-linux and
dpif-netdev. Future patches will make use of this to allow multiple
threads to dump flows from the same flow dump operation.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-02-27 14:27:32 -08:00
Alex Wang
71c24bb0f8 dpif-netdev: Fix memory leak.
In dpif_netdev_flow_del() and dp_netdev_port_input(), the
referenced 'netdev_flow' is not un-referenced.  This causes
the leak of the struct's memory.

This commit fixes the above issue by calling dp_netdev_flow_unref()
after using the reference.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-02-21 14:07:46 -08:00
Alex Wang
3754832be4 dpif-netdev: Call ovs_refcount_destroy() before free().
This commit makes dp_netdev_flow_unref() and dp_netdev_actions_unref()
invoke the ovs_refcount_destroy() before freeing the corresponding
pointer.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-02-21 14:05:07 -08:00
Ben Pfaff
8bfd0fdace Enhance userspace support for MPLS, for up to 3 labels.
This commit makes the userspace support for MPLS more complete.  Now
up to 3 labels are supported.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Co-authored-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Simon Horman <horms@verge.net.au>
2014-02-04 10:41:30 -08:00
Ben Pfaff
80e448834d dpif-netdev: Make a log message more detailed.
This would have helped me track down a bug I was hunting just now.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2014-02-04 08:11:45 -08:00
Ben Pfaff
06f8162043 classifier: Use fat_rwlock instead of ovs_rwlock.
Jarno Rajahalme reported up to 40% performance gain on netperf TCP_CRR with
an earlier version of this patch in combination with a kernel NUMA patch,
together with a reduction in variance:
    http://openvswitch.org/pipermail/dev/2014-January/035867.html

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-01-14 14:45:10 -08:00
Ben Pfaff
6c3eee823e dpif-netdev: Use separate threads for forwarding.
For now, we use exactly two threads.  Presumably at some point we will want
to make this configurable.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-08 17:13:32 -08:00
Ben Pfaff
8a4e3a858a dpif-netdev: Make thread-safety much more granular.
This will allow for parallelism in multithreaded forwarding in an upcoming
commit.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-08 17:13:32 -08:00
Ben Pfaff
f5126b5727 dpif-netdev: Introduce new mutex to protect queues.
This is a first step in making thread safety more granular in dpif-netdev,
to allow for multithreaded forwarding.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-08 17:13:31 -08:00
Ben Pfaff
a84cb64a9e dpif-netdev: Break actions out into new struct dp_netdev_actions.
This is analogous to the split between rule and rule_actions in
ofproto.  As there, it will allow retaining a reference to a rule's
actions, while processing them, without having to retain a reference
to the rule itself.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-08 17:13:31 -08:00
Ben Pfaff
6a8267c5b7 dpif-netdev: Take advantage of ovs_refcount for dp_netdev.
By making "destroyed" own a reference, we can treat dp_netdev's ref_cnt
like any other in Open vSwitch.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-08 17:13:31 -08:00
Ben Pfaff
5c8d2fcad0 dpif-netdev: Remove max_mtu tracking.
Normally all the ports have the same mtu anyhow, so there is little
advantage in keeping track of the maximum mtu on a per-bridge basis.  In
upcoming commits, tracking mtu will require more locking and present
even less advantage (because the packet buffer will become per-thread, so
that reallocating once per thread becomes essentially a null cost).

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-01-08 17:11:14 -08:00
Ben Pfaff
ff073a71f9 dpif-netdev: Use hmap instead of list+array for tracking ports.
The goal is to make it easy to divide the ports into groups for handling
by threads.  It seems easy enough to do that by hash value, and a little
harder otherwise.

This commit has the side effect of raising the maximum number of ports from
256 to UINT32_MAX-1.  That is why some tests need to be updated:
previously, internally generated port names like "ovs_vxlan_4341" were
ignored because 4341 is bigger than the previous limit of 256.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-01-08 17:11:09 -08:00
Ben Pfaff
ed27e010b9 dpif-netdev: Use new "ovsthread_counter" to track dp statistics.
ovsthread_counter is an abstract interface that could be implemented
different ways.  The initial implementation is simple but less than
optimally efficient.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-01-08 17:10:32 -08:00
Ben Pfaff
9e5026938c dpif: Remove unused 'get_max_ports' from provider interface.
Nothing ever called this function.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-01-08 17:10:31 -08:00
Jarno Rajahalme
758c456df5 dpif: Use explicit packet metadata.
This helps reduce confusion about when a flow is a flow and when it is
just metadata.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-12-30 16:52:43 -08:00
Jarno Rajahalme
09f9da0bca odp-execute: Consolidate callbacks.
Use one callback instead of many, helps in adding new functionality
later on.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-12-30 15:58:58 -08:00
Simon Horman
77790ca7b1 dpif-netdev: Remove unnecessary parameters from dp_netdev_port_input()
The skb_priority, pkt_mark and tunl parameters of dp_netdev_port_input()
are always passed as 0, 0 and NULL respectively. So rather than
passing these values to dp_netdev_port_input() just use them directly.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-12-17 16:31:34 -08:00
Francesco Fusco
1ce3fa06fc dpif-linux: fix the size of n_masks
The command ovs-dpctl can wrongly output the masks even if the
datapath does not implement mega flows. In this case the output
will be similar to the following:

system@ovs-system:
	lookups: hit:14 missed:41 lost:0
	flows: 0
	masks: hit:18446744073709551615 total:4294967295
		hit/pkt:335395346794719104.00
	port 0: ovs-system (internal)
	port 1: gre_system (gre: df_default=false, ttl=0)
	port 2: ots-br0 (internal)
	port 3: int0 (internal)
	port 4: vnet0
	port 5: vnet1

The problem depends on the fact that n_masks stats is stored as a
uint32 in the struct ovs_dp_megaflow_stats and as a uint64 in the
struct dpif_dp_stats. UINT32_MAX instead of UINT64_MAX should be
used to detect if the datapath supports megaflows or not.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-12-17 13:20:28 -08:00
Jarno Rajahalme
da546e0764 dpif: Allow execute to modify the packet.
Allowing the packet to be modified by execution allows less data
copying for userspace action execution.  Some users of the
dpif_execute already expect that the packet may be modified.  This
patch makes this behavior uniform and makes the userspace datapath and
the execution helpers modify the packet as it is being executed.
Userspace action now steals the packet if given permission, as the
packet is normally not needed after it.  The only exception is the
sample action, and this is accounted for my keeping track of any
actions that could be following the userspace action.

The packet in dpif_upcall is changed from a pointer to a struct,
allowing the packet to be honest about it's headroom.  After this
change the packet can safely be pushed on over the precarious 4 byte
limit earlier allowed by the netlink data preceding the packet.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-12-16 08:14:52 -08:00
Jarno Rajahalme
8c301900fc dpif-netdev: Properly create exact match masks.
Normally OVS userspace supplies a mask along with a flow key for each
new data path flow that should be created.  OVS also provides an
option to disable the kernel wildcarding, in which case the flows are
created without a mask.  When kernel wildcarding is disabled, the
datapath should use exact match, i.e. not wildcard any bits in the
flow key.  Currently, what happens with the userspace datapath instead
is that a datapath flow with mostly empty mask is created (i.e., most
fields are wildcarded), as the current code does not examine the given
mask key length to find out that the mask key is actually empty.  This
results in the same datapath flow matching on packets of multiple
different flows, wrong actions being processed, and stats being
incorrect.

This patch refactors userspace datapath code to explicitly initialize
a suitable exact match mask when a flow put without a mask is
executed.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-12-11 11:07:01 -08:00
Jarno Rajahalme
476f36e83b Classifier: Staged subtable matching.
Subtable lookup is performed in ranges defined for struct flow,
starting from metadata (registers, in_port, etc.), then L2 header, L3,
and finally L4 ports.  Whenever it is found that there are no matches
in the current subtable, the rest of the subtable can be skipped.  The
rationale of this logic is that as many fields as possible can remain
wildcarded.


Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
2013-11-19 17:31:29 -08:00
Jarno Rajahalme
9080a11199 dpif-netdev: Maintain the original key during execution.
Userspace action needs the original flow key.  This also
matches the kernel datapath behavior.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-11-14 14:35:58 -08:00
Jarno Rajahalme
24b7e19469 dpif_netdev_execute: Extract flow key from the packet.
Extract the flow key from the packet instead of the execute->key.
This reflects how the kernel datapath behaves.

Also use ofpbuf_clone_with_headroom() instead of open coding the same.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-11-14 14:35:58 -08:00
Gurucharan Shetty
2c0ea78f0a dpif-netdev: Introduce a classifier in userspace datapath.
Instead of an exact match flow table, we introduce a classifier.
This enables mega-flows in userspace datapath.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
[blp@nicira.com tweaked flow lookup]
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-11-13 10:13:49 -08:00
Gurucharan Shetty
1763b4b8d8 dpif-netdev: Change a variable name.
'struct dp_netdev_flow' is currently being instantiated as 'flow'.
An upcoming commit introduces a classifier to dpif-netdev
which uses 'struct flow' at a few places and that can cause
confusion while reading code.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-11-04 07:26:55 -08:00
Jarno Rajahalme
a66733a8bc Widen TCP flags handling.
Widen TCP flags handling from 7 bits (uint8_t) to 12 bits (uint16_t).
The kernel interface remains at 8 bits, which makes no functional
difference now, as none of the higher bits is currently of interest
to the userspace.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2013-10-29 09:40:19 -07:00
Andy Zhou
847108dc29 dpif-linux: collect and display mega flow mask stats
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-10-22 10:23:22 -07:00
Ben Pfaff
4fc6592603 odp-execute: Refine signatures for odp_execute_actions() callbacks.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-10-09 17:14:40 -07:00
Alexandru Copot
2499a8ce82 dpif-netdev: Do not allow adding loopback devices
Signed-off-by: Alexandru Copot <alex.mihai.c@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-09-07 09:34:16 -07:00
Alex Wang
1dd16b9a27 dpif: Change get_max_ports() to return uint32_t.
The declaration of 'get_max_ports()' to return odp_port_t adds
unwanted complexity to coding.  This commit changes it back to
return uint32_t type.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-23 11:33:17 -07:00
Jesse Gross
1362e248d6 flow: Rename skb_mark to pkt_mark.
The skb_mark field is currently only available with the Linux datapath
and is only used internally. However, it is desirable to expose this
through OpenFlow and when it is exposed ideally it would not be system-
specific. In preparation for this, skb_mark is rename to pkt_mark in
internal data structures for consistency.

This does not rename the Linux interfaces because doing so would break
the API. It would not necessarily be desirable to do anyways since in
Linux-specific code it is clearer to use the actual name rather than a
generic one. This can lead to confusion in some places, however, because
we do not always strictly separate generic and platform dependent code
(one example is actions). This seems inevitable though at this point if
the lower and upper layers have different names (as they must given the
above requirements).

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
2013-08-13 14:39:39 -07:00
Ben Pfaff
d33ed21806 dpif-netdev: Avoid races on queue and port changes using seq objects.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-08-10 20:49:03 -07:00
Ethan Jackson
97be153858 clang: Add annotations for thread safety check.
This commit adds annotations for thread safety check. And the
check can be conducted by using -Wthread-safety flag in clang.

Co-authored-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-30 21:30:45 -07:00
Ben Pfaff
74cc3969af ofproto-dpif: Tolerate spontaneous changes in datapath port numbers.
This can happen on ESX.

Also adds a test to make sure this works.

Bug #17634.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Tested-by: Guolin Yang <gyang@vmware.com>
2013-07-29 15:11:49 -07:00
Ben Pfaff
5279f8fdf0 dpif-netdev: Make internally thread-safe by introducing a global mutex.
This can be improved later but it is the simple thing to do for now.

I marked a couple of races with XXX.  I don't have a really good solution
for these, but I hope to find one.  They may be harmless in practice.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-07-25 09:56:10 -07:00
Ben Pfaff
3b0aab9343 dpif-netdev: Make 'max_mtu' a per-dp feature, for thread safety.
This ensures that an external lock around a dpif_netdev will allow
thread-safe access to it.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-22 10:53:07 -07:00
Ben Pfaff
586ddea56b dpif-netdev: Make "packet-out" with in_port=OFPP_CONTROLLER work again.
Commit 4e022ec09e14 (Create specific types for ofp and odp port) broke
OpenFlow OFPP_PACKET_OUT requests that use in_port=OFPP_CONTROLLER.  This
commit fixes the problem and adds a regression test.

CC: Alex Wang <alexw@nicira.com>
Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-07-11 10:33:35 -07:00
Ben Pfaff
10a89ef04d Replace all uses of strerror() by ovs_strerror(), for thread safety.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-28 16:09:38 -07:00
Alex Wang
4e022ec09e Create specific types for ofp and odp port
Until now, datapath ports and openflow ports were both represented by
unsigned integers of various sizes. With implicit conversions, etc., it is
easy to mix them up and use one where the other is expected.  This commit
creates two typedefs, ofp_port_t and odp_port_t.  Both of these two types
are marked by "__attribute__((bitwise))" so that sparse can be used to
detect any misuse.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-20 10:42:37 -07:00