2
0
mirror of https://github.com/openvswitch/ovs synced 2025-10-25 15:07:05 +00:00
Commit Graph

85 Commits

Author SHA1 Message Date
Jarno Rajahalme
e65413ab8d lib/classifier: Use internal mutex.
Add an internal mutex to struct cls_classifier, and reorganize
classifier internal structures according to the user of each field,
marking the fields that need to be protected by the mutex.  This makes
locking requirements easier to track, and may make lookup more memory
efficient.

After this patch there is some double locking, as callers are taking
the fat-rwlock, and we take the mutex internally.  A following patch
will remove the classifier fat-rwlock, removing the (double) locking
overhead.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
2014-07-11 04:19:30 -07:00
Jarno Rajahalme
5f0476ce4e lib/classifier: Simplify iteration with C99 declaration.
Hide the cursor from the classifier iteration users and move locking to
the iterators.  This will make following RCU changes simpler, as the call
sites of the iterators need not be changed at that point.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
2014-07-11 04:19:29 -07:00
Jarno Rajahalme
f2c214029e lib/classifier: Use cmap.
Use cmap instead of hmap & hindex in classifier.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
2014-07-11 02:29:07 -07:00
Jarno Rajahalme
fe7cfa5c3f lib/pvector: Non-intrusive RCU priority vector.
Factor out the priority vector code from the classifier.

Making the classifier use RCU instead of locking requires parallel
access to the priority vector, pointing to subtables in descending
priority order.  When a new subtable is added, a new copy of the
priority vector is allocated, while the current readers can keep on
using the old copy they started with.  Adding and removing subtables
is usually less frequent than adding and removing rules, so this
should not have a visible performance implication.  As an optimization
for the userspace datapath use, where all the subtables have the same
priority, new subtables can be added to the end of the vector without
reallocation and without disturbing readers.

cls_subtables_reset() is now removed, as it served its purpose in bug
hunting.  Checks on the new pvector are now incorporated into
tests/test-classifier.c.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-06-26 07:41:25 -07:00
Daniele Di Proietto
c424c2f734 test-classifier: add ovs_assert to prevent warning
GCC 4.9.0 triggers a warning (array-bounds) while compiling test-classifier.c
This commit introduces an assertion that suppresses the warning.

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-06-24 13:21:06 -07:00
Jarno Rajahalme
5a87054c2d lib/classifier: Rename 'cls_subtable_cache' as 'cls_subtables'.
'cache' gives an inexact connotation, as the list is always expected
to be in order and contain pointers to all the subtables.

The struct cls_subtables fields are are also renamed to be more readable.

struct cls_classifier fields 'subtables' is remamed to 'subtables_map' and
'subtables_priority' is renamed to 'subtables',

There are no functional changes in this patch.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-05-19 10:41:03 -07:00
Jarno Rajahalme
627fb667b2 lib/classifier: Separate cls_rule internals from the API.
Keep an internal representation of a rule separate from the one
embedded into user's structs.  This allows for further memory
optimization in the classifier.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-04-29 15:50:38 -07:00
Jarno Rajahalme
cabd4c4385 lib/classifier: Hide more of the internal data structures.
It is better not to expose definitions not needed by users.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-04-29 15:50:38 -07:00
Jarno Rajahalme
3d91d9094d lib: Inline functions used in classifier_lookup.
This helps about 1% in TCP_CRR performance test.  However, this also
helps by clearly showing the classifier_lookup() cost in perf reports
as one item.

This also cleans up the flow/match APIs from functionality only used
by the classifier, making is more straightforward to evolve them
later.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-04-29 15:50:38 -07:00
Jarno Rajahalme
28a560d97a lib/flow: Simplify miniflow accessors, add ipv6 support.
Add new macro MINIFLOW_MAP(FIELD) that returns the map covering the
given struct flow field.

Change the miniflow accessors to macros so that they can take the
field name directly.

Use these to add ipv6 support to miniflow_hash_5tuple().

Add ipv6 support to flow_hash_5tuple() as well so that these two
functions continue to return the same hash value for the corresponding
flows.

Also, simplify miniflow_get_metadata().

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-04-29 15:50:38 -07:00
Jarno Rajahalme
b7807e4f64 lib/flow: Add miniflow accessors and miniflow_get_tcp_flags().
Add inlined generic accessors for miniflow integer type fields, and a
new miniflow_get_tcp_flags() usinge these.  These will be used in a
later patch.

Some definitions also used in lib/packets.h had to be moved there to
resolve circular include dependencies.  Similarly, some inline
functions using struct flow are now in lib/flow.h.  IMO this is
cleaner, since now the lib/flow.h need not be included from
lib/packets.h.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Reviewed-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
2014-04-18 08:34:08 -07:00
Andy Zhou
eadd16449c unit-test: Link 29 test programs into ovstest
Improve link speed by linking 29 test programs into ovstest.

On my machine, running the following command against a fully
built tree:

  $ touch lib/random.c; time make

Improve the overall build time from 7 seconds to 3.5 seconds.

Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-04-03 11:17:17 -07:00
Ben Pfaff
06f8162043 classifier: Use fat_rwlock instead of ovs_rwlock.
Jarno Rajahalme reported up to 40% performance gain on netperf TCP_CRR with
an earlier version of this patch in combination with a kernel NUMA patch,
together with a reduction in variance:
    http://openvswitch.org/pipermail/dev/2014-January/035867.html

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2014-01-14 14:45:10 -08:00
Harold Lim
428b2eddc9 Rename NOT_REACHED to OVS_NOT_REACHED
This allows other libraries to use util.h that has already
defined NOT_REACHED.

Signed-off-by: Harold Lim <haroldl@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-12-17 13:16:39 -08:00
Jarno Rajahalme
13751fd88c Classifier: Track address prefixes.
Add a prefix tree (trie) structure for tracking the used address
space, enabling skipping classifier tables containing longer masks
than necessary for an address field value in a packet header being
classified.  This enables less unwildcarding for datapath flows in
parts of the address space without host routes.

Trie lookup is interwoven to the staged lookup, so that a trie is
searched only when the configured trie field becomes relevant
for the lookup.  The trie lookup results are retained so that each
trie is checked at most once for each classifier lookup.

This implementation tracks the number of rules at each address prefix
for the whole classifier.  More aggressive table skipping would be
possible by maintaining lists of tables that have prefixes at the
lengths encountered on tree traversal, or by maintaining separate
tries for subsets of rules separated by metadata fields.

Prefix tracking is configured via OVSDB.  A new column "prefixes" is
added to the database table "Flow_Table".  "prefixes" is a set of
string values listing the field names for which prefix lookup should
be used.

As of now, the fields for which prefix lookup can be enabled are:
- tun_id, tun_src, tun_dst
- nw_src, nw_dst (or aliases ip_src and ip_dst)
- ipv6_src, ipv6_dst

There is a maximum number of fields that can be enabled for any one
flow table.  Currently this limit is 3.

Examples:

ovs-vsctl set Bridge br0 flow_tables:0=@N1 -- \
 --id=@N1 create Flow_Table name=table0
ovs-vsctl set Bridge br0 flow_tables:1=@N1 -- \
 --id=@N1 create Flow_Table name=table1

ovs-vsctl set Flow_Table table0 prefixes=ip_dst,ip_src
ovs-vsctl set Flow_Table table1 prefixes=[]

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-12-11 11:07:01 -08:00
Jarno Rajahalme
476f36e83b Classifier: Staged subtable matching.
Subtable lookup is performed in ranges defined for struct flow,
starting from metadata (registers, in_port, etc.), then L2 header, L3,
and finally L4 ports.  Whenever it is found that there are no matches
in the current subtable, the rest of the subtable can be skipped.  The
rationale of this logic is that as many fields as possible can remain
wildcarded.


Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
2013-11-19 17:31:29 -08:00
Jarno Rajahalme
0386824614 classifier: Rename struct cls_table as cls_subtable.
The naming of the classifier table has been a source of confusion,
since each OpenFlow table is implemented as a classifier, which
consists of multiple (sub)tables.  This name change hopefully makes
classifier related discussion a bit less confusing.

For consistency, relevant field names as well as the function and
variable names have been renamed in similar fashion.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-10-29 18:41:51 -07:00
Ben Pfaff
b826639572 openvswitch/types.h: New macros OVS_BE16_MAX, OVS_BE32_MAX, OVS_BE64_MAX.
These seem slightly nicer than e.g. htons(UINT16_MAX).

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-09-17 16:17:26 -07:00
Ethan Jackson
0b4f207828 classifier: Make use of the classifier thread safe.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2013-08-09 13:26:14 -07:00
Ben Pfaff
b028db44ca Use random_*() instead of rand(), for thread safety.
None of these test programs are threaded, but has little cost and means
that "grep" doesn't turn up any instances of these thread-unsafe functions
in our tree.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-28 16:09:39 -07:00
Alex Wang
e2711da9eb test-classifier.c: Use UINT16_MAX instead of OFPP_NONE in mask assignment
It is more comprehensible to use UINT16_MAX in mask assignment than OFPP_NONE.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-20 15:20:03 -07:00
Alex Wang
4e022ec09e Create specific types for ofp and odp port
Until now, datapath ports and openflow ports were both represented by
unsigned integers of various sizes. With implicit conversions, etc., it is
easy to mix them up and use one where the other is expected.  This commit
creates two typedefs, ofp_port_t and odp_port_t.  Both of these two types
are marked by "__attribute__((bitwise))" so that sparse can be used to
detect any misuse.

Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-06-20 10:42:37 -07:00
Ethan Jackson
368eefac37 flow: Add new wildcard functions.
Rename the function flow_wildcards_combine() to flow_wildcards_and().
Add new flow_wildcards_or() and flow_hash_in_wildcards() functions.
These will be useful in a future patch.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
2013-06-11 13:03:50 -07:00
Ethan Jackson
74f74083e6 classifier: Add 'wc' argument to classifier_lookup().
A future commit will want to know what bits were significant during the
classifier lookup.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
Co-authored-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
2013-06-11 13:03:50 -07:00
Jarno Rajahalme
4d935a6bcf Optimize classifier by maintaining the priority of the highest priority rule in each table.
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2013-02-08 12:35:51 -08:00
Ben Pfaff
f2f3f5cbde tests: Fix memory leaks in test-classifier program.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2013-01-10 08:32:11 -08:00
Jesse Gross
296e07ace0 flow: Extend struct flow to contain tunnel outer header.
Soon the kernel will begin supplying the information about the outer
IP header for tunneled packets and userspace will need to be able to
track it as part of the flow.  For the time being this is only used
internally by OVS and not exposed outwards to OpenFlow.  As a result,
this threads the information throughout userspace but simply stores
the existing tun_id in it.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-10-03 10:04:10 -07:00
Ben Pfaff
5cb7a79840 Introduce sparse flows and masks, to reduce memory usage and improve speed.
A cls_rule is 324 bytes on i386 now.  The cost of a flow table lookup is
currently proportional to this size, which is going to continue to grow.
However, the required cost of a flow table lookup, with the classifier that
we currently use, is only proportional to the number of bits that a rule
actually matches.  This commit implements that optimization by replacing
the match inside "struct cls_rule" by a sparse representation.

This reduces struct cls_rule to 100 bytes on i386.

There is still some headroom for further optimization following this
commit:

    - I suspect that adding an 'n' member to struct miniflow would make
      miniflow operations faster, since popcount() has some cost.

    - It's probably possible to replace the "struct minimatch" in cls_rule
      by just a "struct miniflow", since the cls_rule's cls_table has a
      copy of the minimask.

    - Some of the miniflow operations aren't well-optimized.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 12:43:53 -07:00
Ben Pfaff
48d28ac161 classifier: Prepare for "struct cls_rule" needing to be destroyed.
Until now, "struct cls_rule" didn't own any data outside its own memory
block.  An upcoming commit will make "struct cls_rule" sometimes own blocks
of memory, so it needs "destroy" and to a lesser extent "clone" functions.
This commit adds these in advance, even though they are mostly no-ops, to
make it possible to separately review the memory management.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 12:24:27 -07:00
Ben Pfaff
81a76618be classifier: Break cls_rule 'flow' and 'wc' members into new "struct match".
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 12:24:27 -07:00
Ben Pfaff
8472a3cecc util: New function zero_rightmost_1bit().
It's probably easier to understand
	x = zero_rightmost_1bit(x);
than
	x &= x - 1;

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 11:19:17 -07:00
Ben Pfaff
e7b4ef5eac flow: Remove flow_wildcards_is_exact().
It's only used in a not-very-useful assertion in some test code.  In
general, exact-match flows make very little sense anymore, and they're
basically on their way out.

Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 11:19:16 -07:00
Ben Pfaff
26720e2449 flow: Replace flow_wildcards members by a single "struct flow".
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 11:19:16 -07:00
Ben Pfaff
51c14ddd8d flow: Ensure that padding is always zeroed.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 11:19:15 -07:00
Ben Pfaff
0bdc4bec4f flow: Use bit-mask for in_port match, instead of FWW_* flag.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 11:19:15 -07:00
Ben Pfaff
e2170cffc1 flow: Use bit-mask for Ethernet type match, instead of FWW_* flag.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 11:19:15 -07:00
Ben Pfaff
851d3105c7 flow: Use bit-mask for IP protocol match, instead of FWW_* flag.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 11:19:15 -07:00
Ben Pfaff
5d9499c4dc flow: Use bit-mask for DSCP and ECN bits, instead of FWW_* flags.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-09-04 11:19:14 -07:00
Joe Stringer
7525e578b6 tests: Improve test coverage of OXM metadata field
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-07-11 14:26:40 -07:00
Ethan Jackson
3b842fc2f0 packets: Fix eth_addr_equal_except().
It turns out that eth_addr_equal_except() computed the exact
opposite of what it purported to.  It returned true if the two
arguments where *not* equal.  This is extremely confusing, so this
patch changes it.

Signed-off-by: Ethan Jackson <ethan@nicira.com>
2012-06-06 17:37:46 -07:00
Joe Stringer
73c0ce349b flow: Adds support for arbitrary ethernet masking
Arbitrary ethernet mask support is one step on the way to support for OpenFlow
1.1+. This patch set seeks to add this capability without breaking current
protocol support.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
[blp@nicira.com made some updates, see
 http://openvswitch.org/pipermail/dev/2012-May/017585.html]
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-29 12:24:07 -07:00
Raju Subramanian
e0edde6fee Global replace of Nicira Networks.
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.

Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-02 17:08:02 -07:00
Ben Pfaff
73f3356323 Add support for bitwise matching on TCP and UDP ports.
Bug #8827.
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-02-02 16:46:22 -08:00
Justin Pettit
2486e66ab5 flow: Use FWW_ flags to wildcard IP DSCP and ECN.
It's no longer necessary to maintain a "nw_tos_mask" wildcard member,
since we only care about completely wildcarding the DSCP and ECN
portions of the IP TOS field.  This commit makes that change.  It also
goes a bit further in internally using "tos" to refer to the entire TOS
field (ie, DSCP and ECN).  We must still refer to the DSCP portions as
"nw_tos" externally through OpenFlow 1.0, since that's the convention it
uses.
2011-11-10 18:03:05 -08:00
Justin Pettit
eadef31329 Prepend "nw_" to "frag" and "tos" elements.
Most of the members in structures referring to network elements indicate
the layer (e.g., "tl_", "nw_", "tp_").  The "frag" and "tos" members
didn't, so this commit add them.
2011-11-10 18:03:04 -08:00
Justin Pettit
9e44d71563 Don't overload IP TOS with the frag matching bits.
This will be useful later when we add support for matching the ECN bits
within the TOS field.

Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
2011-11-09 10:37:57 -08:00
Ben Pfaff
7257b535ab Implement new fragment handling policy.
Until now, OVS has handled IP fragments more awkwardly than necessary.  It
has not been possible to match on L4 headers, even in fragments with offset
0 where they are actually present.  This means that there was no way to
implement ACLs that treat, say, different TCP ports differently, on
fragmented traffic; instead, all decisions for fragment forwarding had to
be made on the basis of L2 and L3 headers alone.

This commit improves the situation significantly.  It is still not possible
to match on L4 headers in fragments with nonzero offset, because that
information is simply not present in such fragments, but this commit adds
the ability to match on L4 headers for fragments with zero offset.  This
means that it becomes possible to implement ACLs that drop such "first
fragments" on the basis of L4 headers.  In practice, that effectively
blocks even fragmented traffic on an L4 basis, because the receiving IP
stack cannot reassemble a full packet when the first fragment is missing.

This commit works by adding a new "fragment type" to the kernel flow match
and making it available through OpenFlow as a new NXM field named
NXM_NX_IP_FRAG.  Because OpenFlow 1.0 explicitly says that the L4 fields
are always 0 for IP fragments, it adds a new OpenFlow fragment handling
mode that fills in the L4 fields for "first fragments".  It also enhances
ovs-ofctl to allow users to configure this new fragment handling mode and
to parse the new field.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Bug #7557.
2011-10-21 15:07:36 -07:00
Ben Pfaff
e3d98cb0ab test-classifier: Remove write-only variable. 2011-10-03 09:25:11 -07:00
Ben Pfaff
08944c1db1 ofproto: Make rule construction and destruction more symmetric.
Before, ->rule_construct() both created the rule and inserted into the
flow table, but ->rule_destruct() only destroyed the rule.  This makes
->rule_destruct() also remove the rule from the flow table.
2011-05-11 14:06:48 -07:00
Ben Pfaff
abe529af47 ofproto: Break apart into generic and hardware-specific parts.
In addition to the changes to ofproto, this commit changes all of the
instances of "struct flow" in the tree so that the "in_port" member is an
OpenFlow port number.  Previously, this member was an OpenFlow port number
in some cases and an ODP port number in other cases.
2011-05-11 12:35:09 -07:00