Change a test so that the result will be the same in both
little-endian and big-endian systems by editing the test case so that
only one bit differs.
Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Use MAP_FOR_EACH_INDEX to improve readability when there is no
apparent cost for it.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This patch adds a new functions classifier_defer() and
classifier_publish(), which control when the classifier modifications
are made available to lookups. By default, all modifications are made
available to lookups immediately. Modifications made after a
classifier_defer() call MAY be 'deferred' for later 'publication'. A
call to classifier_publish() will both publish any deferred
modifications, and cause subsequent changes to to be published
immediately.
Currently any deferring is limited to the visibility of the subtable
vector changes. pvector now processes modifications mostly in a
working copy, which needs to be explicitly published with
pvector_publish(). pvector_publish() sorts the working copy and
removes gaps before publishing it.
This change helps avoiding O(n**2) memory behavior in corner cases,
where large number of rules with different masks are inserted or
deleted.
VMware-BZ: #1322017
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
classifier_remove() was recently changed to take a const struct
cls_rule *. Make the corresponding change to classifier_replace() and
classifier_insert(). This simplifies existing calling sites in
ofproto.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Almost all classifier users already exclude concurrent modifications,
or are single-threaded, hence the classifier internal mutex can be
removed. Due to this change, ovs-router.c and tnl-ports.c need new
mutexes, which are added.
As noted by Ben in review, ovs_router_flush() should also free the
entries it removes from the classifier. It now calls
ovsrcu_postpone() to that effect.
Suggested-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Previously, accurate iteration required writers to be excluded during
iteration. This patch adds an rculist to struct cls_subtable, and a
corresponding list node to struct cls_rule, which makes iteration more
straightforward, and allows the iterators to remain ignorant of the
internals of the cls_match. This new list allows iteration of rules
in the classifier by traversing the RCU-friendly subtables vector, and
the rculist of rules in each subtable.
Classifier modifications may be performed concurrently, but whether or
not the concurrent iterator sees those changes depends on the timing
of change. More specifically, an concurrent iterator:
- May or may not see a rule that is being inserted or removed.
- Will see either the new or the old version of a rule that is replaced.
- Will see all the other rules (that are not being modified).
Finally, The subtable's rculist also allows to make
classifier_rule_overlaps() lockless, which this patch also does.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
There is no point in adding duplicate information into prefix tries.
Also, since the lower-priority duplicate rules are not visible to
lookups, they do not need to be in staged lookup indices directly
either (the head rule is).
Finally, now that cmap operations return the number of elements in the
cmap, subtable's 'n_rules' member is not needed any more.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
insert_rule() only had one caller and this makes the code easier to
understand.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Shifting a 32-bit entity by 32 bits is undefined behavior. As we have 3
cases where we may hit this, it is a time to introduce a helper for
this.
VMware-BZ: #1355026
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Add asserts to make sure the containers within are already empty.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Some struct cls_match and cls_subtable fields were already documented
of being const. Make them const and use CONST_CAST where appropriate
to initialize them.
This will help catch future errors modifying those fields after
initialization.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
struct cls_match 'list' member was recently changed to an rculist.
This allows classifier_find_rule_exactly() to be made lockless.
Since subtable's 'max_priority' member would still require a lock, we
no longer check it before calling find_equal(). This adds a hash
table lookup in cases where the subtable may already be known to not
contain any rule of the target priority.
classifier_find_rule_exactly() is never called on the fastpath, so
this should not be significant.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Returning const struct cls_rule pointers from the classifier API helps
callers to remember that they should not modify the rules returned.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The list of identical, but lower priority rules is not currently used
in classifier lookup. A later patch introducing conjunctive matches
needs to access the list during lookups, so we must make the list RCU.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
OpenFlow has priorities in the 16-bit unsigned range, from 0 to 65535.
In the classifier, it is sometimes useful to be able to have values below
and above this range. With the 'unsigned int' type used for priorities
until now, there were no values below the range, so some code worked
around it by converting priorities to 64-bit signed integers. This didn't
seem so great to me given that a plain 'int' also had the needed range.
This commit therefore changes the type used for priorities to int.
The interesting parts of this change are in pvector.h and classifier.c,
where one can see the elimination of the use of int64_t.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
tests/test-classifier.c used to include lib/classifier.c to gain
access to the internal data structures and some utility functions.
This was confusing, so this patch splits the relevant groups of
classifier internal definations to a new file
(lib/classifier-private.h), which is included by both lib/classifier.c
and tests/test-classifier.c. Other use of the new file is
discouraged.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Megaflow inserts and removals are simplified:
- No need for classifier internal mutex, as dpif-netdev already has a
'flow_mutex'.
- Number of memory allocations/frees can be halved.
- Lookup code path can rely on netdev_flow_key always having inline data.
This will also be easier to simplify further when moving to per-thread
megaflow classifiers in the future.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
This reverts commit d2064437e2bf91859a0a50fba30dcabba668a811, which
fails clang thread satefy analysis.
A more complete patch will be introduced later.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
classifier_find_rule_exactly() only needs the mutex to protect the
list traversal. Subtables are already RCU protected.
Locking here can be eliminated completely when RCU list is available.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
classifier already provides lockless lookups, and protected
modifications. When user wants to remove a flow, we currently require
the flow to exist in the classifier. To be thread safe, this requires
the caller to introduce their own mutex, lock it before a lookup, and
then issue classifier_remove() while the lock is still held.
This patch relaxes the "existence requirement" of the rule in
classifier_remove(), allowing it to be called on a rule that may have
already been removed from the classifier. This allows users to do a
classifier_lookup() and classifier_remove() without additional
syncronization.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Batching the cmap find improves the memory behavior with large cmaps
and can make searches twice as fast:
$ tests/ovstest test-cmap benchmark 2000000 8 0.1 16
Benchmarking with n=2000000, 8 threads, 0.10% mutations, batch size 16:
cmap insert: 533 ms
cmap iterate: 57 ms
batch search: 146 ms
cmap destroy: 233 ms
cmap insert: 552 ms
cmap iterate: 56 ms
cmap search: 299 ms
cmap destroy: 229 ms
hmap insert: 222 ms
hmap iterate: 198 ms
hmap search: 2061 ms
hmap destroy: 209 ms
Batch size 1 has small performance penalty, but all other batch sizes
are faster than non-batched cmap_find(). The batch size 16 was
experimentally found better than 8 or 32, so now
classifier_lookup_miniflow_batch() performs the cmap find operations
in batches of 16.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This makes the following patch easier and cleans up the code.
Explicit "inline" keywords seem necessary to prevent performance
regression on cmap_find() with GCC 4.7 -O2.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Implementation of OBJECT_OFFSETOF() for non-GNUC compilers like MSVC
causes "uninitialized variable" warnings. Since OBJECT_OFFSETOF() is
indirectly used through all the *_FOR_EACH() (through ASSIGN_CONTAINER()
and OBJECT_CONTAINING()) macros, the OVS build
on Windows gets littered with "uninitialized variable" warnings.
This patch attempts to workaround the problem.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This patch causes classifier_lookup_miniflow_batch() to return a
boolean indicating whether any rules could not be successfully looked
up. Used in future patches.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
There isn't any significant downside to making cmap iteration "safe" all
the time, so this drops the _SAFE variant.
Similar changes to CMAP_CURSOR_FOR_EACH and CMAP_CURSOR_FOR_EACH_CONTINUE.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
An upcoming commit will increase the number of fields beyond 64.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
CMAP_FOR_EACH and CLS_FOR_EACH and their variants tried to use void ** as
a "pointer to any kind of pointer". That is a violation of the aliasing
rules in ISO C which technically yields undefined behavior. With GCC 4.1,
it causes both warnings and actual misbehavior. One option would to add
-fno-strict-aliasing to the compiler flags, but that would only help with
GCC; who knows whether this can be worked around with other compilers.
Instead, this commit rewrites the iterators to avoid disallowed pointer
aliasing.
VMware-BZ: #1287651
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Previously we only returned the last matching prefix length
encountered during a trie lookup, and skipped subtables that had
prefixes longer than that. This patch changes the trie lookup
functions to return all matching prefix lengths seen, so that all
non-matching prefix lengths can be skipped.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
These stylistic changes makes the following patch a bit simpler.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Now that it is clear that struct cls_classifier itself does not
need RCU indirection and pvector is defined in its own header, it
is possible get rid of the indirection from struct classifier to
struct cls_classifier.
Suggested-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Now that all the relevant classifier structures use RCU and internal
mutual exclusion for modifications, we can remove the fat-rwlock and
thus make the classifier lookups lockless.
As the readers are operating concurrently with the writers, a
concurrent reader may or may not see a new rule being added by a
writer, depending on how the concurrent events overlap with each
other. Overall, this is no different from the former locked behavior,
but there the visibility of the new rule only depended on the timing
of the locking functions.
A new rule is first added to the segment indices, so the readers may
find the rule in the indices before the rule is visible in the
subtables 'rules' map. This may result in us losing the opportunity
to quit lookups earlier, resulting in sub-optimal wildcarding. This
will be fixed by forthcoming revalidation always scheduled after flow
table changes.
Similar behavior may happen due to us removing the overlapping rule
(if any) from the indices only after the corresponding new rule has
been added.
The subtable's max priority is updated only after a rule is inserted
to the maps, so the concurrent readers may not see the rule, as the
updated priority ordered subtable list will only be visible after the
subtable's max priority is updated.
Similarly, the classifier's partitions are updated by the caller after
the rule is inserted to the maps, so the readers may keep skipping the
subtable until they see the updated partitions.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
cls_set_prefix_fields() now synchronizes explicitly with the readers,
waiting them to finish using the old configuration before changing to
the new configuration.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Add an internal mutex to struct cls_classifier, and reorganize
classifier internal structures according to the user of each field,
marking the fields that need to be protected by the mutex. This makes
locking requirements easier to track, and may make lookup more memory
efficient.
After this patch there is some double locking, as callers are taking
the fat-rwlock, and we take the mutex internally. A following patch
will remove the classifier fat-rwlock, removing the (double) locking
overhead.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Rename 'nbits' as 'n_bits' to be more consistent with other count-like
fields.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Hide the cursor from the classifier iteration users and move locking to
the iterators. This will make following RCU changes simpler, as the call
sites of the iterators need not be changed at that point.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Use generic names hash_add() and hash_finish() instead of mhash_*
equivalents. This makes future changes to hash implentations more
localized.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Reduce the number of goto statements by returning via a new helper
fill_range_wc() when no match is found.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
miniflow_and_mask_matches_flow_wc() fills in the masks in flow
wildcards, so a separate step to that effect is no longer needed.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
When, during a classifier lookup, we narrow down to a single potential
rule, it is enough to match on ("unwildcard") one bit that differs
between the packet and the rule.
This is a special case of the more general algorithm, where it is
sufficient to match on enough bits that separates the packet from all
higher priority rules than the matched rule. For a miss that would be
all the rules. Implementing this is expensive for a more than a few
rules. This patch starts by doing this for a single rule when we
already have it, also reducing the lookup cost by finishing the lookup
earlier than before.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Factor out the priority vector code from the classifier.
Making the classifier use RCU instead of locking requires parallel
access to the priority vector, pointing to subtables in descending
priority order. When a new subtable is added, a new copy of the
priority vector is allocated, while the current readers can keep on
using the old copy they started with. Adding and removing subtables
is usually less frequent than adding and removing rules, so this
should not have a visible performance implication. As an optimization
for the userspace datapath use, where all the subtables have the same
priority, new subtables can be added to the end of the vector without
reallocation and without disturbing readers.
cls_subtables_reset() is now removed, as it served its purpose in bug
hunting. Checks on the new pvector are now incorporated into
tests/test-classifier.c.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
When reaching the end of a prefix trie, we checked one bit off the end
to the intended data. However, since the trie node in that case has
NULLs for both edge links, this did not result in incorrect
functionality.
Found via check-valgrind.
Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
trie_lookup_value() is easier to read with the local variable 'plen'
renamed as 'ofs'.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This bug did not manifest due to 'hmap_node' being in the same offset
in both struct cls_partition and struct cls_subtable.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
'cache' gives an inexact connotation, as the list is always expected
to be in order and contain pointers to all the subtables.
The struct cls_subtables fields are are also renamed to be more readable.
struct cls_classifier fields 'subtables' is remamed to 'subtables_map' and
'subtables_priority' is renamed to 'subtables',
There are no functional changes in this patch.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>