2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-29 13:27:59 +00:00

11 Commits

Author SHA1 Message Date
Ben Pfaff
b70e697679 cmap: New macro CMAP_INITIALIZER, for initializing an empty cmap.
Sometimes code is much simpler if we can statically initialize data
structures.  Until now, this has not been possible for cmap-based data
structures, so this commit introduces a CMAP_INITIALIZER macro.

This works by adding a singleton empty cmap_impl that simply forces the
first insertion into any cmap that points to it to allocate a real
cmap_impl.  There could be some risk that rogue code modifies the
singleton, so for safety it is also marked 'const' to allow the linker to
put it into a read-only page.

This adds a new OVS_ALIGNED_VAR macro with GCC and MSVC implementations.
The latter is based on Microsoft webpages, so developers who know Windows
might want to scrutinize it.

As examples of the kind of simplification this can make possible, this
commit removes an initialization function from ofproto-dpif-rid.c and a
call to cmap_init() from tnl-neigh-cache.c.  An upcoming commit will add
another user.

CC: Jarno Rajahalme <jarno@ovn.org>
CC: Gurucharan Shetty <guru@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
2016-05-09 16:42:56 -07:00
Daniele Di Proietto
25c7da846d cmap: Explain corner case in CMAP_FOR_EACH comment.
Commit d916785ce98c("dpif-netdev: Fix improper use of CMAP_FOR_EACH.")
fixes a problem that's worth documenting.

Requested-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-02-03 19:43:24 -08:00
Jarno Rajahalme
52a524eb20 lib/cmap: cmap_find_batch().
Batching the cmap find improves the memory behavior with large cmaps
and can make searches twice as fast:

$ tests/ovstest test-cmap benchmark 2000000 8 0.1 16
Benchmarking with n=2000000, 8 threads, 0.10% mutations, batch size 16:
cmap insert:    533 ms
cmap iterate:    57 ms
batch search:   146 ms
cmap destroy:   233 ms

cmap insert:    552 ms
cmap iterate:    56 ms
cmap search:    299 ms
cmap destroy:   229 ms

hmap insert:    222 ms
hmap iterate:   198 ms
hmap search:   2061 ms
hmap destroy:   209 ms

Batch size 1 has small performance penalty, but all other batch sizes
are faster than non-batched cmap_find().  The batch size 16 was
experimentally found better than 8 or 32, so now
classifier_lookup_miniflow_batch() performs the cmap find operations
in batches of 16.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-10-06 15:33:47 -07:00
Jarno Rajahalme
55847abee8 lib/cmap: split up cmap_find().
This makes the following patch easier and cleans up the code.

Explicit "inline" keywords seem necessary to prevent performance
regression on cmap_find() with GCC 4.7 -O2.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-10-06 15:33:46 -07:00
Jarno Rajahalme
ee58b46960 lib/cmap: Return number of nodes from functions modifying the cmap.
We already update the count field as the last step of these functions,
so returning the current count is very cheap.  Callers that care about
the count become a bit more efficient, as they avoid extra
non-inlineable function call.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-10-06 15:33:41 -07:00
Gurucharan Shetty
f17e8ad6c6 Avoid uninitialized variable warnings with OBJECT_OFFSETOF() in MSVC.
Implementation of OBJECT_OFFSETOF() for non-GNUC compilers like MSVC
causes "uninitialized variable" warnings. Since OBJECT_OFFSETOF() is
indirectly used through all the *_FOR_EACH() (through ASSIGN_CONTAINER()
and  OBJECT_CONTAINING()) macros, the OVS build
on Windows gets littered with "uninitialized variable" warnings.
This patch attempts to workaround the problem.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-09-12 09:03:10 -07:00
Ben Pfaff
6bc3bb829c cmap: Merge CMAP_FOR_EACH_SAFE into CMAP_FOR_EACH.
There isn't any significant downside to making cmap iteration "safe" all
the time, so this drops the _SAFE variant.

Similar changes to CMAP_CURSOR_FOR_EACH and CMAP_CURSOR_FOR_EACH_CONTINUE.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-07-29 09:02:23 -07:00
Ben Pfaff
78c8df129c cmap, classifier: Avoid unsafe aliasing in iterators.
CMAP_FOR_EACH and CLS_FOR_EACH and their variants tried to use void ** as
a "pointer to any kind of pointer".  That is a violation of the aliasing
rules in ISO C which technically yields undefined behavior.  With GCC 4.1,
it causes both warnings and actual misbehavior.  One option would to add
-fno-strict-aliasing to the compiler flags, but that would only help with
GCC; who knows whether this can be worked around with other compilers.

Instead, this commit rewrites the iterators to avoid disallowed pointer
aliasing.

VMware-BZ: #1287651
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-07-21 21:00:04 -07:00
Jarno Rajahalme
a532e683cf lib/cmap: Simplify iteration with C99 loop declaration.
This further eases porting existing hmap code to use cmap instead.

The iterator variants taking an explicit cursor are retained (renamed)
as they are needed when iteration is to be continued from the last
iterated node.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-06-11 11:09:51 -07:00
Jarno Rajahalme
9d933fc776 lib/cmap: Add more hmap-like functionality.
Add cmap_replace() and cmap_first(), as well as CMAP_FOR_EACH_SAFE and
CMAP_FOR_EACH_CONTINUE to make porting existing hmap using code a bit
easier.

CMAP_FOR_EACH_SAFE is useful in RCU postponed destructors, when it is
known that additional postponing is not needed.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-05-28 16:56:29 -07:00
Ben Pfaff
0e66616040 cmap: New module for cuckoo hash table.
This implements an "optimistic concurrent cuckoo hash", a single-writer,
multiple-reader hash table data structure.  The point of this data
structure is performance, so this commit message focuses on performance.

I tested the performance of cmap with the test-cmap utility included in
this commit.  It takes three parameters for benchmarking:

    - n, the number of elements to insert.

    - n_threads, the number of threads to use for searching and
      mutating the hash table.

    - mutations, the percentage of operations that should modify the
      hash table, from 0% to 100%.

e.g. "test-cmap 1000000 16 1" inserts one million elements, uses 16
threads, and 1% of the operations modify the hash table.

Any given run does the following for both hmap and cmap
implementations:

    - Inserts n elements into a hash table.

    - Iterates over all of the elements.

    - Spawns n_threads threads, each of which searches for each of the
      elements in the hash table, once, and removes the specified
      percentage of them.

    - Removes each of the (remaining) elements and destroys the hash
      table.

and reports the time taken by each step,

The tables below report results for various parameters with a draft version
of this library.  The tests were not formally rerun for the final version,
but the intermediate changes should only have improved performance, and
this seemed to be the case in some informal testing.

n_threads=16 was used each time, on a 16-core x86-64 machine.  The compiler
used was Clang 3.5.  (GCC yields different numbers but similar relative
results.)

The results show:

    - Insertion is generally 3x to 5x faster in an hmap.

    - Iteration is generally about 3x faster in a cmap.

    - Search and mutation is 4x faster with .1% mutations and the
      advantage grows as the fraction of mutations grows.  This is
      because a cmap does not require locking for read operations,
      even in the presence of a writer.

      With no mutations, however, no locking is required in the hmap
      case, and the hmap is somewhat faster.  This is because raw hmap
      search is somewhat simpler and faster than raw cmap search.

    - Destruction is faster, usually by less than 2x, in an hmap.

n=10,000,000:

        .1% mutations    1% mutations    10% mutations     no mutations
          cmap  hmap     cmap   hmap      cmap   hmap       cmap  hmap
insert:   6132  2182     6136   2178      6111   2174       6124  2180
iterate:   370  1203      378   1201       370   1200        371  1202
search:   1375  8692     2393  28197     18402  80379       1281  1097
destroy:  1382  1187     1197   1034       324    241       1405  1205

n=1,000,000:

        .1% mutations    1% mutations    10% mutations     no mutations
          cmap  hmap     cmap   hmap      cmap   hmap       cmap  hmap
insert:    311    25      310     60       311     59        310    60
iterate:    25    62       25     64        25     57         25    60
search:    115   637      197   2266      1803   7284        101    67
destroy:   103    64       90     59        25     13        104    66

n=100,000:

        .1% mutations    1% mutations    10% mutations     no mutations
          cmap  hmap     cmap   hmap      cmap   hmap       cmap  hmap
insert:     25     6       26      5        25      5         25     5
iterate:     1     3        1      3         1      3          2     3
search:     12    57       27    219       164    735         10     5
destroy:     5     3        6      3         2      1          6     4

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-05-20 16:55:20 -07:00