2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-31 06:15:47 +00:00

lib/cmap: cmap_find_batch().

Batching the cmap find improves the memory behavior with large cmaps
and can make searches twice as fast:

$ tests/ovstest test-cmap benchmark 2000000 8 0.1 16
Benchmarking with n=2000000, 8 threads, 0.10% mutations, batch size 16:
cmap insert:    533 ms
cmap iterate:    57 ms
batch search:   146 ms
cmap destroy:   233 ms

cmap insert:    552 ms
cmap iterate:    56 ms
cmap search:    299 ms
cmap destroy:   229 ms

hmap insert:    222 ms
hmap iterate:   198 ms
hmap search:   2061 ms
hmap destroy:   209 ms

Batch size 1 has small performance penalty, but all other batch sizes
are faster than non-batched cmap_find().  The batch size 16 was
experimentally found better than 8 or 32, so now
classifier_lookup_miniflow_batch() performs the cmap find operations
in batches of 16.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This commit is contained in:
Jarno Rajahalme
2014-09-24 10:39:20 -07:00
parent 55847abee8
commit 52a524eb20
7 changed files with 366 additions and 56 deletions

View File

@@ -114,4 +114,13 @@ bool bitmap_is_all_zeros(const unsigned long *, size_t n);
for ((IDX) = bitmap_scan(BITMAP, 1, 0, SIZE); (IDX) < (SIZE); \
(IDX) = bitmap_scan(BITMAP, 1, (IDX) + 1, SIZE))
/* More efficient access to a map of single ulong. */
#define ULONG_FOR_EACH_1(IDX, MAP) \
for (unsigned long map__ = (MAP); \
map__ && (((IDX) = raw_ctz(map__)), true); \
map__ = zero_rightmost_1bit(map__))
#define ULONG_SET0(MAP, OFFSET) ((MAP) &= ~(1UL << (OFFSET)))
#define ULONG_SET1(MAP, OFFSET) ((MAP) |= 1UL << (OFFSET))
#endif /* bitmap.h */