2
0
mirror of https://github.com/openvswitch/ovs synced 2025-10-11 13:57:52 +00:00

lib/cmap: cmap_find_batch().

Batching the cmap find improves the memory behavior with large cmaps
and can make searches twice as fast:

$ tests/ovstest test-cmap benchmark 2000000 8 0.1 16
Benchmarking with n=2000000, 8 threads, 0.10% mutations, batch size 16:
cmap insert:    533 ms
cmap iterate:    57 ms
batch search:   146 ms
cmap destroy:   233 ms

cmap insert:    552 ms
cmap iterate:    56 ms
cmap search:    299 ms
cmap destroy:   229 ms

hmap insert:    222 ms
hmap iterate:   198 ms
hmap search:   2061 ms
hmap destroy:   209 ms

Batch size 1 has small performance penalty, but all other batch sizes
are faster than non-batched cmap_find().  The batch size 16 was
experimentally found better than 8 or 32, so now
classifier_lookup_miniflow_batch() performs the cmap find operations
in batches of 16.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This commit is contained in:
Jarno Rajahalme
2014-09-24 10:39:20 -07:00
parent 55847abee8
commit 52a524eb20
7 changed files with 366 additions and 56 deletions

View File

@@ -2644,7 +2644,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
enum { PKT_ARRAY_SIZE = NETDEV_MAX_RX_BATCH };
#endif
struct packet_batch batches[PKT_ARRAY_SIZE];
const struct miniflow *mfs[PKT_ARRAY_SIZE]; /* NULL at bad packets. */
const struct miniflow *mfs[PKT_ARRAY_SIZE]; /* May NOT be NULL. */
struct cls_rule *rules[PKT_ARRAY_SIZE];
struct dp_netdev *dp = pmd->dp;
struct emc_cache *flow_cache = &pmd->flow_cache;
@@ -2652,7 +2652,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
bool any_miss;
for (i = 0; i < cnt; i++) {
mfs[i] = &keys[i].flow;
mfs[i] = &keys[i].flow; /* No bad packets! */
}
any_miss = !classifier_lookup_miniflow_batch(&dp->cls, mfs, rules, cnt);
if (OVS_UNLIKELY(any_miss) && !fat_rwlock_tryrdlock(&dp->upcall_rwlock)) {