2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

232 Commits

Author SHA1 Message Date
Mike Pattrick
3d6b048d84 classifier: Store n_indices between usage.
Currently the Clang analyzer will complain about usage of an
uninitialized variable in the classifier. This is a false positive, but
not for a reason that could easily be detectable by clang.

The classifier is not safe for multiple writer threads to use
simultaneously so all callers protect these functions from simultaneous
writes. However, this is not so clear from the code's static analysis
alone. To help Clang out here, the n_indicies count is saved onto the
stack instead of accessed from the subtables struct repeatedly.

Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2024-09-11 15:38:59 +02:00
Nobuhiro MIKI
7b514aba0e ofproto-dpif-trace: Improve conjunctive match tracing.
A conjunctive flow consists of two or more multiple flows with
conjunction actions. When input to the ofproto/trace command
matches a conjunctive flow, it outputs flows of all dimensions.

Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-11-16 18:41:07 +01:00
Ilya Maximets
489553b1c2 classifier: Fix missing masks on a final stage with ports trie.
Flow lookup doesn't include masks of the final stage in a resulting
flow wildcards in case that stage had L4 ports match.  Only the result
of ports trie lookup is added to the mask.  It might be sufficient in
many cases, but it's not correct, because ports trie is not how we
decided that the packet didn't match in this subtable.  In fact, we
used a full subtable mask in order to determine that, so all the
subtable mask bits has to be added.

Ports trie can still be used to adjust ports' mask, but it is not
sufficient to determine that the packet didn't match.

Assuming we have following 2 OpenFlow rules on the bridge:

 table=0, priority=10,tcp,tp_dst=80,tcp_flags=+psh actions=drop
 table=0, priority=0 actions=output(1)

The first high priority rule supposed to drop all the TCP data traffic
sent on port 80.  The handshake, however, is allowed for forwarding.

Both 'tcp_flags' and 'tp_dst' are on the final stage in the flow.
Since the stage mask from that stage is not incorporated into the flow
wildcards and only ports mask is getting updated, we have the following
megaflow for the SYN packet that has no match on 'tcp_flags':

 $ ovs-appctl ofproto/trace br0 "in_port=br0,tcp,tp_dst=80,tcp_flags=syn"

 Megaflow: recirc_id=0,eth,tcp,in_port=LOCAL,nw_frag=no,tp_dst=80
 Datapath actions: 1

If this flow is getting installed into datapath flow table, all the
packets for port 80, regardless of TCP flags, will be forwarded.

Incorporating all the looked at bits from the final stage into the
stages map in order to get all the necessary wildcards.  Ports mask
has to be updated as a last step, because it doesn't cover the full
64-bit slot in the flowmap.

With this change, in the example above, OVS is producing correct
flow wildcards including match on TCP flags:

 Megaflow: recirc_id=0,eth,tcp,in_port=LOCAL,nw_frag=no,tp_dst=80,tcp_flags=-psh
 Datapath actions: 1

This way only -psh packets will be forwarded, as expected.

This issue affects all other fields on stage 4, not only TCP flags.
Tests included to cover tcp_flags, nd_target and ct_tp_src/dst.
First two are frequently used, ct ones are sharing the same flowmap
slot with L4 ports, so important to test.

Before the pre-computation of stage masks, flow wildcards were updated
during lookup, so there was no issue.  The bits of the final stage was
lost with introduction of 'stages_map'.

Recent adjustment of segment boundaries exposed 'tcp_flags' to the issue.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/272
Fixes: ca44218515f0 ("classifier: Adjust segment boundary to execute prerequisite processing.")
Fixes: fa2fdbf8d0c1 ("classifier: Pre-compute stage masks.")
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-02-28 18:57:20 +01:00
Adrian Moreno
9e56549c2b hmap: use short version of safe loops if possible.
Using SHORT version of the *_SAFE loops makes the code cleaner and less
error prone. So, use the SHORT version and remove the extra variable
when possible for hmap and all its derived types.

In order to be able to use both long and short versions without changing
the name of the macro for all the clients, overload the existing name
and select the appropriate version depending on the number of arguments.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-03-30 16:59:02 +02:00
Ben Pfaff
721488d4a5 classifier: Make find_match_wc() prototype and definition match.
The prototype said *, the definition said [CLS_MAX_TRIES].  GCC 11
complains about this (though it is perfectly valid from a C standards
perspective).  It would probably be better to make them both use
[CLS_MAX_TRIES] but that's only allowed if the struct's definition is
visible at the point of the prototype, which it's not.  Instead of
moving the definition, this commit just changes both usages to *.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
2021-05-07 12:14:31 -07:00
William Tu
0287f840e8 classifier: Fix use of uninitialized value.
Coverity reports use of uninitialized value of cursor.
This happens in cls_cursor_start(), when rule is false,
cursor.subtable is uninitialized. CID 279324.

Signed-off-by: William Tu <u9012063@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-09-15 22:31:32 +02:00
Eiichi Tsukata
a611705990 classifier: Prevent tries vs n_tries race leading to NULL dereference.
Currently classifier tries and n_tries can be updated not atomically,
there is a race condition which can lead to NULL dereference.
The race can happen when main thread updates a classifier tries and
n_tries in classifier_set_prefix_fields() and at the same time revalidator
or handler thread try to lookup them in classifier_lookup__(). Such race
can be triggered when user changes prefixes of flow_table.

Race(user changes flow_table prefixes: ip_dst,ip_src => none):

  [main thread]             [revalidator/handler thread]
  ===========================================================
                            /* cls->n_tries == 2 */
                            for (int i = 0; i < cls->n_tries; i++) {
  trie_init(cls, i, NULL);
  /* n_tries == 0 */
  cls->n_tries = n_tries;
                            /* cls->tries[i]->feild is NULL */
                            trie_ctx_init(&trie_ctx[i],&cls->tries[i]);
                            /* trie->field is NULL */
                            ctx->be32ofs = trie->field->flow_be32ofs;

To prevent the race, instead of re-introducing internal mutex
implemented in the commit fccd7c092e09 ("classifier: Remove internal
mutex."), this patch makes trie field RCU protected and checks it after
read.

Fixes: fccd7c092e09 ("classifier: Remove internal mutex.")
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-05-28 11:46:39 +02:00
Zhantao Fu
0fcf0776c7 Double postponing to free subtables.
Subtable destruction should be double postponed because readers could still obtain old values while iterating over pvector implementation before its new version published.

Signed-off-by: Zhantao Fu <fuzhantao@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-23 09:08:51 -07:00
Ben Pfaff
1dc1ec247e flow, match, classifier: Add new functions for miniflow and minimatch.
The miniflow and minimatch APIs lack several of the features of the flow
and match APIs.  This commit adds a few of the missing functions.

These functions will be used for the first time in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Armando Migliaccio <armamig@gmail.com>
2018-03-31 11:32:10 -07:00
Ben Pfaff
f825fdd4ff flow: Improve type-safety of MINIFLOW_GET_TYPE.
Until mow, this macro has blindly read the passed-in type's size, but
that's unnecessarily risky.  This commit changes it to verify that the
passed-in type is the same size as the field and, on GCC and Clang, that
the types are compatible.  It also adds a version that does not check,
for the one case where (currently) we deliberately read the wrong size,
and updates a few uses to use more precise field names.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Armando Migliaccio <armamig@gmail.com>
2018-03-31 11:31:51 -07:00
Ben Pfaff
0d71302e36 ofp-util, ofp-parse: Break up into many separate modules.
ofp-util had been far too large and monolithic for a long time.  This
commit breaks it up into units that make some logical sense.  It also
moves the pieces of ofp-parse that were specific to each unit into the
relevant unit.

Most of this commit is just moving code around.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
2018-02-13 10:43:13 -08:00
Ben Pfaff
46ab60bfe5 classifier: Refactor interface for classifier_remove().
Until now, classifier_remove() returned either null or the classifier rule
passed to it, which is an unusual interface.  This commit changes it to
return true if it succeeds or false on failure.

In addition, most of classifier_remove()'s callers know ahead of time that
it must succeed, even though most of them didn't bother with an assertion,
so this commit adds a classifier_remove_assert() function as a helper.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Yifeng Sun <pkusunyifeng@gmail.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
2018-01-31 11:19:21 -08:00
Ben Pfaff
b2befd5bb2 sparse: Add guards to prevent FreeBSD-incompatible #include order.
FreeBSD insists that <sys/types.h> be included before <netinet/in.h> and
that <netinet/in.h> be included before <arpa/inet.h>.  This adds guards to
the "sparse" headers to yield a warning if this order is violated.  This
commit also adjusts the order of many #includes to suit this requirement.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2017-12-22 12:58:02 -08:00
Ben Pfaff
71f21279f6 Eliminate most shadowing for local variable names.
Shadowing is when a variable with a given name in an inner scope hides a
different variable with the same name in a surrounding scope.  This is
generally undesirable because it can confuse programmers.  This commit
eliminates most of it.

Found with -Wshadow=local in GCC 7.  The repo is not really ready to enable
this option by default because of a few cases that are harder to fix, and
harmless, such as nested use of CMAP_FOR_EACH.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
2017-08-02 15:03:35 -07:00
Ben Pfaff
50f96b10e1 Support accepting and displaying port names in OVS tools.
Until now, most ovs-ofctl commands have not accepted names for ports, only
numbers, and have not been able to display port names either.  It's a lot
easier for users if they can use and see meaningful names instead of
arbitrary numbers.  This commit adds that support.

For backward compatibility, only interactive ovs-ofctl commands by default
display port names; to display them in scripts, use the new --names
option.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Aaron Conole <aconole@redhat.com>
2017-05-31 16:06:12 -07:00
Jesse Gross
8d8ab6c2d5 tun-metadata: Manage tunnel TLV mapping table on a per-bridge basis.
When using tunnel TLVs (at the moment, this means Geneve options), a
controller must first map the class and type onto an appropriate OXM
field so that it can be used in OVS flow operations. This table is
managed using OpenFlow extensions.

The original code that added support for TLVs made the mapping table
global as a simplification. However, this is not really logically
correct as the OpenFlow management commands are operating on a per-bridge
basis. This removes the original limitation to make the table per-bridge.

One nice result of this change is that it is generally clearer whether
the tunnel metadata is in datapath or OpenFlow format. Rather than
allowing ad-hoc format changes and trying to handle both formats in the
tunnel metadata functions, the format is more clearly separated by function.
Datapaths (both kernel and userspace) use datapath format and it is not
changed during the upcall process. At the beginning of action translation,
tunnel metadata is converted to OpenFlow format and flows and wildcards
are translated back at the end of the process.

As an additional benefit, this change improves performance in some flow
setup situations by keeping the tunnel metadata in the original packet
format in more cases. This helps when copies need to be made as the amount
of data touched is only what is present in the packet rather than the
maximum amount of metadata supported.

Co-authored-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Jesse Gross <jesse@kernel.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-09-19 09:52:22 -07:00
Jarno Rajahalme
da9cfca6e2 Revert "pvector: Expose non-concurrent priority vector."
This reverts commit 8bdfe1313894047d44349fa4cf4402970865950f.

I failed to see that lib/dpif-netdev.c actually needs the concurrency
provided by pvector prior to this change.  More specifically, when a
subtable is removed, concurrent lookups may skip over another subtable
swapped in to the place of the removed subtable in the vector.

Since this was the only use of the non-concurrent pvector, it is
cleaner to revert the whole patch.

Reported-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-08-10 14:58:51 -07:00
Jarno Rajahalme
aff49b8c66 meta-flow: Clean up masking with prerequisities checking.
Change mf_are_prereqs_ok() take a flow_wildcards pointer, so that the
wildcards can be set at the same time as the prerequisiteis are
checked.  This makes it easier to write more obviously correct code.

Remove the functions mf_mask_field_and_prereqs() and
mf_mask_field_and_prereqs__(), and make the callers first check the
prerequisites, while supplying 'wc' to mf_are_prereqs_ok(), and if
successful, mask the bits of the field that were read or set using
mf_mask_field_masked().

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-07-29 16:52:03 -07:00
Jarno Rajahalme
44e0c35d98 lib: Separate versioning to its own module.
Separate rule versioning to lib/versions.h to make it easier to use
versioning for other data types.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-07-29 16:52:01 -07:00
Jarno Rajahalme
8bdfe13138 pvector: Expose non-concurrent priority vector.
PMD threads use pvectors but do not need the overhead of the
concurrent version.  Expose the non-concurrent version for
that use.

Note that struct pvector is renamed as struct cpvector (for concurrent
priority vector), and the former struct pvector_impl is now struct
pvector.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-07-29 11:12:08 -07:00
Jarno Rajahalme
93f25605e4 pvector: Get rid of special purpose of INT_MIN.
Allow clients to use the whole priority range.  Note that this changes
the semantics of PVECTOR_FOR_EACH_PRIORITY so that the iteration still
continues for entries at the given priority.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-07-29 11:11:14 -07:00
Jarno Rajahalme
59936df6f4 classifier: Use ccmaps for staged lookup indices.
Use the new ccmap type instead of cmap for staged lookup indices to
fix the problem with slow removal of rules with large number of
duplicates.  This was problematic especially when many rules shared
the same match in packet metadata (e.g., a port number, but nothing
else), causing a large number of duplicates to be inserted into the
staged lookup index.  ccmap only keeps the count of inserted (hash)
values, so duplicates do not add any performance penalty.

Reported-by: Alok Kumar Maurya <alok-kumar.maurya@hpe.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-05-16 17:47:54 -07:00
Jarno Rajahalme
31ecf25fad classifier: Remove rare optimization case.
This optimization applied when a staged lookup index would narrow down
to a single rule, which happens sometimes is simple test cases, but
presumably less often in more populated flow tables.  The result of
this optimization allowed a bit more general megaflows, but the bit
patterns produced were sometimes cryptic.  Finally, a later fix to a
more important performance problem does not allow for this
optimization any more, so remove it now.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-05-04 13:00:06 -07:00
Jarno Rajahalme
fd7a68dd76 classifier: Remove logging.
The only vlog line was a left over from debugging.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-05-04 13:00:05 -07:00
Jarno Rajahalme
37b4ea1250 classifier: Remove redundant index.
The test for figuring out if the last index had the same fields as the
actual rules map as broken, resulting into keeping an unnecessary
index around.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-05-04 13:00:05 -07:00
Jarno Rajahalme
5e27fe9781 classifier: Fix race condition leading to NULL dereference.
Addition of table versioning exposed struct cls_rule member
'cls_match' to RCU readers and made it possible for 'cls_match' become
NULL while being accessed by an RCU reader, but we failed to check for
this condition.  This may have resulted in NULL pointer dereference
and ovs-vswitchd crash.

Fix this by making the 'cls_match' member an RCU pointer and checking
the value whenever it potentially read by an RCU reader.  In these
instances we use ovsrcu_get(), whereas functions accessible only by
the exclusive writers use ovsrcu_get_protected() and do not need to
check the result.

VMware-BZ: 1643642
Fixes: 2b7b1427 ("classifier: Support table versioning")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
2016-04-17 08:51:21 -07:00
Ben Warren
f424833659 Move lib/ofp-util.h to include/openvswitch directory
This commit also adds several #include directives in source files in
order to make the 'ofp-util.h' move possible

Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-04-14 13:48:25 -07:00
Ben Warren
3e8a2ad145 Move lib/dynamic-string.h to include/openvswitch directory
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-03-19 10:02:12 -07:00
Jarno Rajahalme
a14502a7c5 classifier: Retire partitions.
Classifier partitions allowed skipping subtables when if was known
from the flow's metadata field that the subtable cannot possibly
match.  This functionality was later implemented in a more general
fashion by staged lookup, where the first stage also covers the
metadata field, among the rest of the non-packet fields in the struct
flow.  While in theory skipping a subtable on the basis of the
metadata field alone could produce more effective wildcards, on the
basis of our testsuite coverage it does not seem to be the case, as
removing the partitioning feature did not result in any test failures.

Removing the partitioning feature makes classifier lookups roughly 20%
faster when a wildcard mask is not needed, and roughly 10% faster when
a wildcard mask is needed, as tested with the test-classifier
benchmark with one lookup thread.

Found by profiling with 'perf'.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-08-26 15:37:42 -07:00
Jarno Rajahalme
5fcff47b0b flow: Add struct flowmap.
Struct miniflow is now sometimes used just as a map.  Define a new
struct flowmap for that purpose.  The flowmap is defined as an array of
maps, and it is automatically sized according to the size of struct
flow, so it will be easier to maintain in the future.

It would have been tempting to use the existing struct bitmap for this
purpose. The main reason this is not feasible at the moment is that
some flowmap algorithms are simpler when it can be assumed that no
struct flow member requires more bits than can fit to a single map
unit. The tunnel member already requires more than 32 bits, so the map
unit needs to be 64 bits wide.

Performance critical algorithms enumerate the flowmap array units
explicitly, as it is easier for the compiler to optimize, compared to
the normal iterator.  Without this optimization a classifier lookup
without wildcard masks would be about 25% slower.

With this more general (and maintainable) algorithm the classifier
lookups are about 5% slower, when the struct flow actually becomes big
enough to require a second map.  This negates the performance gained
in the "Pre-compute stage masks" patch earlier in the series.

Requested-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-08-26 15:37:22 -07:00
Jarno Rajahalme
fa2fdbf8d0 classifier: Pre-compute stage masks.
This makes stage mask computation happen only when a subtable is
inserted and allows simplification of the main lookup function.

Classifier benchmark shows that this speeds up the classification
(with wildcards) about 5%.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-08-26 14:41:02 -07:00
Jarno Rajahalme
1d85dfa559 classifier: Do not use mf_value.
mf_value has grown bigger than needed for storing the biggest
supported prefix (IPv6 address length).  Define a new type to be used
instead of mf_value.

This makes classifier lookups a bit faster.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
2015-08-12 17:03:07 -07:00
Jarno Rajahalme
faa9110735 classifier: Remove unused hash functions.
Remove unused cls_rule_hash() and minimatch_hash() functions.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
2015-08-12 16:12:16 -07:00
Jarno Rajahalme
96152d1daf classifier: Fix comment.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
2015-08-12 16:00:48 -07:00
Jarno Rajahalme
361d808dd9 flow: Split miniflow's map.
Use two maps in miniflow to allow for expansion of struct flow past
512 bytes.  We now have one map for tunnel related fields, and another
for the rest of the packet metadata and actual packet header fields.
This split has the benefit that for non-tunneled packets the overhead
should be minimal.

Some miniflow utilities now exist in two variants, new ones operating
over all the data, and the old ones operating only on a single 64-bit
map at a time.  The old ones require doubling of code but should
execute faster, so those are used in the datapath and classifier's
lookup path.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-07-17 15:18:43 -07:00
Jarno Rajahalme
09b0fa9c55 flow: Make compile with MSVC.
MSVC does not like zero sized arrays in structs.  Hence, remove the
'values' member from struct miniflow and add back the getters
miniflow_values() and miniflow_get_values().

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-07-16 17:42:24 -07:00
Jarno Rajahalme
a851eb9492 flow: Eliminate miniflow_clone() and minimask_clone().
miniflow_clone() and minimask_clone() are no longer used, remove them
from the API.

Now that miniflow data is always inlined, it makes sense to rename
miniflow_clone_inline() miniflow_clone().

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-07-15 13:17:10 -07:00
Jarno Rajahalme
8fd4792403 flow: Always inline miniflows.
Now that performance critical code already inlines miniflows and
minimasks, we can simplify struct miniflow by always dynamically
allocating miniflows and minimasks to the correct size.  This changes
the struct minimatch to always contain pointers to its miniflow and
minimask.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-07-15 13:17:10 -07:00
Jarno Rajahalme
bd53aa1723 classifier: Make versioning more explicit.
Now that struct cls_match has 'add_version' the 'version' in cls_match
was largely redundant.  Remove 'version' from struct cls_rule, and add
it to function prototypes that need it.  This makes versioning more
explicit (or less indirect) in the API.

Suggested-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-07-06 11:46:34 -07:00
Jarno Rajahalme
18721c4a4d classifier: Simplify versioning.
After all, there are some cases in which both the insertion version
and removal version of a rule need to be considered.  This makes the
cls_match a bit bigger, but makes classifier versioning much simpler
to understand.

Also, avoid using type larger than int in an enum, as it is not
portable C.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-06-12 16:12:56 -07:00
Jarno Rajahalme
2541d75983 rculist: Remove postponed poisoning.
Postponed 'next' member poisoning was based on the faulty assumption
that postponed functions would be called in the order they were
postponed.  This assumption holds only for the functions postponed by
any single thread.  When functions are postponed by different
threads, there are no guarantees of the order in which the functions
may be called, or timing between those calls after the next grace
period has passed.

Given this, the postponed poisoning could have executed after
postponed destruction of the object containing the rculist element.

This bug was revealed after the memory leaks on rule deletion were
recently fixed.

This patch removes the postponed 'next' member poisoning and adds
documentation describing the ordering limitations in OVS RCU.

Alex Wang dug out the root cause of the resulting crashes, thanks!

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
2015-06-11 17:28:37 -07:00
Jarno Rajahalme
39c9459355 Use classifier versioning.
Each rule is now added or deleted in a specific tables version.  Flow
tables are versioned with a monotonically increasing 64-bit integer,
where positive values are valid version numbers.

Rule modifications are implemented as an insertion of a new rule and a
deletion of the old rule, both taking place in the same tables
version.  Since concurrent lookups may use different versions, both
the old and new rule must be available for lookups at the same time.

The ofproto provider interface is changed to accomodate the above.  As
rule's actions need not be modified any more, we no longer need
'rule_premodify_actions', nor 'rule_modify_actions'.  'rule_insert'
now takes a pointer to the old rule and adds a flag that tells whether
the old stats should be forwarded to the new rule or not (this
replaces the 'reset_counters' flag of the now removed
'rule_modify_actions').

Versioning all flow table changes has the side effect of making
learned flows visible for future lookups only.  I.e., the upcall that
executes the learn action, will not see the newly learned action in
it's classifier lookups.  Only upcalls that start executing after the
new flow was added will match on it.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-06-11 15:53:43 -07:00
Jarno Rajahalme
8f8023b3ee classifier: Make traversing identical rules robust.
The traversal of the list of identical rules from the lookup threads
is fragile if the list head is removed during the list traversal.

This patch simplifies the implementation of that list by making the
list NULL terminated, singly linked RCU-protected list.  By having the
NULL at the end there is no longer a possiblity of missing the point
when the list wraps around.  This is significant when there can be
multiple elements with the same priority in the list.

This change also decreases the size of the struct cls_match back
pre-'visibility' attribute size.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-06-11 15:53:42 -07:00
Jarno Rajahalme
2b7b1427c1 classifier: Support table versioning
This patch allows classifier rules to become visible and invisible in
specific versions.  A 'version' is defined as a positive monotonically
increasing integer, which never wraps around.

The new 'visibility' attribute replaces the prior 'to_be_removed' and
'visible' attributes.

When versioning is not used, the 'version' parameter should be passed
as 'CLS_MIN_VERSION' when creating rules, and 'CLS_MAX_VERSION' when
looking up flows.

This feature enables the support for atomic OpenFlow bundles without
significant performance penalty on 64-bit systems. There is a
performance decrease in 32-bit systems due to 64-bit atomics used.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-06-10 16:17:47 -07:00
Jarno Rajahalme
186120da7f classifier: Support duplicate rules.
OpenFlow 1.4 bundles are easier to implement when it is possible to
mark a rule as 'to_be_removed' and then insert a new, identical rule
with the same priority.

All but one out of the identical rules must be marked as
'to_be_removed', and the one rule that is not 'to_be_removed' must
have been inserted last.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-06-01 13:25:11 -07:00
Jarno Rajahalme
fc02ecc717 classifier: Add support for invisible flows.
This makes it possible to tentatively add flows to the classifier
without the datapath seeing them.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-06-01 13:24:09 -07:00
Ben Pfaff
18080541d2 classifier: Add support for conjunctive matches.
A "conjunctive match" allows higher-level matches in the flow table, such
as set membership matches, without causing a cross-product explosion for
multidimensional matches.  Please refer to the documentation that this
commit adds to ovs-ofctl(8) for a better explanation, including an example.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-01-11 13:25:24 -08:00
Ben Pfaff
2e0bded4b4 classifier: Make classifier_lookup() 'flow' parameter non-const.
An upcoming commit will make classifier_lookup() sometimes modify its
'flow' argument temporarily during the lookup.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
---
v2: New patch.
v2.1: Rebase.
v3: Rebase.
2015-01-11 13:07:06 -08:00
Jarno Rajahalme
d70e8c28f9 miniflow: Use 64-bit data.
So far the compressed flow data in struct miniflow has been in 32-bit
words with a 63-bit map, allowing for a maximum size of struct flow of
252 bytes.  With the forthcoming Geneve options this is not sufficient
any more.

This patch solves the problem by changing the miniflow data to 64-bit
words, doubling the flow max size to 504 bytes.  Since the word size
is doubled, there is some loss in compression efficiency.  To counter
this some of the flow fields have been reordered to keep related
fields together (e.g., the source and destination IP addresses share
the same 64-bit word).

This change should speed up flow data processing on 64-bit CPUs, which
may help counterbalance the impact of making the struct flow bigger in
the future.

Classifier lookup stage boundaries are also changed to 64-bit
alignment, as the current algorithm depends on each miniflow word to
not be split between ranges.  This has resulted in new padding (part
of the 'mpls_lse' field).

The 'dp_hash' field is also moved to packet metadata to eliminate
otherwise needed padding there.  This allows the L4 to fit into one
64-bit word, and also makes matches on 'dp_hash' more efficient as
misses can be found already on stage 1.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-01-06 14:47:30 -08:00
Thomas Graf
e6211adce4 lib: Move vlog.h to <openvswitch/vlog.h>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-12-15 14:15:19 +01:00