The cited commit reserved lower tc priorities for IP ethertypes in order
to give IP traffic higher priority than other management traffic.
In case of of vlan encap traffic, IP traffic will still get lower
priority.
Fix it by also reserving low priority tc prio for vlan.
Fixes: c230c7579c14 ("netdev-offload-tc: Reserve lower tc prios for ip ethertypes")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <horms@ovn.org>
Currently ovs-vswitchd crashes, with hw offloading enabled, if a geneve
packet with corrupted metadata is received, because the metadata header
is not verified correctly.
This commit adds a check for geneve metadata length and, if the header
is wrong, the packet is not sent to flower.
It also includes a system-traffic test for geneve packets with corrupted
metadata.
Fixes: a468645c6d33 ("lib/tc: add geneve with option match offload")
Reported-by: Haresh Khandelwal <hakhande@redhat.com>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
There is no TCA_TUNNEL_KEY_ENC_SRC_PORT in the kernel, so the offload
should not be attempted if OVS_TUNNEL_KEY_ATTR_TP_SRC is requested
by OVS. Current code just ignores the attribute in the tunnel(set())
action leading to a flow mismatch and potential incorrect datapath
behavior:
|tc(handler21)|DBG|tc flower compare failed action compare
...
Action 0 mismatch:
- Expected Action:
00000010 01 00 00 00 00 00 00 00-00 00 00 00 00 ff 00 11
00000020 c0 5b 17 c1 00 40 00 00-0a 01 00 6d 0a 01 01 12
00000050 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000060 01 02 80 01 00 18 00 0b-00 00 00 00 00 00 00 00
...
- Received Action:
00000010 01 00 00 00 00 00 00 00-00 00 00 00 00 ff 00 11
00000020 00 00 17 c1 00 40 00 00-0a 01 00 6d 0a 01 01 12
00000050 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000060 01 02 80 01 00 18 00 0b-00 00 00 00 00 00 00 00
...
In the tc_action dump above we can see the difference on the second
line. The action dumped from a kernel is missing 'c0 5b' - source
port for a tunnel(set()) action on the second line.
Removing the field from the tc_action_encap structure entirely to
avoid any potential confusion.
Note: In general, the source port number in the tunnel(set()) action
is not particularly useful for most tunnels, because they may just
ignore the value. Specs for Geneve and VXLAN suggest using a value
based on the headers of the inner packet as a source port.
In vast majority of scenarios the source port doesn't actually end
up in the action itself.
Having a mismatch between the userspace and TC leads to constant
revalidation of the flow and warnings in the log.
Adding a test case that demonstrates a scenario where the issue
occurs - bridging of two tunnels.
Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc interface")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2023-October/052744.html
Reported-by: Vladislav Odintsov <odivlad@gmail.com>
Tested-by: Vladislav Odintsov <odivlad@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Kernels that do not support vxlan gbp would treat the rule that has vxlan
gbp encap action or vxlan gbp id match differently, either reject it or
just skip the action/match and continue processing the knowing ones.
To solve the issue, probe and disallow inserting rules with vxlan gbp
action/match if kernel does not support it.
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Add TC offload support for vxlan encap with gbp option.
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Sometimes there is a need to clean empty chains as done in
delete_chains_from_netdev(). The cited commit doesn't remove
the chain completely which cause adding ingress_block later to fail.
This can be reproduced with adding bond as ovs port which makes ovs
use ingress_block for it.
While at it add the netdev name that fails to the log.
Fixes: e1e5eac5b016 ("tc: Add TCA_KIND flower to delete and get operation to avoid rtnl_lock().")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The device may be deleted and added with ifindex changed.
The tc rules on the device will be deleted if the device is deleted.
The func tc_del_filter will fail when flow del. The mapping of
ufid to tc will not be deleted.
The traffic will trigger the same flow(with same ufid) to put to tc
on the new device. Duplicated ufid mapping will be added.
If the hashmap is expanded, the old mapping entry will be the first entry,
and now the dp flow can't be deleted.
Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
parse_tc_flower_to_actions() was not reporting errors, which would
cause parse_tc_flower_to_match() to ignore them.
Fixes: dd03672f7bbb ("netdev-offload-tc: Move flower_to_match action handling to isolated function.")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
tc does not support conntrack ALGs. Even worse, with tc enabled, they
should not be used/configured at all. This is because even though TC
will ignore the rules with ALG configured, i.e., they will flow through
the kernel module, return traffic might flow through a tc conntrack
rule, and it will not invoke the ALG helper.
Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
tc was not setting the OVS_CT_ATTR_FORCE_COMMIT flag when a forced
commit was requested. This patch will fix this.
Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Include and run the system-traffic.at tests as part of the system offload
testsuite. Exclude all the tests that will not run without any special
modifications.
Lowered log level for "recirc_id sharing not supported" message, so tests
will not fail with older kernels. This is not an error level message, but
should be debug, like all other, EOPNOTSUPP, related log messages.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
When a flow gets modified, i.e. the actions are changes, the tc layer will
remove, and re-add the flow. This is causing all the counters to be reset.
This patch will remember the previous tc counters and adjust any requests
for statistics. This is done in a similar way as the rte_flow implementation.
It also updates the check_pkt_len tc test to purge the flows, so we do
not use existing updated tc flow counters, but start with fresh installed
set of datapath flows.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
A long long time ago, an effort was made to make tc flower
rtnl_lock() free. However, on the OVS part we forgot to add
the TCA_KIND "flower" attribute, which tell the kernel to skip
the lock. This patch corrects this by adding the attribute for
the delete and get operations.
The kernel code calls tcf_proto_is_unlocked() to determine the
rtnl_lock() is needed for the specific tc protocol. It does this
in the tc_new_tfilter(), tc_del_tfilter(), and in tc_get_tfilter().
If the name is not set, tcf_proto_is_unlocked() will always return
false. If set, the specific protocol is queried for unlocked support.
Fixes: f98e418fbdb6 ("tc: Add tc flower functions")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Currently ethertype to prio hmap is static and the first ethertype
being used gets a lower priority. Usually there is an arp request
before the ip traffic and the arp ethertype gets a lower tc priority
while the ip traffic proto gets a higher priority.
In this case ip traffic will go through more hops in tc and HW.
Instead, reserve lower priorities for ip ethertypes.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Cited commit correctly fixed the issue of incorrect reporting
of zero-length geneve option matches as wildcarded. But as a
side effect, exact match on the metadata length was added to
every tunnel flow match, even if the tunnel is not geneve.
That doesn't generate any functional issues, but it maybe
confusing to see 'tunnel(...,geneve(),...)' while looking at
datapath flow dumps for, e.g., vxlan tunnel flows.
Fix that by checking the port type before parsing geneve options.
tunnel() attribute itself doesn't have enough information to
figure out the tunnel type.
Fixes: 7a6c8074c5d2 ("netdev-offload-tc: Fix the mask for tunnel metadata length.")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
netdev_tc_flow_put() "consumes" the tunnel.tp_src value, but
it's never passed down to TC, and not parsed back. Fix that.
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Current offloading code supports only limited number of tunnel keys
and silently ignores everything it doesn't understand. This is
causing, for example, offloaded ERSPAN tunnels to not work, because
flow is offloaded, but ERSPAN options are not provided to TC.
There is a number of tunnel keys, which are supported by the userspace,
but silently ignored during offloading:
OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT
OVS_TUNNEL_KEY_ATTR_OAM
OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS
OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS
OVS_TUNNEL_KEY_ATTR_CSUM is kind of supported, but only for actions
and for some reason is set from the tunnel port instead of the
provided action, and not currently supported for the tunnel key in
the match.
Addig a default case to fail offloading of unknown attributes. For
now explicitly allowing incorrect behavior for the DONT_FRAGMENT flag,
otherwise we'll break all tunnel offloading by default. VXLAN and
ERSPAN options has to fail offloading, because the tunnel will not
work otherwise. OAM is not a default configurations, so failing it
as well. The missing DONT_FRAGMENT flag though should, probably,
cause frequent flow revalidation, but that is not new with this patch.
Same for the 'match' key, only clearing masks that was actually
consumed, except for the DONT_FRAGMENT and CSUM flags, which are
explicitly allowed and highlighted as broken.
Also, destination port as well as CSUM configuration for unknown
reason was not taken from the actions list and were passed via HW
offload info instead of being consumed from the set() action.
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/395522.html
Reported-by: Eelco Chaudron <echaudro@redhat.com>
Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc interface")
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
If the key is zero, it doesn't mean we don't need to match on it.
Masks should be checked instead.
Fixes: 49a7961fca65 ("lib/tc: Avoid matching on tunnel ttl or tos if not needed")
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
netdev_tc_flow_put() ignores the tunnel.tp_dst mask. That results
in the exact match on the value. TC supports the masked match on
this field and it does return the mask back during the flow dump
even if it wasn't provided initially. OVS should correctly handle
that. There is a problem though. Some drivers (mlx5) doesn't
support offloading if the destination port is not an exact match [1].
Keeping the logic as-is for now, but making it explicit and somewhat
documented in the comment, so it is clear what is happening and we can
revisit this in the future.
[1] https://patchwork.ozlabs.org/project/openvswitch/patch/20220704224505.1117988-3-i.maximets@ovn.org/#2927396
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
'wc.masks.tunnel.metadata.present.len' must be a mask for the same
field in the flow key, but in the tc_flower structure it's the real
length of metadata masks.
That is correctly handled for the individual opt->length, setting all
the masks to 0x1f like it's done in the tun_metadata_to_geneve_mask__(),
but not handled for the main 'len' field.
Fix that by setting the mask to 0xff like it's done during the flow
translation in xlate_actions() and during the flow dump in the
tun_metadata_from_geneve_nlattr().
Also, flower always has an exact match on the present.len field
regardless of its value and regardless of this field being masked
by OVS flow translation layer while installing the flow. Hence,
all tunnel flows dumped from TC should have an exact match on
present.len and also UDPIF flag, because present.len doesn't make
sense without that flag. Without the change, zero-length options
match is incorrectly reported as a wildcard match. The side effect
though is that zero-length match on geneve options is reported even
for other tunnel types, e.g. vxlan. But that should be fairly
harmless. To avoid reporting a match on empty geneve options for
vxlan/etc. tunnels we'll need to check the tunnel port type, there
is no enough information in the TUNNEL attribute itself.
Extra checks and comments added around the code to better explain
what is going on.
Fixes: a468645c6d33 ("lib/tc: add geneve with option match offload")
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
OVS kernel datapath and TC are parsing IPv6 fragments differently.
For IPv6 later fragments, according to the original design [1], OVS
always sets nw_proto=44 (IPPROTO_FRAGMENT), regardless of the type
of the L4 protocol.
This leads to situation where flow for nw_proto=44 gets installed
to TC, but packets can not match on it, causing all IPv6 later
fragments to go to userspace significantly degrading performance.
Disabling offload for such packets, so the flow can be installed
to the OVS kernel datapath instead. Disabling for all IPv6 fragments
including the first one, because it doesn't make a lot of sense to
handle them separately. It may also cause potential problems with
conntrack trying to re-assemble a packet from fragments handled by
different datapaths (first in HW, later in OVS kernel).
Checking both 'nw_proto' and 'nw_frag' as classifier might decide
to match only on one of them and also nw_proto will not be 44 for
the first fragment.
The issue was hidden for some time due to incorrect behavior of the
OVS kernel datapath that was recently fixed in kernel commit:
12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 fragments")
To allow offloading in the future either flow dissector in TC
should be changed to parse packets in the same way as OVS does,
or parsing in OVS kernel and userspace should be made configurable,
so users can opt-in to the behavior change. Silent change of the
behavior (change by default) is not an option, because existing
OpenFlow pipelines may depend on a certain behavior.
[1] https://docs.openvswitch.org/en/latest/topics/design/#fragments
Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation")
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This change extends the debug logging with the sparse
dump of the flow mask structure to make debug process
easier.
Sample output:
|netdev_offload_tc|DBG|offloading isn't supported, unknown attribute
Unused mask bits:
00000270 00 00 00 00 00 00 00 00-00 00 00 ff 00 00 00 00
In this example, 0x270 + 11 = 635, which is an offset of
the nsh.mdtype in the struct flow.
Suggested-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This change implements support for the check_pkt_len
action using the TC police action, which supports MTU
checking.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Move handling of the individual actions in the parse_tc_flower_to_match()
function to a separate function that will make recursive action handling
easier.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Move handling of the individual actions in the netdev_tc_flow_put()
function to a separate function that will make recursive action handling
easier.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
When offloading rule, tc should be filled with police index, instead
of meter id. As meter is mapped to police action, and the mapping is
inserted into meter_id_to_police_idx hmap, this hmap is used to find
the police index. Besides, the reverse mapping between meter id and
police index is also added, so meter id can be retrieved from this
hashmap and pass to dpif while dumping rules.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
As the police actions with indexes of 0x10000000-0x1fffffff are
reserved for meter offload, to provide a clean environment for OVS,
these reserved police actions should be deleted on startup. So dump
all the police actions, delete those actions with indexes in this
range.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
For dpif-netlink, meters are mapped by tc to police actions with
one-to-one relationship. Implement meter offload API to set/get/del
the police action, and a hmap is used to save the mappings.
An id-pool is used to manage all the available police indexes, which
are 0x10000000-0x1fffffff, reserved only for OVS.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Add function to parse police action from netlink message.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
The parse_key_and_mask_to_match() is a function to translate
a netlink formatted key/mask to match structure. And should
not consider any configuration setting when translating.
In addition we also enforce the encap_eth_type[0] mask as
it's required for the VLAN match.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This patch checks for none offloadable ct_state match flag combinations.
If they exist force the +trk flag down to TC Flower
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Using SHORT version of the *_SAFE loops makes the code cleaner and less
error prone. So, use the SHORT version and remove the extra variable
when possible for hmap and all its derived types.
In order to be able to use both long and short versions without changing
the name of the macro for all the clients, overload the existing name
and select the appropriate version depending on the number of arguments.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Verify that the returned ifindex by netdev_get_ifindex() is valid.
This might not be the case in the ERSPAN port scenario, which can
not be offloaded.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Currently, tc merges all header rewrite actions into one tc pedit
action. So the header rewrite actions order is lost. Save each header
rewrite action into one tc pedit action to keep the order. And only
append one tc csum action to the last pedit action of a series.
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Fragmented packets with offset=0 are defragmented by tc act_ct, and
only when assembled pass to next action, in ovs offload case,
a goto action. Since stats are overwritten on each action dump,
only the stats for last action in the tc filter action priority
list is taken, the stats on the goto action, which count
only the assembled packets. See below for example.
Hardware updates just part of the actions (gact, ct, mirred) - those
that support stats_update() operation. Since datapath rules end
with either an output (mirred) or recirc/drop (both gact), tc rule
will at least have one action that supports it. For software packets,
the first action will have the max software packets count.
Tc dumps total packets (hw + sw) and hardware packets, then
software packets needs to be calculated from this (total - hw).
To fix the above, get hardware packets and calculate software packets
for each action, take the max of each set, then combine back
to get the total packets that went through software and hardware.
Example by running ping above MTU (ping <IP> -s 2000):
ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=first),
packets:14, bytes:19544,..., actions:ct(zone=1),recirc(0x1)
ct_state(-trk),recirc_id(0),...,ipv4(proto=1,frag=later),
packets:14, bytes:28392,..., actions:ct(zone=1),recirc(0x1)
Second rule should have had bytes=14*<size of 'later' frag>, but instead
it's bytes=14*<size of assembled packets - size of 'first' + 'later'
frags>.
Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
When a tunnel port gets added to the bridge setting the checksum option
to true:
ovs-vsctl add-port br0 geneve0 \
-- set interface geneve0 type=geneve \
options:remote_ip=<remote_ip> options:key=<key> options:csum=true
the flow dump for the outgoing traffic will include a
"bad key length 1 ..." message:
ovs-appctl dpctl/dump-flows --names -m
ufid:<>, ..., dp:tc,
actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081,
key6(bad key length 1, expected 0)(01)flags(key)))
,genev_sys_6081
This is due to a mismatch present between the expected length (zero
for OVS_TUNNEL_KEY_ATTR_CSUM in ovs_tun_key_attr_lens) and the
current one.
With this patch the same flow dump becomes:
ovs-appctl dpctl/dump-flows --names -m
ufid:<>, ..., dp:tc,
actions:set(tunnel(tun_id=<>,dst=<>,ttl=64,tp_dst=6081,
flags(csum|key))),genev_sys_6081
Fixes: d9677a1f0eaf ("netdev-tc-offloads: TC csum option is not matched with tunnel configuration")
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
'linux_tc' flow API suitable only for tunneling vports with backing
linux interfaces. DPDK flow API is not suitable for such ports.
With this change we could drop vport restriction from dpif-netdev.
This is a prerequisite for enabling vport offloading in DPDK.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Tested-by: Emma Finn <emma.finn@intel.com>
Tested-by: Marko Kovacevic <marko.kovacevic@intel.com>
Upstream kernel now rejects unsupported ct_state flags.
Earlier kernels, ignored it but still echoed back the requested ct_state,
if ct_state was supported. ct_state initial support had trk, new, est,
and rel flags.
If kernel echos back ct_state, assume support for trk, new, est, and
rel. If kernel rejects a specific unsupported flag, continue and
use reject mechanisim to probe for flags rep and inv.
Disallow inserting rules with unnsupported ct_state flags.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Acked-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Previously, only chain 0 is deleted before attach the ingress block.
If there are rules on the chain other than 0, these rules are not flushed.
In this case, the recreation of qdisc also fails. To fix this issue, flush
rules from all chains.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
TC flower doesn't support some ct state flags such as
INVALID/SNAT/DNAT/REPLY. So it is better to reject this rule.
Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
The n_offloaded_flows counter is saved in dpif, and this is the first
one when ofproto is created. When flow operation is done by ovs-appctl
commands, such as, dpctl/add-flow, a new dpif is opened, and the
n_offloaded_flows in it can't be used. So, instead of using counter,
the number of offloaded flows is queried from each netdev, then sum
them up. To achieve this, a new API is added in netdev_flow_api to get
how many flows assigned to a netdev.
In order to get better performance, this number is calculated directly
from tc_to_ufid hmap for netdev-offload-tc, because flow dumping by tc
takes much time if there are many flows offloaded.
Fixes: af0618470507 ("dpif-netlink: Count the number of offloaded rules")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
There is no need for a 'once' variable per probe.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
tc_replace_flower may fail, so the return value must be checked.
If not zero, ufid can't be deleted. Otherwise the operations on this
filter may fail because its ufid is not found.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
There is no need to pass tc rules to hw when just probing
for tc features. this will avoid redundant errors from hw drivers
that may happen.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Acked-By: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>