2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 05:47:55 +00:00

829 Commits

Author SHA1 Message Date
Ilya Maximets
f598f46212 dpif-netdev: Force port reconfiguration to change dynamic_txqs.
In case number of polling threads goes from exact number of Tx queues
in port to higher value while set_tx_multiq() not implemented or not
requesting reconfiguration, port will not be reconfigured and datapath
will continue using static Tx queue ids leading to crash.

Ex.:
 Assuming that port p0 supports up to 4 Tx queues and doesn't support
 set_tx_multiq() method.  For example, netdev-afxdp could be the case,
 because it could have multiple Tx queues, but doesn't have
 set_tx_multiq() implementation because number of Tx queues always
 equals to number of Rx queues.

 1. Configuring pmd-cpu-mask to have 3 pmd threads.

 2. Adding port p0 to OVS.
    At this point wanted_txqs = 4 (3 for pmd threads + 1 for non-pmd).
    Port reconfigured to have 4 Tx queues successfully.
    dynamic_txqs = (4 < 4) = false;

 3. Configuring pmd-cpu-mask to have 10 pmd threads.
    At this point wanted_txqs = 11 (10 for pmd threads + 1 for non-pmd).
    Since set_tx_multiq() is not implemented, netdev doesn't request
    reconfiguration and 'dynamic_txqs' remains in 'false' state.

 4. Since 'dynamic_txqs == false', dpif-netdev uses static Tx queue
    ids that are in range [0, 10] while device only supports 4 leading
    to unwanted behavior and crashes.

Fix that by marking for reconfiguration all the ports that will likely
change their 'dynamic_txqs' value.

It looks like the issue could be reproduced only with afxdp ports,
because all other non-dpdk ports ignores Tx queue ids and dpdk ports
requests for reconfiguration on set_tx_multiq().

Reported-by: William Tu <u9012063@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-March/368364.html
Fixes: e32971b8ddb4 ("dpif-netdev: Centralized threads and queues handling code.")
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-03-25 13:35:00 -07:00
Ilya Maximets
ef32a1a334 dpif-netdev: Enter quiescent state after each offloading operation.
If the offloading queue is big and filled continuously, offloading
thread may have no chance to quiesce blocking rcu callbacks and
other threads waiting for synchronization.

Fix that by entering momentary quiescent state after each operation
since we're not holding any rcu-protected memory here.

Fixes: 02bb2824e51d ("dpif-netdev: do hw flow offload in a thread")
Reported-by: Eli Britstein <elibr@mellanox.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-February/049768.html
Acked-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-03-15 13:42:57 +01:00
Ilya Maximets
342b8904ab dpif: Fix dp_extra_info leak by reworking the allocation scheme.
dpctl module leaks the 'dp_extra_info' in case the dumped flow doesn't
fit the dump filter while executing dpctl/dump-flows and also while
executing dpctl/get-flow.

This is already a 3rd attempt to fix all the leaks and incorrect usage
of this string that definitely indicates poor initial design of the
feature.

Flow dump/get documentation clearly states that the caller does not own
the data provided in dpif_flow.  Datapath still owns all the data and
promises to not free/modify it until the next quiescent period, however
we're requesting the caller to free 'dp_extra_info' and this obviously
breaks the rules.

This patch fixes the issue by by storing 'dp_extra_info' within
'struct dp_netdev_flow' making datapath to own it.  'dp_netdev_flow'
is RCU-protected, so it will be valid until the next quiescent period.

Fixes: 0e8f5c6a38d0 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command")
Tested-by: Emma Finn <emma.finn@intel.com>
Acked-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-27 21:20:01 +01:00
Emma Finn
0e8f5c6a38 dpif-netdev: Modified ovs-appctl dpctl/dump-flows command
Modified ovs-appctl dpctl/dump-flows command to output
the miniflow bits for a given flow when -m option is passed.

$ ovs-appctl dpctl/dump-flows -m

Signed-off-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2020-01-17 15:26:59 +00:00
Eli Britstein
319a9bb338 dpif-netdev: Populate dpif class field in offload struct.
Populate dpif class field in offload struct to be used in offloading
flow put.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-16 13:34:10 +01:00
Ophir Munk
a309e4f526 dpif-netdev: Update offloaded flows statistics.
In case a flow is HW offloaded, packets do not reach the SW, thus not
counted for statistics. Use netdev flow get API in order to update the
statistics of flows by the HW statistics.

Co-authored-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-16 13:34:10 +01:00
Ilya Maximets
81e89d5c26 dpif-netdev: Make datapath port mutex recursive.
Upcoming HW offloading will request flow statistics from the dpdk
offloading module.  This operation requires holding datapath port
mutex.  However, there is a possible scenario in which flow deletion
happens during datapath reconfiguration process and the mutex already
acquired:

  0  raise () from /lib64/libc.so.6
  1  abort () from /lib64/libc.so.6
  2  ovs_abort_valist ()
  3  ovs_abort ()
  4  ovs_mutex_lock_at ()
  5  dpif_netdev_get_flow_offload_status ()
  6  get_dpif_flow_status ()
  7  flow_del_on_pmd ()
  8  dpif_netdev_flow_del ()
  9  dpif_netdev_operate ()
  10 dpif_operate ()
  11 push_dp_ops ()
  12 push_ukey_ops ()
  13 dp_purge_cb ()
  14 dp_netdev_del_pmd ()
  15 reconfigure_pmd_threads ()
  16 reconfigure_datapath ()
  17 do_del_port ()
  18 dpif_netdev_port_del ()
  19 dpif_port_del ()
  20 port_del ()
  21 ofproto_port_del ()
  22 bridge_delete_or_reconfigure_ports ()
  23 bridge_reconfigure ()
  24 bridge_run ()
  25 main ()

This happens while removing the last port of a particular PMD thread.
Reconfiguration process decides that we need to remove current PMD
thread and calls datapath purge callback in order to clean up resources
assigned to it.  This turns into flow removal and flow_del() tries to
request statistics.

Turning the dp->port_mutex into recursive version as a quick fix for
this issue.  Better solutions might be to avoid statistics request
somehow, or fully disassociate offloaded flows from the datapath flows.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ian Stokes <ian.stokes@intel.com>
2020-01-16 13:34:09 +01:00
Ilya Maximets
633486a98e dpif-netdev: Get rid of broken dpif pointer in dp_netdev structure.
This pointer was introduced in July 2014 by commit
6b31e07347ad ("dpif-netdev: Polling threads directly call ofproto upcall functions.")
and it was broken right from this point because dpif_netdev_open()
updates it on each call with the pointer to a newly allocated
'dpif' structure that becomes invalid on the next dpif_netdev_close().
Since dpif_open/close() always happens asynchronously from different
threads and pointer is not protected by rcu or mutex (it's not even
atomic) it's not possible to safely use it.  Thankfully the actual
usage was in repository for less than 3 weeks and was removed by
commit 623540e4617e ("dpif-netdev: Streamline miss handling.").  Until
recently this pointer was used in order to pass it to dpif_flow_hash().
Another luck is that dpif_flow_hash() didn't use the 'dpif' argument.

However, we tried to use it while netdev offloading by commit
30115809da2e ("dpif-netdev: Use netdev-offload API for port lookup while offloading.")
and that unveiled the issue.

Now that all the code that used this pointer was cleaned up we can
just remove it from the structure to avoid possible misuse in the
future.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2020-01-08 16:02:37 +01:00
Ilya Maximets
7a5e0ee7cc dpif: Turn dpif_flow_hash function into generic odp_flow_key_hash.
Current implementation of dpif_flow_hash() doesn't depend on datapath
interface and only complicates the callers by forcing them to figure
out what is their current 'dpif'.  If we'll need different hashing
for different 'dpif's we'll implement an API for dpif-providers
and each dpif implementation will be able to use their local function
directly without calling it via dpif API.

This change will allow us to not store 'dpif' pointer in the userspace
datapath implementation which is broken and will be removed in next
commits.

This patch moves dpif_flow_hash() to odp-util module and replaces
unused odp_flow_key_hash() by it, along with removing of unused 'dpif'
argument.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2020-01-08 16:02:37 +01:00
Anju Thomas
a13a020975 userspace: Improved packet drop statistics.
Currently OVS maintains explicit packet drop/error counters only on port
level.  Packets that are dropped as part of normal OpenFlow processing
are counted in flow stats of “drop” flows or as table misses in table
stats. These can only be interpreted by controllers that know the
semantics of the configured OpenFlow pipeline.  Without that knowledge,
it is impossible for an OVS user to obtain e.g. the total number of
packets dropped due to OpenFlow rules.

Furthermore, there are numerous other reasons for which packets can be
dropped by OVS slow path that are not related to the OpenFlow pipeline.
The generated datapath flow entries include a drop action to avoid
further expensive upcalls to the slow path, but subsequent packets
dropped by the datapath are not accounted anywhere.

Finally, the datapath itself drops packets in certain error situations.
Also, these drops are today not accounted for.This makes it difficult
for OVS users to monitor packet drop in an OVS instance and to alert a
management system in case of a unexpected increase of such drops.
Also OVS trouble-shooters face difficulties in analysing packet drops.

With this patch we implement following changes to address the issues
mentioned above.

1. Identify and account all the silent packet drop scenarios
2. Display these drops in ovs-appctl coverage/show

Co-authored-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Co-authored-by: Keshav Gupta <keshugupta1@gmail.com>
Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>
Signed-off-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Signed-off-by: Keshav Gupta <keshugupta1@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-01-07 17:01:42 +01:00
Paul Blakey
dcdcad68c6 dpif: Add support to set user features
This enables user features on the kernel datapath via the DP_CMD_SET
command, and also retrieves them to check for actual support and
not just an older kernel ignoring the requested features.

This will be used in next patch to enable recirc_id sharing with tc.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2019-12-22 11:54:40 +01:00
Ilya Maximets
96e744046a dpif-netdev: Avoid infinite re-addition of misconfigured ports.
Infinite re-addition of failed ports happens if the device in userspace
datapath has a linux network interface and it's not able to be
configured.  For example, if the first reconfiguration fails because of
misconfiguration or bad initial device state.
In current code victims are afxdp ports and the Mellanox NIC ports
opened by the DPDK due to their bifurcated drivers (It's unlikely for
usual netdev-linux ports to fail).

The root cause: Every change in the state of the network interface
of a linux kernel device generates if-notifier event and if-notifier
event triggers the OVS code to re-apply the configuration of ports,
i.e. add broken ports back. The most obvious part is that dpif-netdev
changes the device flags before trying to configure it:

   1. add_port()
   2. set_flags() --> if-notifier event
   3. reconfigure() --> port removal from the datapath due to
                        misconfiguration or any other issue in
                        the underlying device.
   4. setting flags back --> another if-notifier event.
   5. There was new if-notifier event?
      yes --> re-apply all settings. --> goto step 1.

Easy way to reproduce is to add afxdp port with n_rxq=N, where N is
bigger than device supports.

This patch fixes the most obvious case for this issue by moving
enabling of a promisc mode later to the place where we already know
that device could be added to datapath without errors, i.e. after
its first successful reconfiguration.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363038.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-12-18 01:39:56 +01:00
Ilya Maximets
e7cb123ffc dpif-netdev: Hold global port mutex while calling offload API.
We changed datapath port lookup to netdev-offload API usage, but
forgot that port mutex was there not only to protect datapath
port hash map.  It was there also as a workaround solution for
complete unsafety of netdev-offload-dpdk functions.

Turning it back to fix the behaviour and adding a comment to prevent
removing it in the future unless netdev-offload-dpdk fixed.

For the thread safety notice see the top of netdev-offload-dpdk.c.

Fixes: 30115809da2e ("dpif-netdev: Use netdev-offload API for port lookup while offloading")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eli Britstein <elibr@mellanox.com>
2019-12-13 13:43:35 +01:00
Ophir Munk
1061dc7c85 dpif-netdev: Retrieve dpif_class from struct dp_netdev.
In case a pmd pointer (struct dp_netdev_pmd_thread *) needs to retrieve
the dpif_class it points at - it can access it as:  pmd->dp->class.  A
second option is to access it as: pmd->dp->dpif->dpif_class. The first
option is safe since there is one dp netdev with a constant pointer to
the dpif class. The second option is not safe since the pointer
pmd->dp->dpif may be changed under the hood, for example, in case there
is a call to dpif_open(). One such scenario is when a netdev bridge is
running while dumping flows statistics with dpctl in parallel:
ovs-appctl dpctl/dump-flows. This commit makes usage of the first
safe option instead of the second option.

Fixes: 30115809da2e ("dpif-netdev: Use netdev-offload API for port lookup while offloading")
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-12-08 15:48:23 +01:00
Darrell Ball
a7f33fdbfb conntrack: Support zone limits.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-03 10:11:13 -08:00
Ilya Maximets
30115809da dpif-netdev: Use netdev-offload API for port lookup while offloading.
Currently, while offloading, userspace datapath tries to lookup netdev
in a local port list of the datapath interface instance.  However,
there is no guarantee that these netdevs are the same netdevs that
netdev-offload module operates with and, as a result, there is no any
guarantee that these netdev instances has initialized flow API.

dpif-netdev should request ports from the netdev-offload module as
intended by flow offloading API in a same way as dpif-netlink does.
This will also give us performance benefits because we don't need to
hold global port mutex anymore.

We're not noticing any significant issues with current code, but
it will become a serious issue in the future, e.g. with offloading
for virtual tunneling ports.

Reported-by: Ophir Munk <ophirmu@mellanox.com>
Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Eli Britstein <elibr@mellanox.com>
2019-12-02 16:06:57 +01:00
Gowrishankar Muthukrishnan
433a3fa518 dpif-netdev: Log rxq assignment for isolated pmd.
There is no log about isolated rxq assignment in a pmd today, which
sometimes could be useful to trace rxq/pmd pinning, when debugging
with log. Ovs-appctl dpif-netdev/pmd-rxq-show reports about it
already, but logging is helpful to trace pinning in time.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukr@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2019-11-19 13:06:09 +01:00
Ilya Maximets
acc5df0e3c dpif-netdev: Fix time delta overflow in case of race for meter lock.
There is a race window between getting the time and getting the meter
lock.  This could lead to situation where the thread with larger
current time (this thread called time_{um}sec() later than others)
will acquire meter lock first and update meter->used to the large
value.  Next threads will try to calculate time delta by subtracting
the large meter->used from their lower time getting the negative value
which will be converted to a big unsigned delta.

Fix that by assuming that all these threads received packets in the
same time in this case, i.e. dropping negative delta to 0.

CC: Jarno Rajahalme <jarno@ovn.org>
Fixes: 4b27db644a8c ("dpif-netdev: Simple DROP meter implementation.")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363126.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2019-10-28 13:38:37 +01:00
Ilya Maximets
18ae34ae1f dpif-netdev: Do not mix recirculation depth into RSS hash itself.
Mixing of RSS hash with recirculation depth is useful for flow lookup
because same packet after recirculation should match with different
datapath rule.  Setting of the mixed value back to the packet is
completely unnecessary because recirculation depth is different on
each recirculation, i.e. we will have different packet hash for
flow lookup anyway.

This should fix the issue that packets from the same flow could be
directed to different buckets based on a dp_hash or different ports of
a balanced bonding in case they were recirculated different number of
times (e.g. due to conntrack rules).
With this change, the original RSS hash will remain the same making
it possible to calculate equal dp_hash values for such packets.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363127.html
Fixes: 048963aa8507 ("dpif-netdev: Reset RSS hash when recirculating.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
2019-10-28 13:38:37 +01:00
Yi-Hung Wei
187bb41fbf ofproto-dpif-xlate: Translate timeout policy in ct action
This patch derives the timeout policy based on ct zone from the
internal data structure that we maintain on dpif layer.

It also adds a system traffic test to verify the zone-based conntrack
timeout feature.  The test uses ovs-vsctl commands to configure
the customized ICMP and UDP timeout on zone 5 to a shorter period.
It then injects ICMP and UDP traffic to conntrack, and checks if the
corresponding conntrack entry expires after the predefined timeout.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>

ofproto-dpif: Checks if datapath supports OVS_CT_ATTR_TIMEOUT

This patch checks whether datapath supports OVS_CT_ATTR_TIMEOUT. With this
check, ofproto-dpif-xlate can use this information to decide whether to
translate the ct timeout policy.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2019-09-26 13:51:04 -07:00
Yi-Hung Wei
ebe62ec1b9 datapath: Add support for conntrack timeout policy
This patch adds support for specifying a timeout policy for a
connection in connection tracking system in kernel datapath.
The timeout policy will be attached to a connection when the
connection is committed to conntrack.

This patch introduces a new odp field OVS_CT_ATTR_TIMEOUT in the
ct action that specifies the timeout policy in the datapath.
In the following patch, during the upcall process, the vswitchd will use
the ct_zone to look up the corresponding timeout policy and fill
OVS_CT_ATTR_TIMEOUT if it is available.

The datapath code is from the following two net-next upstream commits.

Upstream commit:
commit 06bd2bdf19d2f3d22731625e1a47fa1dff5ac407
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date:   Tue Mar 26 11:31:14 2019 -0700

    openvswitch: Add timeout support to ct action

    Add support for fine-grain timeout support to conntrack action.
    The new OVS_CT_ATTR_TIMEOUT attribute of the conntrack action
    specifies a timeout to be associated with this connection.
    If no timeout is specified, it acts as is, that is the default
    timeout for the connection will be automatically applied.

    Example usage:
    $ nfct timeout add timeout_1 inet tcp syn_sent 100 established 200
    $ ovs-ofctl add-flow br0 in_port=1,ip,tcp,action=ct(commit,timeout=timeout_1)

    CC: Pravin Shelar <pshelar@ovn.org>
    CC: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit 6d670497e01803b486aa72cc1a718401ab986896
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Tue Apr 2 09:53:14 2019 +0300

    openvswitch: use after free in __ovs_ct_free_action()

    We free "ct_info->ct" and then use it on the next line when we pass it
    to nf_ct_destroy_timeout().  This patch swaps the order to avoid the use
    after free.

    Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2019-09-26 13:50:17 -07:00
Yi-Hung Wei
1f16131837 ct-dpif, dpif-netlink: Add conntrack timeout policy support
This patch first defines the dpif interface for a datapath to support
adding, deleting, getting and dumping conntrack timeout policy.
The timeout policy is identified by a 4 bytes unsigned integer in
datapath, and it currently support timeout for TCP, UDP, and ICMP
protocols.

Moreover, this patch provides the implementation for Linux kernel
datapath in dpif-netlink.

In Linux kernel, the timeout policy is maintained per L3/L4 protocol,
and it is identified by 32 bytes null terminated string.  On the other
hand, in vswitchd, the timeout policy is a generic one that consists of
all the supported L4 protocols.  Therefore, one of the main task in
dpif-netlink is to break down the generic timeout policy into 6
sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp),
and push down the configuration using the netlink API in
netlink-conntrack.c.

This patch also adds missing symbols in the windows datapath so
that the build on windows can pass.

Appveyor CI:
* https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2019-09-26 13:50:17 -07:00
Paul Chaignon
940ac2ce88 treewide: Use packet batch APIs
This patch replaces direct accesses to dp_packet_batch and dp_packet
internal components by the appropriate API calls.  It extends commit
1270b6e52 (treewide: Wider use of packet batch APIs).

This patch was generated using the following semantic patch (cf.
http://coccinelle.lip6.fr).

// <smpl>
@ dp_packet @
struct dp_packet_batch *b1;
struct dp_packet_batch b2;
struct dp_packet *p;
expression e;
@@

(
- b1->packets[b1->count++] = p;
+ dp_packet_batch_add(b1, p);
|
- b2.packets[b2.count++] = p;
+ dp_packet_batch_add(&b2, p);
|
- p->packet_type == htonl(PT_ETH)
+ dp_packet_is_eth(p)
|
- p->packet_type != htonl(PT_ETH)
+ !dp_packet_is_eth(p)
|
- b1->count == 0
+ dp_packet_batch_is_empty(b1)
|
- !b1->count
+ dp_packet_batch_is_empty(b1)
|
  b1->count = e;
|
  b1->count++
|
  b2.count = e;
|
  b2.count++
|
- b1->count
+ dp_packet_batch_size(b1)
|
- b2.count
+ dp_packet_batch_size(&b2)
)
// </smpl>

Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-25 14:42:00 -07:00
Darrell Ball
64207120c8 conntrack: Add option to disable TCP sequence checking.
This may be needed in some special cases, such as to support some hardware
offload implementations.  Note that disabling TCP sequence number
verification is not an optimization in itself, but supporting some
hardware offload implementations may offer better performance.  TCP
sequence number verification is enabled by default.  This option is only
available for the userspace datapath.  Access to this option is presently
provided via 'dpctl' commands as the need for this option is quite node
specific, by virtue of which nics are in use on a given node.  A test is
added to verify this option.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/359188.html
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-25 12:11:32 -07:00
Yifeng Sun
c98eedf9ef dpif-netdev: Handle uninitialized value error for 'match.wc'
Valgrind reported that match.wc was not initialized, as below:

1176: ofproto-dpif - fragment handling - actions

==21214== Conditional jump or move depends on uninitialised value(s)
==21214==    at 0x4B77C1: odp_flow_key_from_flow__ (odp-util.c:6143)
==21214==    by 0x46DB58: dp_netdev_upcall (dpif-netdev.c:6239)
==21214==    by 0x4774A7: handle_packet_upcall (dpif-netdev.c:6608)
==21214==    by 0x4774A7: fast_path_processing (dpif-netdev.c:6726)
==21214==    by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814)
==21214==    by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852)
==21214==    by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
==21214==    by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264)
==21214==    by 0x4324E7: type_run (ofproto-dpif.c:342)
==21214==    by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
==21214==    by 0x40BAAC: bridge_run__ (bridge.c:2965)
==21214==    by 0x410CF3: bridge_run (bridge.c:3029)
==21214==    by 0x407614: main (ovs-vswitchd.c:127)
==21214==  Uninitialised value was created by a stack allocation
==21214==    at 0x4769C3: fast_path_processing (dpif-netdev.c:6672)

'match' is allocated on stack but its 'wc' is accessed in
odp_flow_key_from_flow__ without proper initialization.
This patch fixes it.

Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-19 09:23:41 -07:00
Ilya Maximets
8afbf2facc dpif-netdev: Add core id in the PMD thread name.
This is highly useful to see on which core PMD is running by
only looking at the thread name. Thread Id still allows to
distinguish different threads running on the same core over the time:

   |dpif_netdev(pmd-c10/id:53)|DBG|Creating 2. subtable <...>
   |dpif_netdev(pmd-c10/id:53)|DBG|flow_add: <...>, actions:2
   |dpif_netdev(pmd-c09/id:70)|DBG|Core 9 processing port <..>

In gdb, top or any other utility it's useful to quickly catch up
needed thread without parsing logs, memory or matching threads by port
names they're handling.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
2019-09-06 11:45:39 +03:00
Ilya Maximets
1276e3db89 dpif-netdev-perf: Fix TSC frequency for non-DPDK case.
Unlike 'rte_get_tsc_cycles()' which doesn't need any specific
initialization, 'rte_get_tsc_hz()' could be used only after successfull
call to 'rte_eal_init()'. 'rte_eal_init()' estimates the TSC frequency
for later use by 'rte_get_tsc_hz()'.  Fairly said, we're not allowed
to use 'rte_get_tsc_cycles()' before initializing DPDK too, but it
works this way for now and provides correct results.

This patch provides TSC frequency estimation code that will be used
in two cases:
  * DPDK is not compiled in, i.e. DPDK_NETDEV not defined.
  * DPDK compiled in but not initialized,
    i.e. other_config:dpdk-init=false

This change is mostly useful for AF_XDP netdev support, i.e. allows
to use dpif-netdev/pmd-perf-show command and various PMD perf metrics.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
2019-09-06 11:45:39 +03:00
Ilya Maximets
3f51ea180b dpif-netdev: Fail port addition if reconfiguration failed.
If the port was destroyed during the initial reconfiguration, we should
report an error to the upper layers. Otherwise, successful addition of
the port will be logged and upper layers will continue to configure
this port. For example, the 'dpif' layer will try to initilaize flow
API for this device.

Fix that by checking for port existence after reconfiguration. We can't
get the real error code here, so let's assume EINVAL. 'ovs-vsctl' will
tell the user to check the logs for a real reason anyway.

Fixes: e32971b8ddb4 ("dpif-netdev: Centralized threads and queues handling code.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
2019-08-29 18:25:50 +03:00
Harry van Haaren
f54d8f004f dpif-netdev: Add specialized generic scalar functions
This commit adds a number of specialized functions, that handle
common miniflow fingerprints. This enables compiler optimization,
resulting in higher performance. Below a quick description of
how this optimization actually works;

"Specialized functions" are "instances" of the generic implementation,
but the compiler is given extra context when compiling. In the case of
iterating miniflow datastructures, the most interesting value to enable
compile time optimizations is the loop trip count per unit.

In order to create a specialized function, there is a generic implementation,
which uses a for() loop without the compiler knowing the loop trip count at
compile time. The loop trip count is passed in as an argument to the function:

uint32_t miniflow_impl_generic(struct miniflow *mf, uint32_t loop_count)
{
    for(uint32_t i = 0; i < loop_count; i++)
        // do work
}

In order to "specialize" the function, we call the generic implementation
with hard-coded numbers - these are compile time constants!

uint32_t miniflow_impl_loop5(struct miniflow *mf, uint32_t loop_count)
{
    // use hard coded constant for compile-time constant-propogation
    return miniflow_impl_generic(mf, 5);
}

Given the compiler is aware of the loop trip count at compile time,
it can perform an optimization known as "constant propogation". Combined
with inlining of the miniflow_impl_generic() function, the compiler is
now enabled to *compile time* unroll the loop 5x, and produce "flat" code.

The last step to using the specialized functions is to utilize a
function-pointer to choose the specialized (or generic) implementation.
The selection of the function pointer is performed at subtable creation
time, when miniflow fingerprint of the subtable is known. This technique
is known as "multiple dispatch" in some literature, as it uses multiple
items of information (miniflow bit counts) to select the dispatch function.

By pointing the function pointer at the optimized implementation, OvS
benefits from the compile time optimizations at runtime.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Malvika Gupta <malvika.gupta@arm.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-19 12:24:13 +01:00
Harry van Haaren
a0b36b3924 dpif-netdev: Refactor generic implementation
This commit refactors the generic implementation. The
goal of this refactor is to simplify the code to enable
"specialization" of the functions at compile time.

Given compile-time optimizations, the compiler is able
to unroll loops, and create optimized code sequences due
to compile time knowledge of loop-trip counts.

In order to enable these compiler optimizations, we must
refactor the code to pass the loop-trip counts to functions
as compile time constants.

This patch allows the number of miniflow-bits set per "unit"
in the miniflow to be passed around as a function argument.

Note that this patch does NOT yet take advantage of doing so,
this is only a refactor to enable it in the next patches.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Malvika Gupta <malvika.gupta@arm.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-19 12:22:23 +01:00
Harry van Haaren
92c7c870d6 dpif-netdev: Split out generic lookup function
This commit splits the generic hash-lookup-verify function to
its own file, for cleaner seperation between optimized versions.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Malvika Gupta <malvika.gupta@arm.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-19 12:22:01 +01:00
Harry van Haaren
f5ace7cd8a dpif-netdev: Move dpcls lookup structures to .h
This commit moves some data-structures to be available
in the dpif-netdev-private.h header. This allows specific
implementations of the subtable lookup function to include
just that header file, and not require that the code exists
in dpif-netdev.c

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Malvika Gupta <malvika.gupta@arm.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-19 12:21:37 +01:00
Harry van Haaren
aadede3dda dpif-netdev: Implement function pointers/subtable
This allows plugging-in of different subtable hash-lookup-verify
routines, and allows special casing of those functions based on
known context (eg: # of bits set) of the specific subtable.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Malvika Gupta <malvika.gupta@arm.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-19 12:21:16 +01:00
Ilya Maximets
ec61d4707b dpif-netdev: Clarify PMD reloading scheme.
It became more complicated, hence needs to be documented.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-10 13:15:19 +01:00
David Marchand
68a0625b78 dpif-netdev: Catch reloads faster.
Looking at the reload flag only every 1024 loops can be a long time
under load, since we might be handling 32 packets per rxq, per iteration,
which means up to poll_cnt * 32 * 1024 packets.
Look at the flag every loop, no major performance impact seen.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-10 11:51:09 +01:00
David Marchand
e2cafa8692 dpif-netdev: Only reload static tx qid when needed.
pmd->static_tx_qid is allocated under a mutex by the different pmd
threads.
Unconditionally reallocating it will make those pmd threads sleep
when contention occurs.
During "normal" reloads like for rebalancing queues between pmd threads,
this can make pmd threads waste time on this.
Reallocating the tx qid is only needed when removing other pmd threads
as it is the only situation when the qid pool can become uncontiguous.

Add a flag to instruct the pmd to reload tx qid for this case which is
Step 1 in current code.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-10 11:50:21 +01:00
David Marchand
6d9fead107 dpif-netdev: Do not sleep when swapping queues.
When swapping queues from a pmd thread to another (q0 polled by pmd0/q1
polled by pmd1 -> q1 polled by pmd0/q0 polled by pmd1), the current
"Step 5" puts both pmds to sleep waiting for the control thread to wake
them up later.

Prefer to make them spin in such a case to avoid sleeping an
undeterministic amount of time.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-10 11:47:46 +01:00
David Marchand
8f077b31e9 dpif-netdev: Trigger parallel pmd reloads.
pmd reloads are currently serialised in each steps calling
reload_affected_pmds.
Any pmd processing packets, waiting on a mutex etc... will make other
pmd threads wait for a delay that can be undeterministic when syscalls
adds up.

Switch to a little busy loop on the control thread using the existing
per-pmd reload boolean.

The memory order on this atomic is rel-acq to have an explicit
synchronisation between the pmd threads and the control thread.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-10 11:46:31 +01:00
David Marchand
299c8d611e dpif-netdev: Convert exit latch to flag.
No need for a latch here since we don't have to wait.
A simple boolean flag is enough.

The memory order on the reload flag is changed to rel-acq ordering to
serve as a synchronisation point between the pmd threads and the control
thread that asks for termination.

Fixes: e4cfed38b159 ("dpif-netdev: Add poll-mode-device thread.")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ian Stokes <ian.stokes@intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-07-10 11:45:55 +01:00
Ilya Maximets
f87c135706 vswitchd: Always cleanup userspace datapath.
'netdev' datapath is implemented within ovs-vswitchd process and can
not exist without it, so it should be gracefully terminated with a
full cleanup of resources upon ovs-vswitchd exit.

This change forces dpif cleanup for 'netdev' datapath regardless of
passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to
not pass this additional option everytime for userspace datapath
installations and also allowes to not terminate system datapath in
setups where both datapaths runs at the same time.

The main part is that dpif_port_del() will lead to netdev_close()
and subsequent netdev_class->destroy(dev) which will stop HW NICs
and free their resources. For vhost-user interfaces it will invoke
vhost driver unregistering with a properly closed vhost-user
connection. For upcoming AF_XDP netdev this will allow to gracefully
destroy xdp sockets and unload xdp programs from linux interfaces.
Another important thing is that port deletion will also trigger
flushing of flows offloaded to HW NICs.

Exception made for 'internal' ports that could have user ip/route
configuration. These ports will not be removed without '--cleanup'.

This change fixes OVS disappearing from the DPDK point of view
(keeping HW NICs improperly configured, sudden closing of vhost-user
connections) and will help with linux devices clearing with upcoming
AF_XDP netdev support.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-07-02 12:24:47 +03:00
David Marchand
35c91567c8 dpif-netdev: Only poll enabled vhost queues.
We currently poll all available queues based on the max queue count
exchanged with the vhost peer and rely on the vhost library in DPDK to
check the vring status beneath.
This can lead to some overhead when we have a lot of unused queues.

To enhance the situation, we can skip the disabled queues.
On rxq notifications, we make use of the netdev's change_seq number so
that the pmd thread main loop can cache the queue state periodically.

$ ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 1:
  isolated : true
  port: dpdk0             queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 2:
  isolated : true
  port: vhost1            queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost3            queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 15:
  isolated : true
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 16:
  isolated : true
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost2            queue-id:  0 (enabled)   pmd usage:  0 %

$ while true; do
  ovs-appctl dpif-netdev/pmd-rxq-show |awk '
  /port: / {
    tot++;
    if ($5 == "(enabled)") {
      en++;
    }
  }
  END {
    print "total: " tot ", enabled: " en
  }'
  sleep 1
done

total: 6, enabled: 2
total: 6, enabled: 2
...

 # Started vm, virtio devices are bound to kernel driver which enables
 # F_MQ + all queue pairs
total: 6, enabled: 2
total: 66, enabled: 66
...

 # Unbound vhost0 and vhost1 from the kernel driver
total: 66, enabled: 66
total: 66, enabled: 34
...

 # Configured kernel bound devices to use only 1 queue pair
total: 66, enabled: 34
total: 66, enabled: 19
total: 66, enabled: 4
...

 # While rebooting the vm
total: 66, enabled: 4
total: 66, enabled: 2
...
total: 66, enabled: 66
...

 # After shutting down the vm
total: 66, enabled: 66
total: 66, enabled: 2

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-06-26 18:43:39 +01:00
Ilya Maximets
b6cabb8f8f netdev: Split up netdev offloading to separate module.
New module 'netdev-offload' created to manage different flow API
implementations. All the generic and provider independent code moved
there from the 'netdev' module.

Flow API providers further encapsulated.

The only function that was changed is 'netdev_any_oor'.
Now it uses offloading related hmap instead of common 'netdev_shash'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
2019-06-11 09:39:36 +03:00
Ilya Maximets
0da667e345 dpif-netdev: Forbid vport offloading attempts.
'netdev_flow_put()' for vports could eventually succeed for
userspace datapath in case there is a kernel datapath with
similar vport at the same time. The root cause is that vports
like 'vxlan' uses same 'vxlan_sys_<port>' system interfaces
for flow offloading and there is no way to distinguish system
and userspace vports using only 'netdev' structure.

Let's forbid vport offloading from userspace datapath to avoid
installing userspace flows to unrelated system devices.

Future dynamic flow API management will allow to enable vport
offloading back using more flexible checks.

Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Reported-by: Ophir Munk <ophirmu@mellanox.com>
Acked-By: Roni Bar Yanai <roniba@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
2019-06-06 17:23:58 +03:00
Ilya Maximets
0a5cba6591 dpif-netdev: Fix flow mark leak on port lookup failure.
Flow mark should be properly freed in all error cases.

Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Acked-By: Roni Bar Yanai <roniba@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
2019-06-06 17:23:58 +03:00
Ilya Maximets
eef8538081 dpif-netdev: Fix unsafe access to pmd polling lists.
All accesses to 'pmd->poll_list' should be guarded by
'pmd->port_mutex'. Additionally fixed inappropriate usage
of 'HMAP_FOR_EACH_SAFE' (hmap doesn't change in a loop)
and dropped not needed local variable 'proc_cycles'.

CC: Nitin Katiyar <nitin.katiyar@ericsson.com>
Fixes: 5bf84282482a ("Adding support for PMD auto load balancing")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
2019-05-29 14:31:54 +03:00
Darrell Ball
57593fd243 conntrack: Stop exporting internal datastructures.
Stop the exporting of the main internal conntrack datastructure.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-03 09:46:22 -07:00
Zhantao Fu
0fcf0776c7 Double postponing to free subtables.
Subtable destruction should be double postponed because readers could still obtain old values while iterating over pvector implementation before its new version published.

Signed-off-by: Zhantao Fu <fuzhantao@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-23 09:08:51 -07:00
Numan Siddique
5b34f8fc3b Add a new OVS action check_pkt_larger
This patch adds a new action 'check_pkt_larger' which checks if the
packet is larger than the given size and stores the result in the
destination register.

Usage: check_pkt_larger(len)->REGISTER
Eg. match=...,actions=check_pkt_larger(1442)->NXM_NX_REG0[0],next;

This patch makes use of the new datapath action - 'check_pkt_len'
which was recently added in the commit [1].
At the start of ovs-vswitchd, datapath is probed for this action.
If the datapath action is present, then 'check_pkt_larger'
makes use of this datapath action.

Datapath action 'check_pkt_len' takes these nlattrs
      * OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for
      * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER (optional) - Nested actions
        to apply if the packet length is greater than the specified 'pkt_len'
      * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL (optional) - Nested
        actions to apply if the packet length is lesser or equal to the
        specified 'pkt_len'.

Let's say we have these flows added to an OVS bridge br-int

table=0, priority=100 in_port=1,ip,actions=check_pkt_larger:100->NXM_NX_REG0[0],resubmit(,1)
table=1, priority=200,in_port=1,ip,reg0=0x1/0x1 actions=output:3
table=1, priority=100,in_port=1,ip,actions=output:4

Then the action 'check_pkt_larger' will be translated as
  - check_pkt_len(size=100,gt(3),le(4))

datapath will check the packet length and if the packet length is greater than 100,
it will output to port 3, else it will output to port 4.

In case, datapath doesn't support 'check_pkt_len' action, the OVS action
'check_pkt_larger' sets SLOW_ACTION so that datapath flow is not added.

This OVS action is intended to be used by OVN to check the packet length
and generate an ICMP packet with type 3, code 4 and next hop mtu
in the logical router pipeline if the MTU of the physical interface
is lesser than the packet length. More information can be found here [2]

[1] - 4d5ec89fc8
[2] - https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html

Reported-at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
CC: Ben Pfaff <blp@ovn.org>
CC: Gregory Rose <gvrose8192@gmail.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-22 12:56:50 -07:00
William Tu
42697ca775 dpif-netdev: fix meter at high packet rate.
When testing packet rate around 1Mpps with meter enabled, the frequency
of hitting meter action becomes much higher, around 30us each time.
As a result, the meter's calculation of 'uint32_t delta_t' becomes
always 0 and meter action has no effect.  This is due to the previous
commit 05f9e707e194 divides the delta by 1000, in order to convert to
msec granularity.  The patch fixes it updating the time when across
millisecond boundary.

Fixes: 05f9e707e194 ("dpif-netdev: Use microsecond granularity.")
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-22 09:51:28 -07:00
Ilya Maximets
af741ca346 dpif-netdev: Update comment about flow installation race.
Userspace datapath uses per-PMD flow tables/classifiers for a long
time. However, it was decided to keep this race window to not block
revalidators. Comment should be updated to reflect the current state.

Fixes: 1c1e46ed8457 ("dpif-netdev: Add per-pmd flow-table/classifier.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-04-18 08:51:46 +01:00