The patch avoids the extra allocation for nat_conn.
Currently, when doing NAT, the userspace conntrack will use an extra
conn for the two directions in a flow. However, each conn has actually
the two keys for both orig and rev directions. This patch introduces a
key_node[CT_DIRS] member as per Aaron's suggestion in the conn which
consists of a key, direction, and a cmap_node for hash lookup so
addressing the feedback received by the original patch [0].
With this adjustment, we also remove the assertion that connections in
the table are DEFAULT while updating connection state and/or removing
connections.
[0] https://patchwork.ozlabs.org/project/openvswitch/patch/20201129033255.64647-2-hepeng.0320@bytedance.com/
Reported-by: Michael Plato <michael.plato@tu-berlin.de>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-September/052065.html
Signed-off-by: Peng He <hepeng.0320@bytedance.com>
Co-authored-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Tested-by: Frode Nordahl <frode.nordahl@canonical.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
A lock is taken during conn_lookup() to check whether a connection is
expired before returning it. This lock can have some contention.
Even though this lock ensures a consistent sequence of writes, it does
not imply a specific order. A ct_clean thread taking the lock first
could read a value that would be updated immediately after by a PMD
waiting on the same lock, just as well as the inverse order.
As such, the expiration time can be stale anytime it is read. In this
context, using an atomic will ensure the same guarantees for either
writes or reads, i.e. writes are consistent and reads are not undefined
behaviour. Reading an atomic is however less costly than taking and
releasing a lock.
Signed-off-by: Gaetan Rivet <grive@u256.net>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This patch aims to replace the expiration lists as, due to the way
they are used, besides being a source of contention, they have a known
issue when used with non-default policies for different zones that
could lead to retaining expired connections potentially for a long
time.
This patch replaces them with an array of rculist used to distribute
all the newly created connections in order to, during the sweeping
phase, scan them without locking, and evict the expired connections
only locking during the actual removal. This allows to reduce the
contention introduced by the pushback performed at every packet
update, also solving the issue related to zones and timeout policies.
Signed-off-by: Gaetan Rivet <grive@u256.net>
Co-authored-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Multiple lookups are done to stored timeout policies, each time blocking
the global 'ct_lock'. This is usually not necessary and it should be
acceptable to get policy updates slightly delayed (by one RCU sync
at most). Using a CMAP reduces multiple lock taking and releasing in
the connection insertion path.
Signed-off-by: Gaetan Rivet <grive@u256.net>
Reviewed-by: Eli Britstein <elibr@nvidia.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*conn_update_expiration* violates the lock order of conn->lock and
ct->lock. In the comments of conntrack, the conn->lock should be
held after ct->lock when ct->lock needs to be taken.
Fixes: 2078901a4c142 ("userspace: Add conntrack timeout policy support.")
Signed-off-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Commit 1f1613183733 ("ct-dpif, dpif-netlink: Add conntrack timeout
policy support") adds conntrack timeout policy for kernel datapath.
This patch enables support for the userspace datapath. I tested
using the 'make check-system-userspace' which checks the timeout
policies for ICMP and UDP cases.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>