A new debug function is added and used in a
subsequent patch.
Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Create a separate function from existing code, so the
code can be reused in a subsequent patch; no change
in functionality.
Acked-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Conn should be removed from the connection expiry list when
the connection tracker experiences NAT resource exhaustion
and the connection needing NAT mapping cannot get it.
If this is not done, the connection tracker can crash during
cleanup of expired connections by the clean thread.
This crash will be triggered when a established flow do ct(nat)
again, like
"ip,actions=ct(table=1)
table=1,in_port=1,ip,actions=ct(commit,nat(dst=5.5.5.5)),2
table=1,in_port=2,ip,ct_state=+est,actions=1
table=1,in_port=1,ip,ct_state=+est,actions=2"
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Lili Huang <huanglili.huang@huawei.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Current time is passed to conntrack_execute so it doesn't have
to recompute it again.
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Acked by: Sugesh Chandran <sugesh.chandran@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
ALG infra and FTP (both V4 and V6) support is added to the userspace
datapath. Also, NAT support is included.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A new function conn_key_cmp() is introduced and used to replace
memcmp of conn_keys. Given that OVS runs on with many compilers and
on many architectures, it seems prudent to avoid memcmp in case
existing and future holes in conn_key are not handled by a given
compiler for a given architecture.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
With the command:
ovs-appctl dpctl/ct-bkts
shows the number of connections per bucket.
By using a threshold:
ovs-appctl dpctl/ct-bkts gt=N
for each bucket shows the number of connections when they
are greater than N.
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Co-authored-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Avoiding checksum validation in conntrack module if it is already verified
in DPDK physical NIC ports.
Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>
Co-authored-by: Darrell Ball <dball@vmware.com>
Signed-off-by: Darrell Ball <dball@vmware.com>
Acked-by: Antonio Fishetti <antonio.fischetti@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The 'nat' portion of 'nat_resources_lock' is dropped as
this lock will be used by ALGs in a subsequent patch.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The conntrack context flag 'related' is changed to 'icmp_related'
to disambiguate usage w.r.t. ALGs which are added in a subsequent
patch.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Fixes some lines exceeding 80 chars and a couple of typos.
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Un-nat conns have no nat_info as do default conns.
However, un-nat conns are originally templated from the
corresponding default conns and therefore need to
have their nat_info explicitly nulled. This
otherwise exposes a double free if conntrack_destroy()
were to be used to destroy the connection tracker. This
would apply to cleaning the datapath after testing.
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Greg Rose <gvrose8192@gmail.com>
The function conn_key_hash() is updated to include
a call to hash_finish() and also to make use of a
new hash abstraction - ct_endpoint_hash_add().
Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Part of the hash input for nat_range_hash() was accidentally
omitted, so this fixes the problem. Also, add a missing call to
hash_finish().
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch adds orig tuple checking and context
recovery; NAT interactions are factored in.
Orig tuple support exists to better handle policy
changes.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch includes more complete support
for icmp4 and icmp6 related NAT handling.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch introduces NAT support for the userspace datapath.
Most conntrack module changes are in this patch, with the
exception of icmp related handling and recent orig tuple
support.
The per packet scope of lookups for NAT and un_NAT is at
the bucket level rather than global. One hash table is
introduced to support create/delete handling. The create/delete
events may be further optimized, if the need becomes clear.
Some NAT options with limited utility (persistent, random) are
not supported yet, but will be supported in a later patch.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Packet batch sorting is removed for three reasons:
1) The following patches for NAT change the locking
marshalling so batching loses benefit.
2) For real mixtures of flows either in hypervisors
or gateways, the batch sorting won't provide benefit
and will just be a tax.
3) Code clarity.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This commit adds a packet_type attribute to the structs dp_packet and flow
to explicitly carry the type of the packet as prepration for the
introduction of the so-called packet type-aware pipeline (PTAP) in OVS.
The packet_type is a big-endian 32 bit integer with the encoding as
specified in OpenFlow verion 1.5.
The upper 16 bits contain the packet type name space. Pre-defined values
are defined in openflow-common.h:
enum ofp_header_type_namespaces {
OFPHTN_ONF = 0, /* ONF namespace. */
OFPHTN_ETHERTYPE = 1, /* ns_type is an Ethertype. */
OFPHTN_IP_PROTO = 2, /* ns_type is a IP protocol number. */
OFPHTN_UDP_TCP_PORT = 3, /* ns_type is a TCP or UDP port. */
OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */
};
The lower 16 bits specify the actual type in the context of the name space.
Only name spaces 0 and 1 will be supported for now.
For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet).
This is the default packet_type in OVS and the only one supported so far.
Packets of type (OFPHTN_ONF, 0) are called Ethernet packets.
In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet.
A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet
whith the Ethernet header (and any VLAN tags) removed to expose the L3
(or L2.5) payload of the packet. These will simply be called L3 packets.
The Ethernet address fields dl_src and dl_dst in struct flow are not
applicable for an L3 packet and must be zero. However, to maintain
compatibility with the large code base, we have chosen to copy the
Ethertype of an L3 packet into the the dl_type field of struct flow.
This does not mean that it will be possible to match on dl_type for L3
packets with PTAP later on. Matching must be done on packet_type instead.
New dp_packets are initialized with packet_type Ethernet. Ports that
receive L3 packets will have to explicitly adjust the packet_type.
Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Otherwise a malformed packet could cause a read up to about 40 bytes past
the end of the packet. The packet would still likely be dropped because
of checksum verification.
Reported-by: Bhargava Shastry <bshastry@sec.t-labs.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
ICMP error packets (e.g. destination unreachable messages) are
considered 'related' to another connection and are treated as part of
that.
However:
* We shouldn't create new entries in the connection table if the
original connection is not found. This is consistent with what the
kernel does.
* We certainly shouldn't call valid_new() on the packet, because
valid_new() assumes the packet l4 type (might be TCP, UDP or ICMP)
to be consistent with the conn_key nw_proto type.
Found by inspection.
Fixes: a489b16854b5("conntrack: New userspace connection tracker.")
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Darrell Ball <dlu998@gmail.com>
Now that dpif_execute has a 'flow' member, it's pretty easy to access a
the flow (or the matching megaflow) in dp_execute_cb().
This means that's not necessary anymore for the connection tracker to
reextract 'dl_type' from the packet, it can be passed as a parameter.
This change means that we have to complicate sightly test-conntrack to
group the packets by dl_type before passing them to the connection
tracker.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
From the connection tracker perspective, an ICMP connection is a tuple
identified by source ip address, destination ip address and ICMP id.
While this allows basic ICMP traffic (pings) to work, it doesn't take
into account the icmp type: the connection tracker will allow
requests/replies in any directions.
This is improved by making the ICMP type and code part of the connection
tuple. An ICMP echo request packet from A to B, will create a
connection that matches ICMP echo request from A to B and ICMP echo
replies from B to A. The same is done for timestamp and info
request/replies, and for ICMPv6.
A new modules conntrack-icmp is implemented, to allow only "request"
types to create new connections.
Also, since they're tracked in both userspace and kernel
implementations, ICMP type and code are always printed in ct-dpif (a few
testcase are updated as a consequence).
Reported-by: Subramani Paramasivam <subramani.paramasivam@wipro.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
New functions are implemented in the conntrack module to support this.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
New functions are implemented in the conntrack module to support this.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
This commit adds a thread that periodically removes expired connections.
The expiration time of a connection can be expressed by:
expiration = now + timeout
For each possible 'timeout' value (there aren't many) we keep a list.
When the expiration is updated, we move the connection to the back of the
corresponding 'timeout' list. This ways, the list is always ordered by
'expiration'.
When the cleanup thread iterates through the lists for expired
connections, it can stop at the first non expired connection.
Suggested-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
This commit adds the conntrack module.
It is a connection tracker that resides entirely in userspace. Its
primary user will be the dpif-netdev datapath.
The module main goal is to provide conntrack_execute(), which offers a
convenient interface to implement the datapath ct() action.
The conntrack module uses two submodules to deal with the l4 protocol
details (conntrack-other for UDP and ICMP, conntrack-tcp for TCP).
The conntrack-tcp submodule implementation is adapted from FreeBSD's pf
subsystem, therefore it's BSD licensed. It has been slightly altered to
match the OVS coding style and to allow the pickup of already
established connections.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>
Acked-by: Joe Stringer <joe@ovn.org>