Add check to validate that 'conn_clean()' is only called for
conntrack entries of default 'conn_type'.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The user should only reference a conntrack entry by the forward
direction context, as per 'conntrack_flush()', enforce this by
checking for 'default' conn_type. The likelihood of a user
not using the original tuple is low, but it should be guarded
against, logged and documented.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When fallback to ephemeral ports triggers to find a NAT translation,
it may happen that the full address range is not explored; i.e. if
all ephemeral ports are being used for the address range >= the
first address checked and there are other addresses in the
available range, then they would not be explored for availability.
The likelihood of hitting this condition is rare. The fix is to
reset the first address to the minimum address when starting to
search ephemeral ports. Found by inspection.
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Ephemeral port fallback is being done for DNAT and the code could be hit in
some special cases and testing configurations. Also good packets are
expected to be persistently dropped in this case, which is not a common
user goal. Regardless, this is incorrect, so filter this out. Also, rename
the variable used for checking whether ephemeral ports need to be checked.
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/351629.html
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When conn_update_state() returns true, conn has been freed, so skip calling
handle_ftp_ctl() with this conn and instead follow code path for new
connections.
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
'alg_exp_entry' is allocated on stack memory, but could be used via
'alg_exp' pointer inside 'write_ct_md' function, i.e. outside its scope.
CC: Darrell Ball <dlu998@gmail.com>
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The ipv4 fragmentation check is broken and allows fragments through.
There were fragile and poorly maintainable checks in extract_l3_ipv*
designed to save a few cycles. The checks make assumptions about what
sanity checks may have been done and could be skipped based on inferring
from the value of another paramater that should be unrelated (l4
pointer needing assignment). Since the benefit is minimal, remove
the special checks and always do sanity checks.
Four tests are added to better maintain fragmentation support.
This needs backporting to 2.9.
Fixes: c8b1ad49da68("conntrack: Reorder sanity checks in extract_l3_ipvx().")
Fixes: a489b16854b5("conntrack: New userspace connection tracker.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
This patch adds support of flushing a conntrack entry specified by the
conntrack 5-tuple in dpif-netdev.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Darrell Ball <dlu998@gmail.com>
The ovs_assert() macro always evaluates its argument, even when NDEBUG is
defined so that failure is ignored. This behavior wasn't documented, and
thus a lot of code didn't rely on it. This commit documents the behavior
and simplifies bits of code that heretofore didn't rely on it.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
This supports using the ct_clear action in the kernel datapath. To
preserve compatibility with current ct_clear behavior on old kernels, we
only pass this action down to the datapath if a probe reveals the
datapath actually supports it.
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
The functions extract_l3_ipv4 and extract_l3_ipv6 check for
unsupported ip fragments and return early. The checks were after
an assignment that would not be needed when early return happens.
This is slightly inefficient, but mostly reads poorly.
Hence, reorder the ip fragment checks before the assignments.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Fix up some instances where variable declarations were not close
enough to their use, as these were missed before. This is the
preferred art in OVS code and flagged heavily in code reviews.
This is highly desirable due to code clarity reasons.
There are also some cases where newlines were not needed by prior art
and some cases where they were needed but missed.
There was one case where there was a missing space after "}".
There were a few cases where for loop index declarations could be
folded into the loop.
One function was missing some const qualifiers.
There were a few instances where a local variable for conn_key_hash
could be eliminated.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
In order to support more algs with different requirements,
expectation handling is allowed to handle more cases, such as
a wildcard source ip as in the case of SIP. NAT can also be
skipped in some alg cases.
Expectation_create() was otherwise simplified in the process.
Some renaming was done to support the above changes.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Presently, alg expectations are removed by being time expired.
This was intended to happen before the control connections and
was intended to minimize the extra work involved for tracking and
removing the expectations. This is not the best option since it
should be possible to remove expectations when a control connection
is removed and a new api is in the works to do this. Also, conceptually
an expectation should not exist without a control connection context
and it can be argued that this should be a strict requirement.
The approach is changed to remove the expectations when the control
connections are removed. The previous code to expire the expectations
is removed at the same time.
Fixes: bd5e81a0e ("Userspace Datapath: Add ALG infra and FTP.")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341683.html
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A get command is added for number of conntrack connections.
This command is only supported in the userspace datapath
at this time.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Co-authored-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Get and set dpctl commands are added for conntrack maxconns.
These commands are only supported in the userspace
datapath at this time.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Co-authored-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
An address sanity check is done on icmp error packets to
check that the icmp error payload makes sense w.r.t. the
packet itself.
The sanity check was partially incorrect since it tried
to verify the source address of the error packet against the
original destination, which does not makes since the error
can be generated by any intermediate node.
Reported-by: wangzhike <wangzhike@jd.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341609.html
Fixes: a489b1685 ("conntrack: New userspace connection tracker.")
CC: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: wangzhike <wangzhike@jd.com>
Co-authored-by: wangzhike <wangzhike@jd.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Presently, alg processing is enabled by default to better exercise code.
This is similar to kernels before 4.7 as well. The recommended default
behavior in the newer kernels is to only process algs if a helper is
supplied in a conntrack rule. The behavior is changed to match the
later kernels.
A test is extended to check that the control connection is still
created in such a case.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Algs can use variable control port numbers for servers.
The main use case is a kind of feeble security measure; the
thinking being by some is that it obscures the alg traffic.
It is really not very effective, but the kernel has this
capability. This patch mimics the capability.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Upcoming requirements for new algs make it desirable to split out
alg helpers more cleanly.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Poll-loop is the core to implement main loop. It should be available in
libopenvswitch.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Add an OVS_UNLIKELY and reorder a few variable condition
checks.
Acked-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
These dead assignment warnings do not affect functionality.
In one case, a local variable could be removed and in another
case, the working pointer should be used rather than the start
pointer.
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Reported-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338515.html
Acked-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Close a theoretical race delete/create corner case for alg
reverse conns and add debugging around this that may point to
an intentional exploit, unintentional problem or just a rare
condition. The solution is to keep track of reverse conn via
nat_conn_keys and avoid deleting the reverse conn when it has been
recreated.
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A new debug function is added and used in a
subsequent patch.
Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Create a separate function from existing code, so the
code can be reused in a subsequent patch; no change
in functionality.
Acked-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Conn should be removed from the connection expiry list when
the connection tracker experiences NAT resource exhaustion
and the connection needing NAT mapping cannot get it.
If this is not done, the connection tracker can crash during
cleanup of expired connections by the clean thread.
This crash will be triggered when a established flow do ct(nat)
again, like
"ip,actions=ct(table=1)
table=1,in_port=1,ip,actions=ct(commit,nat(dst=5.5.5.5)),2
table=1,in_port=2,ip,ct_state=+est,actions=1
table=1,in_port=1,ip,ct_state=+est,actions=2"
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Lili Huang <huanglili.huang@huawei.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Current time is passed to conntrack_execute so it doesn't have
to recompute it again.
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Acked by: Sugesh Chandran <sugesh.chandran@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
ALG infra and FTP (both V4 and V6) support is added to the userspace
datapath. Also, NAT support is included.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
A new function conn_key_cmp() is introduced and used to replace
memcmp of conn_keys. Given that OVS runs on with many compilers and
on many architectures, it seems prudent to avoid memcmp in case
existing and future holes in conn_key are not handled by a given
compiler for a given architecture.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
With the command:
ovs-appctl dpctl/ct-bkts
shows the number of connections per bucket.
By using a threshold:
ovs-appctl dpctl/ct-bkts gt=N
for each bucket shows the number of connections when they
are greater than N.
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Co-authored-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Avoiding checksum validation in conntrack module if it is already verified
in DPDK physical NIC ports.
Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>
Co-authored-by: Darrell Ball <dball@vmware.com>
Signed-off-by: Darrell Ball <dball@vmware.com>
Acked-by: Antonio Fishetti <antonio.fischetti@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The 'nat' portion of 'nat_resources_lock' is dropped as
this lock will be used by ALGs in a subsequent patch.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The conntrack context flag 'related' is changed to 'icmp_related'
to disambiguate usage w.r.t. ALGs which are added in a subsequent
patch.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Fixes some lines exceeding 80 chars and a couple of typos.
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Un-nat conns have no nat_info as do default conns.
However, un-nat conns are originally templated from the
corresponding default conns and therefore need to
have their nat_info explicitly nulled. This
otherwise exposes a double free if conntrack_destroy()
were to be used to destroy the connection tracker. This
would apply to cleaning the datapath after testing.
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Greg Rose <gvrose8192@gmail.com>
The function conn_key_hash() is updated to include
a call to hash_finish() and also to make use of a
new hash abstraction - ct_endpoint_hash_add().
Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Part of the hash input for nat_range_hash() was accidentally
omitted, so this fixes the problem. Also, add a missing call to
hash_finish().
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch adds orig tuple checking and context
recovery; NAT interactions are factored in.
Orig tuple support exists to better handle policy
changes.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch includes more complete support
for icmp4 and icmp6 related NAT handling.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch introduces NAT support for the userspace datapath.
Most conntrack module changes are in this patch, with the
exception of icmp related handling and recent orig tuple
support.
The per packet scope of lookups for NAT and un_NAT is at
the bucket level rather than global. One hash table is
introduced to support create/delete handling. The create/delete
events may be further optimized, if the need becomes clear.
Some NAT options with limited utility (persistent, random) are
not supported yet, but will be supported in a later patch.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Packet batch sorting is removed for three reasons:
1) The following patches for NAT change the locking
marshalling so batching loses benefit.
2) For real mixtures of flows either in hypervisors
or gateways, the batch sorting won't provide benefit
and will just be a tax.
3) Code clarity.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This commit adds a packet_type attribute to the structs dp_packet and flow
to explicitly carry the type of the packet as prepration for the
introduction of the so-called packet type-aware pipeline (PTAP) in OVS.
The packet_type is a big-endian 32 bit integer with the encoding as
specified in OpenFlow verion 1.5.
The upper 16 bits contain the packet type name space. Pre-defined values
are defined in openflow-common.h:
enum ofp_header_type_namespaces {
OFPHTN_ONF = 0, /* ONF namespace. */
OFPHTN_ETHERTYPE = 1, /* ns_type is an Ethertype. */
OFPHTN_IP_PROTO = 2, /* ns_type is a IP protocol number. */
OFPHTN_UDP_TCP_PORT = 3, /* ns_type is a TCP or UDP port. */
OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */
};
The lower 16 bits specify the actual type in the context of the name space.
Only name spaces 0 and 1 will be supported for now.
For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet).
This is the default packet_type in OVS and the only one supported so far.
Packets of type (OFPHTN_ONF, 0) are called Ethernet packets.
In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet.
A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet
whith the Ethernet header (and any VLAN tags) removed to expose the L3
(or L2.5) payload of the packet. These will simply be called L3 packets.
The Ethernet address fields dl_src and dl_dst in struct flow are not
applicable for an L3 packet and must be zero. However, to maintain
compatibility with the large code base, we have chosen to copy the
Ethertype of an L3 packet into the the dl_type field of struct flow.
This does not mean that it will be possible to match on dl_type for L3
packets with PTAP later on. Matching must be done on packet_type instead.
New dp_packets are initialized with packet_type Ethernet. Ports that
receive L3 packets will have to explicitly adjust the packet_type.
Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>