2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-30 05:47:55 +00:00

113 Commits

Author SHA1 Message Date
Paolo Valerio
2c597c8900 conntrack: add coverage counters for L3 bad checksum.
similarly to what already exists for L4, add conntrack_l3csum_err
and ipf_l3csum_err for L3.

Received packets with L3 bad checksum will increase respectively
ipf_l3csum_err if they are fragments and conntrack_l3csum_err
otherwise.

Although the patch basically covers IPv4, the names are kept generic.

Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-06-30 23:56:03 +02:00
Paolo Valerio
76700f374a conntrack: Increment coverage counter for all bad checksum cases.
conntrack_l4csum_err gets incremented only when corrupted icmp pass
through conntrack.  Increase it for the remaining bad checksum cases
including when checksum is offloaded.

Fixes: 38c69ccf8e29 ("conntrack: Add coverage count for l4csum error.")
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-06-30 23:56:03 +02:00
Flavio Leitner
79349cbab0 flow: Support extra padding length.
Although not required, padding can be optionally added until
the packet length is MTU bytes. A packet with extra padding
currently fails sanity checks.

Vulnerability: CVE-2020-35498
Fixes: fa8d9001a624 ("miniflow_extract: Properly handle small IP packets.")
Reported-by: Joakim Hindersson <joakim.hindersson@elastx.se>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-02-10 14:59:55 +01:00
Ben Pfaff
3eec7fb075 pcap-file: Fix calculation of TCP payload length in tcp_reader_run().
The calculation in tcp_reader_run() failed to account for L2 padding.
This fixes the problem, by moving the existing function
tcp_payload_length() from a conntrack private header file into
dp-packet.h and renaming it to suit the dp_packet style.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
2021-02-02 09:59:31 -08:00
Eelco Chaudron
a27d70a898 conntrack: add generic IP protocol support
Currently, userspace conntrack only tracks TCP, UDP, and ICMP, and all
other IP protocols are discarded, and the +inv state is returned. This
is not in line with the kernel conntrack. Where if no L4 information can
be extracted it's treated as generic L3. The change below mimics the
behavior of the kernel.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-12-21 18:50:52 +01:00
Ben Pfaff
f51cf36d86 conntrack: Rename "master" connection to "parent" connection.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
2020-10-21 11:24:15 -07:00
William Tu
2078901a4c userspace: Add conntrack timeout policy support.
Commit 1f1613183733 ("ct-dpif, dpif-netlink: Add conntrack timeout
policy support") adds conntrack timeout policy for kernel datapath.
This patch enables support for the userspace datapath.  I tested
using the 'make check-system-userspace' which checks the timeout
policies for ICMP and UDP cases.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
2020-05-01 08:22:45 -07:00
William Tu
38c69ccf8e conntrack: Add coverage count for l4csum error.
Add a coverage counter when userspace conntrack receives a packet
with invalid l4 checksum.  When using veth for testing, users
often forget to turn off the tx offload on the other side of the
namespace, causing l4 checksum not calculated in packet header,
and at conntrack, return invalid conntrack state.

Suggested-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
2020-04-17 11:58:00 -07:00
Dumitru Ceara
af8169b42d conntrack: Reset ct_state when entering a new zone.
When a new conntrack zone is entered, the ct_state field is zeroed in
order to avoid using state information from different zones.

One such scenario is when a packet is double NATed. Assuming two zones
and 3 flows performing the following actions in order on the packet:
1. ct(zone=5,nat), recirc
2. ct(zone=1), recirc
3. ct(zone=1,nat)

If at step #1 the packet matches an existing NAT entry, it will get
translated and pkt->md.ct_state is set to CS_DST_NAT or CS_SRC_NAT.
At step #2 the new tuple might match an existing connection and
pkt->md.ct_zone is set to 1.
If at step #3 the packet matches an existing NAT entry in zone 1,
handle_nat() will be called to perform the translation but it will
return early because the packet's zone matches the conntrack zone and
the ct_state field still contains CS_DST_NAT or CS_SRC_NAT from the
translations in zone 5.

In order to reliably detect when a packet enters a new conntrack zone
we also need to make sure that the pkt->md.ct_zone is properly
initialized if pkt->md.ct_state is non-zero. This already happens for
most cases. The only exception is when matched conntrack connection is
of type CT_CONN_TYPE_UN_NAT and the master connection is missing. To
cover this path we now call write_ct_md() in that case too. Remove
setting the CS_TRACKED flag as in this case as it will be done by the
new call to write_ct_md().

CC: Darrell Ball <dlu998@gmail.com>
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-03-24 15:49:12 +01:00
William Tu
9a8a18f9fa conntrack: Fix NULL pointer dereference.
Coverity CID 279957 reports NULL pointer derefence when
'conn' is NULL and calling ct_print_conn_info.

Cc: Usman Ansari <uansari@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
2020-03-19 16:59:23 -07:00
Yi-Hung Wei
a867c010ee conntrack: Fix conntrack new state
In connection tracking system, a connection is established if we
see packets from both directions.  However, in userspace datapath's
conntrack, if we send a connection setup packet in one direction
twice, it will make the connection to be in established state.

This patch fixes the aforementioned issue, and adds a system traffic
test for UDP and TCP traffic to avoid regression.

Fixes: a489b16854b59 ("conntrack: New userspace connection tracker.")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
2020-01-29 10:41:43 -08:00
Flavio Leitner
29cf9c1b3b userspace: Add TCP Segmentation Offload support
Abbreviated as TSO, TCP Segmentation Offload is a feature which enables
the network stack to delegate the TCP segmentation to the NIC reducing
the per packet CPU overhead.

A guest using vhostuser interface with TSO enabled can send TCP packets
much bigger than the MTU, which saves CPU cycles normally used to break
the packets down to MTU size and to calculate checksums.

It also saves CPU cycles used to parse multiple packets/headers during
the packet processing inside virtual switch.

If the destination of the packet is another guest in the same host, then
the same big packet can be sent through a vhostuser interface skipping
the segmentation completely. However, if the destination is not local,
the NIC hardware is instructed to do the TCP segmentation and checksum
calculation.

It is recommended to check if NIC hardware supports TSO before enabling
the feature, which is off by default. For additional information please
check the tso.rst document.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Tested-by: Ciara Loftus <ciara.loftus.intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2020-01-17 22:27:25 +00:00
Darrell Ball
a7f33fdbfb conntrack: Support zone limits.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-03 10:11:13 -08:00
Ilya Maximets
361a47d669 flow: Fix using pointer to member of packed struct icmp6_hdr.
OVS has no structure definition for ICMPv6 header with additional
data. More precisely, it has, but this structure named as
'icmp6_error_header' and only suitable to store error related
extended information.  'flow_compose_l4' stores additional
information in reserved bits by using system defined structure
'icmp6_hdr', which is marked as 'packed' and this leads to
build failure with gcc >= 9:

  lib/flow.c:3041:34: error:
    taking address of packed member of 'struct icmp6_hdr' may result
    in an unaligned pointer value [-Werror=address-of-packed-member]

        uint32_t *reserved = &icmp->icmp6_dataun.icmp6_un_data32[0];
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix that by renaming 'icmp6_error_header' to 'icmp6_data_header'
and allowing it to store not only errors, but any type of additional
information by analogue with 'struct icmp6_hdr'.
All the usages of 'struct icmp6_hdr' replaced with this new structure.
Removed redundant conversions between network and host representations.
Now fields are always in be.

This also, probably, makes flow_compose_l4 more robust by avoiding
possible unaligned accesses to 32 bit value.

Fixes: 9b2b84973db7 ("Support for match & set ICMPv6 reserved and options type fields")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2019-10-10 11:11:00 +02:00
Darrell Ball
64207120c8 conntrack: Add option to disable TCP sequence checking.
This may be needed in some special cases, such as to support some hardware
offload implementations.  Note that disabling TCP sequence number
verification is not an optimization in itself, but supporting some
hardware offload implementations may offer better performance.  TCP
sequence number verification is enabled by default.  This option is only
available for the userspace datapath.  Access to this option is presently
provided via 'dpctl' commands as the need for this option is quite node
specific, by virtue of which nics are in use on a given node.  A test is
added to verify this option.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/359188.html
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-25 12:11:32 -07:00
Darrell Ball
594570ea1c conntrack: Optimize recirculations.
Cache the 'conn' context and use it when it is valid.  The cached 'conn'
context will get reset if it is not expected to be valid; the cost to do
this is negligible.  Besides being most optimal, this also handles corner
cases, such as decapsulation leading to the same tuple, as in tunnel VPN
cases.  A negative test is added to check the resetting of the cached
'conn'.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-25 08:58:11 -07:00
Darrell Ball
ba5ca28409 conntrack: Fix 'reverse_nat_packet()' variable datatype.
The datatype 'pad' in the function 'reverse_nat_packet()' was incorrectly
declared as 'char' instead of 'uint8_t'. This can affect reverse natting
of icmpX packets with padding > 127 bytes.  At the same time, add some
comments regarding 'extract_l3_ipvX' usage in this function.  Found by
inspection.

Fixes: edd1bef468c0 ("dpdk: Add more ICMP Related NAT support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-24 13:33:41 -07:00
Darrell Ball
a0b89c5139 conntrack: Fix 'check_orig_tuple()' Valgrind false positive.
Valgrind reported that 'pkt->md.ct_orig_tuple.ipv4.ipv4_proto' is
uninitialized in 'check_orig_tuple()', if 'ct_state' is zero.  Although
this is true, the check is superceded, as even if it succeeds the check
for natted packets based on 'ct_state' is an ORed condition and is intended
to catch this case.
The check is '!(pkt->md.ct_state & (CS_SRC_NAT | CS_DST_NAT))' which
filters out all packets excepted natted ones.  Move this check up to
prevent the Valgrind complaint, which also helps performance and also remove
recenlty added redundant check adding extra cycles.

Fixes: f44733c527da ("conntrack: Validate accessing of conntrack data in pkt_metadata.")
CC: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-24 12:51:30 -07:00
Yifeng Sun
f44733c527 conntrack: Validate accessing of conntrack data in pkt_metadata
Valgrind reported:

1305: ofproto-dpif - conntrack - ipv6

==26942== Conditional jump or move depends on uninitialised value(s)
==26942==    at 0x587C00: check_orig_tuple (conntrack.c:1006)
==26942==    by 0x587C00: process_one (conntrack.c:1141)
==26942==    by 0x587C00: conntrack_execute (conntrack.c:1220)
==26942==    by 0x47B00F: dp_execute_cb (dpif-netdev.c:7305)
==26942==    by 0x4AF756: odp_execute_actions (odp-execute.c:794)
==26942==    by 0x477532: dp_netdev_execute_actions (dpif-netdev.c:7349)
==26942==    by 0x477532: handle_packet_upcall (dpif-netdev.c:6630)
==26942==    by 0x477532: fast_path_processing (dpif-netdev.c:6726)
==26942==    by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814)
==26942==    by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852)
==26942==    by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
==26942==    by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264)
==26942==    by 0x4324E7: type_run (ofproto-dpif.c:342)
==26942==    by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
==26942==    by 0x40BAAC: bridge_run__ (bridge.c:2965)
==26942==    by 0x410CF3: bridge_run (bridge.c:3029)
==26942==    by 0x407614: main (ovs-vswitchd.c:127)
==26942==  Uninitialised value was created by a heap allocation
==26942==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26942==    by 0x532574: xmalloc (util.c:138)
==26942==    by 0x46CD62: dp_packet_new (dp-packet.c:153)
==26942==    by 0x4A0431: eth_from_flow_str (netdev-dummy.c:1644)
==26942==    by 0x4A0431: netdev_dummy_receive (netdev-dummy.c:1783)
==26942==    by 0x531990: process_command (unixctl.c:308)
==26942==    by 0x531990: run_connection (unixctl.c:342)
==26942==    by 0x531990: unixctl_server_run (unixctl.c:393)
==26942==    by 0x40761E: main (ovs-vswitchd.c:128)

1316: ofproto-dpif - conntrack - tcp port reuse

==24039== Conditional jump or move depends on uninitialised value(s)
==24039==    at 0x587BF5: check_orig_tuple (conntrack.c:1004)
==24039==    by 0x587BF5: process_one (conntrack.c:1141)
==24039==    by 0x587BF5: conntrack_execute (conntrack.c:1220)
==24039==    by 0x47B02F: dp_execute_cb (dpif-netdev.c:7306)
==24039==    by 0x4AF7A6: odp_execute_actions (odp-execute.c:794)
==24039==    by 0x47755B: dp_netdev_execute_actions (dpif-netdev.c:7350)
==24039==    by 0x47755B: handle_packet_upcall (dpif-netdev.c:6631)
==24039==    by 0x47755B: fast_path_processing (dpif-netdev.c:6727)
==24039==    by 0x47935C: dp_netdev_input__ (dpif-netdev.c:6815)
==24039==    by 0x479AD8: dp_netdev_input (dpif-netdev.c:6853)
==24039==    by 0x479AD8: dp_netdev_process_rxq_port
(dpif-netdev.c:4287)
==24039==    by 0x47A6C9: dpif_netdev_run (dpif-netdev.c:5264)
==24039==    by 0x4324F7: type_run (ofproto-dpif.c:342)
==24039==    by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
==24039==    by 0x40BAAC: bridge_run__ (bridge.c:2965)
==24039==    by 0x410CF3: bridge_run (bridge.c:3029)
==24039==    by 0x407614: main (ovs-vswitchd.c:127)
==24039==  Uninitialised value was created by a heap allocation
==24039==    at 0x4C2DB8F: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24039==    by 0x5325C4: xmalloc (util.c:138)
==24039==    by 0x46D144: dp_packet_new (dp-packet.c:153)
==24039==    by 0x46D144: dp_packet_new_with_headroom (dp-packet.c:163)
==24039==    by 0x51191E: eth_from_hex (packets.c:498)
==24039==    by 0x4A03B9: eth_from_packet (netdev-dummy.c:1609)
==24039==    by 0x4A03B9: netdev_dummy_receive (netdev-dummy.c:1765)
==24039==    by 0x5319E0: process_command (unixctl.c:308)
==24039==    by 0x5319E0: run_connection (unixctl.c:342)
==24039==    by 0x5319E0: unixctl_server_run (unixctl.c:393)
==24039==    by 0x40761E: main (ovs-vswitchd.c:128)

According to comments in pkt_metadata_init(), conntrack data is valid
only if pkt_metadata.ct_state != 0. This patch prevents
check_orig_tuple() get called when conntrack data is uninitialized.

Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-19 09:24:12 -07:00
Darrell Ball
6c2a93064a conntrack: Fix ICMPv4 error data L4 length check.
The ICMPv4 error data L4 length check was found to be too strict for TCP,
expecting a minimum of 20 rather than 8 bytes.  This worked by
hapenstance for other inner protocols.  The approach is to explicitly
handle the ICMPv4 error data L4 length check and to do this for all
supported inner protocols in the same way.  Making the code common
between protocols also allows the existing ICMPv4 related UDP tests to
cover TCP and ICMP inner protocol cases.
Note that ICMPv6 does not have an 8 byte limit for error L4 data.

Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.")
CC: Daniele Di Proietto <diproiettod@ovn.org>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361949.html
Reported-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Co-authored-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-08-29 07:25:11 -07:00
solomon
e32cd4c629 conntrack: ignore port for ICMP/ICMPv6 NAT.
ICMP/ICMPv6 fails, if the src/dst port is set in a common NAT rule.
For example:
actions=ct(nat(dst=172.16.1.100:5000),commit,table=40)

Fixes: 4cd0481c9e8b ("conntrack: Fix wasted work for ICMP NAT.")
CC: Darrell Ball <dlu998@gmail.com>
Signed-off-by: solomon <liwei.solomon@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Co-authored-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-06-07 11:29:13 -07:00
Darrell Ball
4048c5088f conntrack: Add 'conn_lookup()' api.
The new api incorporates the hash calculation which can be a
distraction.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-06-05 14:58:46 -07:00
Darrell Ball
28274f774a conntrack: Fix missed 'conn' lookup checks.
Whenever a 'conn' entry is removed or added, we need to reverify it's
existence status under lock protection.  There were some cases that
were missed, so fix them.

Fixes: 967bb5c5cd90 ("conntrack: Add rcu support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-06-05 14:58:44 -07:00
Darrell Ball
5f918a8a4d conntrack: Don't re-add cleaned 'conn' to expiry list.
When a 'conn' entry is cleaned up from an expiry list, we don't
want to put it back during an update.  Hence, we detect and block this.

Fixes: 967bb5c5cd90 ("conntrack: Add rcu support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-06-05 14:58:43 -07:00
Darrell Ball
f1a0469ee9 conntrack: Expand 'conn_to_ct_dpif_entry()' locking.
When displaying a connection entry, several TCP fields are read
from a connection entry. Hence, expand the 'conn' locking so the display
does not potentially include fields values from different aggregate
states.

Fixes: 967bb5c5cd90 ("conntrack: Add rcu support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-24 12:50:04 -07:00
Darrell Ball
967bb5c5cd conntrack: Add rcu support.
For performance and code simplification reasons, add rcu support for
conntrack. The array of hmaps is replaced by a cmap as part of this
conversion.  Using a single map also simplifies the handling of NAT
and allows the removal of the nat_conn map and friends.  Per connection
entry locks are introduced, which are needed in a few code paths.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-09 15:18:25 -07:00
Darrell Ball
21ffe40970 conntrack: Free conntrack context in 'conntrack_destroy()'.
Fixes: 57593fd24378 ( conntrack: Stop exporting internal datastructures.)
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-06 08:26:40 -07:00
Darrell Ball
57593fd243 conntrack: Stop exporting internal datastructures.
Stop the exporting of the main internal conntrack datastructure.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-05-03 09:46:22 -07:00
Darrell Ball
c61a85f6c3 conntrack: Fix minimum connections to clean.
If there is low maximum connection count configuration and less than 10
connections in a bucket, the calculation of the maximum number of
connections to clean for the bucket could be zero, leading to these
connections not being cleaned until and if the connection count in the
bucket increases.

Fix this by checking for low maximum connection count configuration
and do this outside of the buckets loop, thereby simplifying the loop.

Fixes: e6ef6cc6349b ("conntrack: Periodically delete expired connections.")
CC: Daniele Di Proietto <diproiettod@ovn.org>
Reported-by: Liujiaxin <liujiaxin.2019@bytedance.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357703.html
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-04-16 13:38:48 -07:00
Darrell Ball
82b9ac94bb conntrack: Replace structure copy by memcpy().
There are a few cases where structure copy can be replaced by
memcpy(), for possible portability benefit.  This is because
the structures involved have padding and elements of the
structure are used to generate hashes.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-03-15 15:38:52 -07:00
Darrell Ball
901a0dad38 conntrack: Lookup only 'UNNAT conns' in 'nat_clean()'.
When freeing 'UNNAT conns', lookup only 'UNNAT conns' to
protect against possible address overlap with 'default
conns' during a DOS attempt.  This is very unlikely, but
protection is simple.

Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-03-15 15:38:51 -07:00
Darrell Ball
a720a7fa80 conntrack: Fix race for NAT cleanup.
Reference lists are not fully protected during cleanup of
NAT connections where the bucket lock is transiently not held during
list traversal.  This can lead to referencing freed memory during
cleaning from multiple contexts.  Fix this by protecting with
the existing 'cleanup' mutex in the missed cases where 'conn_clean()'
is called.  'conntrack_flush()' is converted to expiry list traversal
to support the proper bucket level protection with the 'cleanup' mutex.

The NAT exhaustion case cleanup in 'conn_not_found()' is also modified
to avoid the same issue.

Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Reported-by: solomon <liwei.solomon@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357056.html
Tested-by: solomon <liwei.solomon@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-03-15 15:38:49 -07:00
Darrell Ball
1c8689d78b conntrack: Consolidate 2 selection statements.
No functional change.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-25 17:03:57 -08:00
Darrell Ball
32b2c81f26 conntrack: Skip ephemeral ports with specified port range.
This patch removes the fallback to ephemeral ports when a SNAT port
range is specified;  DNAT already does not fallback to ephemeral ports,
in general.  This is not restrictive to the user and makes it easier to
limit NAT L4 port selection.

The documentation is updated and a new test is added to enforce the
behavior.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-February/356607.html
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-25 16:14:04 -08:00
Darrell Ball
4cd0481c9e conntrack: Fix wasted work for ICMP NAT.
ICMPv4 and ICMPv6 are not subject to port address translation (PAT),
however, a loop increments a local variable unnecessarily for
ephemeral ports, resulting in wasted work for ICMPv4 and ICMPv6 packets
subject to NAT.  Fix this by checking for PAT being enabled before
incrementing the local port variable and bail out otherwise.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-25 16:14:01 -08:00
Darrell Ball
76d85771f0 conntrack: Fix L4 csum for V6 extension hdr pkts.
It is a day one issue that got copied to subsequent code.

Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.")
Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
CC: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-22 17:55:45 -08:00
Darrell Ball
cda1b109c2 conntrack: Simplify 'ct_addr'.
Remove the struct wrapper and remove the unneeded union members.
There may even be a portability benefit here because of the
type punning.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-14 14:21:45 -08:00
Darrell Ball
763b40b0f7 conntrack: Remove redundant call to 'hash_finish()'.
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-14 14:21:19 -08:00
Darrell Ball
4ea96698f6 Userspace datapath: Add fragmentation handling.
Fragmentation handling is added for supporting conntrack.
Both v4 and v6 are supported.

After discussion with several people, I decided to not store
configuration state in the database to be more consistent with
the kernel in future, similarity with other conntrack configuration
which will not be in the database as well and overall simplicity.
Accordingly, fragmentation handling is enabled by default.

This patch enables fragmentation tests for the userspace datapath.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-14 14:18:56 -08:00
Darrell Ball
51b9a533e1 conntrack: Reword conntrack_execute() description.
Use 'must' instead of 'should'.

Suggested-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-14 11:39:20 -08:00
Darrell Ball
523464abb2 flow: Enhance parse_ipv6_ext_hdrs.
Acked-by: Justin Pettit <jpettit@ovn.org>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-14 11:39:18 -08:00
Darrell Ball
9171c63532 conntrack: Exclude l2 padding in 'conn_key_extract()'.
'conn_key_extract()' in userspace conntrack is including L2
(Ethernet) pad bytes for both L3 and L4 sizes. One problem is
any packet with non-zero L2 padding can incorrectly fail L4
checksum validation.

This patch fixes conn_key_extract() by ignoring L2 pad bytes.

Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.")
CC: Daniele Di Proietto <diproiettod@ovn.org>
Co-authored-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Co-authored-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>
Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 19:30:10 -08:00
Li RongQing
78a0b2721f conntrack: Remove unnecessary check in process_ftp_ctl_v4
It has been assured that both first and second int from ftp
command are not bigger than 255, so their combination(first
int << 8 +second int) must not bigger than 65535

Co-authored-by: Wang Li <wangli39@baidu.com>
Signed-off-by: Wang Li <wangli39@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Cc: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-11 17:55:32 -08:00
Darrell Ball
c3f6bae258 conntrack: Fix possibly uninitialized memory.
There are a few cases where struct 'conn_key' padding may be unspecified
according to the C standard.  Practically, it seems implementations don't
have issue, but it is better to be safe. The code paths modified are not
hot ones.  Fix this by doing a memcpy in these cases in lieu of a
structure copy.

Found by inspection.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 16:19:17 -08:00
Darrell Ball
298530b878 conntrack: Fix max size for inet_ntop() call.
The call to inet_ntop() in repl_ftp_v6_addr() is 1 short to handle
the maximum possible V6 address size for v4 mapping case.

Found by inspection.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 09:42:50 -08:00
Darrell Ball
cd7c99a6aa conntrack: fix ftp ipv4 address substitution.
When replacing the ipv4 address in repl_ftp_v4_addr(), the remaining size
was incorrectly calculated which could lead to the wrong replacement
adjustment.

This goes unnoticed most of the time, unless you choose carefully your
initial and replacement addresses.

Example fail address combination with 10.1.1.200 DNAT'd to 10.1.100.1.

Fix this by doing something similar to V6 and also splicing out common
code for better coverage and maintainability.

A test is updated to exercise different initial and replacement addresses
and another test is added.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Reported-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-02-04 09:42:49 -08:00
Darrell Ball
d13d711503 conntrack: Fix FTP seq_skew boundary adjustments.
At the same time, splice out a function and also rely on the compiler
for overflow/underflow handling.

Found by inspection.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-01-18 16:16:17 -08:00
David Marchand
efa29a8910 conntrack: fix expectations for ftp+DNAT.
When configuring the nat part of an expectation, care must be taken to
look at the master nat action and direction to properly reproduce it.

DNAT tests have been added to both active and passive modes, all
ftp/tftp tests titles have been updated to reflect they are dealing with
SNAT.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Co-authored-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-01-18 16:16:16 -08:00
David Marchand
253e4dc068 conntrack: fix tcp seq adjustments when mangling commands.
The ftp alg deals with packets in two ways for the command connection:
either they are inspected and can be mangled when nat is enabled
(CT_FTP_CTL_INTEREST) or they just go through without being modified
(CT_FTP_CTL_OTHER).

For CT_FTP_CTL_INTEREST packets, we must both adjust the packet tcp seq
number by the connection current offset, then prepare for the next
packets by setting an accumulated offset in the ct object.  However,
this was not done for multiple CT_FTP_CTL_INTEREST packets for the same
connection.
This is relevant for handling multiple child data connections that also
need natting.

The tests are updated so that some ftp+NAT tests send multiple port
commands or other similar commands for a single control connection.
Wget is not able to do this, so switch to lftp.

Fixes: bd5e81a0e596 ("Userspace Datapath: Add ALG infra and FTP.")
Co-authored-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-01-18 16:16:10 -08:00
Darrell Ball
faa0826dd9 conntrack: Keep Address Sanitizer happy.
An Address Sanitizer false positive.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-12-18 14:22:49 -08:00