mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 21:07:47 +00:00

Author	SHA1	Message	Date
Ihar Hrachyshka	c62b4ac8f8	ovs-ofctl: Implement compose-packet --bare [--bad-csum]. With --bare, it will produce a bare hexified payload with no spaces or offset indicators inserted, which is useful in tests to produce frames to pass to e.g. `ovs-ofctl receive`. With --bad-csum, it will produce a frame that has an invalid IP checksum (applicable to IPv4 only because IPv6 doesn't have checksums.) The command is now more useful in tests, where we may need to produce hex frame payloads to compare observed frames against. As an example of the tool use, a single test case is converted to it. The test uses both normal --bare and --bad-csum behaviors of the command, confirming they work as advertised. Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-11-16 18:16:59 +01:00
Mike Pattrick	3337e6d91c	userspace: Enable L4 checksum offloading by default. The netdev receiving packets is supposed to provide the flags indicating if the L4 checksum was verified and it is OK or BAD, otherwise the stack will check when appropriate by software. If the packet comes with good checksum, then postpone the checksum calculation to the egress device if needed. When encapsulate a packet with that flag, set the checksum of the inner L4 header since that is not yet supported. Calculate the L4 checksum when the packet is going to be sent over a device that doesn't support the feature. Linux tap devices allows enabling L3 and L4 offload, so this patch enables the feature. However, Linux socket interface remains disabled because the API doesn't allow enabling those two features without enabling TSO too. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Co-authored-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-15 23:50:30 +02:00
Mike Pattrick	5d11c47d3e	userspace: Enable IP checksum offloading by default. The netdev receiving packets is supposed to provide the flags indicating if the IP checksum was verified and it is GOOD or BAD, otherwise the stack will check when appropriate by software. If the packet comes with good checksum, then postpone the checksum calculation to the egress device if needed. When encapsulate a packet with that flag, set the checksum of the inner IP header since that is not yet supported. Calculate the IP checksum when the packet is going to be sent over a device that doesn't support the feature. Linux devices don't support IP checksum offload alone, so the support is not enabled. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Co-authored-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-15 23:49:51 +02:00
Nobuhiro MIKI	349112f975	flow: Support rt_hdr in parse_ipv6_ext_hdrs(). Checks whether IPPROTO_ROUTING exists in the IPv6 extension headers. If it exists, the first address is retrieved. If NULL is specified for "frag_hdr" and/or "rt_hdr", those addresses in the header are not reported to the caller. Of course, "frag_hdr" and "rt_hdr" are properly parsed inside this function. Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-03-29 21:41:28 +02:00
Rosemarie O'Riorden	fcdf8ae4a3	lib: Print nw_frag in flow key. nw_frag was not being printed in the flow key because it was improperly masked and printed. Since this field is only two bits, it needs to use a different macro to be masked. During printing, the switch statement switched on the whole 8 bits rather than just the two that are relevant. This caused nw_frag to often not be printed at all. Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-07-19 19:55:19 +02:00
Kumar Amber	b80f58cde2	dpif-netdev/mfex: Add ipv6 profile based hashing. For packets which don't already have a hash calculated, miniflow_hash_5tuple() calculates the hash of a packet using the previously built miniflow. This commit adds IPv6 profile specific hashing which uses fixed offsets into the packet to improve hashing performance. Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-07-05 15:42:07 +01:00
Kumar Amber	af864cedb0	dpif-netdev/mfex: Add ipv4 profile based hashing. For packets which don't already have a hash calculated, miniflow_hash_5tuple() calculates the hash of a packet using the previously built miniflow. This commit adds IPv4 profile specific hashing which uses fixed offsets into the packet to improve hashing performance. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Co-authored-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2022-05-04 14:24:04 +01:00
Ilya Maximets	e7e9973b80	dpif-netdev: Forwarding optimization for flows with a simple match. There are cases where users might want simple forwarding or drop rules for all packets received from a specific port, e.g :: "in_port=1,actions=2" "in_port=2,actions=IN_PORT" "in_port=3,vlan_tci=0x1234/0x1fff,actions=drop" "in_port=4,actions=push_vlan:0x8100,set_field:4196->vlan_vid,output:3" There are also cases where complex OpenFlow rules can be simplified down to datapath flows with very simple match criteria. In theory, for very simple forwarding, OVS doesn't need to parse packets at all in order to follow these rules. "Simple match" lookup optimization is intended to speed up packet forwarding in these cases. Design: Due to various implementation constraints userspace datapath has following flow fields always in exact match (i.e. it's required to match at least these fields of a packet even if the OF rule doesn't need that): - recirc_id - in_port - packet_type - dl_type - vlan_tci (CFI + VID) - in most cases - nw_frag - for ip packets Not all of these fields are related to packet itself. We already know the current 'recirc_id' and the 'in_port' before starting the packet processing. It also seems safe to assume that we're working with Ethernet packets. So, for the simple OF rule we need to match only on 'dl_type', 'vlan_tci' and 'nw_frag'. 'in_port', 'dl_type', 'nw_frag' and 13 bits of 'vlan_tci' can be combined in a single 64bit integer (mark) that can be used as a hash in hash map. We are using only VID and CFI form the 'vlan_tci', flows that need to match on PCP will not qualify for the optimization. Workaround for matching on non-existence of vlan updated to match on CFI and VID only in order to qualify for the optimization. CFI is always set by OVS if vlan is present in a packet, so there is no need to match on PCP in this case. 'nw_frag' takes 2 bits of PCP inside the simple match mark. New per-PMD flow table 'simple_match_table' introduced to store simple match flows only. 'dp_netdev_flow_add' adds flow to the usual 'flow_table' and to the 'simple_match_table' if the flow meets following constraints: - 'recirc_id' in flow match is 0. - 'packet_type' in flow match is Ethernet. - Flow wildcards contains only minimal set of non-wildcarded fields (listed above). If the number of flows for current 'in_port' in a regular 'flow_table' equals number of flows for current 'in_port' in a 'simple_match_table', we may use simple match optimization, because all the flows we have are simple match flows. This means that we only need to parse 'dl_type', 'vlan_tci' and 'nw_frag' to perform packet matching. Now we make the unique flow mark from the 'in_port', 'dl_type', 'nw_frag' and 'vlan_tci' and looking for it in the 'simple_match_table'. On successful lookup we don't need to run full 'miniflow_extract()'. Unsuccessful lookup technically means that we have no suitable flow in the datapath and upcall will be required. So, in this case EMC and SMC lookups are disabled. We may optimize this path in the future by bypassing the dpcls lookup too. Performance improvement of this solution on a 'simple match' flows should be comparable with partial HW offloading, because it parses same packet fields and uses similar flow lookup scheme. However, unlike partial HW offloading, it works for all port types including virtual ones. Performance results when compared to EMC: Test setup: virtio-user OVS virtio-user Testpmd1 ------------> pmd1 ------------> Testpmd2 (txonly) x<------ pmd2 <------------ (mac swap) Single stream of 64byte packets. Actions: in_port=vhost0,actions=vhost1 in_port=vhost1,actions=vhost0 Stats collected from pmd1 and pmd2, so there are 2 scenarios: Virt-to-Virt : Testpmd1 ------> pmd1 ------> Testpmd2. Virt-to-NoCopy : Testpmd2 ------> pmd2 --->x Testpmd1. Here the packet sent from pmd2 to Testpmd1 is always dropped, because the virtqueue is full since Testpmd1 is in txonly mode and doesn't receive any packets. This should be closer to the performance of a VM-to-Phy scenario. Test performed on machine with Intel Xeon CPU E5-2690 v4 @ 2.60GHz. Table below represents improvement in throughput when compared to EMC. +----------------+------------------------+------------------------+ \| \| Default (-g -O2) \| "-Ofast -march=native" \| \| Scenario +------------+-----------+------------+-----------+ \| \| GCC \| Clang \| GCC \| Clang \| +----------------+------------+-----------+------------+-----------+ \| Virt-to-Virt \| +18.9% \| +25.5% \| +10.8% \| +16.7% \| \| Virt-to-NoCopy \| +24.3% \| +33.7% \| +14.9% \| +22.0% \| +----------------+------------+-----------+------------+-----------+ For Phy-to-Phy case performance improvement should be even higher, but it's not the main use-case for this functionality. Performance difference for the non-simple flows is within a margin of error. Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-07 20:32:20 +01:00
lic121	1f5749c790	flow: Consider dataofs when parsing TCP packets. 'dataofs' field of TCP header indicates the TCP header length. The length should be >= 20 bytes/4 and <= TCP data length. This patch is to test the 'dataofs' and not parse layer 4 fields when meet bad dataofs. This behavior is consistent with the openvswitch kernel module. Fixes: 5a51b2cd3483 ("lib/ofpbuf: Remove 'l7' pointer.") Signed-off-by: lic121 <lic121@chinatelecom.cn> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-12-03 23:45:26 +01:00
David Marchand	15329b728b	flow: Count and dump invalid IP packets. Skipping further processing of invalid IP packets helps avoid crashes but it does not help to figure out if the malformed packets are still present on the network. Add coverage counters for IPv4 and IPv6 sanity checks so that we know there are some invalid packets. Dump such whole packets in debug mode. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-07-16 14:23:00 +02:00
Flavio Leitner	79349cbab0	flow: Support extra padding length. Although not required, padding can be optionally added until the packet length is MTU bytes. A packet with extra padding currently fails sanity checks. Vulnerability: CVE-2020-35498 Fixes: fa8d9001a624 ("miniflow_extract: Properly handle small IP packets.") Reported-by: Joakim Hindersson <joakim.hindersson@elastx.se> Acked-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-02-10 14:59:55 +01:00
William Tu	3c6d05a02e	userspace: Add GTP-U support. GTP, GPRS Tunneling Protocol, is a group of IP-based communications protocols used to carry general packet radio service (GPRS) within GSM, UMTS and LTE networks. GTP protocol has two parts: Signalling (GTP-Control, GTP-C) and User data (GTP-User, GTP-U). GTP-C is used for setting up GTP-U protocol, which is an IP-in-UDP tunneling protocol. Usually GTP is used in connecting between base station for radio, Serving Gateway (S-GW), and PDN Gateway (P-GW). This patch implements GTP-U protocol for userspace datapath, supporting only required header fields and G-PDU message type. See spec in: https://tools.ietf.org/html/draft-hmm-dmm-5g-uplane-analysis-00 Tested-at: https://travis-ci.org/github/williamtu/ovs-travis/builds/666518784 Signed-off-by: Feng Yang <yangfengee04@gmail.com> Co-authored-by: Feng Yang <yangfengee04@gmail.com> Signed-off-by: Yi Yang <yangyi01@inspur.com> Co-authored-by: Yi Yang <yangyi01@inspur.com> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org>	2020-03-25 20:26:51 -07:00
Eli Britstein	eb540c0f5f	flow: Fix parsing l3_ofs with partial offloading l3_ofs should be set all Ethernet packets, not just IPv4/IPv6 ones. For example for ARP over VLAN tagged packets, it may cause wrong processing like in changing the VLAN ID action. Fix it. Fixes: aab96ec4d81e ("dpif-netdev: retrieve flow directly from the flow mark") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2020-01-31 09:01:38 -08:00
Zhike Wang	d0dd4908d7	flow: Fix IPv6 header parser with partial offloading. Set nw_proto before it is used in parse_ipv6_ext_hdrs__(). Signed-off-by: Zhike Wang <wangzk320@163.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-11-21 17:37:05 -08:00
Ilya Maximets	54e2baec09	flow: Fix crash on vlan packets with partial offloading. parse_tcp_flags() does not care about vlan tags in a packet thus not able to parse them. As a result, if partial offloading is enabled in userspace datapath vlan packets are not parsed, i.e. has no initialized offsets. This causes OVS crash on any attempt to access/modify packet header fields. For example, having the flow with following actions: in_port=1,ip,actions=mod_nw_src:192.168.0.7,output:IN_PORT will lead to OVS crash on vlan packet handling: Process terminating with default action of signal 11 (SIGSEGV) Invalid read of size 4 at 0x785657: get_16aligned_be32 (unaligned.h:249) by 0x785657: odp_set_ipv4 (odp-execute.c:82) by 0x785657: odp_execute_masked_set_action (odp-execute.c:527) by 0x785657: odp_execute_actions (odp-execute.c:894) by 0x74CDA9: dp_netdev_execute_actions (dpif-netdev.c:7355) by 0x74CDA9: packet_batch_per_flow_execute (dpif-netdev.c:6339) by 0x74CDA9: dp_netdev_input__ (dpif-netdev.c:6845) by 0x74DB6E: dp_netdev_input (dpif-netdev.c:6854) by 0x74DB6E: dp_netdev_process_rxq_port (dpif-netdev.c:4287) by 0x74E863: dpif_netdev_run (dpif-netdev.c:5264) by 0x703F57: type_run (ofproto-dpif.c:370) by 0x6EC8B8: ofproto_type_run (ofproto.c:1760) by 0x6DA52B: bridge_run__ (bridge.c:3188) by 0x6E083F: bridge_run (bridge.c:3252) by 0x1642E4: main (ovs-vswitchd.c:127) Address 0xc is not stack'd, malloc'd or (recently) free'd Fix that by properly parsing vlan tags first. Function 'parse_dl_type' transformed for that purpose as it had no users anyway. Added unit test for packet modification with partial offloading that triggers above crash. Fixes: aab96ec4d81e ("dpif-netdev: retrieve flow directly from the flow mark") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>	2019-10-25 18:06:50 +02:00
Ilya Maximets	361a47d669	flow: Fix using pointer to member of packed struct icmp6_hdr. OVS has no structure definition for ICMPv6 header with additional data. More precisely, it has, but this structure named as 'icmp6_error_header' and only suitable to store error related extended information. 'flow_compose_l4' stores additional information in reserved bits by using system defined structure 'icmp6_hdr', which is marked as 'packed' and this leads to build failure with gcc >= 9: lib/flow.c:3041:34: error: taking address of packed member of 'struct icmp6_hdr' may result in an unaligned pointer value [-Werror=address-of-packed-member] uint32_t *reserved = &icmp->icmp6_dataun.icmp6_un_data32[0]; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix that by renaming 'icmp6_error_header' to 'icmp6_data_header' and allowing it to store not only errors, but any type of additional information by analogue with 'struct icmp6_hdr'. All the usages of 'struct icmp6_hdr' replaced with this new structure. Removed redundant conversions between network and host representations. Now fields are always in be. This also, probably, makes flow_compose_l4 more robust by avoiding possible unaligned accesses to 32 bit value. Fixes: 9b2b84973db7 ("Support for match & set ICMPv6 reserved and options type fields") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: William Tu <u9012063@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org>	2019-10-10 11:11:00 +02:00
Yanqin Wei	892545b953	flow: fix incorrect padding length checking in ipv6_sanity_check The padding length is (packet size - ipv6 header length - ipv6 plen). This patch fixes incorrect padding size checking in ipv6_sanity_check. Acked-by: William Tu <u9012063@gmail.com> Reviewed-by: Gavin Hu <Gavin.Hu@arm.com> Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-09-25 15:16:02 -07:00
Paul Chaignon	940ac2ce88	treewide: Use packet batch APIs This patch replaces direct accesses to dp_packet_batch and dp_packet internal components by the appropriate API calls. It extends commit 1270b6e52 (treewide: Wider use of packet batch APIs). This patch was generated using the following semantic patch (cf. http://coccinelle.lip6.fr). // <smpl> @ dp_packet @ struct dp_packet_batch b1; struct dp_packet_batch b2; struct dp_packet p; expression e; @@ ( - b1->packets[b1->count++] = p; + dp_packet_batch_add(b1, p); \| - b2.packets[b2.count++] = p; + dp_packet_batch_add(&b2, p); \| - p->packet_type == htonl(PT_ETH) + dp_packet_is_eth(p) \| - p->packet_type != htonl(PT_ETH) + !dp_packet_is_eth(p) \| - b1->count == 0 + dp_packet_batch_is_empty(b1) \| - !b1->count + dp_packet_batch_is_empty(b1) \| b1->count = e; \| b1->count++ \| b2.count = e; \| b2.count++ \| - b1->count + dp_packet_batch_size(b1) \| - b2.count + dp_packet_batch_size(&b2) ) // </smpl> Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-09-25 14:42:00 -07:00
Yanqin Wei	f1dbe3796d	flow: save "vlan_hdrs" memset for untagged traffic For untagged traffic, it is unnecessary to clear vlan_hdrs as it costs 32B memset. So the patch improves it by postponing to clear vlan_hdrs until ethtype check. It can benefit both untagged and single-tagged traffic. From testing, it does not impact performance of dual-tagged traffic. Reviewed-by: Gavin Hu <Gavin.Hu@arm.com> Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-08-28 14:45:54 -07:00
Malvika Gupta	c2c19ddd7c	flow: Reduce metadata connection state branches in miniflow_extract This patch merges two separate if-else branches for metadata connection state into one if-else branch to improve performance. It gives an average performance improvement of ~3% on arm platforms and ~4.5% on x86 platforms. Signed-off-by: Malvika Gupta <malvika.gupta@arm.com> Reviewed-by: Yanqin Wei <yanqin.wei@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-08-28 12:56:03 -07:00
Ben Pfaff	2ed6505555	flow: Avoid unsafe comparison of minimasks. The following, run inside the OVS sandbox, caused OVS to abort when Address Sanitizer was used: ovs-vsctl add-br br-int ovs-ofctl add-flow br-int "table=0,cookie=0x1234,priority=10000,icmp,actions=drop" ovs-ofctl --strict del-flows br-int "table=0,cookie=0x1234/-1,priority=10000" Sample report from Address Sanitizer: ==3029==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000043260 at pc 0x7f6b09c2459b bp 0x7ffcb67e7540 sp 0x7ffcb67e6cf0 READ of size 40 at 0x603000043260 thread T0 #0 0x7f6b09c2459a (/lib/x86_64-linux-gnu/libasan.so.5+0xb859a) #1 0x565110a748a5 in minimask_equal ../lib/flow.c:3510 #2 0x565110a9ea41 in minimatch_equal ../lib/match.c:1821 #3 0x56511091e864 in collect_rules_strict ../ofproto/ofproto.c:4516 #4 0x56511093d526 in delete_flow_start_strict ../ofproto/ofproto.c:5959 #5 0x56511093d526 in ofproto_flow_mod_start ../ofproto/ofproto.c:7949 #6 0x56511093d77b in handle_flow_mod__ ../ofproto/ofproto.c:6122 #7 0x56511093db71 in handle_flow_mod ../ofproto/ofproto.c:6099 #8 0x5651109407f6 in handle_single_part_openflow ../ofproto/ofproto.c:8406 #9 0x5651109407f6 in handle_openflow ../ofproto/ofproto.c:8587 #10 0x5651109e40da in ofconn_run ../ofproto/connmgr.c:1318 #11 0x5651109e40da in connmgr_run ../ofproto/connmgr.c:355 #12 0x56511092b129 in ofproto_run ../ofproto/ofproto.c:1826 #13 0x5651108f23cd in bridge_run__ ../vswitchd/bridge.c:2965 #14 0x565110904887 in bridge_run ../vswitchd/bridge.c:3023 #15 0x5651108e659c in main ../vswitchd/ovs-vswitchd.c:127 #16 0x7f6b093b709a in __libc_start_main ../csu/libc-start.c:308 #17 0x5651108e9009 in _start (/home/blp/nicira/ovs/_build/vswitchd/ovs-vswitchd+0x11d009) This fixes the problem, which although largely theoretical could crop up with odd implementations of memcmp(), perhaps ones optimized in various "clever" ways. All in all, it seems best to avoid the theoretical problem. Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-07-17 16:51:29 -07:00
Vishal Deep Ajmera	cbbab70127	flow: Wildcard UDP ports when using SYMMETRIC_L4 hash for select groups. UDP source and destination ports are not used to derive the hash index used for selecting the bucket in case of SYMMETRIC_L4 hash based select groups. However, they are un-wildcarded in the megaflow entry match criteria. This results in distinct megaflow entry being created for each pair of UDP source and destination ports unnecessarily and causes significant performance deterioration when the megaflow cache limit is reached. This patch wildcards UDP ports when using select group with SYMMETRIC_L4 hash function. Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> CC: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-07-16 15:21:45 -07:00
Van Bemmel, Jeroen (Nokia - US)	fb23ed4789	flow: Don't include ports of first fragments in hash For a series of IP fragments, only the first packet includes the transport header (TCP/UDP/SCTP) and the src/dst ports. By including these port numbers in the hash, it may happen that a first fragment hashes to a different value than subsequent packets, causing different packets from the same flow to follow different paths. This in turn may result in out-of-order delivery or failed reassembly. This patch excludes port numbers from the hash calculation in case of IP fragmentation. Signed-off-by: Jeroen van Bemmel <jeroen.van_bemmel@nokia.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-06-07 11:21:26 -07:00
Ben Pfaff	005bb87206	flow: Add FLOW_WC_SEQ assertions and improve comments. The assertions make it easier to find all the places that need to be updated when adding protocol support. Acked-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-04-12 15:08:06 -07:00
Darrell Ball	523464abb2	flow: Enhance parse_ipv6_ext_hdrs. Acked-by: Justin Pettit <jpettit@ovn.org> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-02-14 11:39:18 -08:00
Li RongQing	11e4765329	flow: fix a possible memory leak in parse_ct_state state_s should be freed always before exit parse_ct_state Fixes: b4293a336d8d ("conntrack: Move ct_state parsing to lib/flow.c") Acked-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-02-04 16:13:08 -08:00
Vishal Deep Ajmera	9b2b84973d	Support for match & set ICMPv6 reserved and options type fields Currently OVS supports all ARP protocol fields as OXM match fields to implement the relevant ARP procedures for IPv4. This includes support for matching copying and setting ARP fields. In IPv6 ARP has been replaced by ICMPv6 neighbor discovery (ND) procedures, neighbor advertisement and neighbor solicitation. The support for ICMPv6 fields in OVS is not complete for the use cases equivalent to ARP in IPv4. OVS lacks support for matching, copying and setting the “ND option type” and “ND reserved” fields. Without these user cannot implement all ICMPv6 ND procedures for IPv6 support. This commit adds additional OXM fields to OVS for ICMPv6 “ND option type“ and ICMPv6 “ND reserved” using the OXM extension mechanism. This allows support for parsing these fields from an ICMPv6 packet header and extending the OpenFlow protocol with specifications for these new OXM fields for matching, copying and setting. Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Co-authored-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com> Signed-off-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-02-04 13:34:41 -08:00
Li RongQing	7a17a07d54	flow: fix udp checksum As per RFC 768, if the calculated UDP checksum is 0, it should be instead set as 0xFFFF in the frame. A value of 0 in the checksum field indicates to the receiver that no checksum was calculated and hence it should not verify the checksum. Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-01-25 12:59:20 -08:00
Sriharsha Basavapatna via dev	738c785ff1	dpif-netlink: Detect Out-Of-Resource condition on a netdev This is the first patch in the patch-set to support dynamic rebalancing of offloaded flows. The patch detects OOR condition on a netdev port when ENOSPC error is returned by TC-Flower while adding a flow rule. A new structure is added to the netdev called "netdev_hw_info", to store OOR related information required to perform dynamic offload-rebalancing. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Reviewed-by: Sathya Perla <sathya.perla@broadcom.com> Reviewed-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2018-10-19 11:27:45 +02:00
Yifeng Sun	41179399ac	flow: Clear ovs_nsh_key's context data when nsh's type can't be handled In the default case when nsh's md_type is not recognized by nsh parser, uninitialized data in key->context can sneak into miniflow. This patch fixes it. Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10519 Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-10-08 11:02:41 -07:00
Martin Xu	84ddf96ce0	bundle: add symmetric_l3 hash method for multipath Add a symmetric_l3 hash method that uses both network destination address and network source address. VMware-BZ: #2112940 Signed-off-by: Martin Xu <martinxu9.ovs@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-10-02 15:17:43 -07:00
Ben Pfaff	97bc5b2326	flow: Fix uninitialized flow fields in IPv6 error case. When parse_ipv6_ext_hdrs__() returned false, half a 64-bit word had been pushed into the miniflow and the second half was left uninitialized. This commit fixes the problem. Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10518 Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>	2018-09-21 20:10:23 -07:00
Ben Pfaff	34c2c34334	flow: Document parse_tcp_flags() assumptions and semantics. Reported-by: Bhargava Shastry <bshastry@sect.tu-berlin.de> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>	2018-09-12 14:39:06 -07:00
Jianbo Liu	2f9366beb4	flow: Refactor some of VLAN helper functions By default, these function are to change the first vlan vid and pcp in the flow. Add a parameter as index for vlans if we want to handle the second ones. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2018-07-25 18:15:34 +02:00
Ben Pfaff	4fe0801606	flow: Fix buffer overread for crafted IPv6 packets. The ipv6_sanity_check() function implemented a check for IPv6 payload length wrong: ip6_plen is the payload length but this function checked whether it was longer than the total length of IPv6 header plus payload. This meant that a packet with a crafted ip6_plen could result in a buffer overread of up to the length of an IPv6 header (40 bytes). The kernel datapath flow extraction code does not obviously have a similar problem. Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9287 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Darrell Ball <dlu998@gmail.com>	2018-07-09 20:54:22 -07:00
Yuanhan Liu	aab96ec4d8	dpif-netdev: retrieve flow directly from the flow mark So that we could skip some very costly CPU operations, including but not limiting to miniflow_extract, emc lookup, dpcls lookup, etc. Thus, performance could be greatly improved. A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more that 260% performance boost. Note that though the heavy miniflow_extract is skipped, we still have to do per packet checking, due to we have to check the tcp_flags. Co-authored-by: Finn Christensen <fc@napatech.com> Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org> Signed-off-by: Finn Christensen <fc@napatech.com> Co-authored-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-07-06 10:32:52 +01:00
Yuanhan Liu	62b0859dd8	flow: Introduce IP packet sanity checks Those checks were part of the miniflow extractor. Moving them out to act as a general helpers for packet validation. Co-authored-by: Finn Christensen <fc@napatech.com> Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org> Signed-off-by: Finn Christensen <fc@napatech.com> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Co-authored-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-07-06 10:32:52 +01:00
Jan Scheurich	6a0b0d3be8	userspace datapath: Add OVS_HASH_L4_SYMMETRIC dp_hash algorithm This commit implements a new dp_hash algorithm OVS_HASH_L4_SYMMETRIC in the netdev datapath. It will be used as default hash algorithm for the dp_hash-based select groups in a subsequent commit to maintain compatibility with the symmetry property of the current default hash selection method. A new dpif_backer_support field 'max_hash_alg' is introduced to reflect the highest hash algorithm a datapath supports in the dp_hash action. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com> Co-authored-by: Nitin Katiyar <nitin.katiyar@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-05-25 14:58:40 -07:00
William Tu	7dc18ae96d	userspace: add erspan tunnel support. ERSPAN is a tunneling protocol based on GRE tunnel. The patch add erspan tunnel support for ovs-vswitchd with userspace datapath. Configuring erspan tunnel is similar to gre tunnel, but with additional erspan's parameters. Matching a flow on erspan's metadata is also supported, see ovs-fields for more details. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-05-21 20:33:30 -07:00
Ben Pfaff	f825fdd4ff	flow: Improve type-safety of MINIFLOW_GET_TYPE. Until mow, this macro has blindly read the passed-in type's size, but that's unnecessarily risky. This commit changes it to verify that the passed-in type is the same size as the field and, on GCC and Clang, that the types are compatible. It also adds a version that does not check, for the one case where (currently) we deliberately read the wrong size, and updates a few uses to use more precise field names. Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Reviewed-by: Armando Migliaccio <armamig@gmail.com>	2018-03-31 11:31:51 -07:00
Ben Pfaff	6f06837989	flow: Add some L7 payload data to most L4 protocols that accept it. This makes traffic generated by flow_compose() look slightly more realistic. It requires lots of updates to tests, but at least the tests themselves should be slightly more realistic too. At the same time, add --l7 and --l7-len options to ofproto/trace to allow users to specify the amount or contents of payloads that they want. Suggested-by: Brad Cowie <brad@cowie.nz> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Yifeng Sun <pkusunyifeng@gmail.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>	2018-01-27 08:58:31 -08:00
Ben Pfaff	89225d6515	flow: Simplify flow_compose_l4(). Each of the cases in flow_compose_l4() separately tracked the number of bytes of L4 data added to the packet. This commit makes the function do that in a single place without per-protocol bookkeeping. Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>	2018-01-26 14:30:04 -08:00
Yi Yang	17553f27ba	nsh: add new flow key 'ttl' IETF NSH draft added a new filed ttl in NSH header, this patch is to add new nsh key 'ttl' for it. Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-01-11 11:46:11 -08:00
Yi Yang	f59cb331c4	nsh: rework NSH netlink keys and actions This patch changes OVS_KEY_ATTR_NSH to nested attribute and adds three new NSH sub attribute keys: OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header OVS_NSH_KEY_ATTR_MD1: for length-fixed MD type 1 context OVS_NSH_KEY_ATTR_MD2: for length-variable MD type 2 metadata Its intention is to align to NSH kernel implementation. NSH match fields, set and PUSH_NSH action all use the below nested attribute format: OVS_KEY_ATTR_NSH begin OVS_NSH_KEY_ATTR_BASE OVS_NSH_KEY_ATTR_MD1 OVS_KEY_ATTR_NSH end or OVS_KEY_ATTR_NSH begin OVS_NSH_KEY_ATTR_BASE OVS_NSH_KEY_ATTR_MD2 OVS_KEY_ATTR_NSH end In addition, NSH encap and decap actions are renamed as push_nsh and pop_nsh to meet action naming convention. Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-01-08 13:19:14 -08:00
Ben Pfaff	14fae3e093	flow: Avoid buffer overread in parse_nsh() for malformed packet. Found by libfuzzer. CC: Jan Scheurich <jan.scheurich@ericsson.com> Fixes: 7edef47b4896 ("NSH: Minor bugfixes") Reported-by: Bhargava Shastry <bshastry@sec.t-labs.tu-berlin.de> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>	2017-11-29 13:29:11 -08:00
Jan Scheurich	9a180f2c00	NSH: Adjust NSH wire format to the latest IETF draft This commit adjusts the NSH user space implementation in OVS to the latest wire format defined in draft-ietf-sfc-nsh-28 (November 3 2017). The NSH_MDTYPE field was reduced from 8 to 4 bits. The FLAGS field is reduced from 8 to 2 bits. A new 6 bit TTL header field is added. The TTL field is set to 63 at encap(nsh). Match and set_field support for the newly introduced TTL header field and a corresponding dec_nsh_ttl action is not yet included and will be implemented in a future patch. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-11-08 12:33:45 -08:00
Jan Scheurich	7edef47b48	NSH: Minor bugfixes - Fix 2 incorrect length checks - Remove unnecessary limit of MD length to 16 bytes - Remove incorrect comments stating MD2 was not supported - Pad metadata in encap_nsh with zeroes if not multiple of 4 bytes Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-11-08 12:31:57 -08:00
Daniel Alvarez	7827edcaeb	Add dl_type to flow metadata for correct interpretation of conntrack metadata When a packet is sent to the controller, dl_type is not stored in the 'ofputil_packet_in_private'. When the packet is resumed, the flow's dl_type is 0 and this leads to invalid value in ct_orig_tuple in the pkt_metadata. This patch adds the dl_type to the metadata so that conntrack information can be interpreted correctly when packets are resumed. This is a change from the ordinary practice, since flow_get_metadata() is only supposed to deal with metadata and dl_type is not metadata. It is necessary when ct_state is involved, though, because ct_state only applies in the case of particular Ethertypes (IPv4 and IPv6 currently), so we need to add it as a kind of prerequisite. (This isn't ideal; maybe we didn't think through the ct_state mechanism carefully enough.) Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339868.html Signed-off-by: Daniel Alvarez <dalvarez@redhat.com> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-10-26 09:39:25 -07:00
Jan Scheurich	3d2fbd70bd	userspace: Add support for NSH MD1 match fields This patch adds support for NSH packet header fields to the OVS control plane and the userspace datapath. Initially we support the fields of the NSH base header as defined in https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt and the fixed context headers specified for metadata format MD1. The variable length MD2 format is parsed but the TLV context headers are not yet available for matching. The NSH fields are modelled as experimenter fields with the dedicated experimenter class 0x005ad650 proposed for NSH in ONF. The following fields are defined: NXOXM code ofctl name Size Comment ===================================================================== NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word (0x005ad650,1) NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23 (0x005ad650,2) NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31 (0x005ad650,3) NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word (0x005ad650,4) NXOXM_NSH_SI nsh_si 8 Bits 24-31 (0x005ad650,5) NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1 (0x005ad650,6) NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1 (0x005ad650,7) NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1 (0x005ad650,8) NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1 (0x005ad650,9) Co-authored-by: Johnson Li <johnson.li@intel.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-08-07 11:26:09 -07:00
Andy Zhou	bc0f51765d	flow: Refactor flow_compose() API. Currently, flow_compose_size() is only supposed to be called after flow_compose(). I find this API to be unintuitive. Change flow_compose() API to take the 'size' argument, and returns 'true' if the packet can be created, 'false' otherwise. This change also improves error detection and reporting when 'size' is unreasonably small. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ilya Maximets <i.maximets@samsung.com>	2017-07-27 15:22:39 -07:00

1 2 3 4 5 ...

346 Commits