mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

Author	SHA1	Message	Date
Paolo Valerio	9b4d2ad8e8	conntrack: Allow to dump userspace conntrack expectations. The patch introduces a new commands ovs-appctl dpctl/dump-conntrack-exp that allows to dump the existing expectations for the userspace ct. Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-29 22:20:43 +02:00
Yunjian Wang	c3559dffcb	dpif-netlink: Fix memory leak dpif_netlink_open(). In the specific call to dpif_netlink_dp_transact() (line 398) in dpif_netlink_open(), the 'dp' content is not being used in the branch when no error is returned (starting line 430). Furthermore, the 'dp' and 'buf' variables are overwritten later in this same branch when a new netlink request is sent (line 437), which results in a memory leak. Reported by Address Sanitizer. Indirect leak of 1024 byte(s) in 1 object(s) allocated from: 0 0x7fe09d3bfe70 in __interceptor_malloc (/usr/lib64/libasan.so.4+0xe0e70) 1 0x8759be in xmalloc__ lib/util.c:140 2 0x875a9a in xmalloc lib/util.c:175 3 0x7ba0d2 in ofpbuf_init lib/ofpbuf.c:141 4 0x7ba1d6 in ofpbuf_new lib/ofpbuf.c:169 5 0x9057f9 in nl_sock_transact lib/netlink-socket.c:1113 6 0x907a7e in nl_transact lib/netlink-socket.c:1817 7 0x8b5abe in dpif_netlink_dp_transact lib/dpif-netlink.c:5007 8 0x89a6b5 in dpif_netlink_open lib/dpif-netlink.c:398 9 0x5de16f in do_open lib/dpif.c:348 10 0x5de69a in dpif_open lib/dpif.c:393 11 0x5de71f in dpif_create_and_open lib/dpif.c:419 12 0x47b918 in open_dpif_backer ofproto/ofproto-dpif.c:764 13 0x483e4a in construct ofproto/ofproto-dpif.c:1658 14 0x441644 in ofproto_create ofproto/ofproto.c:556 15 0x40ba5a in bridge_reconfigure vswitchd/bridge.c:885 16 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 17 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 18 0x7fe09cc03c86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) Fixes: b841e3cd4a28 ("dpif-netlink: Fix feature negotiation for older kernels.") Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-04-25 21:58:14 +02:00
Paolo Valerio	9fa612959c	ovs-dpctl: Add new command dpctl/ct-[sg]et-sweep-interval. Since 3d9c1b855a5f ("conntrack: Replace timeout based expiration lists with rculists.") the sweep interval changed as well as the constraints related to the sweeper. Being able to change the default reschedule time may be convenient in some conditions, like debugging. This patch introduces new commands allowing to get and set the sweep interval in ms. Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-04-06 22:59:25 +02:00
Nobuhiro MIKI	03fc1ad785	userspace: Add SRv6 tunnel support. SRv6 (Segment Routing IPv6) tunnel vport is responsible for encapsulation and decapsulation the inner packets with IPv6 header and an extended header called SRH (Segment Routing Header). See spec in: https://datatracker.ietf.org/doc/html/rfc8754 This patch implements SRv6 tunneling in userspace datapath. It uses `remote_ip` and `local_ip` options as with existing tunnel protocols. It also adds a dedicated `srv6_segs` option to define a sequence of routers called segment list. Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-03-29 22:16:04 +02:00
Adrian Moreno	79f9367449	dpif-netlink: Always create at least 1 handler. Ensure at least 1 handler is created even if something goes wrong during cpu detection or prime numer calculation. Fixes: a5cacea5f988 ("handlers: Create additional handler threads when using CPU isolation.") Suggested-by: Aaron Conole <aconole@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Michael Santana <msantana@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-03-27 22:04:40 +02:00
Eelco Chaudron	4d69c19000	ofproto-dpif-upcall: Reset ukey's last stats value if the datapath changed. When the ukey's action set changes, it could cause the flow to use a different datapath, for example, when it moves from tc to kernel. This will cause the the cached previous datapath statistics to be used. This change will reset the cached statistics when a change in datapath is discovered. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-03-03 22:27:37 +01:00
wangchuanlei	e22e1f6725	dpctl: Add support to count upcall packets. Add support to count upcall packets per port, both succeed and failed, which is a better way to see how many packets upcalled on each interface. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: wangchuanlei <wangchuanlei@inspur.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-01-31 17:40:50 +01:00
Roi Dayan	48a0adefae	dpif-netlink: Remove redundant null assignment The assignment of the features pointer is not doing anything and can be removed. CC: Justin Pettit <jpettit@ovn.org> Signed-off-by: Roi Dayan <roid@nvidia.com> Acked-by: Justin Pettit <jpettit@ovn.org> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-11-07 05:44:10 -05:00
Ilya Maximets	262eded5fb	netdev-offload-tc: Fix ignoring unknown tunnel keys. Current offloading code supports only limited number of tunnel keys and silently ignores everything it doesn't understand. This is causing, for example, offloaded ERSPAN tunnels to not work, because flow is offloaded, but ERSPAN options are not provided to TC. There is a number of tunnel keys, which are supported by the userspace, but silently ignored during offloading: OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT OVS_TUNNEL_KEY_ATTR_OAM OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS OVS_TUNNEL_KEY_ATTR_CSUM is kind of supported, but only for actions and for some reason is set from the tunnel port instead of the provided action, and not currently supported for the tunnel key in the match. Addig a default case to fail offloading of unknown attributes. For now explicitly allowing incorrect behavior for the DONT_FRAGMENT flag, otherwise we'll break all tunnel offloading by default. VXLAN and ERSPAN options has to fail offloading, because the tunnel will not work otherwise. OAM is not a default configurations, so failing it as well. The missing DONT_FRAGMENT flag though should, probably, cause frequent flow revalidation, but that is not new with this patch. Same for the 'match' key, only clearing masks that was actually consumed, except for the DONT_FRAGMENT and CSUM flags, which are explicitly allowed and highlighted as broken. Also, destination port as well as CSUM configuration for unknown reason was not taken from the actions list and were passed via HW offload info instead of being consumed from the set() action. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/395522.html Reported-by: Eelco Chaudron <echaudro@redhat.com> Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc interface") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-19 19:39:51 +02:00
Michael Santana	2803b3fb53	handlers: Fix handlers mapping. The handler and CPU mapping in upcalls are incorrect, and this is specially noticeable systems with cpu isolation enabled. Say we have a 12 core system where only every even number CPU is enabled C0, C2, C4, C6, C8, C10 This means we will create an array of size 6 that will be sent to kernel that is populated with sockets [S0, S1, S2, S3, S4, S5] The problem is when the kernel does an upcall it checks the socket array via the index of the CPU, effectively adding additional load on some CPUs while leaving no work on other CPUs. e.g. C0 indexes to S0 C2 indexes to S2 (should be S1) C4 indexes to S4 (should be S2) Modulo of 6 (size of socket array) is applied, so we wrap back to S0 C6 indexes to S0 (should be S3) C8 indexes to S2 (should be S4) C10 indexes to S4 (should be S5) Effectively sockets S0, S2, S4 get overloaded while sockets S1, S3, S5 get no work assigned to them This leads to the kernel to throw the following message: "openvswitch: cpu_id mismatch with handler threads" Instead we will send the kernel a corrected array of sockets the size of all CPUs in the system, or the largest core_id on the system, which ever one is greatest. This is to take care of systems with non-continous core cpus. In the above example we would create a corrected array in a round-robin(assuming prime bias) fashion as follows: [S0, S1, S2, S3, S4, S5, S6, S0, S1, S2, S3, S4] Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.") Co-authored-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Michael Santana <msantana@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-15 19:43:57 +02:00
Michael Santana	a5cacea5f9	handlers: Create additional handler threads when using CPU isolation. Additional threads are required to service upcalls when we have CPU isolation (in per-cpu dispatch mode). The reason additional threads are required is because it creates a more fair distribution. With more threads we decrease the load of each thread as more threads would decrease the number of cores each threads is assigned. Adding additional threads also increases the chance OVS utilizes all cores available to use. Some RPS schemas might make some handler threads get all the workload while others get no workload. This tends to happen when the handler thread count is low. An example would be an RPS that sends traffic on all even cores on a system with only the lower half of the cores available for OVS to use. In this example we have as many handlers threads as there are available cores. In this case 50% of the handler threads get all the workload while the other 50% get no workload. Not only that, but OVS is only utilizing half of the cores that it can use. This is the worst case scenario. The ideal scenario is to have as many threads as there are cores - in this case we guarantee that all cores OVS can use are utilized But, adding as many threads are there are cores could have a performance hit when the number of active cores (which all threads have to share) is very low. For this reason we avoid creating as many threads as there are cores and instead meet somewhere in the middle. The formula used to calculate the number of handler threads to create is as follows: handlers_n = min(next_prime(active_cores+1), total_cores) Assume default behavior when total_cores <= 2, that is do not create additional threads when we have less than 2 total cores on the system Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.") Signed-off-by: Michael Santana <msantana@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-15 18:51:09 +02:00
Ilya Maximets	01edbc3add	dpif-netlink: Fix incorrect bit shift in compat mode. SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior in lib/dpif-netlink.c:1077:40: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' #0 0x73fc31 in dpif_netlink_port_add_compat lib/dpif-netlink.c:1077:40 #1 0x73fc31 in dpif_netlink_port_add lib/dpif-netlink.c:1132:17 #2 0x2c1745 in dpif_port_add lib/dpif.c:597:13 #3 0x07b279 in port_add ofproto/ofproto-dpif.c:3957:17 #4 0x01b209 in ofproto_port_add ofproto/ofproto.c:2124:13 #5 0xfdbfce in iface_do_create vswitchd/bridge.c:2066:13 #6 0xfdbfce in iface_create vswitchd/bridge.c:2109:13 #7 0xfdbfce in bridge_add_ports__ vswitchd/bridge.c:1173:21 #8 0xfb5319 in bridge_add_ports vswitchd/bridge.c:1189:5 #9 0xfb5319 in bridge_reconfigure vswitchd/bridge.c:901:9 #10 0xfae0f9 in bridge_run vswitchd/bridge.c:3334:9 #11 0xfe67dd in main vswitchd/ovs-vswitchd.c:129:9 #12 0x4b6d8f (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) #13 0x4b6e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) #14 0x562594eed024 in _start (vswitchd/ovs-vswitchd+0x787024) Fixes: 526df7d8543f ("tunnel: Provide framework for tunnel extensions for VXLAN-GBP and others") Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-08-04 14:07:37 +02:00
Jianbo Liu	5660b89a30	dpif-netlink: Offloading meter to tc police action OVS meters are created in advance and openflow rules refer to them by their unique ID. New tc_police API is used to offload them. By calling the API, police actions are created and meters are mapped to them. These actions then can be used in tc filter rules by the index. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>	2022-07-11 11:22:26 +02:00
Dumitru Ceara	3764f5188a	treewide: Fix invalid bit shift operations. UB Sanitizer reports: tests/test-hash.c:59:40: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' 0 0x44c3c9 in get_range128 tests/test-hash.c:59 1 0x44cb2e in check_hash_bytes128 tests/test-hash.c:178 2 0x44d14d in test_hash_main tests/test-hash.c:282 [...] ofproto/ofproto-dpif-xlate.c:5607:45: runtime error: left shift of 65535 by 16 places cannot be represented in type 'int' 0 0x53fe9f in xlate_sample_action ofproto/ofproto-dpif-xlate.c:5607 1 0x54d625 in do_xlate_actions ofproto/ofproto-dpif-xlate.c:7160 2 0x553b76 in xlate_actions ofproto/ofproto-dpif-xlate.c:7806 3 0x4fcb49 in upcall_xlate ofproto/ofproto-dpif-upcall.c:1237 4 0x4fe02f in process_upcall ofproto/ofproto-dpif-upcall.c:1456 5 0x4fda99 in upcall_cb ofproto/ofproto-dpif-upcall.c:1358 [...] tests/test-util.c:89:23: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' 0 0x476415 in test_ctz tests/test-util.c:89 [...] lib/dpif-netlink.c:396:33: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' 0 0x571b9f in dpif_netlink_open lib/dpif-netlink.c:396 Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-05-17 23:06:46 +02:00
Gaetan Rivet	9ab104718b	dpctl: Add function to read hardware offload statistics. Expose a function to query datapath offload statistics. This function is separate from the current one in netdev-offload as it exposes more detailed statistics from the datapath, instead of only from the netdev-offload provider. Each datapath is meant to use the custom counters as it sees fit for its handling of hardware offloads. Call the new API from dpctl. Signed-off-by: Gaetan Rivet <grive@u256.net> Reviewed-by: Eli Britstein <elibr@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-18 15:12:01 +01:00
Eelco Chaudron	85d3785e6a	utilities: Add netlink flow operation USDT probes and upcall_cost script. This patch adds a series of NetLink flow operation USDT probes. These probes are in turn used in the upcall_cost Python script, which in addition of some kernel tracepoints, give an insight into the time spent on processing upcall. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2022-01-18 00:46:30 +01:00
Chris Mi	e409824684	dpif-netlink: Improve feature negotiation for older kernels. OVS_DP_F_UNALIGNED is already set, no need to set again. If restarting ovs, dp is already created. So dpif_netlink_dp_transact() will return EEXIST. No need to probe again. Signed-off-by: Chris Mi <cmi@nvidia.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-12-03 18:53:56 +01:00
Eelco Chaudron	9b20df73a6	dpctl: dpif: Allow viewing and configuring dp cache sizes. This patch adds a general way of viewing/configuring datapath cache sizes. With an implementation for the netlink interface. The ovs-dpctl/ovs-appctl show commands will display the current cache sizes configured: $ ovs-dpctl show system@ovs-system: lookups: hit:25 missed:63 lost:0 flows: 0 masks: hit:282 total:0 hit/pkt:3.20 cache: hit:4 hit-rate:4.54% caches: masks-cache: size:256 port 0: ovs-system (internal) port 1: br-int (internal) port 2: genev_sys_6081 (geneve: packet_type=ptap) port 3: br-ex (internal) port 4: eth2 port 5: sw0p1 (internal) port 6: sw0p3 (internal) A specific cache can be configured as follows: $ ovs-appctl dpctl/cache-set-size DP CACHE SIZE $ ovs-dpctl cache-set-size DP CACHE SIZE For example to disable the cache do: $ ovs-dpctl cache-set-size system@ovs-system masks-cache 0 Setting cache size successful, new size 0. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-11-08 21:48:05 +01:00
Eelco Chaudron	efd55eb34c	dpctl: dpif: Add kernel datapath cache hit output. This patch adds cache usage statistics to the output: $ ovs-dpctl show system@ovs-system: lookups: hit:24 missed:71 lost:0 flows: 0 masks: hit:334 total:0 hit/pkt:3.52 cache: hit:4 hit-rate:4.21% port 0: ovs-system (internal) port 1: genev_sys_6081 (geneve: packet_type=ptap) port 2: br-int (internal) port 3: br-ex (internal) port 4: eth2 port 5: sw1p1 (internal) port 6: sw0p4 (internal) Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-11-08 21:48:05 +01:00
Mark Gray	b841e3cd4a	dpif-netlink: Fix feature negotiation for older kernels. Older kernels do not reject unsupported features. This can lead to a situation in which 'ovs-vswitchd' believes that a feature is supported when, in fact, it is not. This patch probes for this by attempting to set a known unsupported feature. Reported-by: Dumitru Ceara <dceara@redhat.com> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2004083 Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Tested-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-10-12 17:49:11 +02:00
Mark Gray	b1e517bd2f	dpif-netlink: Introduce per-cpu upcall dispatch. The Open vSwitch kernel module uses the upcall mechanism to send packets from kernel space to user space when it misses in the kernel space flow table. The upcall sends packets via a Netlink socket. Currently, a Netlink socket is created for every vport. In this way, there is a 1:1 mapping between a vport and a Netlink socket. When a packet is received by a vport, if it needs to be sent to user space, it is sent via the corresponding Netlink socket. This mechanism, with various iterations of the corresponding user space code, has seen some limitations and issues: * On systems with a large number of vports, there is correspondingly a large number of Netlink sockets which can limit scaling. (https://bugzilla.redhat.com/show_bug.cgi?id=1526306) * Packet reordering on upcalls. (https://bugzilla.redhat.com/show_bug.cgi?id=1844576) * A thundering herd issue. (https://bugzilla.redhat.com/show_bug.cgi?id=1834444) This patch introduces an alternative, feature-negotiated, upcall mode using a per-cpu dispatch rather than a per-vport dispatch. In this mode, the Netlink socket to be used for the upcall is selected based on the CPU of the thread that is executing the upcall. In this way, it resolves the issues above as: a) The number of Netlink sockets scales with the number of CPUs rather than the number of vports. b) Ordering per-flow is maintained as packets are distributed to CPUs based on mechanisms such as RSS and flows are distributed to a single user space thread. c) Packets from a flow can only wake up one user space thread. Reported-at: https://bugzilla.redhat.com/1844576 Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-07-16 20:05:03 +02:00
Mark Gray	485e3a13a6	dpif-netlink: Fix report_loss() message. Fixes: 1579cf677fcb ("dpif-linux: Implement the API functions to allow multiple handler threads read upcall.") Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-07-16 20:05:03 +02:00
Eelco Chaudron	e6ad4d8d9c	conntrack: Document all-zero IP SNAT behavior and add a test case. Currently, conntrack in the kernel has an undocumented feature referred to as all-zero IP address SNAT. Basically, when a source port collision is detected during the commit, the source port will be translated to an ephemeral port. If there is no collision, no SNAT is performed. This patchset documents this behavior and adds a self-test to verify it's not changing. In addition, a datapath feature flag is added for the all-zero IP SNAT case. This will help applications on top of OVS, like OVN, to determine this feature can be used. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-07-08 21:19:14 +02:00
Ilya Maximets	13c0eaa7b4	dpif-netlink: Fix send of uninitialized memory in ct limit requests. ct limit requests never initializes the whole 'struct ovs_zone_limit' sending uninitialized stack memory to kernel: Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s) at 0x5E23867: sendmsg (in /usr/lib64/libpthread-2.28.so) by 0x54F761: nl_sock_transact_multiple__ (netlink-socket.c:858) by 0x54FB6E: nl_sock_transact_multiple.part.9 (netlink-socket.c:1079) by 0x54FCC0: nl_sock_transact_multiple (netlink-socket.c:1044) by 0x54FCC0: nl_sock_transact (netlink-socket.c:1108) by 0x550B6F: nl_transact (netlink-socket.c:1804) by 0x53BEA2: dpif_netlink_ct_get_limits (dpif-netlink.c:3052) by 0x588B57: dpctl_ct_get_limits (dpctl.c:2178) by 0x586FF2: dpctl_unixctl_handler (dpctl.c:2870) by 0x52C241: process_command (unixctl.c:310) by 0x52C241: run_connection (unixctl.c:344) by 0x52C241: unixctl_server_run (unixctl.c:395) by 0x407526: main (ovs-vswitchd.c:128) Address 0x10b87480 is 32 bytes inside a block of size 4,096 alloc'd at 0x4C30F0B: malloc (vg_replace_malloc.c:307) by 0x52CDE4: xmalloc (util.c:138) by 0x4F7E07: ofpbuf_init (ofpbuf.c:123) by 0x4F7E07: ofpbuf_new (ofpbuf.c:151) by 0x53BDE3: dpif_netlink_ct_get_limits (dpif-netlink.c:3025) by 0x588B57: dpctl_ct_get_limits (dpctl.c:2178) by 0x586FF2: dpctl_unixctl_handler (dpctl.c:2870) by 0x52C241: process_command (unixctl.c:310) by 0x52C241: run_connection (unixctl.c:344) by 0x52C241: unixctl_server_run (unixctl.c:395) by 0x407526: main (ovs-vswitchd.c:128) Uninitialised value was created by a stack allocation at 0x46AAA0: ct_dpif_get_limits (ct-dpif.c:197) Fix that by using designated initializers that will clear all the non-specified fields. Fixes: 906ff9d229ee ("dpif-netlink: Implement conntrack zone limit") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Mark D. Gray <mark.d.gray@redhat.com>	2021-05-24 20:21:56 +02:00
Yunjian Wang	0c147fb4e5	dpif-netlink: Fix using uninitialized info.tc_modify_flow_deleted in out label. Before info.tc_modify_flow_deleted is assigned a value, error processing of other statements goes to the out label. In the out label, the uninitialized variant is used for condition determination, which may cause uncertain behavior. Fixes: 65b84d4a32bd ("dpif-netlink: avoid netlink modify flow put op failed after tc modify flow put op failed.") Signed-off-by: Mengfan Lv <lvmengfan@huawei.com> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2021-04-19 23:17:46 +02:00
Martin Varghese	ebe0e518b0	tunnel: Bareudp Tunnel Support. There are various L3 encapsulation standards using UDP being discussed to leverage the UDP based load balancing capability of different networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them. The Bareudp tunnel provides a generic L3 encapsulation support for tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP tunnel. An example to create bareudp device to tunnel MPLS traffic is given $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ options:payload_type=0x8847 options:dst_port=6635 The bareudp device supports special handling for MPLS & IP as they can have multiple ethertypes. MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). The bareudp device to tunnel L3 traffic with multiple ethertypes (MPLS & IP) can be created by passing the L3 protocol name as string in the field payload_type. An example to create bareudp device to tunnel MPLS unicast & multicast traffic is given below.:: $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ options:payload_type=mpls options:dst_port=6635 Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-By: Greg Rose <gvrose8192@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-22 12:51:22 +01:00
Jianbo Liu	c5b4b0ce95	dpif-netlink: Fix issues of the offloaded flows counter. The n_offloaded_flows counter is saved in dpif, and this is the first one when ofproto is created. When flow operation is done by ovs-appctl commands, such as, dpctl/add-flow, a new dpif is opened, and the n_offloaded_flows in it can't be used. So, instead of using counter, the number of offloaded flows is queried from each netdev, then sum them up. To achieve this, a new API is added in netdev_flow_api to get how many flows assigned to a netdev. In order to get better performance, this number is calculated directly from tc_to_ufid hmap for netdev-offload-tc, because flow dumping by tc takes much time if there are many flows offloaded. Fixes: af0618470507 ("dpif-netlink: Count the number of offloaded rules") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-12-21 20:25:59 +01:00
Jianbo Liu	af06184705	dpif-netlink: Count the number of offloaded rules Add a counter for the offloaded rules, and display it in the command of "ovs-appctl upcall/show". Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-12-07 07:30:15 +01:00
Ilya Maximets	bbe2e39287	dpctl: Fix broken flow deletion via ovs-dpctl due to missing ufid. Current code generates UFID for flows installed by ovs-dpctl. This leads to inability to remove such flows by the same command. Ex: ovs-dpctl add-dp test ovs-dpctl add-if test vport0 ovs-dpctl add-flow test "in_port(0),eth(),eth_type(0x800),ipv4(src=100.1.0.1)" 0 ovs-dpctl del-flow test "in_port(0),eth(),eth_type(0x800),ipv4(src=100.1.0.1)" dpif\|WARN\|system@test: failed to flow_del (No such file or directory) ufid:e4457189-3990-4a01-bdcf-1e5f8b208711 in_port(0), eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800), ipv4(src=100.1.0.1,dst=0.0.0.0,proto=0,tos=0,ttl=0,frag=no) ovs-dpctl: deleting flow (No such file or directory) Perhaps you need to specify a UFID? During del-flow operation UFID is generated too, however resulted value is different from one generated during add-flow. This happens because odp_flow_key_hash() function uses random base value for flow hashes which is different on every invocation. That is not an issue while running 'ovs-appctl dpctl/{add,del}-flow' because execution of these requests happens in context of the OVS main process, i.e. there will be same random seed. Commit e61984e781e6 was intended to allow offloading for flows added by dpctl/add-flow unixctl command, so it's better to generate UFIDs conditionally inside dpctl command handler only for appctl invocations. Offloading is not possible from ovs-dpctl utility anyway. There are still couple of corner case: It will not be possible to remove flow by 'ovs-appctl dpctl/del-flow' without specifying UFID if main OVS process was restarted since flow addition and it will not be possible to remove flow by ovs-dpctl without specifying UUID if it was added by 'ovs-appctl dpctl/add-flow'. But these scenarios seems minor since these commands intended for testing only. Reported-by: Eelco Chaudron <echaudro@redhat.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-September/374863.html Fixes: e61984e781e6 ("dpif-netlink: Generate ufids for installing TC flowers") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Tested-by: Eelco Chaudron <echaudro@redhat.com>	2020-10-09 13:00:49 +02:00
Ilya Maximets	8842fdf1b3	netdev-offload: Use dpif type instead of class. There is no real difference between the 'class' and 'type' in the context of common lookup operations inside netdev-offload module because it only checks the value of pointers without using the value itself. However, 'type' has some meaning and can be used by offload provides on the initialization phase to check if this type of Flow API in pair with the netdev type could be used in particular datapath type. For example, this is needed to check if Linux flow API could be used for current tunneling vport because it could be used only if tunneling vport belongs to system datapath, i.e. has backing linux interface. This is needed to unblock tunneling offloads in userspace datapath with DPDK flow API. Acked-by: Eli Britstein <elibr@mellanox.com> Acked-by: Roni Bar Yanai <roniba@mellanox.com> Acked-by: Ophir Munk <ophirmu@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-07-08 19:07:21 +02:00
Vishal Deep Ajmera	9df65060cf	userspace: Avoid dp_hash recirculation for balance-tcp bond mode. Problem: In OVS, flows with output over a bond interface of type “balance-tcp” gets translated by the ofproto layer into "HASH" and "RECIRC" datapath actions. After recirculation, the packet is forwarded to the bond member port based on 8-bits of the datapath hash value computed through dp_hash. This causes performance degradation in the following ways: 1. The recirculation of the packet implies another lookup of the packet’s flow key in the exact match cache (EMC) and potentially Megaflow classifier (DPCLS). This is the biggest cost factor. 2. The recirculated packets have a new “RSS” hash and compete with the original packets for the scarce number of EMC slots. This implies more EMC misses and potentially EMC thrashing causing costly DPCLS lookups. 3. The 256 extra megaflow entries per bond for dp_hash bond selection put additional load on the revalidation threads. Owing to this performance degradation, deployments stick to “balance-slb” bond mode even though it does not do active-active load balancing for VXLAN- and GRE-tunnelled traffic because all tunnel packet have the same source MAC address. Proposed optimization: This proposal introduces a new load-balancing output action instead of recirculation. Maintain one table per-bond (could just be an array of uint16's) and program it the same way internal flows are created today for each possible hash value (256 entries) from ofproto layer. Use this table to load-balance flows as part of output action processing. Currently xlate_normal() -> output_normal() -> bond_update_post_recirc_rules() -> bond_may_recirc() and compose_output_action__() generate 'dp_hash(hash_l4(0))' and 'recirc(<RecircID>)' actions. In this case the RecircID identifies the bond. For the recirculated packets the ofproto layer installs megaflow entries that match on RecircID and masked dp_hash and send them to the corresponding output port. Instead, we will now generate action as 'lb_output(<bond id>)' This combines hash computation (only if needed, else re-use RSS hash) and inline load-balancing over the bond. This action is used only for balance-tcp bonds in userspace datapath (the OVS kernel datapath remains unchanged). Example: Current scheme: With 8 UDP flows (with random UDP src port): flow-dump from pmd on cpu core: 2 recirc_id(0),in_port(7),<...> actions:hash(hash_l4(0)),recirc(0x1) recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),<...> actions:2 recirc_id(0x1),dp_hash(0xb236c260/0xff),<...> actions:1 recirc_id(0x1),dp_hash(0x7d89eb18/0xff),<...> actions:1 recirc_id(0x1),dp_hash(0xa78d75df/0xff),<...> actions:2 recirc_id(0x1),dp_hash(0xb58d846f/0xff),<...> actions:2 recirc_id(0x1),dp_hash(0x24534406/0xff),<...> actions:1 recirc_id(0x1),dp_hash(0x3cf32550/0xff),<...> actions:1 New scheme: We can do with a single flow entry (for any number of new flows): in_port(7),<...> actions:lb_output(1) A new CLI has been added to dump datapath bond cache as given below. # ovs-appctl dpif-netdev/bond-show [dp] Bond cache: bond-id 1 : bucket 0 - slave 2 bucket 1 - slave 1 bucket 2 - slave 2 bucket 3 - slave 1 Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com> Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Tested-by: Matteo Croce <mcroce@redhat.com> Tested-by: Adrian Moreno <amorenoz@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2020-06-22 13:11:51 +02:00
Rui Cao	25a2af4fe9	dpif-netlink: Fix Windows incompatibility when setting new feature OVS_DP_ATTR_NAME field is required when sending OVS_DP_CMD_SET to windows kernel driver. The function "dpif_netlink_set_features" dose not set the OVS_DP_ATTR_NAME field which will cause set feature failure and ovs-vswitchd will exist. This patch fixes the issue by setting "request.name" in request. Reported-at: https://github.com/openvswitch/ovs-issues/issues/187 Submitted-at: https://github.com/openvswitch/ovs/pull/319 Signed-off-by: Rui Cao <rcao@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2020-06-16 13:24:42 -07:00
Vlad Buslov	191536574e	netdev-offload: Implement terse dump support In order to improve revalidator performance by minimizing unnecessary copying of data, extend netdev-offloads to support terse dump mode. Extend netdev_flow_api->flow_dump_create() with 'terse' bool argument. Implement support for terse dump in functions that convert netlink to flower and flower to match. Set flow stats "used" value based on difference in number of flow packets because lastuse timestamp is not included in TC terse dump. Kernel API support is implemented in following patch. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-05 10:14:27 +02:00
Tonghao Zhang	e61984e781	dpif-netlink: Generate ufids for installing TC flowers To support installing the TC flowers to HW, via "ovs-appctl dpctl/add-flow" command, there should be an ufid. This patch will check whether ufid exists, if not, generate an ufid. Should to know that when processing upcall packets, ufid is generated in parse_odp_packet for kernel datapath. Configuring the max-idle/max-revalidator, may help testing this patch. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-06-03 09:53:21 +02:00
William Tu	3c6d05a02e	userspace: Add GTP-U support. GTP, GPRS Tunneling Protocol, is a group of IP-based communications protocols used to carry general packet radio service (GPRS) within GSM, UMTS and LTE networks. GTP protocol has two parts: Signalling (GTP-Control, GTP-C) and User data (GTP-User, GTP-U). GTP-C is used for setting up GTP-U protocol, which is an IP-in-UDP tunneling protocol. Usually GTP is used in connecting between base station for radio, Serving Gateway (S-GW), and PDN Gateway (P-GW). This patch implements GTP-U protocol for userspace datapath, supporting only required header fields and G-PDU message type. See spec in: https://tools.ietf.org/html/draft-hmm-dmm-5g-uplane-analysis-00 Tested-at: https://travis-ci.org/github/williamtu/ovs-travis/builds/666518784 Signed-off-by: Feng Yang <yangfengee04@gmail.com> Co-authored-by: Feng Yang <yangfengee04@gmail.com> Signed-off-by: Yi Yang <yangyi01@inspur.com> Co-authored-by: Yi Yang <yangyi01@inspur.com> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org>	2020-03-25 20:26:51 -07:00
wenxu	65b84d4a32	dpif-netlink: avoid netlink modify flow put op failed after tc modify flow put op failed. The tc modify flow put always delete the original flow first and then add the new flow. If the modfiy flow put operation failed, the flow put operation will change from modify to create if success to delete the original flow in tc (which will be always failed with ENOENT, the flow is already be deleted before add the new flow in tc). Finally, the modify flow put will failed to add in kernel datapath. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-03-19 11:58:45 +01:00
Ilya Maximets	d7b55c5c94	dpif: Fix leak and usage of uninitialized dp_extra_info. 'dpif_probe_feature'/'revalidate' doesn't free the 'dp_extra_info' string. Also, all the implementations of dpif_flow_get() should initialize the value to avoid printing/freeing of random memory. 30 bytes in 1 blocks are definitely lost in loss record 323 of 889 at 0x483AD19: realloc (vg_replace_malloc.c:836) by 0xDDAD89: xrealloc (util.c:149) by 0xCE1609: ds_reserve (dynamic-string.c:63) by 0xCE1A90: ds_put_format_valist (dynamic-string.c:161) by 0xCE19B9: ds_put_format (dynamic-string.c:142) by 0xCCCEA9: dp_netdev_flow_to_dpif_flow (dpif-netdev.c:3170) by 0xCCD2DD: dpif_netdev_flow_get (dpif-netdev.c:3278) by 0xCCEA0A: dpif_netdev_operate (dpif-netdev.c:3868) by 0xCDF81B: dpif_operate (dpif.c:1361) by 0xCDEE93: dpif_flow_get (dpif.c:1002) by 0xCDECF9: dpif_probe_feature (dpif.c:962) by 0xC635D2: check_recirc (ofproto-dpif.c:896) by 0xC65C02: check_support (ofproto-dpif.c:1567) by 0xC63274: open_dpif_backer (ofproto-dpif.c:818) by 0xC65E3E: construct (ofproto-dpif.c:1605) by 0xC4D436: ofproto_create (ofproto.c:549) by 0xC3931A: bridge_reconfigure (bridge.c:877) by 0xC3FEAC: bridge_run (bridge.c:3324) by 0xC4551D: main (ovs-vswitchd.c:127) CC: Emma Finn <emma.finn@intel.com> Fixes: 0e8f5c6a38d0 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Roi Dayan <roid@mellanox.com>	2020-01-20 17:51:16 +01:00
Ilya Maximets	7a5e0ee7cc	dpif: Turn dpif_flow_hash function into generic odp_flow_key_hash. Current implementation of dpif_flow_hash() doesn't depend on datapath interface and only complicates the callers by forcing them to figure out what is their current 'dpif'. If we'll need different hashing for different 'dpif's we'll implement an API for dpif-providers and each dpif implementation will be able to use their local function directly without calling it via dpif API. This change will allow us to not store 'dpif' pointer in the userspace datapath implementation which is broken and will be removed in next commits. This patch moves dpif_flow_hash() to odp-util module and replaces unused odp_flow_key_hash() by it, along with removing of unused 'dpif' argument. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>	2020-01-08 16:02:37 +01:00
Ilya Maximets	f7392b44a2	dpif-netlink: Fix dumping uninitialized netdev flow stats. dpif logging functions expects to be called after the operation. log_flow_del_message() dumps flow stats on success which are not initialized before the actual call to netdev_flow_del(): Conditional jump or move depends on uninitialised value(s) at 0x6090875: _itoa_word (_itoa.c:179) by 0x6093F0D: vfprintf (vfprintf.c:1642) by 0x60C090F: vsnprintf (vsnprintf.c:114) by 0xE5E7EC: ds_put_format_valist (dynamic-string.c:155) by 0xE5E755: ds_put_format (dynamic-string.c:142) by 0xE5A5E6: dpif_flow_stats_format (dpif.c:903) by 0xE5B708: log_flow_message (dpif.c:1763) by 0xE5BCA4: log_flow_del_message (dpif.c:1809) by 0xFA6076: try_send_to_netdev (dpif-netlink.c:2190) by 0xFA0D3C: dpif_netlink_operate (dpif-netlink.c:2248) by 0xE5AFAC: dpif_operate (dpif.c:1376) by 0xDF176E: push_dp_ops (ofproto-dpif-upcall.c:2367) by 0xDF04C8: push_ukey_ops (ofproto-dpif-upcall.c:2447) by 0xDF008F: revalidator_sweep__ (ofproto-dpif-upcall.c:2805) by 0xDF5DC6: revalidator_sweep (ofproto-dpif-upcall.c:2816) by 0xDF1E83: udpif_revalidator (ofproto-dpif-upcall.c:949) by 0xF3A3FE: ovsthread_wrapper (ovs-thread.c:383) by 0x565F6DA: start_thread (pthread_create.c:463) by 0x615988E: clone (clone.S:95) Uninitialised value was created by a stack allocation at 0xDEFC24: revalidator_sweep__ (ofproto-dpif-upcall.c:2733) Fixes: 3cd99886191e ("dpif-netlink: Use dpif logging functions") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2020-01-07 10:11:45 +01:00
Paul Blakey	9221c721be	netdev-offload-tc: Add conntrack label and mark support Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2019-12-22 11:54:40 +01:00
Paul Blakey	576126a931	netdev-offload-tc: Add conntrack support Zone and ct_state first. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2019-12-22 11:54:40 +01:00
Paul Blakey	b2ae40690e	netdev-offload-tc: Add recirculation support via tc chains Each recirculation id will create a tc chain, and we translate the recirculation action to a tc goto chain action. We check for kernel support for this by probing OvS Datapath for the tc recirc id sharing feature. If supported, we can offload rules that match on recirc_id, and recirculation action safely. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2019-12-22 11:54:40 +01:00
Paul Blakey	dcdcad68c6	dpif: Add support to set user features This enables user features on the kernel datapath via the DP_CMD_SET command, and also retrieves them to check for actual support and not just an older kernel ignoring the requested features. This will be used in next patch to enable recirc_id sharing with tc. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2019-12-22 11:54:40 +01:00
Tonghao Zhang	0442bfb11d	ofproto-dpif-upcall: Echo HASH attribute back to datapath. The kernel datapath may sent upcall with hash info, ovs-vswitchd should get it from upcall and then send it back. The reason is that: \| When using the kernel datapath, the upcall don't \| include skb hash info relatived. That will introduce \| some problem, because the hash of skb is important \| in kernel stack. For example, VXLAN module uses \| it to select UDP src port. The tx queue selection \| may also use the hash in stack. \| \| Hash is computed in different ways. Hash is random \| for a TCP socket, and hash may be computed in hardware, \| or software stack. Recalculation hash is not easy. \| \| There will be one upcall, without information of skb \| hash, to ovs-vswitchd, for the first packet of a TCP \| session. The rest packets will be processed in Open vSwitch \| modules, hash kept. If this tcp session is forward to \| VXLAN module, then the UDP src port of first tcp packet \| is different from rest packets. \| \| TCP packets may come from the host or dockers, to Open vSwitch. \| To fix it, we store the hash info to upcall, and restore hash \| when packets sent back. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=bd1903b7c4596ba6f7677d0dfefd05ba5876707d Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-11-22 09:51:57 -08:00
Ben Pfaff	622ea8fd75	dpif-netlink: Fix some variable naming. Usually a plural name refers to an array, but 'socks' and 'socksp' were only single objects, so this changes their names to 'sock' and 'sockp'. Usually a 'p' suffix means that a variable is an output argument, but that was only true in one place here, so this changes the names of the other variables to plain 'sock'. Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>	2019-10-14 12:27:11 -07:00
Yifeng Sun	666602ada7	dpif-netlink: Free leaked nl_sock Valgrind reports: 20 bytes in 1 blocks are definitely lost in loss record 94 of 353 by 0x532594: xmalloc (util.c:138) by 0x553EAD: nl_sock_create (netlink-socket.c:146) by 0x54331D: create_nl_sock (dpif-netlink.c:255) by 0x54331D: dpif_netlink_port_add__ (dpif-netlink.c:756) by 0x5435F6: dpif_netlink_port_add_compat (dpif-netlink.c:876) by 0x5435F6: dpif_netlink_port_add (dpif-netlink.c:922) by 0x47EC1D: dpif_port_add (dpif.c:584) by 0x42B35F: port_add (ofproto-dpif.c:3721) by 0x41E64A: ofproto_port_add (ofproto.c:2032) by 0x40B3FE: iface_do_create (bridge.c:1817) by 0x40B3FE: iface_create (bridge.c:1855) by 0x40B3FE: bridge_add_ports__ (bridge.c:943) by 0x40D14A: bridge_add_ports (bridge.c:959) by 0x40D14A: bridge_reconfigure (bridge.c:673) by 0x410D75: bridge_run (bridge.c:3050) by 0x407614: main (ovs-vswitchd.c:127) This leak is because when vport_add_channel() returns 0, it is expected to take the ownership of 'socksp'. This patch fixes this issue. Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-10-14 11:00:34 -07:00
Yi-Hung Wei	187bb41fbf	ofproto-dpif-xlate: Translate timeout policy in ct action This patch derives the timeout policy based on ct zone from the internal data structure that we maintain on dpif layer. It also adds a system traffic test to verify the zone-based conntrack timeout feature. The test uses ovs-vsctl commands to configure the customized ICMP and UDP timeout on zone 5 to a shorter period. It then injects ICMP and UDP traffic to conntrack, and checks if the corresponding conntrack entry expires after the predefined timeout. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> ofproto-dpif: Checks if datapath supports OVS_CT_ATTR_TIMEOUT This patch checks whether datapath supports OVS_CT_ATTR_TIMEOUT. With this check, ofproto-dpif-xlate can use this information to decide whether to translate the ct timeout policy. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>	2019-09-26 13:51:04 -07:00
Yi-Hung Wei	1f16131837	ct-dpif, dpif-netlink: Add conntrack timeout policy support This patch first defines the dpif interface for a datapath to support adding, deleting, getting and dumping conntrack timeout policy. The timeout policy is identified by a 4 bytes unsigned integer in datapath, and it currently support timeout for TCP, UDP, and ICMP protocols. Moreover, this patch provides the implementation for Linux kernel datapath in dpif-netlink. In Linux kernel, the timeout policy is maintained per L3/L4 protocol, and it is identified by 32 bytes null terminated string. On the other hand, in vswitchd, the timeout policy is a generic one that consists of all the supported L4 protocols. Therefore, one of the main task in dpif-netlink is to break down the generic timeout policy into 6 sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp), and push down the configuration using the netlink API in netlink-conntrack.c. This patch also adds missing symbols in the windows datapath so that the build on windows can pass. Appveyor CI: * https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754 Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Justin Pettit <jpettit@ovn.org>	2019-09-26 13:50:17 -07:00
Darrell Ball	64207120c8	conntrack: Add option to disable TCP sequence checking. This may be needed in some special cases, such as to support some hardware offload implementations. Note that disabling TCP sequence number verification is not an optimization in itself, but supporting some hardware offload implementations may offer better performance. TCP sequence number verification is enabled by default. This option is only available for the userspace datapath. Access to this option is presently provided via 'dpctl' commands as the need for this option is quite node specific, by virtue of which nics are in use on a given node. A test is added to verify this option. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/359188.html Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2019-09-25 12:11:32 -07:00
Ilya Maximets	0770429e38	dpif-netlink: Allow offloading of flows with dl_type 0x1234. 'dpif_probe_feature()' always has DPIF_FP_PROBE flag set. Other probing code uses dpif_execute() with DPIF_OP_EXECUTE, hence never calls parse_flow_put(). Thus, this 'if' statement is wrong and should be removed as it only forbids offloading of the real legitimate flows with dl_type 0x1234. Dummy flows never reach this code. CC: Paul Blakey <paulb@mellanox.com> Fixes: 8b668ee3f0cc ("dpif-netlink: Use netdev flow put api to insert a flow") Reported-by: Eli Britstein <elibr@mellanox.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>	2019-07-31 14:44:15 +03:00

1 2 3 4

158 Commits