mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-22 09:58:01 +00:00

Author	SHA1	Message	Date
Mark Michelson	b6e840aed0	pcap-file: Add nanosecond resolution pcap support. PCAP header magic numbers are different for microsecond and nanosecond resolution timestamps. This patch adds support for understanding the difference and reporting the time correctly with ovs_pcap_read(). When writing pcap files, OVS will always use microsecond resolution, so no new calculations were added to those functions. Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-10-05 17:35:07 -07:00
Ben Pfaff	89c09c1cd1	netdev: Clean up class initialization. The macros are hard to read. This makes it a little more readable. Signed-off-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-08-27 17:48:23 +01:00
John Hurley	88dcf2aa82	netdev-provider: add class op to get block_id Add a new class op for netdevs to get the block_id if one exists. The block_id is used in offload ops to group multiple qdiscs together. Stub calls are made to the new class op (implementation to follow in further patches). The default block_id of 0 (no block) will be used in these cases. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2018-06-29 14:51:47 +02:00
Ben Pfaff	fa37affad3	Embrace anonymous unions. Several OVS structs contain embedded named unions, like this: struct { ... union { ... } u; }; C11 standardized a feature that many compilers already implemented anyway, where an embedded union may be unnamed, like this: struct { ... union { ... }; }; This is more convenient because it allows the programmer to omit "u." in many places. OVS already used this feature in several places. This commit embraces it in several others. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org> Tested-by: Alin Gabriel Serdean <aserdean@ovn.org> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>	2018-05-25 13:36:05 -07:00
Jan Scheurich	8492adc270	netdev: Add optional qfill output parameter to rxq_recv() If the caller provides a non-NULL qfill pointer and the netdev implemementation supports reading the rx queue fill level, the rxq_recv() function returns the remaining number of packets in the rx queue after reception of the packet burst to the caller. If the implementation does not support this, it returns -ENOTSUP instead. Reading the remaining queue fill level should not substantilly slow down the recv() operation. A first implementation is provided for ethernet and vhostuser DPDK ports in netdev-dpdk.c. This output parameter will be used in the upcoming commit for PMD performance metrics to supervise the rx queue fill level for DPDK vhostuser ports. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Acked-by: Billy O'Mahony <billy.o.mahony@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>	2018-05-11 08:08:24 +01:00
Justin Pettit	e883448e3f	dp-packet: Add index to DP_PACKET_BATCH_FOR_EACH to prevent shadowing. Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>	2018-02-28 14:53:27 -08:00
Ben Pfaff	6f06837989	flow: Add some L7 payload data to most L4 protocols that accept it. This makes traffic generated by flow_compose() look slightly more realistic. It requires lots of updates to tests, but at least the tests themselves should be slightly more realistic too. At the same time, add --l7 and --l7-len options to ofproto/trace to allow users to specify the amount or contents of payloads that they want. Suggested-by: Brad Cowie <brad@cowie.nz> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Yifeng Sun <pkusunyifeng@gmail.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>	2018-01-27 08:58:31 -08:00
Ben Pfaff	ae9f2ce7c5	netdev-dummy: Lock mutex when retrieving custom stats. Found by Clang. CC: Michal Weglicki <michalx.weglicki@intel.com> Fixes: 971f4b394c6e ("netdev: Custom statistics.") Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>	2018-01-10 16:07:53 -08:00
Michal Weglicki	971f4b394c	netdev: Custom statistics. - New get_custom_stats interface function is added to netdev. It allows particular netdev implementation to expose custom counters in dictionary format (counter name/counter value). - New statistics are retrieved using experimenter code and are printed as a result to ofctl dump-ports. - New counters are available for OpenFlow 1.4+. - New statistics are printed to output via ofctl only if those are present in reply message. - New statistics definition is added to include/openflow/intel-ext.h. - Custom statistics are implemented only for dpdk-physical port type. - DPDK-physical implementation uses xstats to collect statistics. Only dropped and error counters are exposed. Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2018-01-10 15:29:13 -08:00
Ilya Maximets	ad8b0b4fe7	netdev: Remove useless cutlen. Cutlen already applied while processing OVS_ACTION_ATTR_OUTPUT. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com	2017-12-20 21:07:46 +00:00
Ilya Maximets	b30896c969	netdev: Remove unused may_steal. Not needed anymore because 'may_steal' already handled on dpif-netdev layer and always true. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com	2017-12-20 21:07:46 +00:00
Bhanuprakash Bodireddy	7a385993a6	netdev-dummy: Reorder elements in dummy_packet_stream structure. By reordering elements in dummy_packet_stream structure, sum holes can be reduced, thus saving a cache line. Before: structure size: 784, sum holes: 56, cachelines:13 After : structure size: 768, sum holes: 40, cachelines:12 Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-11-03 12:52:09 -07:00
Xiao Liang	fd016ae3fb	lib: Move lib/poll-loop.h to include/openvswitch Poll-loop is the core to implement main loop. It should be available in libopenvswitch. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-11-03 10:47:55 -07:00
Yifeng Sun	dac0fb811e	netdev-dummy: Avoid double-free in netdev_dummy_ip4addr(). netdev_dummy_ip6addr() calls netdev_close() twice though it increases netdev's reference only once from netdev_from_name(). As a result, Valgrind test 788 (tunnel_push_pop - action) reports the error below: ==20465== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Invalid read of size 8 at 0x493FE0: netdev_get_name (netdev.c:911) by 0x5125D3: tnl_port_map_delete_ipdev (tnl-ports.c:470) by 0x4E551C: __rt_entry_delete (ovs-router.c:252) by 0x4E64AA: ovs_router_flush (ovs-router.c:478) by 0x475CA8: call_hooks.part.2 (fatal-signal.c:254) by 0x5E53FF7: __run_exit_handlers (exit.c:82) by 0x5E54044: exit (exit.c:104) by 0x5E3A836: (below main) (libc-start.c:325) Address 0x65ea680 is 0 bytes inside a block of size 640 free'd at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x492BA2: netdev_unref (netdev.c:572) by 0x41646E: ofport_destroy__ (ofproto.c:2516) by 0x41FD58: ofproto_destroy (ofproto.c:1645) by 0x40B96B: bridge_destroy (bridge.c:3273) by 0x410238: bridge_exit (bridge.c:506) by 0x40700E: main (ovs-vswitchd.c:135) Block was alloc'd at at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x516A82: xcalloc (util.c:103) by 0x48D74D: netdev_dummy_alloc (netdev-dummy.c:661) by 0x4931D1: netdev_open.part.12 (netdev.c:406) by 0x40A985: iface_do_create (bridge.c:1784) by 0x40A985: iface_create (bridge.c:1837) by 0x40A985: bridge_add_ports__ (bridge.c:931) by 0x40C7EA: bridge_add_ports (bridge.c:947) by 0x40C7EA: bridge_reconfigure (bridge.c:663) by 0x410485: bridge_run (bridge.c:2998) by 0x406F64: main (ovs-vswitchd.c:119) Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-11-02 13:55:51 -07:00
Joe Stringer	df3a6d503e	netdev-dummy: Fix minor style variation. Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>	2017-08-09 16:56:30 -07:00
Ben Pfaff	360990eb1d	netdev-dummy: Close pcap files when dummy device is closed. Fixes a fd leak. Reported-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>	2017-08-08 16:54:04 -07:00
Ben Pfaff	a61a289119	dp-packet: New function dp_packet_get_send_len(). This function is useful in a few places for representing the packet's length minus the cutlen. Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-08-02 18:58:10 -07:00
Ben Pfaff	71f21279f6	Eliminate most shadowing for local variable names. Shadowing is when a variable with a given name in an inner scope hides a different variable with the same name in a surrounding scope. This is generally undesirable because it can confuse programmers. This commit eliminates most of it. Found with -Wshadow=local in GCC 7. The repo is not really ready to enable this option by default because of a few cases that are harder to fix, and harmless, such as nested use of CMAP_FOR_EACH. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>	2017-08-02 15:03:35 -07:00
Andy Zhou	bc0f51765d	flow: Refactor flow_compose() API. Currently, flow_compose_size() is only supposed to be called after flow_compose(). I find this API to be unintuitive. Change flow_compose() API to take the 'size' argument, and returns 'true' if the packet can be created, 'false' otherwise. This change also improves error detection and reporting when 'size' is unreasonably small. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ilya Maximets <i.maximets@samsung.com>	2017-07-27 15:22:39 -07:00
Ilya Maximets	1e2eecbbf7	netdev-dummy: Fix setting length in recieve command. Currently, if '--len' option passed to 'netdev-dummy/receive' command, only 'size' field of dp_packet will changes. This is incorrect behaviour, because memory for that size is not allocated and also packet headers not fixed to reflect the new size. This leads to flow_extract() failure, because it checks the 'ip->tot_len' and stops further parsing if it doesn't match the dp_packet_size(). As a result packets created while processing of the 'receive' command can't be parsed to the same flow. Additionally this may lead to wrong memory accesses in case someone will try to read or modify packets data. Fix that by creating right packets using recently introduced 'flow_compose_size()'. CC: Andy Zhou <azhou@ovn.org> Fixes: d8ada2368cbe ("netdev-dummy: Add --len option for netdev-dummy/receive command") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>	2017-07-25 14:42:11 -07:00
Ben Pfaff	875ab13020	userspace: Handling of versatile tunnel ports In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based on packet_type of flow. If it's about an Ethernet packet, it is set to ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set according to the name space type. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-06-27 17:28:30 -04:00
Paul Blakey	18ebd48cfb	netdev: Adding a new netdev API to be used for offloading flows Add a new API interface for offloading dpif flows to netdev. The API consist on the following: flow_put - offload a new flow flow_get - query an offloaded flow flow_del - delete an offloaded flow flow_flush - flush all offloaded flows flow_dump_* - dump all offloaded flows In upcoming commits we will introduce an implementation of this API for netdev-linux. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>	2017-06-14 10:12:30 +02:00
Jan Scheurich	2482b0b0c8	userspace: Add packet_type in dp_packet and flow This commit adds a packet_type attribute to the structs dp_packet and flow to explicitly carry the type of the packet as prepration for the introduction of the so-called packet type-aware pipeline (PTAP) in OVS. The packet_type is a big-endian 32 bit integer with the encoding as specified in OpenFlow verion 1.5. The upper 16 bits contain the packet type name space. Pre-defined values are defined in openflow-common.h: enum ofp_header_type_namespaces { OFPHTN_ONF = 0, /* ONF namespace. / OFPHTN_ETHERTYPE = 1, / ns_type is an Ethertype. / OFPHTN_IP_PROTO = 2, / ns_type is a IP protocol number. / OFPHTN_UDP_TCP_PORT = 3, / ns_type is a TCP or UDP port. / OFPHTN_IPV4_OPTION = 4, / ns_type is an IPv4 option number. */ }; The lower 16 bits specify the actual type in the context of the name space. Only name spaces 0 and 1 will be supported for now. For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet). This is the default packet_type in OVS and the only one supported so far. Packets of type (OFPHTN_ONF, 0) are called Ethernet packets. In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet. A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet whith the Ethernet header (and any VLAN tags) removed to expose the L3 (or L2.5) payload of the packet. These will simply be called L3 packets. The Ethernet address fields dl_src and dl_dst in struct flow are not applicable for an L3 packet and must be zero. However, to maintain compatibility with the large code base, we have chosen to copy the Ethertype of an L3 packet into the the dl_type field of struct flow. This does not mean that it will be possible to match on dl_type for L3 packets with PTAP later on. Matching must be done on packet_type instead. New dp_packets are initialized with packet_type Ethernet. Ports that receive L3 packets will have to explicitly adjust the packet_type. Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2017-05-03 16:56:40 -07:00
Andy Zhou	72c84bc2db	dp-packet: Enhance packet batch APIs. One common use case of 'struct dp_packet_batch' is to process all packets in the batch in order. Add an iterator for this use case to simplify the logic of calling sites, Another common use case is to drop packets in the batch, by reading all packets, but writing back pointers of fewer packets. Add macros to support this use case. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>	2017-01-26 17:35:29 -08:00
Andy Zhou	d8ada2368c	netdev-dummy: Add --len option for netdev-dummy/receive command Currently, there is no way to specify the packet size when injecting a packet via "netdev-dummy/receive" with a flow specification. Thus far, packet size is not important for testing OVS features, but it becomes useful in writing unit tests for the future patches. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>	2017-01-26 15:02:50 -08:00
Daniele Di Proietto	9fff138ec3	netdev: Add 'errp' to set_config(). Since 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"), set_config() is used to identify a DPDK device, so it's better to report its detailed error message to the user. Tunnel devices and patch ports rely a lot on set_config() as well. This commit adds a param to set_config() that can be used to return an error message and makes use of that in netdev-dpdk and netdev-vport. Before this patch: $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk ovs-vsctl: Error detected while setting up 'dpdk0': dpdk0: could not set configuration (Invalid argument). See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". $ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch ovs-vsctl: Error detected while setting up 'p+': p+: could not set configuration (Invalid argument). See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". $ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve ovs-vsctl: Error detected while setting up 'gnv0': gnv0: could not set configuration (Invalid argument). See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". After this patch: $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk ovs-vsctl: Error detected while setting up 'dpdk0': 'dpdk0' is missing 'options:dpdk-devargs'. The old 'dpdk<port_id>' names are not supported. See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". $ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch ovs-vsctl: Error detected while setting up 'p+': p+: patch type requires valid 'peer' argument. See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". $ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve ovs-vsctl: Error detected while setting up 'gnv0': gnv0: geneve type requires valid 'remote_ip' argument. See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". CC: Ciara Loftus <ciara.loftus@intel.com> CC: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Tested-by: Ciara Loftus <ciara.loftus@intel.com>	2017-01-11 18:29:39 -08:00
nickcooper-zhangtonghao	bf9f6f80c0	netdev-dummy: Limits the number of tx/rx queues. This patch avoids the ovs_rcu to report WARN, caused by blocked for a long time, when ovs-vswitchd processes a port with many rx/tx queues. The number of tx/rx queues per port may be appropriate, because the dpdk uses it as an default max value. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>	2017-01-10 18:53:34 -08:00
nickcooper-zhangtonghao	cce57f8daa	netdev-dummy: Uses the NR_QUEUE instead of magic numbers. The NR_QUEUE is defined in "lib/dpif-netdev.h", netdev-dpdk uses it instead of magic number. netdev-dummy should be in the same case. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>	2017-01-08 18:06:39 -08:00
nickcooper-zhangtonghao	56edfb185b	datapath: Checks the MTU for netdev-dummy ports. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Ben Pfaff <blp@ovn.org>	2016-12-12 17:00:36 -08:00
Ilya Maximets	2a21e75796	netdev: Set the default number of queues at removal from the database Expected behavior for attribute removal from the database is resetting it to default value. Currently this doesn't work for n_rxq/n_txq options of pmd netdevs (last requested value used): # ovs-vsctl set interface dpdk0 options:n_rxq=4 # ovs-vsctl remove interface dpdk0 options n_rxq # ovs-appctl dpif/show \| grep dpdk0 <...> dpdk0 1/1: (dpdk: configured_rx_queues=4, <...> \ requested_rx_queues=4, <...>) Fix that by using NR_QUEUE or 1 as a default value for 'smap_get_int'. Fixes: a14b8947fd13 ("dpif-netdev: Allow different numbers of rx queues for different ports.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Tested-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>	2016-12-09 18:15:51 -08:00
Daniele Di Proietto	ae59d13433	tests: Add a new MTU test. Also, netdev-dummy needs to call netdev_change_seq_changed() in set_mtu(). Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>	2016-08-15 11:07:47 -07:00
Daniele Di Proietto	e98d0cb3ac	netdev-dummy: Add dummy-internal class. "internal" netdevs are treated specially in OVS (e.g. for MTU), but the dummy datapath remaps both "system" and "internal" devices to the same "dummy" netdev class, so there's no way to discern those in tests. This commit adds a new "dummy-internal" netdev type, which will be used by the dummy datapath for internal ports, so that other parts of the code can understand which ports are internal just by looking at the netdev object. The alternative solution, using the original interface type ("internal") instead of the translated netdev type ("dummy"), is harder to implement, because in so many places only the netdev object is available. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>	2016-08-15 11:07:42 -07:00
Daniele Di Proietto	1c33f0c35e	netdev: Pass 'netdev_class' to ->run() and ->wait(). This will allow run() and wait() methods to be shared between different classes and still perform class-specific work. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>	2016-08-15 11:07:37 -07:00
Daniele Di Proietto	4124cb1254	netdev: Make netdev_set_mtu() netdev parameter non-const. Every provider silently drops the const attribute when converting the parameter to the appropriate subclass. Might as well drop the const attribute from the parameter, since this is a "set" function. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>	2016-08-12 19:32:12 -07:00
Daniele Di Proietto	efe179e041	netdev-*: Do not use dp_packet_pad() in recv() functions. All the netdevs used by dpif-netdev (except for netdev-dpdk) have a dp_packet_pad() call in the receive function, probably because the userspace datapath couldn't handle properly short packets. This doesn't appear to be the case anymore. This commit removes the call to have a more consistent behavior with the kernel datapath. All the testsuite changes in this commit adjust the expectations for packet lengths in flow dumps and other stats. There's only one fix in ovn.at: one of the test_ip() functions generated an incomplete udp packet, which was not a problem until now, because of the padding. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>	2016-07-29 14:08:10 -07:00
Ilya Maximets	324c837485	dpif-netdev: XPS (Transmit Packet Steering) implementation. If CPU number in pmd-cpu-mask is not divisible by the number of queues and in a few more complex situations there may be unfair distribution of TX queue-ids between PMD threads. For example, if we have 2 ports with 4 queues and 6 CPUs in pmd-cpu-mask such distribution is possible: <------------------------------------------------------------------------> pmd thread numa_id 0 core_id 13: port: vhost-user1 queue-id: 1 port: dpdk0 queue-id: 3 pmd thread numa_id 0 core_id 14: port: vhost-user1 queue-id: 2 pmd thread numa_id 0 core_id 16: port: dpdk0 queue-id: 0 pmd thread numa_id 0 core_id 17: port: dpdk0 queue-id: 1 pmd thread numa_id 0 core_id 12: port: vhost-user1 queue-id: 0 port: dpdk0 queue-id: 2 pmd thread numa_id 0 core_id 15: port: vhost-user1 queue-id: 3 <------------------------------------------------------------------------> As we can see above dpdk0 port polled by threads on cores: 12, 13, 16 and 17. By design of dpif-netdev, there is only one TX queue-id assigned to each pmd thread. This queue-id's are sequential similar to core-id's. And thread will send packets to queue with exact this queue-id regardless of port. In previous example: pmd thread on core 12 will send packets to tx queue 0 pmd thread on core 13 will send packets to tx queue 1 ... pmd thread on core 17 will send packets to tx queue 5 So, for dpdk0 port after truncating in netdev-dpdk: core 12 --> TX queue-id 0 % 4 == 0 core 13 --> TX queue-id 1 % 4 == 1 core 16 --> TX queue-id 4 % 4 == 0 core 17 --> TX queue-id 5 % 4 == 1 As a result only 2 of 4 queues used. To fix this issue some kind of XPS implemented in following way: * TX queue-ids are allocated dynamically. * When PMD thread first time tries to send packets to new port it allocates less used TX queue for this port. * PMD threads periodically performes revalidation of allocated TX queue-ids. If queue wasn't used in last XPS_TIMEOUT_MS milliseconds it will be freed while revalidation. * XPS is not working if we have enough TX queues. Reported-by: Zhihong Wang <zhihong.wang@intel.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>	2016-07-27 12:56:04 -07:00
Terry Wilson	ee89ea7b47	json: Move from lib to include/openvswitch. To easily allow both in- and out-of-tree building of the Python wrapper for the OVS JSON parser (e.g. w/ pip), move json.h to include/openvswitch. This also requires moving lib/{hmap,shash}.h. Both hmap.h and shash.h were #include-ing "util.h" even though the headers themselves did not use anything from there, but rather from include/openvswitch/util.h. Fixing that required including util.h in several C files mostly due to OVS_NOT_REACHED and things like xmalloc. Signed-off-by: Terry Wilson <twilson@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2016-07-22 17:09:17 -07:00
Lance Richardson	7778360b0b	netdev-dummy: fix crash with more than one passive connection Investigation found that Some of the occasional failures in the "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case are caused by ovs-vswitchd crashing with SIGSEGV. It turns out that the crash occurrs when the number of netdev-dummy passive connections transitions from 1 to 2. When xrealloc() copies the array of dummy_packet_stream structures from the original buffer to a newly allocated one, the struct ovs_list txq member of the structure becomes corrupt (e.g. if ovs_list_is_empty() would have returned false before the copy, it will return true after the copy, which will lead to a crash when the bogus packet buffer on the list is dereferenced). Fix by taking a hint from David Wheeler and adding a level of indirection. Signed-off-by: Lance Richardson <lrichard@redhat.com> [blp@ovn.org folded in an additional bug fix] Signed-off-by: Ben Pfaff <blp@ovn.org>	2016-07-22 15:16:35 -07:00
William Tu	64839cf432	netdev-provider: Apply batch object to netdev provider. Commit 1895cc8dbb64 ("dpif-netdev: create batch object") introduces batch process functions and 'struct dp_packet_batch' to associate with batch-level metadata. This patch applies the packet batch object to the netdev provider interface (dummy, Linux, BSD, and DPDK) so that batch APIs can be used in providers. With batch metadata visible in providers, optimizations can be introduced at per-batch level instead of per-packet. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/145694197 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>	2016-07-21 16:46:32 -07:00
Ilya Maximets	d66e4a5e7e	netdev-dummy: Add n_txq option. Will be used for testing with different numbers of TX queues. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>	2016-07-08 15:27:37 -07:00
Daniele Di Proietto	d537e73a42	netdev-dummy: Allow configuring the numa_id for testing purposes. This commit introduces an (undocumented) option for dummy Interfaces to specify a dummy numa_id, to which the device belongs. It will be used to test the pmd threads in dpif-netdev. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>	2016-06-24 14:15:04 -07:00
William Tu	aaca4fe0ce	ofp-actions: Add truncate action. The patch adds a new action to support packet truncation. The new action is formatted as 'output(port=n,max_len=m)', as output to port n, with packet size being MIN(original_size, m). One use case is to enable port mirroring to send smaller packets to the destination port so that only useful packet information is mirrored/copied, saving some performance overhead of copying entire packet payload. Example use case is below as well as shown in the testcases: - Output to port 1 with max_len 100 bytes. - The output packet size on port 1 will be MIN(original_packet_size, 100). # ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)' - The scope of max_len is limited to output action itself. The following packet size of output:1 and output:2 will be intact. # ovs-ofctl add-flow br0 \ 'actions=output(port=1,max_len=100),output:1,output:2' - The Datapath actions shows: # Datapath actions: trunc(100),1,1,2 Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134 Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>	2016-06-24 09:17:00 -07:00
Daniele Di Proietto	f8cf65022b	netdev-dummy: Introduce sched_yield() in rxq_recv() for pmd devices. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>	2016-06-07 11:15:01 -07:00
Ilya Maximets	9a81a63728	netdev-dummy: Add multiqueue support to dummy-pmd. All previous multi-open logic preserved for rx queues. Also, added new optional parameter '--qid' for 'netdev-dummy/receive' in order to allow user to choose id of rx queue to which packet will be sent. Ex.: ovs-appctl netdev-dummy/receive p1 --qid 3 'in_port(1) ...' Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>	2016-06-06 18:13:49 -07:00
Ilya Maximets	f9176a3a7f	netdev-dummy: Add dummy-pmd class. 'dummy-pmd' class is a new dummy class. Created in purposes of testing of PMD interfaces. Ex.: ovs-vsctl set interface <iface> type=dummy-pmd Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>	2016-06-06 18:10:17 -07:00
Daniele Di Proietto	050c60bfb5	netdev-dpdk: Use ->reconfigure() call to change rx/tx queues. This introduces in dpif-netdev and netdev-dpdk the first use for the newly introduce reconfigure netdev call. When a request to change the number of queues comes, netdev-dpdk will remember this and notify the upper layer via netdev_request_reconfigure(). The datapath, instead of periodically calling netdev_set_multiq(), can detect this and call reconfigure(). This mechanism can also be used to: * Automatically match the number of rxq with the one provided by qemu via the new_device callback. * Provide a way to change the MTU of dpdk devices at runtime. * Move a DPDK vhost device to the proper NUMA socket. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Tested-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>	2016-05-23 10:27:42 -07:00
Daniele Di Proietto	790fb3b745	netdev: Add reconfigure request mechanism. A netdev provider, especially a PMD provider (like netdev DPDK) might not be able to change some of its parameters (such as MTU, or number of queues) without stopping everything and restarting. This commit introduces a mechanism that allows a netdev provider to request a restart (netdev_request_reconfigure()). The upper layer can be notified via netdev_wait_reconf_required() and netdev_is_reconf_required(). After closing all the rxqs the upper layer can finally call netdev_reconfigure(), to make sure that the new configuration is in place. This will be used by next commit to reconfigure rx and tx queues in netdev-dpdk. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Tested-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>	2016-05-23 10:27:42 -07:00
mweglicx	d6e3feb57c	Add support for extended netdev statistics based on RFC 2819. Implementation of new statistics extension for DPDK ports: - Add new counters definition to netdev struct and open flow, based on RFC2819. - Initialize netdev statistics as "filtered out" before passing it to particular netdev implementation (because of that change, statistics which are not collected are reported as filtered out, and some unit tests were modified in this respect). - New statistics are retrieved using experimenter code and are printed as a result to ofctl dump-ports. - New counters are available for OpenFlow 1.4+. - Add new vendor id: INTEL_VENDOR_ID. - New statistics are printed to output via ofctl only if those are present in reply message. - Add new file header: include/openflow/intel-ext.h which contains new statistics definition. - Extended statistics are implemented only for dpdk-physical and dpdk-vhost port types. - Dpdk-physical implementation uses xstats to collect statistics. - Dpdk-vhost implements only part of statistics (RX packet sized based counters). Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> [blp@ovn.org made software devices more consistent] Signed-off-by: Ben Pfaff <blp@ovn.org>	2016-05-06 15:28:56 -07:00
Ben Warren	25d436fbd4	Move lib/ofp-print.h to include/openvswitch directory Signed-off-by: Ben Warren <ben@skyportsystems.com> Signed-off-by: Ben Pfaff <blp@ovn.org>	2016-04-14 16:38:32 -07:00
William Tu	91644f45c6	dp-packet: Fix use of uninitialised value at emc_lookup. Valgrind reports "Conditional jump or move depends on uninitialised value" and "Use of uninitialised value" at case 2016 ovn -- 3 HVs, 1 LS, 3 lports/HV. It is caused by 1) assigning an uninitialized value to 'key->hash' at emc_processing(). Due to uninit rss_hash_valid, dp_packet_rss_valid() might return true and undefined hash value is returned, and 2) at emc_lookup, the 'current_entry->key.hash' could be uninitialized due to dp_packet_clone(). The patch fixes the two and as a result, a couple of calls to dp_packet_rss_invalidate() become redundant and thus are removed. Call stacks: - Connditional jump or move depends on uninitialised value(s) dpif_netdev_packet_get_rss_hash (dpif-netdev.c:3334) emc_processing (dpif-netdev.c:3455) dp_netdev_input__ (dpif-netdev.c:3639) and, - Use of uninitialised value of size 8 emc_lookup (dpif-netdev.c:1785) emc_processing (dpif-netdev.c:3457) dp_netdev_input__ (dpif-netdev.c:3639) Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>	2016-04-06 19:32:34 -07:00

1 2 3 4

155 Commits