mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-29 13:27:59 +00:00

Author	SHA1	Message	Date
Ben Pfaff	8ba0a5227f	ovs-thread: Make caller provide thread name when creating a thread. Thread names are occasionally very useful for debugging, but from time to time we've forgotten to set one. This commit adds the new thread's name as a parameter to the function to start a thread, to make that mistake impossible. This also simplifies code, since two function calls become only one. This makes a few other changes to the thread creation function: * Since it is no longer a direct wrapper around a pthread function, rename it to avoid giving that impression. * Remove 'pthread_attr_t ' param that every caller supplied as NULL. Change 'pthread *' parameter into a return value, for convenience. The system-stats code hadn't set a thread name, so this fixes that issue. This patch is a prerequisite for making RCU report the name of a thread that is blocking RCU synchronization, because the easiest way to do that is for ovsrcu_quiesce_end() to record the current thread's name. ovsrcu_quiesce_end() is called before the thread function is called, so it won't get a name set within the thread function itself. Setting the thread name earlier, as in this patch, avoids the problem. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>	2014-04-28 15:25:49 -07:00
Andy Zhou	62ac1f20e9	openvswitch.h: rename hash action definition Rename hash_bias to hash_basis to make it consistent with similar usages. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>	2014-04-20 22:28:01 -07:00
Andy Zhou	fbfe01de0d	odp-util: Always generate key/mask pair in netlink for recirc_id Currently netlink flow (and mask) recirc_id attribute is only serialized when the recirc_id value is non-zero. For this logic to work correctly, the interpretation of the missing recirc_id depends on whether the datapath supports recirculation. This patch remove the ambiguity of the meaning of missing recirc_id attribute in netlink message. When recirc_id is non-zero, or when it is not a wildcard match, both key and mask attributes are serialized. On the other hand, when recirc_id is zero, and being wildcarded, they are not serialized. A missing recirc_id key and mask attribute thus should always be interpreted as wildcard, same as other flow fields. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>	2014-04-20 22:27:55 -07:00
Jarno Rajahalme	4f15074492	dpif-netdev: Use miniflow as a flow key. Use miniflow as a flow key in the userspace datapath classifier. The miniflow is expanded for upcalls, but for existing datapath flows, the key need not be expanded. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Reviewed-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>	2014-04-18 08:39:44 -07:00
Andy Zhou	347bf289b3	dpif-netdev: Move hash function out of the recirc action, into its own action Currently recirculation action can optionally compute hash. This patch adds a hash action that is independent of the recirc action, which no longer computes hash. For megaflow bond with recirc, the output to a bond port action will look like: hash(hash_l4(0)), recirc(<recirc_id>) Obviously, when a recirculation application that does not depend on hash value can just use the recirc action alone. Signed-off-by: Andy Zhou <azhou@nicira.com> Reviewed-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Acked-by: Pravin B Shelar <pshelar@nicira.com	2014-04-16 15:30:42 -07:00
Andy Zhou	cd527139bb	dpif-netdev: Use existing flow for computing dp hash Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-04-09 10:59:48 -07:00
Andy Zhou	4347b9b38e	dpif-netdev: preserve packet metadata fields across recirculation If the actions executed during recirculation changed metadata fields, then any actions after the recirculation returns would see those new values. Now, all metadata are saved and restored across a recirculation. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-04-09 10:59:48 -07:00
Andy Zhou	adcf00ba35	ofproto/bond: Implement bond megaflow using recirculation Infrastructure to enable megaflow support for bond ports using recirculation. This patch adds the following features: * Generate RECIRC action when bond can benefit from recirculation. * Populate post recirculation rules in a hidden table. Currently table 254. * Uses post recirculation rules for bond rebalancing * A recirculation implementation in dpif-netdev. The goal of this patch is to be able to megaflow bond outputs and thus greatly improve performance. However, this patch does not actually improve the megaflow generation. It is left for a later commit. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-04-07 19:55:30 -07:00
Ben Pfaff	f3f750e5ae	dpif-netdev: Unwildcard entire odp_port in dpif_netdev_mask_from_nlattrs(). One case in the dpif_netdev_mask_from_nlattrs() function accidentally wildcarded only a 16-bit subset of the mask's odp_port. On little-endian machines this subset was the lower bits, which happened to work out OK, but on big-endian machines this subset was the upper bits, which doesn't work and causes a test failure. (The problem was actually visible in the test expected results on little-endian machines, but we had not noticed.) This commit unwildcards the whole field, fixing the problem, and updates the test expected results to match. This fixes the failure of test 732 seen here: https://buildd.debian.org/status/fetch.php?pkg=openvswitch&arch=sparc&ver=2.1.0%2Bgit20140325-1&stamp=1396438624 Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>	2014-04-05 10:27:05 -07:00
Pravin Shelar	1f317cb5c2	ofpbuf: Introduce access api for base, data and size. These functions will be used by later patches. Following patch does not change functionality. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-03-30 06:18:43 -07:00
Pritesh Kothari	5794e276b4	sparse: workaround for a bug in sparse. sparse emits the following warning: lib/dpif-netdev.c:1755:15: warning: Initializer entry defined twice lib/dpif-netdev.c:1755:15: also defined here due to a bug in sparse which doesn't like inlined functions which expands a #define within it. This commit removes inline to make sparse happy. Signed-off-by: Pritesh Kothari <pritesh.kothari@cisco.com> Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-03-28 14:40:07 -07:00
YAMAMOTO Takashi	9b516652a1	recirculation: Some cosmetic fixes Wrap long lines, fix whitespaces, and fix a typo in a comment. No functional changes are intended. Cc: Andy Zhou <azhou@nicira.com> Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Andy Zhou <azhou@nicira.com>	2014-03-28 13:14:18 -07:00
Andy Zhou	572f732ab0	dpif-netdev: user space datapath recirculation Add basic recirculation infrastructure and user space data path support for it. The following bond mega flow patch will make use of this infrastructure. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-03-25 13:24:39 -07:00
Pravin	8617affff4	netdev-dpdk: Use multiple core for dpdk IO. DPDK need to set _lcore_id for using multiple core. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>	2014-03-21 11:48:28 -07:00
Pravin	55c955bd8a	netdev: Add support multiqueue recv. new netdev type like DPDK can support multi-queue IO. Following patch Adds support for same. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>	2014-03-21 11:48:28 -07:00
Pravin	f77917408a	netdev: Rename netdev_rx to netdev_rxq Preparation for multi queue netdev IO. There are no functional changes in this patch. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>	2014-03-21 11:48:28 -07:00
Pravin	e4cfed38b1	dpif-netdev: Add poll-mode-device thread. This patch adds PMD type netdev for netdevice with poll-mode drivers. Since there is no way to get signal on a packet recv from these devices we need to poll them in busy loop. So minimize system call overhead this patch uses dpif-thread exclusively for PMD devices and rest of devices which needs system calls to do IO are moved to dpif-netdev-run(). PMD device like DPDK work in userspace so there is no system call overhead for them. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>	2014-03-21 11:48:28 -07:00
Pravin	b284085e55	dpif-netdev: Add ref-counting for port. DPDK Poll mode thread need to keep ref to dpif-port. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>	2014-03-21 11:48:28 -07:00
Pravin	40d26f04b2	netdev: Send ofpbuf directly to netdev. DPDK netdev need to access ofpbuf while sending buffer. Following patch changes netdev_send accordingly. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>	2014-03-21 11:48:28 -07:00
Pravin	df1e5a3bc7	netdev: Extend rx_recv to pass multiple packets. DPDK can receive multiple packets but current netdev API does not allow that. Following patch allows dpif-netdev receive batch of packet in a rx_recv() call for any netdev port. This will be used by dpdk-netdev. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-03-21 11:48:28 -07:00
Alex Wang	63be20bee2	dpif-netdev: Implement the API functions to allow multiple handler threads read upcall. This commit implements the API functions to allow multiple handler threads read upcall. Also, this commit removes the handling priority of DPIF_UC_MISS over DPIF_UC_ACTION. So, both misses will be put to the same queue. The decision is based on the fact that a lot has changed since the age when flow setup rate is most treasured and starving all actions in the presence of any flow misses doesn't seem like a sound balancing solution. Thusly the current implementation will be put in testing and investigation for better balancing solution will continue if there is an issue. Also note, the introduction and use of flow_hash_5tuple() will put missed ICMP packets from same source but with different type/code to different handler queues. This may cause reordering of these packets. For now, we do not count this as a problem. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-03-20 10:27:20 -07:00
Alex Wang	1954e6bbcb	dpif: Change dpif API to allow multiple handler threads read upcall. This commit changes the API in 'dpif-provider.h' to allow multiple handler threads call dpif_recv() simultaneously. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-03-20 10:27:10 -07:00
Jarno Rajahalme	e0eecb1ca1	lib: Use tcp_flags from flow. TCP flags are already extracted from the flow, no need to parse them again. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-03-19 16:13:32 -07:00
Jarno Rajahalme	855dd13c9a	dpif-netdev: Use packet key to parse TCP flags. The flow that created the netdev_flow might have wildcarded TCP flags, or it may not be a TCP flow at all. Fix this by using the freshly extracted flow key to parse TCP flags. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-03-19 16:13:32 -07:00
Ben Pfaff	61e7deb143	dpif-netdev: Use RCU to protect data. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-03-19 07:48:43 -07:00
Ben Pfaff	679ba04cab	dpif-netdev: Use ovsthread_stats for flow stats. This should scale better than a single mutex, though still not ideally. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-03-19 07:48:42 -07:00
Ben Pfaff	51852a57a0	ovs-thread: Replace ovsthread_counter by more general ovsthread_stats. This allows clients to do more than just increment a counter. The following commit will make the first use of that feature. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-03-19 07:47:12 -07:00
Andy Zhou	1a65ba8544	dpif-netdev: init atomic flag dp->destroyed It is better to explicitly initialize the dp->destroy than to rely on xzalloc(). Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-03-18 00:40:10 -07:00
Ben Pfaff	8917f72cbb	ovs-atomic: Delete atomic, atomic_flag, ovs_refcount destroy functions. None of the atomic implementations need a destroy function anymore, so it's "more standard" and more convenient for users to get rid of them. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-03-13 12:45:47 -07:00
Andy Zhou	b5e7e61a99	lib: simplify flow_extract() API Change the flow_extract() API to accept struct pkt_metadata, instead of individual metadata fields. It will make the API more logical and easier to maintain when we need to expand metadata down the road. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>¬	2014-02-28 16:29:37 -08:00
Joe Stringer	bdeadfdd95	dpif: New function flow_dump_next_may_destroy_keys(). This new function allows callers to determine whether previously returned keys will be modified or reallocated on the next call to dpif_flow_dump_next(). This will be used in a future commit to allow batched flow deletion by revalidator threads. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-02-27 14:39:21 -08:00
Joe Stringer	d2ad7ef178	dpif: Make dpif_flow_dump_next() thread-safe. This patch makes it the caller's responsibility to initialize a per-thread 'state' object and pass it down to the dpif_flow_dump_next() implementation. The implementation can expect to be called from multiple threads with the same 'iter' and different 'state' objects. When flow_dump_next() returns non-zero, the implementation must ensure that subsequent calls with the same arguments also return non-zero. Subsequent calls with the same 'iter' and different 'state' may return zero, but should make progress towards returning non-zero. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-02-27 14:30:25 -08:00
Joe Stringer	e723fd32d5	dpif: Separate local and shared flow dump state. This patch separates the structures for thread-local flow dump state ("state") from the shared flow dump state ("iter") in dpif-linux and dpif-netdev. Future patches will make use of this to allow multiple threads to dump flows from the same flow dump operation. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-02-27 14:27:32 -08:00
Alex Wang	71c24bb0f8	dpif-netdev: Fix memory leak. In dpif_netdev_flow_del() and dp_netdev_port_input(), the referenced 'netdev_flow' is not un-referenced. This causes the leak of the struct's memory. This commit fixes the above issue by calling dp_netdev_flow_unref() after using the reference. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-02-21 14:07:46 -08:00
Alex Wang	3754832be4	dpif-netdev: Call ovs_refcount_destroy() before free(). This commit makes dp_netdev_flow_unref() and dp_netdev_actions_unref() invoke the ovs_refcount_destroy() before freeing the corresponding pointer. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2014-02-21 14:05:07 -08:00
Ben Pfaff	8bfd0fdace	Enhance userspace support for MPLS, for up to 3 labels. This commit makes the userspace support for MPLS more complete. Now up to 3 labels are supported. Signed-off-by: Ben Pfaff <blp@nicira.com> Co-authored-by: Simon Horman <horms@verge.net.au> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-02-04 10:41:30 -08:00
Ben Pfaff	80e448834d	dpif-netdev: Make a log message more detailed. This would have helped me track down a bug I was hunting just now. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-02-04 08:11:45 -08:00
Ben Pfaff	06f8162043	classifier: Use fat_rwlock instead of ovs_rwlock. Jarno Rajahalme reported up to 40% performance gain on netperf TCP_CRR with an earlier version of this patch in combination with a kernel NUMA patch, together with a reduction in variance: http://openvswitch.org/pipermail/dev/2014-January/035867.html Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2014-01-14 14:45:10 -08:00
Ben Pfaff	6c3eee823e	dpif-netdev: Use separate threads for forwarding. For now, we use exactly two threads. Presumably at some point we will want to make this configurable. Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-01-08 17:13:32 -08:00
Ben Pfaff	8a4e3a858a	dpif-netdev: Make thread-safety much more granular. This will allow for parallelism in multithreaded forwarding in an upcoming commit. Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-01-08 17:13:32 -08:00
Ben Pfaff	f5126b5727	dpif-netdev: Introduce new mutex to protect queues. This is a first step in making thread safety more granular in dpif-netdev, to allow for multithreaded forwarding. Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-01-08 17:13:31 -08:00
Ben Pfaff	a84cb64a9e	dpif-netdev: Break actions out into new struct dp_netdev_actions. This is analogous to the split between rule and rule_actions in ofproto. As there, it will allow retaining a reference to a rule's actions, while processing them, without having to retain a reference to the rule itself. Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-01-08 17:13:31 -08:00
Ben Pfaff	6a8267c5b7	dpif-netdev: Take advantage of ovs_refcount for dp_netdev. By making "destroyed" own a reference, we can treat dp_netdev's ref_cnt like any other in Open vSwitch. Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-01-08 17:13:31 -08:00
Ben Pfaff	5c8d2fcad0	dpif-netdev: Remove max_mtu tracking. Normally all the ports have the same mtu anyhow, so there is little advantage in keeping track of the maximum mtu on a per-bridge basis. In upcoming commits, tracking mtu will require more locking and present even less advantage (because the packet buffer will become per-thread, so that reallocating once per thread becomes essentially a null cost). Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2014-01-08 17:11:14 -08:00
Ben Pfaff	ff073a71f9	dpif-netdev: Use hmap instead of list+array for tracking ports. The goal is to make it easy to divide the ports into groups for handling by threads. It seems easy enough to do that by hash value, and a little harder otherwise. This commit has the side effect of raising the maximum number of ports from 256 to UINT32_MAX-1. That is why some tests need to be updated: previously, internally generated port names like "ovs_vxlan_4341" were ignored because 4341 is bigger than the previous limit of 256. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2014-01-08 17:11:09 -08:00
Ben Pfaff	ed27e010b9	dpif-netdev: Use new "ovsthread_counter" to track dp statistics. ovsthread_counter is an abstract interface that could be implemented different ways. The initial implementation is simple but less than optimally efficient. Signed-off-by: Ben Pfaff <blp@nicira.com>	2014-01-08 17:10:32 -08:00
Ben Pfaff	9e5026938c	dpif: Remove unused 'get_max_ports' from provider interface. Nothing ever called this function. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>	2014-01-08 17:10:31 -08:00
Jarno Rajahalme	758c456df5	dpif: Use explicit packet metadata. This helps reduce confusion about when a flow is a flow and when it is just metadata. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2013-12-30 16:52:43 -08:00
Jarno Rajahalme	09f9da0bca	odp-execute: Consolidate callbacks. Use one callback instead of many, helps in adding new functionality later on. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>	2013-12-30 15:58:58 -08:00
Simon Horman	77790ca7b1	dpif-netdev: Remove unnecessary parameters from dp_netdev_port_input() The skb_priority, pkt_mark and tunl parameters of dp_netdev_port_input() are always passed as 0, 0 and NULL respectively. So rather than passing these values to dp_netdev_port_input() just use them directly. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>	2013-12-17 16:31:34 -08:00

... 2 3 4 5 6 ...

394 Commits