mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-29 21:38:13 +00:00

Author	SHA1	Message	Date
Ilya Maximets	f5188ff214	daemon.at: Correctly terminate ovsdb process in a backtrace test. In a backtrace test with monitor the child process will be re-started after being killed. The test doesn't wait for that to happen, so it is possible that during the test cleanup the pid in a pid file is not updated yet. Hence, the on-exit hook will not kill the process. This is causing issues in Cirrus CI, because gmake on FreBSD waits for all child processes to exit and that never happens. Fix the issue by waiting for a new process. It's also better to exit gracefully instead of relying on the on-exit kill. Fixes: 759a29dc2d97 ("backtrace: Extend the backtrace functionality.") Acked-by: Ales Musil <amusil@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-19 18:38:17 +02:00
Ilya Maximets	24520a401e	vswitchd: Wait for a bridge exit before replying to exit unixctl. Before the cleanup option, the bridge_exit() call was fairly fast, because it didn't include any particularly long operations. However, with the cleanup flag, this function destroys a lot of datapath resources freeing a lot of memory, waiting on RCU and talking to the kernel. That may take a noticeable amount of time, especially on a busy system or under profilers/sanitizers. However, the unixctl 'exit' command replies instantly without waiting for any work to actually be done. This may cause system test failures or other issues where scripts expect ovs-vswitchd to exit or destroy all the datapath resources shortly after appctl call. Fix that by waiting for the bridge_exit() before replying to the user. At least, all the datapath resources will actually be destroyed by the time ovs-appctl exits. Also moving a structure from stack to global. Seems cleaner this way. Since we're not replying right away and it's technically possible to have multiple clients requesting exit at the same time, storing connections in an array. Fixes: fe13ccdca6a2 ("vswitchd: Add --cleanup option to the 'appctl exit' command") Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-18 20:45:00 +02:00
Ilya Maximets	bffffd841f	Prepare for post-3.2.0 (3.2.90). Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:26:13 +02:00
Ilya Maximets	f20980a19e	Prepare for 3.2.0. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:26:08 +02:00
Adrian Moreno	07ce41da11	netdev-linux: Support 64-bit rates in tc policing. Use TCA_POLICE_RATE64 if the rate cannot be expressed using 32bits. This breaks the 32Gbps barrier. The new barrier is ~4Tbps caused by netdev's API expressing kbps rates using 32-bit integers. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2137643 Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:03:54 +02:00
Adrian Moreno	68ac6e9db7	netdev-linux: Refactor nl_msg_put_act_police. In preparation for supporting 64-bit rates in tc policies, move the allocation and initialization of struct tc_police object inside nl_msg_put_act_police(). That way, the function is now called with the actual rates. Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:03:54 +02:00
Adrian Moreno	13e183da31	netdev-linux: Remove tc_matchall_fill_police. It is equivalent to tc_policer_init() so remove the duplicated function. Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:03:54 +02:00
Adrian Moreno	a86fea06fe	netdev-linux: Use 64-bit rates in htb tc classes. Currently, htb rates are capped at ~34Gbps because they are internally expressed as 32-bit fields. Move min and max rates to 64-bit fields and use TCA_HTB_RATE64 and TCA_HTB_CEIL64 to configure HTC classes to break this barrier. In order to test this, create a dummy tuntap device and set it's speed to a very high value so we can try adding a QoS queue with big rates. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2137619 Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:03:54 +02:00
Adrian Moreno	7edfac5745	netdev-linux: Use 64bit rtab and burst calculations. tc uses these "rtab" tables to estimate the time (ticks) that it takes to send a packet of different sizes. In preparation for the introduction of 64-bit rates, add an argument to tc_put_rtab() to allow an external 64-bit rate. Also use 64bits for other burst buffer calculation functions. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:03:54 +02:00
Adrian Moreno	b8f8fad864	netdev-linux: Use speed as max rate in tc classes. Instead of relying on feature bits, use the speed value directly as maximum rate for htb and hfsc classes. There is still a limitation with the maximum rate that we can express with a 32-bit number in bytes/s (~ 34.3Gbps), but using the actual link speed instead of the feature bits, we can at least use an accurate maximum for some link speeds (such as 25Gbps) which are not supported by netdev's feature bits. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:03:54 +02:00
Adrian Moreno	6240c0b4c8	netdev: Add netdev_get_speed() to netdev API. Currently, the netdev's speed is being calculated by taking the link's feature bits (using netdev_get_features()) and transforming them into bps. This mechanism can be both inaccurate and difficult to maintain, mainly because we currently use the feature bits supported by OpenFlow which would have to be extended to support all new feature bits of all netdev implementations while keeping the OpenFlow API intact. In order to expose the link speed accurately for all current and future hardware, add a new netdev API call that allows the implementations to provide the current and maximum link speeds in Mbps. Internally, the logic to get the maximum supported speed still relies on feature bits so it might still get out of sync in the future. However, the maximum configurable speed is not used as much as the current speed and these feature bits are not exposed through the netdev interface so it should be easier to add more. Use this new function instead of netdev_get_features() where the link speed is needed. As a consequence of this patch, link speeds of cards is properly reported (internally in OVSDB) even if not supported by OpenFlow. A test verifies this behavior using a tap device. Also, in order to avoid using the old, this patch adds a checkpatch.py warning if the old API is used. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2137567 Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 20:03:32 +02:00
Ilya Maximets	1ef3f4f78a	AUTHORS: Add Felix Huettner. Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 19:51:39 +02:00
Felix Huettner	5392f89fed	relay: Allow setting probe interval. Previously it was not possible to set the probe interval for the connection from a relay to the backing ovsdb-server. With this change it is now possible using the `ovsdb-server/set-relay-source-probe-interval` command. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-17 19:51:39 +02:00
Kevin Traynor	ef4883a8df	dpif-netdev: Remove pmd-sleep-max experimental tag. Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-15 00:17:05 +02:00
Kevin Traynor	bc6a6f82e5	dpif-netdev: Add pmd-sleep-show command. Max requested sleep time and status for a PMD thread is logged at start up or when changed, but it can be convenient to have a command to dump this information explicitly. It is envisaged that this will be expanded for individual pmds in the future, hence adding to dpif_netdev_pmd_info(). Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-15 00:17:05 +02:00
Kevin Traynor	395668a68d	pmd.at: Add macro for checking pmd sleep max time and state. This is just cosmetic. There is no change to the tests. Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-15 00:11:24 +02:00
Kevin Traynor	023dcdc7a1	dpif-netdev: Rename pmd-maxsleep config option. other_config:pmd-maxsleep is a config option to allow PMD thread cores to sleep under low or no load conditions. Rename it to 'pmd-sleep-max' to allow a more structured name and so that additional options or command can follow the 'pmd-sleep-xyz' pattern. Use of other_config:pmd-maxsleep is deprecated to be removed in a future release and will result in a warning. Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-15 00:11:21 +02:00
Terry Wilson	4d55a364ff	python: Add async DNS support. This adds a Python version of the async DNS support added in: 771680d96 DNS: Add basic support for asynchronous DNS resolving The above version uses the unbound C library, and this implimentation uses the SWIG-wrapped Python version of that. In the event that the Python unbound library is not available, a warning will be logged and the resolve() method will just return None. For the case where inet_parse_active() is passed an IP address, it will not try to resolve it, so existing behavior should be preserved in the case that the unbound library is unavailable. Intentional differences from the C version are as follows: OVS_HOSTS_FILE environment variable can bet set to override the system 'hosts' file. This is primarily to allow testing to be done without requiring network connectivity. Since resolution can still be done via hosts file lookup, DNS lookups are not disabled when resolv.conf cannot be loaded. The Python socket_util module has fallen behind its C equivalent. The bare minimum change was done to inet_parse_active() to support sync/async dns, as there is no equivalent to parse_sockaddr_components(), inet_parse_passive(), etc. A TODO was added to bring socket_util.py up to equivalency to the C version. Signed-off-by: Terry Wilson <twilson@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-14 22:24:03 +02:00
Paolo Valerio	501f665a5a	conntrack: Extract l4 information for SCTP. Since a27d70a89 ("conntrack: add generic IP protocol support") all the unrecognized IP protocols get handled using ct_proto_other ops and are managed as L3 using 3 tuples. This patch stores L4 information for SCTP in the conn_key so that multiple conn instances, instead of one with ports zeroed, will be created when there are multiple SCTP connections between two hosts. It also performs crc32c check when not offloaded, and adds SCTP to pat_enabled. With this patch, given two SCTP association between two hosts, tracking the connection will result in: sctp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=55884,dport=5201), reply=(src=10.1.1.1,dst=10.1.1.2,sport=5201,dport=12345),zone=1 sctp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=59874,dport=5202), reply=(src=10.1.1.1,dst=10.1.1.2,sport=5202,dport=12346),zone=1 instead of: sctp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=0,dport=0), reply=(src=10.1.1.1,dst=10.1.1.2,sport=0,dport=0),zone=1 Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-13 21:22:41 +02:00
James Raphael Tiovalen	62f5aa42aa	shash, simap, smap: Add assertions to `*_count` functions. This commit adds assertions in the functions `shash_count`, `simap_count`, and `smap_count` to ensure that the corresponding input struct pointer is not NULL. This ensures that if the return values of `shash_sort`, `simap_sort`, or `smap_sort` are NULL, then the following for loops would not attempt to access the pointer, which might result in segmentation faults or undefined behavior. Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-13 17:29:32 +02:00
Viacheslav Galaktionov	a5fdc45b84	netdev-dpdk: Fix build with experimental API. The set_error function is now used regardless of whether experimental APIs are allowed or not, so it must be defined unconditionally. Fixes: fc06ea9a1883 ("netdev-dpdk: Add custom rx-steering configuration.") Acked-by: Ivan Malov <ivan.malov@arknetworks.am> Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@arknetworks.am> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-13 17:28:14 +02:00
Mike Pattrick	4829506b2a	ofproto-dpif-xlate: Reduce stack usage in recursive xlate functions. Several xlate actions used in recursive translation currently store a large amount of information on the stack. This can result in handler threads quickly running out of stack space despite before xlate_resubmit_resource_check() is able to terminate translation. This patch reduces stack usage by over 3kb from several translation actions. This patch also moves some trace function from do_xlate_actions into its own function. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2104779 Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-13 15:46:17 +02:00
Eelco Chaudron	f3e9d30041	AUTHORS: Add Chandan Somani. Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-12 12:09:57 +02:00
Chandan Somani	799f697e51	checkpatch: Print subject field if misspelled or missing. This narrows down spelling errors that are in the commit subject. It also provides a subject if the subject line is missing. The provisional subject is the name of the patch file, which should provide some context about the patch. Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Chandan Somani <csomani@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-12 12:01:09 +02:00
Chandan Somani	9a50170a80	checkpatch: Add suggestions to the spell checker. This will be useful for correcting possible spelling mistakes with ease. Suggestions limited to 3 at first, but can be made configurable in the future. Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Chandan Somani <csomani@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-12 12:01:09 +02:00
Chandan Somani	d25c6bd8df	checkpatch: Reorganize flagged words using a list. Single out flagged words and allow for more useful details, like spelling suggestions. Signed-off-by: Chandan Somani <csomani@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-12 12:01:09 +02:00
Ilya Maximets	f770b8c133	AUTHORS: Add James Raphael Tiovalen. Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-12 00:31:40 +02:00
James Raphael Tiovalen	b2d45921a6	ovs-vsctl: Fix crash when routing is enabled. In the case where routing is enabled, the bridge member of the `vsctl_port` structs is not populated. This can cause a crash if we attempt to access it. This patch fixes the crash by checking if the bridge member is valid before attempting to access it. In the `check_conflicts` function, we print both the port name and the bridge name if routing is disabled and we only print the port name if routing is enabled. Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-12 00:30:40 +02:00
James Raphael Tiovalen	e769387b42	file, monitor: Add null pointer assertions for old and new ovsdb_rows. This commit adds non-null pointer assertions in some code that performs some decisions based on old and new input ovsdb_rows. Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-12 00:29:08 +02:00
James Raphael Tiovalen	e71f1a2da1	ovsdb: Assert and check return values of `ovsdb_table_schema_get_column`. This commit adds a few null pointer assertions and checks to some return values of `ovsdb_table_schema_get_column`. If a null pointer is encountered in these blocks, either the assertion will fail or the control flow will now be redirected to alternative paths which will output the appropriate error messages. A few ovsdb-rbac and ovsdb-server tests are also updated to verify the expected warning logs by adding said logs to the ALLOWLIST of the OVSDB_SERVER_SHUTDOWN statements. Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: James Raphael Tiovalen <jamestiotio@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-12 00:22:02 +02:00
Ilya Maximets	00782baac0	AUTHORS: Add Sayali Naval. Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-12 00:14:54 +02:00
Sayali Naval	8e073791d4	bridge: Fix unexpected values for IPFIX enable-input/output-sampling. As per the Open vSwitch Manual ovs-vsctl(8) the Bridge IPFIX parameters can be passed as follows: ovs-vsctl -- set Bridge br0 ipfix=@i \ -- --id=@i create IPFIX targets=\"192.168.0.34:4739\" \ obs_domain_id=123 obs_point_id=456 cache_active_timeout=60 \ cache_max_flows=13 \ other_config:enable-input-sampling=false \ other_config:enable-output-sampling=false where the default values are: enable_input_sampling: true enable_output_sampling: true But in the existing code these 2 parameters take up unexpected values in some scenarios: be_opts.enable_input_sampling = !smap_get_bool(&be_cfg->other_config, "enable-input-sampling", false); be_opts.enable_output_sampling = !smap_get_bool(&be_cfg->other_config, "enable-output-sampling", false); Here, the function smap_get_bool is being used with a negation. This returns expected values for the default case (since the above code will negate “false” we get from smap_get bool function and return the value “true”) but unexpected values for the case where the sampling value is passed through the CLI. For example, if we pass "true" for other_config:enable-input-sampling in the CLI, the above code will negate the “true” value we get from the smap_bool function and return the value “false”. Same would be the case for enable_output_sampling. Acked-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Sayali Naval <sanaval@cisco.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-12 00:12:49 +02:00
Robin Jarry	fc06ea9a18	netdev-dpdk: Add custom rx-steering configuration. Some control protocols are used to maintain link status between forwarding engines (e.g. LACP). When the system is not sized properly, the PMD threads may not be able to process all incoming traffic from the configured Rx queues. When a signaling packet of such protocols is dropped, it can cause link flapping, worsening the situation. Use the rte_flow API to redirect these protocols into a dedicated Rx queue. The assumption is made that the ratio between control protocol traffic and user data traffic is very low and thus this dedicated Rx queue will never get full. Re-program the RSS redirection table to only use the other Rx queues. The additional Rx queue will be assigned a PMD core like any other Rx queue. Polling that extra queue may introduce increased latency and a slight performance penalty at the benefit of preventing link flapping. This feature must be enabled per port on specific protocols via the rx-steering option. This option takes "rss" followed by a "+" separated list of protocol names. It is only supported on ethernet ports. This feature is experimental. If the user has already configured multiple Rx queues on the port, an additional one will be allocated for control packets. If the hardware cannot satisfy the number of requested Rx queues, the last Rx queue will be assigned for control plane. If only one Rx queue is available, the rx-steering feature will be disabled. If the hardware does not support the rte_flow matchers/actions, the rx-steering feature will be completely disabled on the port and regular rss will be performed instead. It cannot be enabled when other-config:hw-offload=true as it may conflict with the offloaded flows. Similarly, if hw-offload is enabled, custom rx-steering will be forcibly disabled on all ports and replaced by regular rss. Example use: ovs-vsctl add-bond br-phy bond0 phy0 phy1 -- \ set interface phy0 type=dpdk options:dpdk-devargs=0000:ca:00.0 -- \ set interface phy0 options:rx-steering=rss+lacp -- \ set interface phy1 type=dpdk options:dpdk-devargs=0000:ca:00.1 -- \ set interface phy1 options:rx-steering=rss+lacp As a starting point, only one protocol is supported: LACP. Other protocols can be added in the future. NIC compatibility should be checked. To validate that this works as intended, I used a traffic generator to generate random traffic slightly above the machine capacity at line rate on a two ports bond interface. OVS is configured to receive traffic on two VLANs and pop/push them in a br-int bridge based on tags set on patch ports. +----------------------+ \| DUT \| \|+--------------------+\| \|\| br-int \|\| in_port=patch10,actions=mod_dl_src:$patch11, \|\| \|\| mod_dl_dst:$tgen0, \|\| \|\| output:patch10 \|\| \|\| in_port=patch11,actions=mod_dl_src:$patch10 \|\| \|\| mod_dl_dst:$tgen0, \|\| patch10 patch11 \|\| output:patch10 \|+---\|-----------\|----+\| \| \| \| \| \|+---\|-----------\|----+\| \|\| patch00 patch01 \|\| \|\| tag:10 tag:20 \|\| \|\| \|\| \|\| br-phy \|\| default flow, action=NORMAL \|\| \|\| \|\| bond0 \|\| balance-slb, lacp=passive, lacp-time=fast \|\| phy0 phy1 \|\| \|+------\|-----\|-------+\| +-------\|-----\|--------+ \| \| +-------\|-----\|--------+ \| port0 port1 \| balance L3/L4, lacp=active, lacp-time=fast \| lag \| mode trunk VLANs 10, 20 \| \| \| switch \| \| \| \| vlan 10 vlan 20 \| mode access \| port2 port3 \| +-----\|----------\|-----+ \| \| +-----\|----------\|-----+ \| tgen0 tgen1 \| Random traffic that is properly balanced \| \| across the bond ports in both directions. \| traffic generator \| +----------------------+ Without rx-steering, the bond0 links are randomly switching to "defaulted" when one of the LACP packets sent by the switch is dropped because the RX queues are full and the PMD threads did not process them fast enough. When that happens, all traffic must go through a single link which causes above line rate traffic to be dropped. ~# ovs-appctl lacp/show-stats bond0 ---- bond0 statistics ---- member: phy0: TX PDUs: 347246 RX PDUs: 14865 RX Bad PDUs: 0 RX Marker Request PDUs: 0 Link Expired: 168 Link Defaulted: 0 Carrier Status Changed: 0 member: phy1: TX PDUs: 347245 RX PDUs: 14919 RX Bad PDUs: 0 RX Marker Request PDUs: 0 Link Expired: 147 Link Defaulted: 1 Carrier Status Changed: 0 When rx-steering is enabled, no LACP packet is dropped and the bond links remain enabled at all times, maximizing the throughput. Neither the "Link Expired" nor the "Link Defaulted" counters are incremented anymore. This feature may be considered as "QoS". However, it does not work by limiting the rate of traffic explicitly. It only guarantees that some protocols have a lower chance of being dropped because the PMD cores cannot keep up with regular traffic. The choice of protocols is limited on purpose. This is not meant to be configurable by users. Some limited configurability could be considered in the future but it would expose to more potential issues if users are accidentally redirecting all traffic in the isolated queue. Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Robin Jarry <rjarry@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-10 15:49:44 +02:00
David Marchand	a5669fd51c	netdev-dpdk: Drop TSO in case of conflicting virtio features. At some point in OVS history, some virtio features were announced as supported (ECN and UFO virtio features). The userspace TSO code, which has been added later, does not support those features and tries to disable them. This breaks OVS upgrades: if an existing VM already negotiated such features, their lack on reconnection to an upgraded OVS triggers a vhost socket disconnection by Qemu. This results in an endless loop because Qemu then retries with the same set of virtio features. This patch proposes to try and detect those vhost socket disconnection and fallback restoring the old virtio features (and disabling TSO for this vhost port). Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Simon Horman <simon.horman@corigine.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-07 18:05:48 +02:00
Gavin Li	b4c7009c20	system-offloads-traffic.at: Add vxlan gbp offload test. Add a vxlan gbp offload test case: vxlan offloads with gbp extention - ping between two ports - offloads enabled ok Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	7f04588d78	netdev-tc-offloads: Probe for allowing vxlan gbp support. Kernels that do not support vxlan gbp would treat the rule that has vxlan gbp encap action or vxlan gbp id match differently, either reject it or just skip the action/match and continue processing the knowing ones. To solve the issue, probe and disallow inserting rules with vxlan gbp action/match if kernel does not support it. Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	a2a3f1983f	tc: Add vxlan encap action with gbp option offload. Add TC offload support for vxlan encap with gbp option. Reviewed-by: Gavi Teitz <gavi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	256c1e5819	tc: Pass encap entirely to nl_msg_put_act_tunnel_key_set. Most of the data members of struct tc_action{ } are defined as anonymous struct in place. Instead of passing all members of an anonymous struct, which is not flexible to new members being added, expose encap as named struct and pass it entirely. Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	a4332b5e68	tc: Add vxlan gbp option flower match offload. Add TC offload support for filtering vxlan tunnels with gbp option. Reviewed-by: Gavi Teitz <gavi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	c39d7d06f5	netlink: Add new function to add NLA_F_NESTED to nested netlink messages. Linux kernel netlink module added NLA_F_NESTED flag checking for nested netlink messages in 5.2. A nested message without the flag set will be treated as malformatted one. The check is optional and is controlled by message policy. To avoid this, add NLA_F_NESTED explicitly for all nested netlink messages with a new function nl_msg_start_nested_with_flag(). Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	31baa7781e	odp-util: Extract vxlan gbp option encoding to a function. Extract vxlan gbp option encoding to odp_encode_gbp_raw to be used in following commits. Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	8c3d5488da	odp-util: Extract vxlan gbp option decoding to a function. Extract vxlan gbp option decoding to odp_decode_gbp_raw to be used in following commits. Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Gavin Li	affb9b8183	tc: Pass tunnel entirely to tunnel option parse and put functions. Tc flower tunnel key options were encoded in nl_msg_put_flower_tunnel_opts and decoded in nl_parse_flower_tunnel_opts. Only geneve was supported. To avoid adding more arguments to the function to support more vxlan options in the future, change the function arguments to pass tunnel entirely to it instead of keep adding new arguments. Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2023-07-03 11:56:39 +02:00
Ilya Maximets	c2433bdfc0	dpif-netdev: Lockless meters. Current implementation of meters in the userspace datapath takes the meter lock for every packet batch. If more than one thread hits the flow with the same meter, they will lock each other. Replace the critical section with atomic operations to avoid interlocking. Meters themselves are RCU-protected, so it's safe to access them without holding a lock. Implementation does the following: 1. Tries to advance the 'used' timer of the meter with atomic compare+exchange if it's smaller than 'now'. 2. If the timer change succeeds, atomically update band buckets. 3. Atomically update packet statistics for a meter. 4. Go over buckets and try to atomically subtract the amount of packets or bytes, recording the highest exceeded band. 5. Atomically update band statistics and drop packets. Bucket manipulations are implemented with atomic compare+exchange operations with extra checks, because bucket size should never exceed the maximum and it should never go below zero. Packet statistics may be momentarily inconsistent, i.e., number of packets and the number of bytes may reflect different sets of packets. But it should be eventually consistent. And the difference at any given time should be in just few packets. For the sake of reduced code complexity PKTPS meter tries to push packets through the band one by one, even though they all have the same weight. This is also more fair if more than one thread is passing packets through the same band at the same time. Trying to predict the number of packets that can pass may also cause extra atomic operations reducing the performance. This implementation shows similar performance to the previous one, but should scale better with more threads hitting the same meter. Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Lin Huang <linhuang@ruijie.com.cn> Tested-by: Zhang YuHuang <zhangyuhuang@ruijie.com.cn> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-07-01 00:35:18 +02:00
Han Zhou	2ece9c9ac1	ovsdb: raft: Fix RAFT paper link. Signed-off-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-30 00:03:36 +02:00
Paolo Valerio	9b4d2ad8e8	conntrack: Allow to dump userspace conntrack expectations. The patch introduces a new commands ovs-appctl dpctl/dump-conntrack-exp that allows to dump the existing expectations for the userspace ct. Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-29 22:20:43 +02:00
Kevin Traynor	34ace16cb8	tests: Add macro to common file. get_log_next_line_num() was defined in alb.at. As it may be useful in other test files, move to ofproto-macros.at. Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-29 22:13:55 +02:00
Dumitru Ceara	d56932aac6	checkpatch: Ignore yml files when checking line lengths. As far as I can tell they're used mostly for CI job definitions and these tend to result in long lines. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2023-June/405796.html Suggested-by: Aaron Conole <aconole@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-28 12:39:31 +02:00
Eelco Chaudron	903294cde6	dpif: Add coverage counters for dpif_operate() failures. Add additional error coverage counters for dpif operation failures. This could help to quickly identify netlink problems when communicating with the OVS kernel module. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2070630 Reviewed-by: Adrian Moreno <amorenoz@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-23 15:04:55 +02:00
Simon Horman	c918670302	MAINTAINERS: Add Eelco Chaudron. Eelco Chaudron was elected by the Open vSwitch committers yesterday. This formalises his status as an Open vSwitch committer. Welcome Eelco! Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2023-06-21 15:02:47 +02:00

1 2 3 4 5 ...

19672 Commits