mir/ovs - ovs - Mike's Git repositories

mir/ovs

mirror of https://github.com/openvswitch/ovs synced 2025-08-31 14:25:26 +00:00

Author	SHA1	Message	Date
Simon Horman	95ff912ede	appveyor: Prepare for rename of primary development branch. Recently OVS adopted a policy of using the inclusive naming word list v1 [1, 2]. And in keeping with this policy it is intended to rename the primary development branch from master to main [3]. In order to help facilitate this change allow Appveyor to run on the main as well as master branch. It is intended that master branch will be removed from appveyor.yml after the primary branch has been renamed. Also, update the string included in artifacts from 'master' to 'main'. [1] `df5e5cf431` ("Documentation: Add section on inclusive language.") [2] https://inclusivenaming.org/word-lists/ [3] https://mail.openvswitch.org/pipermail/ovs-dev/2024-March/412686.html Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Ilya Maximets <i.maximets@ovn.org>	2024-04-02 12:18:08 +01:00
Ilya Maximets	5b9feacfc8	ovsdb: raft: Fix inability to join after leadership change round trip. Consider the following sequence of events: 1. Cluster with 2 nodes - A and B. A is a leader. 2. C connects to A and sends a join request. 3. A sends an append request to C. C is in CATCHUP phase for A. 4. A looses leadership to B. Sends join failure notification to C. 5. C sends append reply to A. 6. A discards append reply (not leader). 7. B looses leadership back to A. 8. C sends a new join request to A. 9. A replies with failure (already in progress). 10. GoTo step 8. At this point A is waiting for an append reply that it already discarded at step 6 and fails all the new attempts of C to join with 'already in progress' verdict. C stays forever in a joining state and in a CATCHUP phase from A's perspective. This is a similar case to a sudden disconnect from a leader fixed in commit `999ba294fb` ("ovsdb: raft: Fix inability to join the cluster after interrupted attempt."), but since we're not disconnecting, the servers are not getting destroyed. Fix that by destroying all the servers that are not yet part of the configuration after leadership is lost. This way, server C will be able to simply re-start the joining process from scratch. New failure test command is added in order to simulate leadership change before we receive the append reply, so it gets discarded. New cluster test is added to exercise this scenario. Fixes: `1b1d2e6daa` ("ovsdb: Introduce experimental support for clustered databases.") Reported-at: https://github.com/ovn-org/ovn/issues/235 Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-27 21:20:01 +01:00
Ilya Maximets	e987af5039	ovsdb: raft: Fix assertion when 1-node cluster looses leadership. Some of the failure tests can make a single-node cluster to loose leadership. In this case the next raft_run() will trigger election with a pre-vote enabled. This is causing an assertion when this server attempts to vote for itself. Fix that by not using pre-voting if there is only one server. A new failure test introduced in later commit triggers this assertion every time. Fixes: `85634fd580` ("ovsdb: raft: Support pre-vote mechanism to deal with disruptive server.") Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-27 21:19:58 +01:00
Ilya Maximets	af5a99737e	ovsdb: raft: Fix permanent joining state on a cluster member. Consider the following chain of events: 1. Have a cluster with 2 members - A and B. A is a leader. 2. C connects to A, sends a request to join the cluster. 3. A catches up C, creates an update for the 'servers' list and sends it to B and C to apply. This entry is not committed yet. 4. Before B or C can reply, A looses leadership for some reason. 5. A sends a joining failure message to C, C remains in joining state. 5. Both B and C have the new version of 'servers', so they recognize each other as valid cluster members. 6. B initiates a vote, C (or A) replies and B becomes a new leader. 7. B has a new list of servers. B commits it. C becomes a committed cluster member. 8. A and C receive heartbeats with a new commit index and C is now a committed cluster member for all A, B and C. However, at the end of this process, C is still in joining state as it never received a successful reply for a join request, and C is still in a COMMITTING phase for A. So, C skips some parts of the RAFT life cycle and A will refuse to transfer leadership to C if something happens in the future. More interestingly, B can actually transfer leadership to C and vote for it. A will vote for it just fine as well. After that, C becomes a new cluster leader while still in joining state. In this state C will not commit any changes. So, we have seemingly stable cluster that doesn't commit any changes! E.g.: s3 Address: unix:s3.raft Status: joining cluster Remotes for joining: unix:s3.raft unix:s2.raft unix:s1.raft Role: leader Term: 4 Leader: self Vote: self Last Election started 30095 ms ago, reason: leadership_transfer Last Election won: 30093 ms ago Election timer: 1000 Log: [2, 7] Entries not yet committed: 2 Entries not yet applied: 6 Connections: ->s1 ->s2 <-s1 <-s2 Disconnections: 0 Servers: s3 (60cf at unix:s3.raft) (self) next_index=7 match_index=6 s2 (46aa at unix:s2.raft) next_index=7 match_index=6 last msg 58 ms ago s1 (28f7 at unix:s1.raft) next_index=7 match_index=6 last msg 59 ms ago Fix the first scenario by examining server changes in committed log entries. This way server A can transition C to a STABLE phase and server C can find itself in the committed list of servers and move out from a joining state. This is similar to completing commands without receiving an explicit reply or after the role change from leader to follower. The second scenario with a leader in a joining state can be fixed when the joining server becomes leader. New leader's log is getting committed automatically and all servers transition into STABLE phase for it, but it should also move on from a joining state, since it leads the cluster now. It is also possible that B transfers leadership to C before the list of servers is marked as committed on other servers. In this case C will commit it's own addition to the cluster configuration. The added test usually triggers both scenarios, but it will trigger at least one of them. Fixes: `1b1d2e6daa` ("ovsdb: Introduce experimental support for clustered databases.") Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-27 21:19:56 +01:00
Ilya Maximets	bcad733e2c	ovsdb: raft: Fix time intervals for multitasking while joining. While joining, ovsdb-server may not wake up for a duration of a join timer, which is 1 second and is by default 3x larger than a heartbeat timer. This is causing unnecessary warnings from the cooperative multitasking module that thinks that we missed the heartbeat time by a lot. Use join timer (1000) instead while joining. Fixes: `d4a15647b9` ("ovsdb: raft: Enable cooperative multitasking.") Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-27 21:19:52 +01:00
Ilya Maximets	aab379ec21	ovsdb: raft: Avoid transferring leadership to unavailable servers. Current implementation of the leadership transfer just shoots the leadership in the general direction of the first stable server in the configuration. It doesn't check if the server was active recently or even that the connection is established. This may result in sending leadership to a disconnected or otherwise unavailable server. Such behavior should not cause log truncation or any other correctness issues because the destination server would have all the append requests queued up or the connection will be dropped by the leader. In a worst case we will have a leader-less cluster until the next election timer fires up. Other servers will notice the absence of the leader and will trigger a new leader election normally. However, the potential wait for the election timer is not good as real-world setups may have high values configured. Fix that by trying to transfer to servers that we know have applied the most changes, i.e., have the highest 'match_index'. Such servers replied to the most recent append requests, so they have highest chances to be healthy. Choosing the random starting point in the list of such servers so we don't transfer to the same server every single time. This slightly improves load distribution, but, most importantly, increases robustness of our test suite, making it cover more cases. Also checking that the message was actually sent without immediate failure. If we fail to transfer to any server with the highest index, try to just transfer to any other server that is not behind majority and then just any other server that is connected. We did actually send them all the updates (if the connection is open), they just didn't reply yet for one reason or another. It should be better than leaving the cluster without a leader. Note that there is always a chance that transfer will fail, since we're not waiting for it to be acknowledged (and must not wait). In this case, normal election will be triggered after the election timer fires up. Fixes: `1b1d2e6daa` ("ovsdb: Introduce experimental support for clustered databases.") Acked-by: Han Zhou <hzhou@ovn.org> Acked-by: Felix Huettner <felix.huettner@mail.schwarz> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-27 21:19:45 +01:00
Aaron Conole	bf7c0b0a90	ofproto-dpif-xlate: Fix continuations with associated metering. Open vSwitch supports the ability to invoke a controller action by way of a sample action with a specified meter. In the normal case, this sample action is transparently generated during xlate processing. However, when executing via a continuation, the logic to generate the sample action when finishing the context freeze was missing. The result is that the behavior when action is 'controller(pause,meter_id=1)' does not match the behavior when action is 'controller(meter_id=1)'. OVN and other controller solutions may rely on this metering to protect the control path, so it is critical to preserve metering, whether we are doing a plain old send to controller, or a continuation. Fixes: `77ab5fd2a9` ("Implement serializing the state of packet traversal in "continuations".") Reported-at: https://issues.redhat.com/browse/FDP-455 Tested-by: Alex Musil <amusil@redhat.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Acked-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-27 09:25:52 -04:00
Ilya Maximets	c6538b4439	dpif-netdev: Fix crash due to tunnel offloading on recirculation. Recirculation involves re-parsing the packet from scratch and that process is not aware of multiple header levels nor the inner/outer offsets. So, it overwrites offsets with new ones from the outermost headers and sets offloading flags that change their meaning when the packet is marked for tunnel offloading. For example: 1. TCP packet enters OVS. 2. TCP packet gets encapsulated into UDP tunnel. 3. Recirculation happens. 4. Packet is re-parsed after recirculation with miniflow_extract() or similar function. 5. Packet is marked for UDP checksumming because we parse the outermost set of headers. But since it is tunneled, it means inner UDP checksumming. And that makes no sense, because the inner packet is TCP. This is causing packet drops due to malformed packets or even assertions and crashes in the code that is trying to fixup checksums for packets using incorrect metadata: SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/packets.c:2061:15: runtime error: member access within null pointer of type 'struct udp_header' 0 0xbe5221 in packet_udp_complete_csum lib/packets.c:2061:15 1 0x7e5662 in dp_packet_ol_send_prepare lib/dp-packet.c:638:9 2 0x96ef89 in netdev_send lib/netdev.c:940:9 3 0x818e94 in dp_netdev_pmd_flush_output_on_port lib/dpif-netdev.c:5577:9 4 0x817606 in dp_netdev_pmd_flush_output_packets lib/dpif-netdev.c:5618:27 5 0x81cfa5 in dp_netdev_process_rxq_port lib/dpif-netdev.c:5677:9 6 0x7eefe4 in dpif_netdev_run lib/dpif-netdev.c:7001:25 7 0x610e87 in type_run ofproto/ofproto-dpif.c:367:9 8 0x5b9e80 in ofproto_type_run ofproto/ofproto.c:1879:31 9 0x55bbb4 in bridge_run__ vswitchd/bridge.c:3281:9 10 0x558b6b in bridge_run vswitchd/bridge.c:3346:5 11 0x591dc5 in main vswitchd/ovs-vswitchd.c:130:9 12 0x172b89 in __libc_start_call_main (/lib64/libc.so.6+0x27b89) 13 0x172c4a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x27c4a) 14 0x47eff4 in _start (vswitchd/ovs-vswitchd+0x47eff4) Tests added for both IPv4 and IPv6 cases. Though IPv6 test doesn't trigger the issue it's better to have a symmetric test. Fixes: `084c808729` ("userspace: Support VXLAN and GENEVE TSO.") Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2024-March/053014.html Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-22 20:45:37 +01:00
Ilya Maximets	e6a8a8e900	AUTHORS: Add Jun Wang. Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-22 20:44:43 +01:00
Jun Wang	6f93d8e62f	netdev-dpdk: Disable outer UDP checksum offload for ice/i40e driver. Fixing the issue of incorrect outer UDP checksum in packets sent by E810 or X710. We disable RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM,but also disable all the dependent offloads like RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO and RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO. Fixes: `084c808729` ("userspace: Support VXLAN and GENEVE TSO.") Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 Signed-off-by: Jun Wang <junwang01@cestc.cn> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-22 20:36:50 +01:00
Mike Pattrick	3ddb31f604	ovs-monitor-ipsec: LibreSwan autodetect paths. In v4.0, LibreSwan changed a default paths that had been hardcoded in ovs-monitor-ipsec, breaking some uses of this script. This patch adds support for both old and newer versions by auto detecting the version of LibreSwan and then choosing the correct path. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1975039 Reported-by: Qijun Ding <qding@redhat.com> Fixes: `d6afbc00d5` ("ipsec: Allow custom file locations.") Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-22 20:35:45 +01:00
Ilya Maximets	840979663d	route-table: Avoid routes from non-standard routing tables. Currently, ovs-vswitchd is subscribed to all the routing changes in the kernel. On each change, it marks the internal routing table cache as invalid, then resets it and dumps all the routes from the kernel from scratch. The reason for that is kernel routing updates not being reliable in a sense that it's hard to tell which route is getting removed or modified. Userspace application has to track the order in which route entries are dumped from the kernel. Updates can get lost or even duplicated and the kernel doesn't provide a good mechanism to distinguish one route from another. To my knowledge, dumping all the routes from a kernel after each change is the only way to keep the cache consistent. Some more info can be found in the following never addressed issues: https://bugzilla.redhat.com/1337860 https://bugzilla.redhat.com/1337855 It seems to be believed that NetworkManager "mostly" does incremental updates right. But it is still not completely correct, will re-dump the whole table in certain cases, and it takes a huge amount of very complicated code to do the accounting and route comparisons. Going back to ovs-vswitchd, it currently dumps routes from all the routing tables. If it will get conflicting routes from multiple tables, the cache will not be useful. The routing cache in userspace is primarily used for checking the egress port for tunneled traffic and this way also detecting link state changes for a tunnel port. For userspace datapath it is used for actual routing of the packet after sending to a native tunnel. With kernel datapath we don't really have a mechanism to know which routing table will actually be used by the kernel after encapsulation, so our lookups on a cache may be incorrect because of this as well. So, unless all the relevant routes are in the standard tables, the lookup in userspace route cache is unreliable. Luckily, most setups are not using any complicated routing in non-standard tables that OVS has to be aware of. It is possible, but unlikely, that standard routing tables are completely empty while some other custom table is not, and all the OVS tunnel traffic is directed to that table. That would be the only scenario where dumping non-standard tables would make sense. But it seems like this kind of setup will likely need a way to tell OVS from which table the routes should be taken, or we'll need to dump routing rules and keep a separate cache for each table, so we can first match on rules and then lookup correct routes in a specific table. I'm not sure if trying to implement all that is justified. For now, stop considering routes from non-standard tables to avoid mixing different tables together and also wasting CPU resources. This fixes a high CPU usage in ovs-vswitchd in case a BGP daemon is running on a same host and in a same network namespace with OVS using its own custom routing table. Unfortunately, there seems to be no way to tell the kernel to send updates only for particular tables. So, we'll still receive and parse all of them. But they will not result in a full cache invalidation in most cases. Linux kernel v4.20 introduced filtering support for RTM_GETROUTE dumps. So, we can make use of it and dump only standard tables when we get a relevant route update. NETLINK_GET_STRICT_CHK has to be enabled on the socket for filtering to work. There is no reason to not enable it by default, if supported. It is not used outside of NETLINK_ROUTE. Fixes: `f0e167f0db` ("route-table: Handle route updates more robustly.") Fixes: `ea83a2fcd0` ("lib: Show tunnel egress interface in ovsdb") Reported-at: https://github.com/openvswitch/ovs-issues/issues/185 Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-October/052091.html Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-22 20:31:25 +01:00
Daniel Ding	3388c3451f	ovs-tcpdump: Fix cleanup mirror failed with twice fatal signals. After running ovs-tcpdump and inputs multiple CTRL+C, the program will raise the following exception. Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/usr/bin/ovs-tcpdump", line 421, in cleanup_mirror ovsdb = OVSDB(db_sock) File "/usr/bin/ovs-tcpdump", line 168, in __init__ OVSDB.wait_for_db_change(self._idl_conn) # Initial Sync with DB File "/usr/bin/ovs-tcpdump", line 155, in wait_for_db_change while idl.change_seqno == seq and not idl.run(): The default handler of SIGINT is default_int_handler, so it was not registered to the signal handler. When received CTRL+C again, the program was broken, and calling hook could not be executed completely. Signed-off-by: Daniel Ding <danieldin186@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-22 20:27:07 +01:00
Tao Liu	9d0a40120f	ofproto-dpif: Fix tunnel with different name del/add failure. Reproduce: ovs-vsctl add-port br-int p0 \ -- set interface p0 type=vxlan options:remote_ip=10.10.10.1 sleep 2 ovs-vsctl --if-exists del-port p0 \ -- add-port br-int p1 \ -- set interface p1 type=vxlan options:remote_ip=10.10.10.1 ovs-vsctl: Error detected while setting up 'p1': could not add network device p1 to ofproto (File exists). vswitchd log: bridge\|INFO\|bridge br-int: added interface p0 on port 1106 bridge\|INFO\|bridge br-int: deleted interface p0 on port 1106 tunnel\|WARN\|p1: attempting to add tunnel port with same config as port 'p0' (::->10.10.10.1, key=0, legacy_l2, dp port=122) ofproto\|WARN\|br-int: could not add port p1 (File exists) bridge\|WARN\|could not add network device p1 to ofproto (File exists) CallTrace: bridge_reconfigure bridge_del_ports port_destroy iface_destroy__ netdev_remove <------ netdev p0 removed bridge_delete_or_reconfigure_ports OFPROTO_PORT_FOR_EACH ofproto_port_dump_next port_dump_next port_query_by_name <------ netdev_shash do not contain p0 ofproto_port_del <------ p0 do not del in ofproto bridge_add_ports bridge_add_ports__ iface_create iface_do_create ofproto_port_add <------ p1 add failed Fixes: `fe83f81df9` ("netdev: Remove netdev from global shash when the user is changing interface configuration.") Acked-by: Han Zhou <hzhou@ovn.org> Tested-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Tao Liu <taoliu828@163.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-19 20:25:02 +01:00
Ales Musil	5339ce386f	ofpbuf: Prevent undefined behavior in ofpbuf_clone. The new_buffer data pointer is NULL when the size of the cloned buffer is 0. This is fine as there is no need to allocate space. However, the cloned buffer header/msg might be the same pointer as data. This causes undefined behavior by adding 0 to NULL pointer. Check if the data buffer is not NULL before attempting to apply the header/msg offset. This was caught by OVN system test: lib/ofpbuf.c:203:56: runtime error: applying zero offset to null pointer 0 0xa012fc in ofpbuf_clone_with_headroom /ovs/lib/ofpbuf.c:203:56 1 0x635fd4 in put_remote_port_redirect_overlay /controller/physical.c:397:40 2 0x635fd4 in consider_port_binding /controller/physical.c:1951:9 3 0x62e046 in physical_run /controller/physical.c:2447:9 4 0x601d98 in en_pflow_output_run /controller/ovn-controller.c:4690:5 5 0x707769 in engine_recompute /lib/inc-proc-eng.c:415:5 6 0x7060eb in engine_compute /lib/inc-proc-eng.c:454:17 7 0x7060eb in engine_run_node /lib/inc-proc-eng.c:503:14 8 0x7060eb in engine_run /lib/inc-proc-eng.c:528:9 9 0x5f9f26 in main /controller/ovn-controller.c Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-19 20:19:14 +01:00
Eelco Chaudron	679b068ac7	AUTHORS: Add Kevin Sprague. Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-19 16:26:33 +01:00
Aaron Conole	b89a6b8145	rhel: Enable USDT scripts by default in Fedora builds. All supported versions of Fedora do package libbpf, so it makes sense to enable USDT support. Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-19 15:34:53 +01:00
Kevin Sprague	86b9e653ef	revalidator: Add a USDT probe during flow deletion with purge reason. During normal operations, it is useful to understand when a particular flow gets removed from the system. This can be useful when debugging performance issues tied to ofproto flow changes, trying to determine deployed traffic patterns, or while debugging dynamic systems where ports come and go. Prior to this change, there was a lack of visibility around flow expiration. The existing debugging infrastructure could tell us when a flow was added to the datapath, but not when it was removed or why. This change introduces a USDT probe at the point where the revalidator determines that the flow should be removed. Additionally, we track the reason for the flow eviction and provide that information as well. With this change, we can track the complete flow lifecycle for the netlink datapath by hooking the upcall tracepoint in kernel, the flow put USDT, and the revalidator USDT, letting us watch as flows are added and removed from the kernel datapath. This change only enables this information via USDT probe, so it won't be possible to access this information any other way (see: Documentation/topics/usdt-probes.rst). Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py) which serves as a demonstration of how the new USDT probe might be used going forward. Co-authored-by: Aaron Conole <aconole@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Kevin Sprague <ksprague0711@gmail.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-19 15:34:53 +01:00
Ilya Maximets	0ce82ac45e	netdev-dpdk: Fix tunnel type check during Tx offload preparation. Tunnel types are not flags, but 4-bit fields, so checking them with a simple binary 'and' is incorrect and may produce false-positive matches. While the current implementation is unlikely to cause any issues today, since both RTE_MBUF_F_TX_TUNNEL_VXLAN and RTE_MBUF_F_TX_TUNNEL_GENEVE only have 1 bit set, it is risky to have this code and it may lead to problems if we add support for other tunnel types in the future. Use proper field checks instead. Also adding a warning for unexpected tunnel types in case something goes wrong. Fixes: `084c808729` ("userspace: Support VXLAN and GENEVE TSO.") Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-16 02:06:20 +01:00
Ilya Maximets	05e9f05d14	netdev-dpdk: Fix TCP check during Tx offload preparation. RTE_MBUF_F_TX_TCP_CKSUM is not a flag, but a 2-bit field, so checking it with a simple binary 'and' is incorrect. For example, this check will succeed for a packet with UDP checksum requested as well. Fix the check to avoid wrongly initializing tso_segz and potentially accessing UDP header via TCP structure pointer. The IPv4 checksum flag has to be set for any L4 checksum request, regardless of the type, so moving this check out of the TCP condition. Fixes: `8b5fe2dc60` ("userspace: Add Generic Segmentation Offloading.") Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-16 02:06:20 +01:00
Ilya Maximets	f8809760fc	netdev-dpdk: Clear inner packet marks if no inner offloads requested. In some cases only outer offloads may be requested for a tunneled packet. In this case there is no need to mark the type of an inner packet. Clean these flags up to avoid potential confusion of DPDK drivers. Fixes: `084c808729` ("userspace: Support VXLAN and GENEVE TSO.") Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-16 02:06:20 +01:00
Ilya Maximets	7df30c86ce	netdev-dpdk: Clean up all marker flags if no offloads requested. Some drivers (primarily, Intel ones) do not expect any marking flags being set if no offloads are requested. If these flags are present, driver will fail Tx preparation or behave abnormally. For example, ixgbe driver will refuse to process the packet with only RTE_MBUF_F_TX_TUNNEL_GENEVE and RTE_MBUF_F_TX_OUTER_IPV4 set. This pretty much breaks Geneve tunnels on these cards. An extra check is added to make sure we don't have any unexpected Tx offload flags set. Fixes: `084c808729` ("userspace: Support VXLAN and GENEVE TSO.") Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-13 16:29:25 +01:00
Ilya Maximets	fa0dfa18d5	github: Reduce ASLR entropy to be compatible with asan in llvm 14. Starting with image version 20240310.1.0, GitHub runners are using 32-bit entropy for ASLR: $ sudo sysctl -a \| grep vm.mmap.rnd vm.mmap_rnd_bits = 32 vm.mmap_rnd_compat_bits = 16 This breaks all the asan-enabled builds, because older asan gets confused by memory mappings and crashes with segmentation fault. The issue is fixed in newer releases of llvm: `fb77ca05ff` https://reviews.llvm.org/D148280 But these are not available in Ubuntu 22.04 image. This should be fixed by GitHub, but until new images are available reducing ASLR entropy manually to 28 bits to make builds work. Reported-at: https://github.com/actions/runner-images/issues/9491 Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-12 15:00:20 +01:00
Mike Pattrick	33f45ded67	ovsdb: Don't iterate over rows on empty mutation. Previously when an empty mutation was used to count the number of rows in a table, OVSDB would iterate over all rows twice. First to perform an RBAC check, and then to perform the no-operation. This change adds a short circuit to mutate operations with no conditions and an empty mutation set, returning immediately. One notable change in functionality is not performing the RBAC check in this condition, as no mutation actually takes place. Reported-by: Terry Wilson <twilson@redhat.com> Reported-at: https://issues.redhat.com/browse/FDP-359 Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-08 20:21:27 +01:00
Ilya Maximets	2c4ffd2f8a	netdev-dpdk: Dump packets that fail Tx preparation. It's hard to debug situations where driver rejects packets for some reason. Dumping out the mbuf should help with that. Sample output looks like this: \|netdev_dpdk(pmd-c03/id:8)\|DBG\|ovs-p1: First invalid packet: dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176 pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0 segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1 Dump data at [0x1180bce580], len=64 00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 \| 33.....'..M...`. 00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 \| ...$............ 00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 \| ................ 00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 \| ......:......... Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-08 20:19:54 +01:00
Timothy Redaelli	07c2ef5cd0	bfd: Improve state change log message. A log message like this one: 2024-01-09T06:45:17.201Z\|00071\|bfd(handler2)\|INFO\|ovn-0af536-0: BFD state change: down->up "Neighbor Signaled Session Down"->"Neighbor Signaled Session Down". can be hard to read since '->' usually represents a status change, but in this case the diagnostic code stays constant. Update the log message to avoid such ambiguity. The log message for the above event become: 2024-01-09T06:45:16.211Z\|00026\|bfd(handler3)\|INFO\|ovn-0af536-0: BFD state change: (bfd.SessionState: down, bfd.LocalDiag: "Neighbor Signaled Session Down") -> (bfd.SessionState: up, bfd.LocalDiag: "Neighbor Signaled Session Down") Reported-by: Alex Stupnikov <astupnik@redhat.com> Reported-at: https://bugzilla.redhat.com/2258496 Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-08 20:18:05 +01:00
Timothy Redaelli	d2a42f3963	tests: Fix "SSL db: Implementation" test with openssl > 3.2.0. In OpenSSL 3.2.0 (81b741f) all the "alert" error messages were updated to replace "sslv3" with "ssl/tls". This commit updates the "SSL db: implementation" test to support both the pre-openssl 3.2.0 error message: "sslv3 alert certificate unknown" and the post-openssl 3.2.0 error message: "ssl/tls alert certificate unknown". Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-08 20:15:44 +01:00
Xavier Simonart	6c082a8310	conntrack: Fix flush not flushing all elements. On netdev datapath, when a ct element was cleaned, the cmap could be shrinked, potentially causing some elements to be skipped in the flush iteration. Fixes: `967bb5c5cd` ("conntrack: Add rcu support.") Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Simon Horman <horms@ovn.org>	2024-03-06 17:50:27 +00:00
Simon Horman	e0aa15f897	utilities: Use localhost as sample hostname. Recently OVS adopted a policy of using the inclusive naming word list v1 [1, 2]. This patch addresses the use of the term master in the context of a hostname used in documentation of the kernel_delay utility. It does so by using localhost as the hostname instead. [1] `df5e5cf` ("Documentation: Add section on inclusive language.") [2] https://inclusivenaming.org/word-lists/ Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-06 10:12:28 +00:00
Simon Horman	b3ebc34a06	netdev-linux: Rename local variables as primary_*. Recently OVS adopted a policy of using the inclusive naming word list v1 [1, 2]. This patch partially addresses the use of the term master in the context of LAG devices by using the term primary instead: the local variables master_netdev and master_name are renamed as primary_netdev and primary_name. Related comments are also updated. No functional change intended. [1] `df5e5cf` ("Documentation: Add section on inclusive language.") [2] https://inclusivenaming.org/word-lists/ Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-06 10:12:12 +00:00
Simon Horman	f92b30a0ff	netdev-linux: Rename struct nedev_linux field as is_lag_primary. Recently OVS adopted a policy of using the inclusive naming word list v1 [1, 2]. This patch partially addresses the use of the term master in the context of LAG devices by using the term primary instead: the is_lag_master field of struct netdev_linux is renamed is_lag_primary. A related comment is also updated. No functional change intended. [1] `df5e5cf` ("Documentation: Add section on inclusive language.") [2] https://inclusivenaming.org/word-lists/ Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-06 10:10:58 +00:00
Simon Horman	0c255bf763	Documentation: Update to refer to main repository. Recently OVS adopted a policy of using the inclusive naming word list v1 [1, 2]. This patch addresses the use of the term master repository by using the term main repository instead. This is as distinct from addressing the use of a master branch, which remains as a follow-up task. [1] `df5e5cf` ("Documentation: Add section on inclusive language.") [2] https://inclusivenaming.org/word-lists/ Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-06 10:10:43 +00:00
Simon Horman	29e09c8091	vswitch.xml: Use member wording for bonds. Since the patch-set that included [1] there has been a policy of using the term member for bonds, LACP, and bundle contexts. This is consistent with the more recently adopted policy of using the inclusive naming word list v1 [2, 3]. This patch addresses two instances where the term member should be used in vswitch.xml. It does not address instances of alternative wording that require code updates, which can addressed as follow-up activity. [1] `91fc374a9c` ("Eliminate use of term "slave" in bond, LACP, and bundle contexts.") [2] `df5e5cf431` ("Documentation: Add section on inclusive language.") [3] https://inclusivenaming.org/word-lists/ Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-06 10:10:29 +00:00
Ilya Maximets	9d8208484a	appveyor: Build with OpenSSL 3.0. OpenSSL 1.0.2u is long deprecated and not available for download. So, our CI never actually downloads it and uses whatever is in the OpenSSL-Win64 folder provided by AppVeyor. Luckily, it happens to be OpenSSL 1.0.2u today. The oldest supported version of OpenSSL upstream today is 3.0. And it is an LTS version. 3.1 and 3.2 are not LTS. Use OpenSSL 3.0 for testing instead. This commit does a few things to achieve that: 1. Removes the folder provided by AppVeyor. This way we will fail the build if something goes wrong instead of silently using OpenSSL version provided by AppVeyor. 2. Obtains the JSON description of available releases and downloads the latest minor version of OpenSSL 3.0 64-bit. With this approach we should not need to update the download link that frequently. New minor releases will be picked up automatically. They should not have any breaking changes, so should be fine to use in CI. OpenSSL 3.0 is supported until at least Sep 2026. The JSON file is an official file referenced on the: https://slproweb.com/products/Win32OpenSSL.html So, it should be safe to use. 3. Executes the downloaded installer with 'Start-Process -Wait' to properly wait for installation to finish instead of just sleeping for 30 seconds. 4. Caches the downloaded installer, so we're not downloading 300 MB on each CI run as that is not nice to do. We know the hash of the latest version, so we will re-download only when the binary changes, i.e. on a new minor release. For the cache to work we need to introduce the 'install' phase, because caches are populated after 'init', but before 'install'. Alternatively, we could have just renamed 'init' to 'install', but I think it's a little nicer to have separate phases, and we can also move 'windows-prepare.sh' to the install phase. Cache is also invalidated whenever appveyor.yml changes. Acked-by: Simon Horman <horms@ovn.org> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-04 22:16:07 +01:00
Ilya Maximets	68e9312214	m4: Fix linking with OpenSSL 1.1.0+ and 3+ on Windows. OpenSSL 1.1.0 changed the library names from libeay32 and ssleay32 to standard libssl and libcrypto. All the versions of OpenSSL that used old names reached their official EoL, so it should be safe to just migrate to new names. They can still be supported via premium support option, but I don't think that is important for us. Also, OpenSSL installers for older versions had the following folder structure: C:\OPENSSL-WIN64\ +---bin +---include \| +---openssl +---lib \| libeay32.lib \| ssleay32.lib +---VC libeay32MD.lib libeay32MDd.lib libeay32MT.lib libeay32MTd.lib ssleay32MD.lib ssleay32MDd.lib ssleay32MT.lib ssleay32MTd.lib With newer OpenSSL 3+ the structure is different: C:\OPENSSL-WIN64 +---bin +---include \| +---openssl +---lib +---VC +---x64 +---MD \| libcrypto.lib \| libssl.lib +---MDd \| libcrypto.lib \| libssl.lib +---MT \| libcrypto.lib \| libssl.lib +---MTd libcrypto.lib libssl.lib Basically, instead of one generic library in the lib folder and a bunch of differently named versions of it for different type of linkage, we now have multiple instances of the library located in different folders based on the linkage type. So, we have to provide an exact path in order to find the library. 'lib/VC/x64/MT' was chosen in this patch since it is a way used for building in build-aux/ccl. MD stands for dynamic linking, MT is static, 'd' stands for debug versions of the libraries. While at it, fixing documentation examples to point to Win64 default installation folder. Acked-by: Simon Horman <horms@ovn.org> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-04 22:15:58 +01:00
Ilya Maximets	f5fa9a0a3c	ovs-pki: Fix file permissions on Windows. There is no chmod or 'mkdir -m' support on Windows, so setting file permissions for keys and certificates doesn't actually work. Implementing them using icacls utility instead. ovs-pki script currently only uses 0700 and 0750 modes, so only those (and 0600) are implemented. NTFS ACLs on Windows are fairly different and more complex in comparison with Unix file permissions, so it's hard to implement these functions in a generic way. The script will fail if it will encounter an unknown mode. 0700 is implemented as a F (full access) for 'Creator Owner' with no other permissions. 0750 has an additional RX (read+execute) for the 'Creator Group'. 0600 is implemented the same as 0700, since it doesn't matter for this use case to have or not to have an executable or traversal permissions managed separately from everything else and it would be a little overly verbose to give all the permissions except for X. Inheritance rules are set to (OI)(CI), so the folder itself, subfolders and files in a folder inherit those ACEs. 'umask' also doesn't work on Windows. Instead, moving the private key output files to a temporary folder that has restricted access already configured. The file will inherit these restricted ACEs. It should not be necessary to set explicit permissions for these files since moving them within the same volume should preserve ACEs. However, it might be safer to chmod them directly as well, just in case. Windows administrators will still have to be careful with private keys, because file copies do not preserve permissions and moves to different volumes do not preserve them as well. 'robocopy' with flags to copy security should be used in these cases. We may want to re-implement 'mv' with 'robocopy' if that becomes a problem in the future. There is one more place where umask is used in the script for creation of a self-signed certificate, but it is not actually needed there since the resulted certificate doen't need to be private, so not changing this part for now. Tested with running an empty 'make check' in AppVeyor and examining permissions for files in tests/pki: Files \| Linux \| Windows ---------------------+------------+-------------------------------------- controllerca \| drwxr-xr-x \| NT AUTHORITY\SYSTEM:(I)(OI)(CI)(F) switchca \| \| BUILTIN\Administrators:(I)(OI)(CI)(F) ca\certs \| \| BUILTIN\Users:(I)(OI)(CI)(RX) ca\crl \| \| BUILTIN\Users:(I)(CI)(AD) ca\newcerts \| \| BUILTIN\Users:(I)(CI)(WD) \| \| APPVEYOR-VM\appveyor:(I)(F) \| \| CREATOR OWNER:(I)(OI)(CI)(IO)(F) ---------------------+------------+-------------------------------------- stamp \| -rw-r--r-- \| NT AUTHORITY\SYSTEM:(I)(F) test-cert.pem \| \| BUILTIN\Administrators:(I)(F) test-req.pem \| \| BUILTIN\Users:(I)(RX) test2-cert.pem \| \| APPVEYOR-VM\appveyor:(I)(F) test2-req.pem \| \| ca\ca.cnf \| \| ca\cacert.pem \| \| ca\careq.pem \| \| ca\crlnumber \| \| ca\index.txt* \| \| ca\serial \| \| ca\newcerts\.pem \| \| ---------------------+------------+-------------------------------------- controllerca\private \| drwx------ \| APPVEYOR-VM\appveyor:(F) switchca\private \| \| CREATOR OWNER:(OI)(CI)(IO)(F) ---------------------+------------+-------------------------------------- test-privkey.pem \| -rw------- \| APPVEYOR-VM\appveyor:(F) test2-privkey.pem \| \| *ca\private\cakey.pem\| \| We can see that private folders and keys have only a full access from their owners. Other files and folders have some extra inherited ACEs from a containing folder. Acked-by: Simon Horman <horms@ovn.org> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-04 22:15:48 +01:00
Ilya Maximets	d439c20179	appveyor: Print out config.log on configuration failure. We need to know exact linking / compilation errors in order to fix issues. We could have uploaded it as an artifact, but it seems easier to just print it out for now. Acked-by: Simon Horman <horms@ovn.org> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-04 22:15:31 +01:00
Adrian Moreno	436aba68d5	bond: Reset stats when deleting post recirc rule. In order to properly balance bond traffic, ofproto/bond periodically reads usage statistics of the post-recirculation rules (which are added to a hidden internal table). To do that, each "struct bond_entry" (which represents a hash within a bond) stores the last seen statistics for its rule. When a hash is moved to another member (due to a bond rebalance or the previous member going down), the rule is typically just modified, i.e: same match different actions. In this case, statistics are preserved and accounting continues to work. However, if the rule gets completely deleted (e.g: when all bond members go down) and then re-created, the new rule will have 0 tx_bytes but its associated entry will still store a non-zero last-seen value. This situation leads to an overflow of the delta calculation (computed as [current_stats_value - last_seen_value]), which can affect traffic as the hash will be considered to carry a lot of traffic and rebalancing will kick in. In order to fix this situation, reset the value of last seen statistics on rule deletion. Implementation notes: Modifying pr_tx_bytes requires write-locking the global rwlock but a lockless version of update_recirc_rules was being maintained to avoid locking on bon_unref(). Considering the small impact of locking during bond removal, removing the lockless version and relying on clang's thread safety analysis is preferred. Also, folding Ilya's [1], i.e: fixing thread safety annotation in update_recirc_rules() to require holding write-lock. [1] https://patchwork.ozlabs.org/project/openvswitch/patch/20240209161718.1149494-1-i.maximets@ovn.org/ Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 Co-authored-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-01 22:42:00 +01:00
Ilya Maximets	6fc215de30	ofproto-dpif-trace: Fix infinite recirculation tracing. Trace attempts to process all the recirculations. However, if there is a recirculation loop, i.e. if every recirculation generates another recirculation, this process will never stop. It will grind until the trace fills the system memory. A simple reproducer: make sandbox ovs-vsctl add-br br0 ovs-vsctl add-port br0 p1 ovs-ofctl add-flow br0 "table=0,in_port=p1,ip,actions=ct(table=0)" ovs-appctl ofproto/trace br0 in_port=p1,ip Limit the number of recirculations trace is processing with a fairly arbitrary number - 4096 (loosely based on the resubmit limit, but they are not actually related). Not adding a test for this since it's only for a trace, but also because the test may lead to OOM event in a system if the test fails, which is not nice. Fixes: `e6bc8e7493` ("ofproto/trace: Add support for tracing conntrack recirculation") Reported-by: Jaime Caamaño Ruiz <jcaamano@redhat.com> Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-01 22:35:41 +01:00
Ilya Maximets	99c86c6c46	github: Temporarily disable SNAT with exhaustion system test. With a new runner update, GitHub Actions had a kernel update. And it seems like something changed between kernels 6.2 and 6.5 so this test now fails very frequently. I can reproduce the same issue on RHEL 9, and I can't reproduce it on Ubuntu 23.04 (kernel 6.2). The test is creating a NAT with a single address+port pair in an attempt to simulate an address space exhaustion. It is expected that a first connection with wget leaves a conntrack entry in a TIME_WAIT state and the second wget should fail as long as this entry remains, because the only available address+port pair is already taken. However, very frequently (not always!) the second connection replaces the first conntrack entry with a new one and connection succeeds. There is still only one connection in the conntrack at any single moment in time, so there is seemingly no issue with the NAT, but the behavior is unexpected and the test fails. The issue is likely introduced by a new kernel feature that allows to evict connections that are in the process of closing: https://lore.kernel.org/netdev/20230626064749.75525-7-pablo@netfilter.org/ Disable the test in CI until we figure out how to fix it. Acked-by: Simon Horman <horms@ovn.org> Acked-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-03-01 18:24:16 +01:00
Simon Horman	cc0e795181	Documentation: Update links to upstream Kernel documentation. This updates links to several upstream Kernel documents. 1. Lore is now the canonical archive for the netdev mailing list 2. net-next is now maintained by the netdev team, of which David Miller is currently a member, rather than only by David. Also, use HTTPS rather than HTTP. 3. The Netdev FAQ has evolved into the Netdev Maintainer Handbook. 4. The Kernel security document link was dead, provide the current canonical location for this document instead. 1., 2. & 3. Found by inspection 4. Flagged by check-docs Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-01 11:22:27 +00:00
Simon Horman	2d7a2bddb1	Documentation: Update Pacemaker link. Update link to OCF Resource Agents documentation as the existing link is broken. Also, use HTTPS. Broken link flagged by make check-docs Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-01 11:22:27 +00:00
Simon Horman	3c52cd15ac	Documentation: Anuket project updates. The Anuket was formed by a merger of OPNFV and CNTT [1]. Also, VswitchPerf, aka vsperf, formerly an OPNFV project, has been renamed ViNePerf [2]. Update links and documentation accordingly. The old links were broken, this was flagged by make check-docs [1] https://anuket.io/news/2021/01/27/lf-networking-launches-anuket-an-open-source-project-to-accelerate-infrastructure-compliance-interoperability-and-5g-deployments/ [2] https://docs.opnfv.org/projects/vineperf/en/latest/release/release-notes/release-notes.html Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-01 11:22:27 +00:00
Simon Horman	786a89aba7	Documentation: Correct spelling errors. Correct spelling errors in .rst files flagged by codespell. Also correct some minor grammar errors in nearby documentation. Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-01 11:22:27 +00:00
Simon Horman	5639aa7b6d	Documentation: Extend copyright to 2024. IANAL, but I think we can extend the copyright attached to documentation to cover the current year: we are still actively working on the documentation. Signed-off-by: Simon Horman <horms@ovn.org> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>	2024-03-01 11:22:22 +00:00
Mike Pattrick	4c32b6d096	dp-packet: Don't offload inner csum if outer isn't supported. Some network cards support inner checksum offloading but not outer checksum offloading. Currently OVS will resolve that outer checksum but allows the network card to resolve the inner checksum, invalidating the outer checksum in the process. Now if we can't offload outer checksums, we don't offload inner either. Reported-at: https://issues.redhat.com/browse/FDP-363 Fixes: `084c808729` ("userspace: Support VXLAN and GENEVE TSO.") Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-02-29 23:11:04 +01:00
Eelco Chaudron	f0d1beca6c	dpif-netdev: Do not create handler threads. Avoid unnecessary thread creation as no upcalls are generated, resulting in idle threads waiting for process termination. This optimization significantly reduces memory usage, cutting it by half on a 128 CPU/thread system during testing, with the number of threads reduced from 95 to 0. Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2024-02-29 12:24:59 +01:00
Ilya Maximets	166ee41d28	ofproto-dpif-xlate: Fix ignoring IPv6 local_ip for native tunnels. Local IP is taken into account only in case of IPv4 address, IPv6 source is not checked. That leads to source being ignored during the route lookup and ultimately packets encapsulated with a source IP found during a route lookup, which is likely the wrong one. Even worse, after encapsulation we have a difference between the tunnel metadata that contains a correct source IP and the generated actions that used a wrong source IP. This means that if there are OpenFlow rules in a bridge where packet goes after encapsulation, we may match on rules that do not correspond to the actual packet we have. Add the check for IPv6 source address before the route lookup. Tests added to check that we're actually using the configured local_ip as a source address in the packet. Also adding the same test for IPv4, since apparently we don't have any tests covering this functionality for userspace tunnels. This issue also affects the case where source address is set via OpenFlow, e.g. 'set_filed:2001:beef::88->tun_ipv6_src', but it's just a different way of populating the tunnel metadata that doesn't depend on a tunnel to be native or kernel one. So, not adding extra tests for this case for now. Fixes: `8e4e45887e` ("ofproto-dpif-xlate: makes OVS native tunneling honor tunnel-specified source addresses") Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2024-February/052938.html Reported-by: Derrick Lim <derrick.lim@rakuten.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-02-21 20:46:45 +01:00
Ihar Hrachyshka	7992a26ef4	netdev-dummy: Add local route entries for IP addresses. To mimic what kernel routing subsystem does [1], add a local route entry for every dummy IP address. This helps with OVN testing multiple chassis on a single host and allows to run better unit tests for userspace tunnels without adding route entries manually. This is also the only way to add 'local' route entries that are required for testing 'local_ip' functionality with native tunnels in userspace datapath because route lookup will reject non-local source IPs. There seems to be no way to explicitly remove an IP address from netdev-dummy, hence no code path to handle route entry cleanup. The port itself can be removed, but our tests do not normally do that. Removal can be implemented later if necessary. [1]: http://linux-ip.net/html/routing-tables.html#routing-table-local "If the machine has several IP addresses on one Ethernet interface, there will be a route to each locally hosted IP in the local routing table. This is a normal side effect of bringing up an IP address on an interface under linux." Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Co-authored-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-02-21 20:46:32 +01:00
Ilya Maximets	f9e42c6683	tests: Move the non-local port as tunnel endpoint test. It's not a system test as it runs with dummy datapath and ports and it has nothing to do with layer 3 tunnels. It should be with other userspace tunnel tests. While moving also making it a little nicer visually and less error prone by requesting port numbers for all the ports. Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>	2024-02-21 20:46:27 +01:00

1 2 3 4 5 ...

20088 Commits