2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 12:58:00 +00:00

18 Commits

Author SHA1 Message Date
Ilya Maximets
840979663d route-table: Avoid routes from non-standard routing tables.
Currently, ovs-vswitchd is subscribed to all the routing changes in the
kernel.  On each change, it marks the internal routing table cache as
invalid, then resets it and dumps all the routes from the kernel from
scratch.  The reason for that is kernel routing updates not being
reliable in a sense that it's hard to tell which route is getting
removed or modified.  Userspace application has to track the order in
which route entries are dumped from the kernel.  Updates can get lost
or even duplicated and the kernel doesn't provide a good mechanism to
distinguish one route from another.  To my knowledge, dumping all the
routes from a kernel after each change is the only way to keep the
cache consistent.  Some more info can be found in the following never
addressed issues:
  https://bugzilla.redhat.com/1337860
  https://bugzilla.redhat.com/1337855

It seems to be believed that NetworkManager "mostly" does incremental
updates right.  But it is still not completely correct, will re-dump
the whole table in certain cases, and it takes a huge amount of very
complicated code to do the accounting and route comparisons.

Going back to ovs-vswitchd, it currently dumps routes from all the
routing tables.  If it will get conflicting routes from multiple
tables, the cache will not be useful.  The routing cache in userspace
is primarily used for checking the egress port for tunneled traffic
and this way also detecting link state changes for a tunnel port.
For userspace datapath it is used for actual routing of the packet
after sending to a native tunnel.
With kernel datapath we don't really have a mechanism to know which
routing table will actually be used by the kernel after encapsulation,
so our lookups on a cache may be incorrect because of this as well.

So, unless all the relevant routes are in the standard tables, the
lookup in userspace route cache is unreliable.

Luckily, most setups are not using any complicated routing in
non-standard tables that OVS has to be aware of.

It is possible, but unlikely, that standard routing tables are
completely empty while some other custom table is not, and all the OVS
tunnel traffic is directed to that table.  That would be the only
scenario where dumping non-standard tables would make sense.  But it
seems like this kind of setup will likely need a way to tell OVS from
which table the routes should be taken, or we'll need to dump routing
rules and keep a separate cache for each table, so we can first match
on rules and then lookup correct routes in a specific table.  I'm not
sure if trying to implement all that is justified.

For now, stop considering routes from non-standard tables to avoid
mixing different tables together and also wasting CPU resources.

This fixes a high CPU usage in ovs-vswitchd in case a BGP daemon is
running on a same host and in a same network namespace with OVS using
its own custom routing table.

Unfortunately, there seems to be no way to tell the kernel to send
updates only for particular tables.  So, we'll still receive and parse
all of them.  But they will not result in a full cache invalidation in
most cases.

Linux kernel v4.20 introduced filtering support for RTM_GETROUTE dumps.
So, we can make use of it and dump only standard tables when we get a
relevant route update.  NETLINK_GET_STRICT_CHK has to be enabled on
the socket for filtering to work.  There is no reason to not enable it
by default, if supported.  It is not used outside of NETLINK_ROUTE.

Fixes: f0e167f0dbad ("route-table: Handle route updates more robustly.")
Fixes: ea83a2fcd0d3 ("lib: Show tunnel egress interface in ovsdb")
Reported-at: https://github.com/openvswitch/ovs-issues/issues/185
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-October/052091.html
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2024-03-22 20:31:25 +01:00
Paolo Valerio
4a6a473462 netlink-socket: Log extack error messages in netlink transactions.
During a netlink transaction, in case of replies of type NLMSG_ERROR,
the current behavior includes the translation of the error number
received into a string that describes the error code.

Netlink replies may carry a more descriptive error message, and
although it is possible to read those messages using the existing perf
tracepoint, it is more convenient to retrieve them directly from ovs.

This patch extends nl_msg_nlmsgerr() so that it retrieves the message
that later, if present, will be used by nl_sock_transact_multiple__()
in place of the generic descriptive form of the error number.  This is
particularly useful with tc that makes use of such kind of mechanism.

As an example, with this patch applied, the following generic message:

ovs|00239|netlink_socket|DBG|received NAK error=0 (Operation not supported)

becomes:

ovs|00239|netlink_socket|DBG|received NAK error=0 - Conntrack isn't enabled

The layout has been slightly modified to avoid nested parentheses.

Suggested-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-16 22:16:16 +01:00
Yi-Hung Wei
1f16131837 ct-dpif, dpif-netlink: Add conntrack timeout policy support
This patch first defines the dpif interface for a datapath to support
adding, deleting, getting and dumping conntrack timeout policy.
The timeout policy is identified by a 4 bytes unsigned integer in
datapath, and it currently support timeout for TCP, UDP, and ICMP
protocols.

Moreover, this patch provides the implementation for Linux kernel
datapath in dpif-netlink.

In Linux kernel, the timeout policy is maintained per L3/L4 protocol,
and it is identified by 32 bytes null terminated string.  On the other
hand, in vswitchd, the timeout policy is a generic one that consists of
all the supported L4 protocols.  Therefore, one of the main task in
dpif-netlink is to break down the generic timeout policy into 6
sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp),
and push down the configuration using the netlink API in
netlink-conntrack.c.

This patch also adds missing symbols in the windows datapath so
that the build on windows can pass.

Appveyor CI:
* https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
2019-09-26 13:50:17 -07:00
Flavio Leitner
cf114a7fce netlink linux: enable listening to all nsids
Internal ports may be moved to another network namespace
and when that happens, the vswitch stops receiving netlink
notifications.

This patch enables the vswitch to listen to all network
namespaces that have a nsid assigned into the network
namespace where the socket has been opened.

It requires kernel 4.2 or newer.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-03-31 12:48:36 -07:00
Sairam Venugopal
218e42da3d Windows: Use NETLINK_NETFILTER instead of NETLINK_GENERIC
Windows datapath lacked support for different Netlink Family protocols.
Now that Windows supports different Netlink protocol, revert the change to
override NETLINK_NETFILTER to use NETLINK_GENERIC.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
2016-07-13 15:11:54 -07:00
Ankur Sharma
2b829ece6f datapath-windows: remove reference to OvsNetlink.h
In this patch we remove reference to OvsNetlink.h.
Since we do not refer to lib/netlink-protocol.h anymore,
hence removed the WIN_DP based check as well.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/18
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-08-22 10:30:36 -07:00
Ankur Sharma
99f368f446 lib/netlink-protocol.h: Do not include headers for windows kernel build.
Added saurabh's fix to not to include some header files
in lib/netlink-protocol.h not needed by windows driver.

This is a temporary fix to make sure that windows driver can compile.

We are working in parallel on adding netlink support in windows driver
(https://github.com/openvswitch/ovs-issues/issues/18). With netlink support
changes we will get rid of the dependency on lib/netlink-protocol.h.

Testing:
1. Verified make distcheck on linux.
2. Verified that windows driver can compile now.
3. Verified that ping works over vxlan tunnel.

Signed-off-by: Ankur Sharma <ankursharma@vmware.com>
Co-authored-by: Saurabh Shah <ssaurabh@vmware.com>
Tested-by: Ankur Sharma <ankursharma@vmware.com>
Reported-by: Alin Serdean <aserdean@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/21
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-08-12 12:30:56 -07:00
Alin Serdean
1675f19ea7 netlink-protocol: Add more definitions.
Add MAX_LINKS define needed for nl_pool.

Add NLM_F_CREATE and NLM_F_EXCL defines also.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-07-29 09:32:20 -07:00
Ben Pfaff
aa359b5f8b netlink-protocol: Remove definition of struct sockaddr_nl.
This struct is used only in netlink-socket.c, which is only used on Linux,
which in turn gets the definition from <linux/netlink.h>.  On Windows the
definition actually causes a small amount of trouble because Windows does
not define sa_family_t (despite POSIX), so it's better to remove it.

Even if other platforms adopt Netlink, I have no reason to believe that
they will use the same sockaddr format as Linux.

CC: Saurabh Shah <ssaurabh@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Gurucharan Shetty <gshetty@nicira.com>
2014-07-10 13:21:57 -07:00
Raju Subramanian
e0edde6fee Global replace of Nicira Networks.
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.

Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-05-02 17:08:02 -07:00
Ben Pfaff
05fe17646f treewide: Convert tabs to spaces in C source files written in OVS style.
The Open vSwitch C style doesn't use hard tabs.

This commit doesn't touch files written in kernel style or that are
imported from other projects where we want to minimize changes from
upstream (the sflow files).

Reported-by: Mehak Mahajan <mmahajan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2012-03-23 12:20:07 -07:00
Ben Pfaff
0a397a2fca netlink-protocol: Move CTRL_ATTR_MCAST definitions for consistency.
One of the current goals of netlink-protocol.h, for better or for worse, is
to ensure that the same definitions are available whether a Linux kernel is
in use or not.  One of the ways it accomplishes this is by putting the
conditional definitions that test for features missing in old kernels at
the very end, after the dummy definitions used on non-Linux platforms.
However, commit b0025c8389f "netlink-protocol: Define missing symbols"
added new conditional definitions only in the Linux platform case, which
means that those definitions won't be available on non-Linux platforms.
This commit moves them to the end, instead.

The symbols that are moved are only used from netlink-socket.c, which is
only built on Linux platforms, so this does not change an actual bug.  It
only makes the location of the definitions consistent with prior practice.
2011-09-06 09:33:26 -07:00
Ethan Jackson
b0025c8389 netlink-protocol: Define missing symbols.
OVS fails to build with xenddk-56100build3926 because it has an
outdated genetlink header.
2011-09-02 08:03:57 -07:00
Ben Pfaff
cceb11f5b1 netlink-socket: Add functions for joining and leaving multicast groups.
When this library was originally implemented, support for Linux 2.4 was
important.  The Netlink implementation in Linux only added support for
joining and leaving multicast groups after a socket is bound as of Linux
2.6.14, so the library did not support it either.  But the current version
of Open vSwitch targets Linux 2.6.18 and over, so it's fine to add this
support now, and this commit does so.

This will be used more extensively in upcoming commits.

Reviewed by Justin Pettit.
2011-01-27 09:26:05 -08:00
Ben Pfaff
7c62447839 netlink: New function nl_attr_type().
Linux since v2.6.24 has a couple of couple of bits at the top of
nla_type that one is apparently supposed to ignore.  This commit
starts doing that in Open vSwitch userspace.

Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-10 11:13:31 -08:00
Ben Pfaff
365a25176b netlink: Make netlink-protocol.h compatible with <linux/netlink.h>.
Until now, netlink-protocol.h and <linux/netlink.h> could not both be
included by a single source file, because they contained conflicting
definitions.  This commit fixes the problem, by having netlink-protocol.h
delegate to <linux/netlink.h> where it is available.

Here's an example of the problem: odp-util.c includes both
datapath-protocol.h and will need netlink-protocol.h also so that it can
look through actions defined as struct nlattr.  datapath-protocol.h
includes <linux/if_link.h> for the definition of rtnl_link_stats64, and
<linux/if_link.h> includes <linux/netlink.h>.

Acked-by: Jesse Gross <jesse@nicira.com>
2010-12-10 11:12:53 -08:00
Ben Pfaff
a14bc59fb8 Update primary code license to Apache 2.0. 2009-06-15 15:11:30 -07:00
Ben Pfaff
064af42167 Import from old repository commit 61ef2b42a9c4ba8e1600f15bb0236765edc2ad45. 2009-07-08 13:19:16 -07:00