2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-22 09:58:01 +00:00

107 Commits

Author SHA1 Message Date
Mike Pattrick
cfc8321da1 netlink-socket: Initialize socket family.
The Clang analyzer will alert on the use of uninitialized variable local
despite the fact that this should be set by a syscall.

To suppress the warning, this variable is now initialized.

Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
2024-09-11 15:38:38 +02:00
Ilya Maximets
840979663d route-table: Avoid routes from non-standard routing tables.
Currently, ovs-vswitchd is subscribed to all the routing changes in the
kernel.  On each change, it marks the internal routing table cache as
invalid, then resets it and dumps all the routes from the kernel from
scratch.  The reason for that is kernel routing updates not being
reliable in a sense that it's hard to tell which route is getting
removed or modified.  Userspace application has to track the order in
which route entries are dumped from the kernel.  Updates can get lost
or even duplicated and the kernel doesn't provide a good mechanism to
distinguish one route from another.  To my knowledge, dumping all the
routes from a kernel after each change is the only way to keep the
cache consistent.  Some more info can be found in the following never
addressed issues:
  https://bugzilla.redhat.com/1337860
  https://bugzilla.redhat.com/1337855

It seems to be believed that NetworkManager "mostly" does incremental
updates right.  But it is still not completely correct, will re-dump
the whole table in certain cases, and it takes a huge amount of very
complicated code to do the accounting and route comparisons.

Going back to ovs-vswitchd, it currently dumps routes from all the
routing tables.  If it will get conflicting routes from multiple
tables, the cache will not be useful.  The routing cache in userspace
is primarily used for checking the egress port for tunneled traffic
and this way also detecting link state changes for a tunnel port.
For userspace datapath it is used for actual routing of the packet
after sending to a native tunnel.
With kernel datapath we don't really have a mechanism to know which
routing table will actually be used by the kernel after encapsulation,
so our lookups on a cache may be incorrect because of this as well.

So, unless all the relevant routes are in the standard tables, the
lookup in userspace route cache is unreliable.

Luckily, most setups are not using any complicated routing in
non-standard tables that OVS has to be aware of.

It is possible, but unlikely, that standard routing tables are
completely empty while some other custom table is not, and all the OVS
tunnel traffic is directed to that table.  That would be the only
scenario where dumping non-standard tables would make sense.  But it
seems like this kind of setup will likely need a way to tell OVS from
which table the routes should be taken, or we'll need to dump routing
rules and keep a separate cache for each table, so we can first match
on rules and then lookup correct routes in a specific table.  I'm not
sure if trying to implement all that is justified.

For now, stop considering routes from non-standard tables to avoid
mixing different tables together and also wasting CPU resources.

This fixes a high CPU usage in ovs-vswitchd in case a BGP daemon is
running on a same host and in a same network namespace with OVS using
its own custom routing table.

Unfortunately, there seems to be no way to tell the kernel to send
updates only for particular tables.  So, we'll still receive and parse
all of them.  But they will not result in a full cache invalidation in
most cases.

Linux kernel v4.20 introduced filtering support for RTM_GETROUTE dumps.
So, we can make use of it and dump only standard tables when we get a
relevant route update.  NETLINK_GET_STRICT_CHK has to be enabled on
the socket for filtering to work.  There is no reason to not enable it
by default, if supported.  It is not used outside of NETLINK_ROUTE.

Fixes: f0e167f0dbad ("route-table: Handle route updates more robustly.")
Fixes: ea83a2fcd0d3 ("lib: Show tunnel egress interface in ovsdb")
Reported-at: https://github.com/openvswitch/ovs-issues/issues/185
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-October/052091.html
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2024-03-22 20:31:25 +01:00
ldejing
b26015c33f datapath-windows: support meter action initial version
This patch implemented meter action, currently, meter only support drop
method and only support one band. The overall implementation is, when a
packet comes in, it will first lookup meter according to the meter id,
then get the band->rates and delta time since last access the same meter
from the meter struct. Add the multiply result(band->rates * delta_time)
to bucket, finally bucket minus the packet size, if the result larger
than zero, allow the packet go through, otherwise deny the packet go
through.

Test case:
    1. Setting the size meter size 3M, then the bandwidth was limit
       around 3M;
        ovs-ofctl -O OpenFlow13 add-meter br-test meter=2,kbps,\
                     band=type=drop,rate=3000
        ovs-ofctl add-flow br-test "table=0,priority=1,ip \
                     actions=meter:2,normal" -O OpenFlow13
    2. Setting the meter size 8M, then the bandwidth was limit
       around 8M;
       ovs-ofctl -O OpenFlow13 add-meter br-test meter=2,\
                      kbps,band=type=drop,rate=8000
       ovs-ofctl add-flow br-test "table=0,priority=1,ip\
                      actions=meter:2,normal" -O OpenFlow13

Signed-off-by: ldejing <ldejing@vmware.com>
Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
2022-09-20 02:48:44 +03:00
Paolo Valerio
4a6a473462 netlink-socket: Log extack error messages in netlink transactions.
During a netlink transaction, in case of replies of type NLMSG_ERROR,
the current behavior includes the translation of the error number
received into a string that describes the error code.

Netlink replies may carry a more descriptive error message, and
although it is possible to read those messages using the existing perf
tracepoint, it is more convenient to retrieve them directly from ovs.

This patch extends nl_msg_nlmsgerr() so that it retrieves the message
that later, if present, will be used by nl_sock_transact_multiple__()
in place of the generic descriptive form of the error number.  This is
particularly useful with tc that makes use of such kind of mechanism.

As an example, with this patch applied, the following generic message:

ovs|00239|netlink_socket|DBG|received NAK error=0 (Operation not supported)

becomes:

ovs|00239|netlink_socket|DBG|received NAK error=0 - Conntrack isn't enabled

The layout has been slightly modified to avoid nested parentheses.

Suggested-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2022-01-16 22:16:16 +01:00
Paolo Valerio
d08c086e56 netlink-socket: Replace error with txn->error when logging nacked transactions.
in case nl_msg_nlmsgerr returns true which basically means that the
nlmsg_type == NLMSG_ERROR, we need to log the error code, besides the
descriptive representation, stored by nl_msg_nlmsgerr instead of
"error".

Fixes: 72d32ac0b3a1 ("netlink-socket: Make caller provide message receive buffers.")
Suggested-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2021-08-16 19:38:13 +02:00
Ansis Atteka
b4a9c9cd84 netlink: make Netlink socket receive buffer 4x larger
Under high load I observed that Netlink socket buffer constantly
fills up for daemons listening for Conntrack Table notifications:

netlink_notifier|WARN|netlink receive buffer overflowed

This patch mitigates the problem by increasing socket
receive buffer size.  Ideally we should try to calculate
buffer size required, but it would be more sophisticated
solution than simply increasing buffer size.

Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ansis Atteka <aatteka@ovn.org>
VMware-BZ: #2724821
2021-03-31 09:32:05 -05:00
Anand Kumar
a1d4207e2c datapath-windows: Add support to configure ct zone limits
This patch implements limiting conntrack entries
per zone using dpctl commands.

Example:
ovs-appctl dpctl/ct-set-limits default=5 zone=1,limit=2 zone=1,limit=3
ovs-appctl dpct/ct-del-limits zone=4
ovs-appctl dpct/ct-get-limits zone=1,2,3

- Also update the netlink-socket.c to support netlink family
  'OVS_WIN_NL_CTLIMIT_FAMILY_ID' for conntrack zone limit.

Signed-off-by: Anand Kumar <kumaranand@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
2018-09-20 14:31:34 +03:00
Alin Gabriel Serdean
16b9bae89e v2 netlink-socket: Fix broken build on Windows
Skip network namespace id check on windows since we lack support
and integration for their equivalent at the moment.

Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
---
v2: Integrate comments as suggested by Ben and add him as co author
2018-04-03 14:53:08 +03:00
Flavio Leitner
cf114a7fce netlink linux: enable listening to all nsids
Internal ports may be moved to another network namespace
and when that happens, the vswitch stops receiving netlink
notifications.

This patch enables the vswitch to listen to all network
namespaces that have a nsid assigned into the network
namespace where the socket has been opened.

It requires kernel 4.2 or newer.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-03-31 12:48:36 -07:00
Flavio Leitner
a86bd14ec9 netlink: provide network namespace id from a msg.
The netlink notification's ancillary data contains the network
namespace id (netnsid) needed to identify the device correctly.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-03-31 12:48:31 -07:00
Xiao Liang
fd016ae3fb lib: Move lib/poll-loop.h to include/openvswitch
Poll-loop is the core to implement main loop. It should be available in
libopenvswitch.

Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-03 10:47:55 -07:00
Alin Serdean
d553dd54b3 windows: Crash when the handle communication device cannot be found
When trying to uninstall/disable the OVS extension the driver will
fail to unload properly(require reboot)/hang until ovs-vswitchd is closed.

The root cause of this behavior is because the handles from ovs-vswitchd
to the kernel communication devices are still opened although the
actual device was removed from the kernel.

Trying to close the handles will also fail because they do not exist.

The remaining option is to cause a crash and rely on the service manager
to restart ovs-vswitchd.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/27
Reported-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-04-13 21:51:00 -07:00
Alin Serdean
7e86fe8274 windows: Fix uninitialized variable in netlink-socket
The variable `request_nlmsg` was used without being initialized.

This patch assigns a value to it before being used.

Found by inspection.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-03-07 16:32:43 -08:00
Alin Serdean
74804b2108 windows: fix calls in netlink-socket
Add nl_sock_transact forward declaration, since it is used before
being on implemented. This applies only on Windows.

Move nl_sock_subscribe_packet__ function before it is used.

It makes more sense to move it rather than adding a forward declaration
since it is used by the two functions defined above it.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
2017-03-07 13:37:15 -08:00
Roi Dayan
d9c194a1c0 netlink-socket: Fix possiblity of nl_transact dereferencing null pointer
Many nl_transact callers and its wrapper tc_transact pass NULL for replyp
which is being accessed in error flow without being checked if null or not.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-01-31 08:11:47 -08:00
Terry Wilson
ee89ea7b47 json: Move from lib to include/openvswitch.
To easily allow both in- and out-of-tree building of the Python
wrapper for the OVS JSON parser (e.g. w/ pip), move json.h to
include/openvswitch. This also requires moving lib/{hmap,shash}.h.

Both hmap.h and shash.h were #include-ing "util.h" even though the
headers themselves did not use anything from there, but rather from
include/openvswitch/util.h. Fixing that required including util.h
in several C files mostly due to OVS_NOT_REACHED and things like
xmalloc.

Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-22 17:09:17 -07:00
Sairam Venugopal
e70f55edbc Windows: Add support for handling protocol (netlink family)
Windows datapath currently has no notion of netlink family.
It assumes all netlink messages to belong to NETLINK_GENERIC family.
This patch adds support for handling other protocols if the userspace sends it down to kernel.

This patch introduces a new NETLINK_CMD - OVS_CTRL_CMD_SOCK_PROP to manage
all properties associated with a socket. The properties are passed down as
netlink message attributes. This makes it easier to introduce other
properties in the future.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
2016-07-13 15:04:52 -07:00
Paul Boca
e6b298ef73 datapath-windows: Validate Netlink packets' integrity.
Solved access violation when trying to access Netlink message - obtained
with forged IOCTLs.

Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-05-16 22:04:58 -07:00
Ben Warren
64c967795b Move lib/ofpbuf.h to include/openvswitch directory
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-03-30 13:10:18 -07:00
Ben Warren
3e8a2ad145 Move lib/dynamic-string.h to include/openvswitch directory
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-03-19 10:02:12 -07:00
Alin Serdean
d71c423edf Clean code in netlink-socket
Found by inspection.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-02-26 16:52:10 -08:00
Ben Pfaff
0a2869d524 ofpbuf: New function ofpbuf_const_initializer().
A number of times I've looked at code and thought that it would be easier
to understand if I could write an initializer instead of
ofpbuf_use_const().  This commit adds a function for that purpose and
adapts a lot of code to use it, in the places where I thought it made
the code better.

In theory this could improve code generation since the new function can
be inlined whereas ofpbuf_use_const() isn't.  But I guess that's probably
insignificant; the intent of this change is code readability.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2016-02-19 16:15:44 -08:00
Thadeu Lima de Souza Cascardo
2e46795083 netlink-socket: return correct error code when connect fails
When connect and other calls fail after get_socket_rcvbuf, the return code would
be the rcvbuf size, not errno from the last call.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2016-02-16 11:26:39 +09:00
Alin Serdean
d8d1ef2f07 netlink-socket: Fix log message for subscribe/unsubscribe on Windows.
The warning message was inverted on the performed operation.

Also use the error returned by nl_sock_subscribe_packet__.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-01-04 16:04:32 -08:00
Nithin Raju
b91d3d0339 netlink-socket.c: event polling for packets on windows
Currently, we do busy-polling for packets on Windows. In this patch
we nuke that code and schedule an event.

The code has been tested for packet reads, and CPU utilization of
ovs-vswitchd went down drastically.

I'll send out the changes to get vport events to work in a seperate
patch.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-09-29 22:43:48 -07:00
Alin Serdean
9667de98d6 nl_sock_fd is not used under MSVC
Ifdef out nl_sock_fd to make users aware it is not used.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
2015-09-28 08:11:22 -07:00
Nithin Raju
a51a50862a ovs-hyperv: make kernel return values netlink socket like
In this patch, we make changes to usersapce as well as
kernel datapath on hyperv to make it more netlink socket
like. Previously, the kernel datapath did not distinguish
between "transport errors" and other errors. Netlink
semantics dictate that netlink functions should only
return an error only in the case of a "transport error"
which is generally something fatal. Eg. failure to
communicate with the OVS module, or an invalid command
altogether. Other errors such as an unsupported action,
or an invalid flow key is not considered a "transport
error", and in such cases, netlink functions are to return
success with a 'struct nlmsgerr' populated in the output
buffer.

This patch implements these semantics.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/72
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-04-29 07:35:50 -07:00
Ben Pfaff
b937e116da netlink-socket: Exit NL transaction loop when EINVAL is returned
The nl_sock_transact_multiple function enters in an infinite loop,
when invalid error, EINVAL, is returned by nl_sock_transact_multiple__.
EINVAL is the error returned by the latter function when a driver
request fails.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Reported-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/57
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-04-20 12:43:57 -07:00
Sorin Vinturis
190cf53389 datapath-windows: Make GET_PID a separate IOCTL
Added a new IOCTL in order to retrieve the PID from the kernel datapath.
The new method uses a direct and cleaner way, as opposed to the old way
of using a Netlink transaction, avoiding the unnecessary overhead.

Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com>
Reported-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/31
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-04-02 12:41:50 -07:00
Pravin B Shelar
6fd6ed71cb ofpbuf: Simplify ofpbuf API.
ofpbuf was complicated due to its wide usage across all
layers of OVS, Now we have introduced independent dp_packet
which can be used for datapath packet, we can simplify ofpbuf.
Following patch removes DPDK mbuf and access API of ofpbuf
members.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-03-03 13:37:39 -08:00
Thomas Graf
e6211adce4 lib: Move vlog.h to <openvswitch/vlog.h>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-12-15 14:15:19 +01:00
Eitan Eliahu
92a5068f61 netlink-socket: Set socket pid number in NL message on Windows.
The pid must be set in the NL header as the driver checks it against the pid in
the instance paired with the socket.

Signed-off-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-11-21 15:23:30 -08:00
Nithin Raju
8341662d1b netlink-socket: Fix a couple of compilation warnings.
Reported-by: Gurucharan Shetty <gshetty@nicira.com>
Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
[blp@nicira.com replaced conventional cast by CONST_CAST]
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-10-24 16:49:31 -07:00
Nithin Raju
36791e2178 netlink-socket: Add packet subscribe functionality on Windows.
In this patch, we add support in userspace for packet subscribe API
similar to the join/leave MC group API that is used for port events.
The kernel code has already been commited.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-10-23 13:08:28 -07:00
Nithin Raju
2c26eabfb7 netlink-socket: Use poll_immediate_wake() on Windows.
We have not yet tested the wakup via pending IRP functionality on
Windows yet. Hence we use poll_immediate_wake().

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Eitan Eliahu <eliahue@vmware.com>
2014-10-22 09:01:42 -07:00
Nithin Raju
15fd90524b netlink-socket: Fix nl_sock_recv__() on Windows.
In nl_sock_recv__() on Windows, we realloc a new ofpbuf to copy received
data if the caller specified buffer is small. While we do so, we need
reset some of the other stack variables to point to the new ofpbuf.

Other fixes are around using 'error' rather than 'errno'.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-10-21 12:35:11 -07:00
Nithin Raju
9189184d09 netlink-socket: Always pass the output buffer in a transaction.
We need to pass down the output buffer so that the kernel can return
transaction status - error or otherwise.

Also, we were processing the output buffer only when when
'txn->reply != NULL' ie when the caller specified an ofpbuf for the
reply. In this patch, the code has been updated to process the reply
unconditionally, but making sure to copy the reply to the 'txn->reply'
only when it is not NULL. The reason for the unconditional processing is
we can pass up transactional errors in 'txn->error'. Otherwise, it
results in an endless loop of calling nl_transact().

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
2014-10-13 13:27:12 -07:00
Nithin Raju
83cc9d5612 netlink-socket: add support for OVS_WIN_NETDEV_FAMILY
In this patch, we add support for family ID lookup of
OVS_WIN_NETDEV_FAMILY.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-10-06 15:38:18 -07:00
Eitan Eliahu
64513e6859 netlink-socket: User mode event read for Windows.
User mode sends down three distinct Read ioctl commands for Events, Packet
Reads and Dumps. In case the Packet Read socket can not be distinguished a
Set function will be provided.

Signed-off-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-10-03 15:21:12 -07:00
Nithin Raju
0fd22ae2a2 lib/netlink-socket.c: add support for nl_transact() on Windows
In this patch, we add support for nl_transact() on Windows using
the OVS_IOCTL_TRANSACT ioctl that sends down the request and gets
the reply in the same call to the kernel.

This is obviously a digression from the way it is implemented in
Linux where all the sends are done at once using sendmsg() and
replies are received one at a time.

Initial implementation was in the Linux way using multiple writes
followed by reads, but decided against it since it is not efficient
and also it complicates the state machine in the kernel.

The Windows implementation has equivalent code for handling corner
cases and error coditions similar to Linux. Some of it is not
applicable yet. Eg. the Windows kernel does not embed an error
in the netlink message itself. There's userspace code nevertheless
for this.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-09-19 14:36:58 -07:00
Eitan Eliahu
b8f958eaf3 Netlink_socket.c Join/Unjoin an MC group for event subscription
Use a specific out of band device control to subscribe/unsubscribe a socket
to the driver event queue for notification.

Signed-off-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Nithin Raju <nithin@vmware.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-09-12 09:54:51 -07:00
Gurucharan Shetty
52a1540a7f netlink-socket: Convert from error number to string correctly.
As mentioned in the comment above the function ovs_strerror(), it
should not be used to convert WINAPI error numbers to string.
Use ovs_lasterror_to_string() instead.

Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
2014-09-10 14:27:15 -07:00
Nithin Raju
5f7487da0a netlink-socket: remove local variable in do_lookup_genl_family.
'sock' is not initialized and hence should not be un-initialized
as well in the failure path.

Reported-by: Gurucharan Shetty <shettyg@nicira.com>
Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
2014-09-09 14:02:48 -07:00
Eitan Eliahu
7fa0961101 netlink-socket: Add support for async notification on Windows.
We keep an outstanding, out of band, I/O request in the driver at all time.
Once an event generated the driver queues the event message, completes the
pending I/O and unblocks the calling thread through setting the event in the
overlapped structure in the NL socket. The thread will read all all event
messages synchronously through the call of nl_sock_recv()

Signed-off-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-09-09 12:37:53 -07:00
Nithin Raju
fd972eb87a netlink-socket: Use read/write ioctl instead of ReadFile/WriteFile.
The Windows datapath supports a READ/WRITE ioctl instead of ReadFile/WriteFile.
In this change, we update the following:
- WriteFile() in nl_sock_send__() to use DeviceIoControl(OVS_IOCTL_WRITE)
- ReadFile() in nl_sock_recv__() to use DeviceIoControl(OVS_IOCTL_READ)

The WriteFile() call in nl_sock_transact_multiple__() has not been touched
since it is not needed yet.

Main motive for this change is to be able to unblock the DP Dump workflow.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Eitan Eliahu <eliahue@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
2014-08-28 08:52:52 -07:00
Nithin Raju
ebac7fb759 netlink-socket: fix typo to get_sock_pid_from_kernel()
A typo crept in while respinning get_sock_pid_from_kernel() in the previous
patch. Fixing it now. Also, get_sock_pid_from_kernel() doesn't need an OUT
argument. Fixing that too.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-08-27 08:12:16 -07:00
Nithin Raju
4c484aca0d netlink-socket: add support for nl_lookup_genl_mcgroup()
While we work out whether nl_sock_join_mcgroup() will be the mechanism
to support VPORT events, it is easy to add support for
nl_lookup_genl_mcgroup() and make progress on the other commands.

In this patch, we implement support for nl_lookup_genl_mcgroup() only
for the VPORT family though, which is all what dpif-linux.c needs.

Validation:
- A ported dpif-linux.c with epoll code commented out went so far as
to call dp_enumerate! DP Dump commands can be implemented next.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-08-27 08:08:28 -07:00
Nithin Raju
b3fca8a881 netlink-socket.c: add support for do_lookup_genl_family on Windows
In this patch, we add support for querying the genl family id for any
family supported by the OVS kernel datapath. On platforms that support
netlink natively, the operating system assigns a family ID, and the
OS netlink infrastructure supports querying the family ID by name.

In case of Windows, since OVS datpath provides the netlink support,
it is not necessary to make a call into the kernel. Returning a
family ID that is consistent between the userspace and kernel
is sufficient. Once there is code to support netlink message parsing
as well as constructing netlink messages, we can make a call into
the kernel, but that in itself may not buy anything more than this
approach.

This patch is a precursor to make progress of the other commands.
The next hurdle is to support nl_lookup_genl_mcgroup().

Signed-off-by: Nithin Raju <nithin@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-08-22 10:14:35 -07:00
Nithin Raju
886dd35a03 netlink-socket.c: implement get pid support on Windows
To verify if the netlink support in the kernel works, I updated
the netlink-socket.c code to get the PID for a given device
descriptor.

In the existing code, userspace sets the PID, which will not be
unique across different processes. So, it is better for the
kernel to generate the PID and give it back to userspace.

dpif-linux.c was ported to Windows (similar to Alin's change in
the cloudbase repo) and was able to exercise the code changes
in netlink-socket.c to read the PID. dpif-linux.c changes are
not being checked in.

Signed-off-by: Nithin Raju <nithin@vmware.com>
Acked-by: Alin Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Ankur Sharma <ankursharma@vmware.com>
Acked-by: Saurabh Shah <ssaurabh@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/18
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-08-19 14:28:30 -07:00
Alin Serdean
22326ba6f8 netlink-socket: Adapt to Windows and MSVC.
Add two functions set_sock_pid_in_kernel and portid_next. This will allow
the channel identification for the kernel extension to send back messages.

Replace send with WriteFile equivalent and ignore nl_sock_drain for the moment
under MSVC.

Replace sendmsg and recvmsg with ReadFile and WriteFile equivalents.

On MSVC put in handle instead of fd(sock->fd becomes sock->handle).

Creation of the netlink socket will be replaced by CreateFile equivalent.

Add MAX_STACK_LENGTH for MSVC.  This will be our maximum size for on-stack
copy buffer.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2014-07-29 09:38:42 -07:00