Device can have multiple IP address but netdev_get_in4/6()
returns only one configured IPv6 address. Following
patch fixes it.
OVS router is also updated to return source ip address for
given destination, This is required when interface has multiple
IP address configured.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Made to simplify creation of derived classes.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Until now it's been pretty hard to properly test any of the queue support,
because the dummy network device doesn't have any queues. By adding one
queue to the dummy network device (queue 0), we can get slightly higher
confidence that OVS queue support works correctly. I suppose we could do
even better if we made the dummy network device support modifying the
queues, but that's a job for another day.
Signed-off-by: Ben Pfaff <blp@ovn.org>
This saves some code and improves clarity, in my opinion.
Some of these changes just change an inet_pton() call into a similar
ip_parse() or ipv6_parse() call. In those cases the benefit is better
type safety, since inet_pton()'s output parameter is type "void *".
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
Based on IPv4 tests, test tunnels over IPv6. In order to do that, add
netdev-dummy/ip6addr command for dummy bridges, and get_in6 support for
netdev-dummy as well.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
DPDK mbufs contain a valid RSS hash only if PKT_RX_RSS_HASH is
set in 'ol_flags'. Otherwise the hash is garbage and doesn't
relate to the packet.
This fixes an issue with vhost, which, being a virtual NIC, doesn't
compute the hash.
Reported-by: Dongjun <dongj@dtdream.com>
Suggested-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Define struct eth_addr and use it instead of a uint8_t array for all
ethernet addresses in OVS userspace. The struct is always the right
size, and it can be assigned without an explicit memcpy, which makes
code more readable.
"struct eth_addr" is a good type name for this as many utility
functions are already named accordingly.
struct eth_addr can be accessed as bytes as well as ovs_be16's, which
makes the struct 16-bit aligned. All use seems to be 16-bit aligned,
so some algorithms on the ethernet addresses can be made a bit more
efficient making use of this fact.
As the struct fits into a register (in 64-bit systems) we pass it by
value when possible.
This patch also changes the few uses of Linux specific ETH_ALEN to
OVS's own ETH_ADDR_LEN, and removes the OFP_ETH_ALEN, as it is no
longer needed.
This work stemmed from a desire to make all struct flow members
assignable for unrelated exploration purposes. However, I think this
might be a nice code readability improvement by itself.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Until now there have been two variants for --enable-dummy:
* --enable-dummy: This adds support for "dummy" dpif and netdev.
* --enable-dummy=override: In addition, this replaces *every* existing
dpif and netdev by the dummy type.
The latter is useful for testing but it defeats the possibility of using
the userspace native tunneling implementation (because all the tunnel
netdevs get replaced by dummy netdevs). Thus, this commit adds a third
variant:
* --enable-dummy=system: This replaces the "system" dpif and netdev
by dummies but leaves the others untouched.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
This is the only missing piece to make native tunneling work with dummy
devices for testing purposes.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
This conforms with the interface described in netdev-provider.h.
Found when experimenting with native tunneling and dummy devices.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
The 'list' member is only used (two users) in the slow path.
This commit removes it to reduce the struct size
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
This would trigger if someone tried to switch a dummy device between
active and passive connections. It's not very important because dummy
devices are only enabled during testing.
Found by LLVM scan-build.
Reported-by: Kevin Lo <kevlo@FreeBSD.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
We already have the 'dp_hash' embedded in the metadata. This caused
confusion in the code. With this commit it should be clear that
'rss_hash' is the packet hash used for internal purposes, while
'md.dp_hash' is part of the flow, computed during the execution of
certain actions.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
GNU C++ isn't too happy with ovs-atomic.h. We could fix that (maybe we
should) but the report I received from a C++ user implied to me that it
would be just as useful to just drop the unnecessary #include
"ovs-atomic.h" from hmap.h.
Reported-by: Michael Hu <humichael@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
ofpbuf was complicated due to its wide usage across all
layers of OVS, Now we have introduced independent dp_packet
which can be used for datapath packet, we can simplify ofpbuf.
Following patch removes DPDK mbuf and access API of ofpbuf
members.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Currently dp-packet make use of ofpbuf for managing packet
buffers. That complicates ofpbuf, by making dp-packet
independent of ofpbuf both libraries can be optimized for
their own use case.
This avoids mapping operation between ofpbuf and dp_packet
in datapath upcalls.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
dp_packet is short and better name for datapath packet
structure.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Expose the struct ovs_list definition in <openvswitch/list.h>. Keep the
list access API private for now.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
struct list is a common name and can't be used in public headers.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.
Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This commit adds a new API to the 'struct netdev_class' which
allows user to configure the number of tx queues and rx queues
of 'netdev'. Upcoming patches will use this function to set
multiple tx/rx queues when adding the netdev to dpif-netdev.
Currently, only netdev-dpdk module implements this function.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
There are couple of reasons to remove this support:
* This is used in very old OVS use-case. It is much better
to read stats directly from OVS.
* Forthcoming commit will remove support for setting stats
for vport. The stats update depends on stats-set.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
This commit adds new variable n_txq to 'struct netdev' for recording
the number of tx queues. Correspondingly, the send_*() functions are
extended to accept queue id as input argument.
All 'netdev-*' implementation will ignore the queue id since having
multiple tx queues is not supported. Upcomping patches will start
using it and create multiple tx queues for dpdk netdev.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
This commit adds a new API to the 'struct netdev_class' which
allows user to query the numa node id the 'netdev' is on.
Currently, only netdev-dpdk module implements this function.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
These function are used to stored the packet hash. 'netdev-dpdk'
automatically set this value to the RSS hash returned by the
NIC. Other 'netdev's set it to 0 (which is an invalid hash
value), so that callers can compute the hash on their own.
If DPDK support is enabled, struct dpif_packet's member
'dp_hash' is removed and 'pkt.hash.rss' from DPDK mbuf is used
This commit also configure DPDK devices to compute RSS hash
for UDP and IPv6 packets
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Even though there is no need to optimize netdev-dummy, it might be
good to do this right, in case it serves as an inspiration for
something else later.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The netdev_send function has been modified to accept multiple packets, to
allow netdev providers to amortize locking and queuing costs.
This is especially true for netdev-dpdk.
Later commits exploit the new API.
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
This commit introduces a new data structure used for receiving packets from
netdevs and passing them to dpifs.
The purpose of this change is to allow storing some private data for each
packet. The subsequent commits make use of it.
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Using without any parameter, this command list the connection
state of all netdev-dummy devices that are configured to make
active connections.
Optionally, the name of the netdev-dummy device can be supplied
as a parameter.
The states will be displayed as:
"connected": The socket has been connected to the listener.
"disconnected": The socket is not connected.
"unknown": It is not an active dummy device.
CC: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
This commit can be seen as a partial revert of commit
da4a619179d (netdev: Globally track port status changes)
by adding the 'change_seq' to 'struct netdev'.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Preparation for multi queue netdev IO. There are no functional changes
in this patch.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
DPDK netdev need to access ofpbuf while sending buffer. Following
patch changes netdev_send accordingly.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
DPDK can receive multiple packets but current netdev API does
not allow that. Following patch allows dpif-netdev receive batch
of packet in a rx_recv() call for any netdev port. This will be
used by dpdk-netdev.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
netdev-dummy will mostly be used for testing and debugging over fairly
reliable connection. Reduce reconnect back off timeout in case the first
connect attempt failed.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
The netdev-dummy unit test ran into the reconnect condition on Jarno's
machine. With his test environment, we found and fixed the bugs in
handling reconnect.
Co-authored-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
netdev-dummy is mostly useed over reliable link for testing. Periodic
probing over those links seem overkill.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
The dummy ports thus far only support passive connections. It can
listen for multiple incoming connection requests but not make active
connections. This patch adds support of active stream, so that a
dummy port can be configured with either passive or active connections.
The net result is that dummy ports can now connect to each other,
without being patch ports. This feature will be useful in adding test
cases of future commits.
Signed-off-by: Andy Zhou <azhou@nicira.com>
This is done to avoid collisions and confusions with libpcap symbols,
like pcap_read()
Signed-off-by: Luigi Rizzo <rizzo@iet.unipi.it>
Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Update the netdev_class so that struct ofpbuf * is passed to rx_recv()
to provide both the data and size of the data to read a packet into.
On success, update struct ofpbuf size inside netdev_class rx_recv
implementation and return 0. This moves logic from the caller.
On error a positive error code is returned, whereas previously
a negative error code was returned. This is a more common convention.
This patch should not have any behavioural changes.
This patch is in preparation for the netdev-linux variant of rx_recv()
making use of headroom in the struct ofpbuf * parameter to push a VLAN tag
obtained from auxdata.
Signed-off-by: Simon Horman <horms@verge.net.au>
Co-authored-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Until now, netdev_dummy_rx_wait() has only checked whether the receive
queue for the dummy device is currently empty. This has worked OK because
in practice packets were queued to dummy devices only from the same thread
that attempted to receive them. An upcoming commit will use different
threads for these purposes, so this commit switches to a notification
method that works cross-thread.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Previously, we tracked status changes for ofports on a per-device basis.
Each time in the main thread's loop, we would inspect every ofport
to determine whether the status had changed for corresponding devices.
This patch replaces the per-netdev change_seq with a global 'struct seq'
which tracks status change for all ports. In the average case where
ports are not constantly going up or down, this allows us to check the
sequence once per main loop and not poll any ports. In the worst case,
execution is expected to be similar to how it is currently.
In a test environment of 5000 internal ports and 50 tunnel ports with
bfd, this reduces average CPU usage of the main thread from about 40% to
about 35%.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
netdev-dummy and netdev-vport use the function name netdev_poll_notify()
for the same purpose as netdev-linux/bsd's netdev_*_changed(). This patch
changes the former two to be more consistent with the linux/bsd naming
scheme.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This alters the way rx packets are accounted for by
counting them when they are processed by netdev_dummy_rx_recv(),
which seems to be a common path used by all received packets.
Previously accounting was done earlier, in netdev_dummy_receive(),
however this does not appear to count packets that are received via
a socket.
This resolves packet counting errors reported by the following
OFtest tests:
port_stats.MultiFlowStats
port_stats.SingleFlowStats
pktact.WildcardPriorityWithDelete
pktact.WildcardPriority
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>