2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 21:07:47 +00:00

281 Commits

Author SHA1 Message Date
Flavio Leitner
8c5163fe81 userspace TSO: Include UDP checksum offload.
Virtio doesn't expose flags to control which protocols checksum
offload needs to be enabled or disabled. This patch checks if the
NIC supports UDP checksum offload and active it when TSO is enabled.

Reported-by: Ilya Maximets <i.maximets@ovn.org>
Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2020-02-26 15:24:15 +01:00
Flavio Leitner
29cf9c1b3b userspace: Add TCP Segmentation Offload support
Abbreviated as TSO, TCP Segmentation Offload is a feature which enables
the network stack to delegate the TCP segmentation to the NIC reducing
the per packet CPU overhead.

A guest using vhostuser interface with TSO enabled can send TCP packets
much bigger than the MTU, which saves CPU cycles normally used to break
the packets down to MTU size and to calculate checksums.

It also saves CPU cycles used to parse multiple packets/headers during
the packet processing inside virtual switch.

If the destination of the packet is another guest in the same host, then
the same big packet can be sent through a vhostuser interface skipping
the segmentation completely. However, if the destination is not local,
the NIC hardware is instructed to do the TCP segmentation and checksum
calculation.

It is recommended to check if NIC hardware supports TSO before enabling
the feature, which is off by default. For additional information please
check the tso.rst document.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Tested-by: Ciara Loftus <ciara.loftus.intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2020-01-17 22:27:25 +00:00
Yanqin Wei
3343f8d6cf netdev: use acquire-release semantics for change_seq in netdev
"rxq_enabled" of netdev is writen in the vhost thread and read by pmd
thread once it observes 'change_seq' is updated. This patch is to keep
order on aarch64 or other weak memory model CPU to ensure 'rxq_enabled' is
observed before 'change_seq'.

Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-12-02 14:48:14 -08:00
Darrell Ball
594570ea1c conntrack: Optimize recirculations.
Cache the 'conn' context and use it when it is valid.  The cached 'conn'
context will get reset if it is not expected to be valid; the cost to do
this is negligible.  Besides being most optimal, this also handles corner
cases, such as decapsulation leading to the same tuple, as in tunnel VPN
cases.  A negative test is added to check the resetting of the cached
'conn'.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-09-25 08:58:11 -07:00
William Tu
0de1b42596 netdev-afxdp: add new netdev type for AF_XDP.
The patch introduces experimental AF_XDP support for OVS netdev.
AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket
type built upon the eBPF and XDP technology.  It is aims to have comparable
performance to DPDK but cooperate better with existing kernel's networking
stack.  An AF_XDP socket receives and sends packets from an eBPF/XDP program
attached to the netdev, by-passing a couple of Linux kernel's subsystems
As a result, AF_XDP socket shows much better performance than AF_PACKET
For more details about AF_XDP, please see linux kernel's
Documentation/networking/af_xdp.rst. Note that by default, this feature is
not compiled in.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
2019-07-19 17:42:06 +03:00
David Marchand
35c91567c8 dpif-netdev: Only poll enabled vhost queues.
We currently poll all available queues based on the max queue count
exchanged with the vhost peer and rely on the vhost library in DPDK to
check the vring status beneath.
This can lead to some overhead when we have a lot of unused queues.

To enhance the situation, we can skip the disabled queues.
On rxq notifications, we make use of the netdev's change_seq number so
that the pmd thread main loop can cache the queue state periodically.

$ ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 1:
  isolated : true
  port: dpdk0             queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 2:
  isolated : true
  port: vhost1            queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost3            queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 15:
  isolated : true
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  0 %
pmd thread numa_id 0 core_id 16:
  isolated : true
  port: vhost0            queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhost2            queue-id:  0 (enabled)   pmd usage:  0 %

$ while true; do
  ovs-appctl dpif-netdev/pmd-rxq-show |awk '
  /port: / {
    tot++;
    if ($5 == "(enabled)") {
      en++;
    }
  }
  END {
    print "total: " tot ", enabled: " en
  }'
  sleep 1
done

total: 6, enabled: 2
total: 6, enabled: 2
...

 # Started vm, virtio devices are bound to kernel driver which enables
 # F_MQ + all queue pairs
total: 6, enabled: 2
total: 66, enabled: 66
...

 # Unbound vhost0 and vhost1 from the kernel driver
total: 66, enabled: 66
total: 66, enabled: 34
...

 # Configured kernel bound devices to use only 1 queue pair
total: 66, enabled: 34
total: 66, enabled: 19
total: 66, enabled: 4
...

 # While rebooting the vm
total: 66, enabled: 4
total: 66, enabled: 2
...
total: 66, enabled: 66
...

 # After shutting down the vm
total: 66, enabled: 66
total: 66, enabled: 2

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-06-26 18:43:39 +01:00
Ilya Maximets
4f746d526d netdev-offload: Rename offload providers.
Flow API providers renamed to be consistent with parent module
'netdev-offload' and look more like each other.

'_rte_' replaced with more convenient '_dpdk_'.

We'll have following structure:

  Common code:
    lib/netdev-offload-provider.h
    lib/netdev-offload.c
    lib/netdev-offload.h

  Providers:
    lib/netdev-offload-tc.c
    lib/netdev-offload-dpdk.c

'netdev-offload-dummy' still resides inside netdev-dummy, but it
makes no much sence to move it out of there.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
2019-06-11 09:39:36 +03:00
Ilya Maximets
b6cabb8f8f netdev: Split up netdev offloading to separate module.
New module 'netdev-offload' created to manage different flow API
implementations. All the generic and provider independent code moved
there from the 'netdev' module.

Flow API providers further encapsulated.

The only function that was changed is 'netdev_any_oor'.
Now it uses offloading related hmap instead of common 'netdev_shash'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Roi Dayan <roid@mellanox.com>
2019-06-11 09:39:36 +03:00
Ilya Maximets
5fc5c50f3d netdev: Dynamic per-port Flow API.
Current issues with Flow API:

* OVS calls offloading functions regardless of successful
  flow API initialization. (ex. on init_flow_api failure)
* Static initilaization of Flow API for a netdev_class forbids
  having different offloading types for different instances
  of netdev with the same netdev_class. (ex. different vports in
  'system' and 'netdev' datapaths at the same time)

Solution:

* Move Flow API from the netdev_class to netdev instance.
* Make Flow API dynamic, i.e. probe the APIs and choose the
  suitable one.

Side effects:

* Flow API providers localized as possible in their modules.
* Now we have an ability to make runtime checks. For example,
  we could check if particular device supports features we
  need, like if dpdk device supports RSS+MARK action.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Roi Dayan <roid@mellanox.com>
2019-06-11 09:39:36 +03:00
Ilya Maximets
f7e4685015 treewide: Clean up inclusions of netdev-dpdk header.
'netdev-dpdk.h' provides only 'netdev_dpdk_register' and
'free_dpdk_buf' which are not used in these files and should
not be used.
Leftovers from the already removed code.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-03-14 08:45:05 +00:00
Ilya Maximets
a47e2db209 dp-packet: Refactor offloading API.
1. No reason to have mbuf related APIs in a generic code.
2. Not only RSS/checksums should be invalidated in case of tunnel
   decapsulation or sending to 'ring' ports.

In order to fix two above issues, new function
'dp_packet_reset_offload' introduced. In order to clean up/unify
the code and simplify addition of new offloading features to non-DPDK
version of dp_packet, introduced 'ol_flags' bitmask. Additionally
reduced code complexity in 'dp_packet_clone_with_headroom' by using
already existent generic APIs.

Unfortunately, we still need to have a special case for mbuf
initialization inside 'dp_packet_init__()'.
'dp_packet_init_specific()' introduced for this purpose as a generic
API for initialization of the implementation-specific fields.

Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2019-03-13 09:51:30 +00:00
Daniel Alvarez
3bdf8b620b netdev: Add comment to allow removing a workaround in the future
This patch [0] in glibc fixes an issue which is right now workarounded
in OVS by [1]. I'm adding a comment to indicate that from glibc 2.28
and beyond, the workaround is not needed so that we can eventually
remove it.

[0] https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c1f86a33ca32e26a9d6e29fc961e5ecb5e2e5eb4
[1] 3434d30686

Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-12-12 10:49:05 -08:00
Sriharsha Basavapatna via dev
57924fc91c revalidator: Rebalance offloaded flows based on the pps rate
This is the third patch in the patch-set to support dynamic rebalancing
of offloaded flows.

The dynamic rebalancing functionality is implemented in this patch. The
ukeys that are not scheduled for deletion are obtained and passed as input
to the rebalancing routine. The rebalancing is done in the context of
revalidation leader thread, after all other revalidator threads are
done with gathering rebalancing data for flows.

For each netdev that is in OOR state, a list of flows - both offloaded
and non-offloaded (pending) - is obtained using the ukeys. For each netdev
that is in OOR state, the flows are grouped and sorted into offloaded and
pending flows.  The offloaded flows are sorted in descending order of
pps-rate, while pending flows are sorted in ascending order of pps-rate.

The rebalancing is done in two phases. In the first phase, we try to
offload all pending flows and if that succeeds, the OOR state on the device
is cleared. If some (or none) of the pending flows could not be offloaded,
then we start replacing an offloaded flow that has a lower pps-rate than
a pending flow, until there are no more pending flows with a higher rate
than an offloaded flow. The flows that are replaced from the device are
added into kernel datapath.

A new OVS configuration parameter "offload-rebalance", is added to ovsdb.
The default value of this is "false". To enable this feature, set the
value of this parameter to "true", which provides packets-per-second
rate based policy to dynamically offload and un-offload flows.

Note: This option can be enabled only when 'hw-offload' policy is enabled.
It also requires 'tc-policy' to be set to 'skip_sw'; otherwise, flow
offload errors (specifically ENOSPC error this feature depends on) reported
by an offloaded device are supressed by TC-Flower kernel module.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Sathya Perla <sathya.perla@broadcom.com>
Reviewed-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-10-19 11:27:52 +02:00
Sriharsha Basavapatna via dev
738c785ff1 dpif-netlink: Detect Out-Of-Resource condition on a netdev
This is the first patch in the patch-set to support dynamic rebalancing
of offloaded flows.

The patch detects OOR condition on a netdev port when ENOSPC error is
returned by TC-Flower while adding a flow rule. A new structure is added
to the netdev called "netdev_hw_info", to store OOR related information
required to perform dynamic offload-rebalancing.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Sathya Perla <sathya.perla@broadcom.com>
Reviewed-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-10-19 11:27:45 +02:00
Ben Pfaff
63cf14cd7a netdev: Properly clear 'details' when iterating in NETDEV_QOS_FOR_EACH.
The function comment for netdev_queue_dump_next() said that it cleared its
'detail' argument, but it didn't actually do that, which meant that details
could be incorrectly carried along from one queue to the next.

Reported-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
2018-10-03 14:08:17 -07:00
Daniel Alvarez
3434d30686 netdev: Retry getting interfaces on inconsistent dumps from kernel
This patch in glibc [0] is fixing a bug where we may be getting
inconsistent dumps from the kernel when listing interfaces due to
a race condition.

This could happen if we try to retrieve them while interfaces are
being added/removed from the system at the same time.
For systems running against old glibc versions, this patch is retrying
the operation up to 3 times and then proceeding by logging a
warning.

Note that 3 times should be enough to not delay the operation much
and since it's unlikely that we hit the race condition 3 times in
a row. Still, if this happened, this patch is not changing the
current behavior.

[0] https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c1f86a33ca32e26a9d6e29fc961e5ecb5e2e5eb4

Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Co-authored-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-08-15 13:38:09 -07:00
John Hurley
88dcf2aa82 netdev-provider: add class op to get block_id
Add a new class op for netdevs to get the block_id if one exists. The
block_id is used in offload ops to group multiple qdiscs together.

Stub calls are made to the new class op (implementation to follow in
further patches). The default block_id of 0 (no block) will be used in
these cases.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-06-29 14:51:47 +02:00
Gavi Teitz
d63ca5329f dpctl: Properly reflect a rule's offloaded to HW state
Previously, any rule that is offloaded via a netdev, not necessarily
to the HW, would be reported as "offloaded". This patch fixes this
misalignment, and introduces the 'dp' state, as follows:

rule is in HW via TC offload  -> offloaded=yes dp:tc
rule is in not HW over TC DP  -> offloaded=no  dp:tc
rule is in not HW over OVS DP -> offloaded=no  dp:ovs

To achieve this, the flows's 'offloaded' flag was encapsulated in a new
attrs struct, which contains the offloaded state of the flow and the
DP layer the flow is handled in, and instead of setting the flow's
'offloaded' state based solely on the type of dump it was acquired
via, for netdev flows it now sends the new attrs struct to be
collected along with the rest of the flow via the netdev, allowing
it to be set per flow.

For TC offloads, the offloaded state is set based on the 'in_hw' and
'not_in_hw' flags received from the TC as part of the flower. If no
such flag was received, due to lack of kernel support, it defaults
to true.

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
[simon: resolved conflict in lib/dpctl.man]
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-06-18 09:57:37 +02:00
William Tu
754f8acb45 netdev-native-tnl: refactor the tunnel push header.
The patch adds additional 'struct netdev *' to the
native tunnel's push_header() interface.  This is used
for later GRE sequence number support.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-05-21 20:33:30 -07:00
Jan Scheurich
8492adc270 netdev: Add optional qfill output parameter to rxq_recv()
If the caller provides a non-NULL qfill pointer and the netdev
implemementation supports reading the rx queue fill level, the rxq_recv()
function returns the remaining number of packets in the rx queue after
reception of the packet burst to the caller. If the implementation does
not support this, it returns -ENOTSUP instead. Reading the remaining queue
fill level should not substantilly slow down the recv() operation.

A first implementation is provided for ethernet and vhostuser DPDK ports
in netdev-dpdk.c.

This output parameter will be used in the upcoming commit for PMD
performance metrics to supervise the rx queue fill level for DPDK
vhostuser ports.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Acked-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2018-05-11 08:08:24 +01:00
Darrell Ball
762ceb66b2 netdev: If MTU set fails, issue warn log.
Recently, an issue was debugged that was thought to be a bond
failover triggered issue.  It turned out to an vlan interface MTU set issue
that had nothing to do with bonding or most other likely possibilities.
Besides the effect of not setting the MTU to the desired value, this can
result in increased netlink traffic and processing with associated wasted
work. Let us flag a configuration issue at warn level (rather than dbg) to
catch the problem early.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-04-18 11:09:58 -07:00
Ben Pfaff
d2a60e57a8 netdev: Fix typos in comment.
Fixes: ee4776b8bce1 ("netdev: New function netdev_get_ip_by_name().")
Suggested-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-04-17 08:33:41 -07:00
Ben Pfaff
ee4776b8bc netdev: New function netdev_get_ip_by_name().
This is like netdev_get_in4_by_name() but accepts any IP address instead
of just an IPv4 address.

It will acquire its first user in an upcoming commit.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
2018-04-16 14:53:27 -07:00
Ben Pfaff
93b7faf1c7 socket-util: Add more functions for IPv[46] sockaddr and sockaddr_storage.
The existing functions for working with sockaddr_storage that contain an
IPv4 or IPv6 address are useful.  This commit adds more functions for
working with them, as well as a parallel set of functions for struct
sockaddr.

This also adds an initial user for some of the new sockaddr functions in
netdev.c.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
2018-04-16 14:53:27 -07:00
Ben Pfaff
dfc77282c5 ofp-print: Move much of the printing code into message-specific files.
Until now, the ofp-print code has had a lot of logic specific to
individual messages.  This code is better put with the other code specific
to those messages, so this commit starts to migrate it.

There is more work of a similar type to do, but this is a reasonable start.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2018-03-14 11:41:22 -07:00
Justin Pettit
e883448e3f dp-packet: Add index to DP_PACKET_BATCH_FOR_EACH to prevent shadowing.
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2018-02-28 14:53:27 -08:00
Michal Weglicki
971f4b394c netdev: Custom statistics.
- New get_custom_stats interface function is added to netdev. It
  allows particular netdev implementation to expose custom
  counters in dictionary format (counter name/counter value).
- New statistics are retrieved using experimenter code and
  are printed as a result to ofctl dump-ports.
- New counters are available for OpenFlow 1.4+.
- New statistics are printed to output via ofctl only if those
  are present in reply message.
- New statistics definition is added to include/openflow/intel-ext.h.
- Custom statistics are implemented only for dpdk-physical
  port type.
- DPDK-physical implementation uses xstats to collect statistics.
  Only dropped and error counters are exposed.

Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-01-10 15:29:13 -08:00
Ben Pfaff
34944e81f0 Merge branch 'dpdk_merge' of https://github.com/istokes/ovs into HEAD 2018-01-02 07:45:17 -08:00
Ben Pfaff
b2befd5bb2 sparse: Add guards to prevent FreeBSD-incompatible #include order.
FreeBSD insists that <sys/types.h> be included before <netinet/in.h> and
that <netinet/in.h> be included before <arpa/inet.h>.  This adds guards to
the "sparse" headers to yield a warning if this order is violated.  This
commit also adjusts the order of many #includes to suit this requirement.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
2017-12-22 12:58:02 -08:00
Ilya Maximets
b30896c969 netdev: Remove unused may_steal.
Not needed anymore because 'may_steal' already handled on
dpif-netdev layer and always true.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com
2017-12-20 21:07:46 +00:00
Yifeng Sun
59b1e023ae netdev: netdev_get_etheraddr is not functioning as advertised.
netdev_get_etheraddr claims to clear 'mac' on error, but it fails to do so.
When looking further into both netdev_windows_get_etheraddr() and
netdev_linux_get_etheraddr(), 'mac' is also not cleared. This will lead to
usage of uninitialised ofputil_phy_port.hw_addr.

v1 -> v2: fixed a bug in v1 found by Ben, thanks Ben.

Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-30 13:36:29 -08:00
Ben Pfaff
fa54741ea5 netdev: Eliminate redundant ifindex mapping.
Until now, the code for mapping ODP port number to ifindexes and vice versa
has maintained two completely separate data structures, one for each
direction.  It was possible for the two mappings to become out of sync
with each other since either one could change independently.  This commit
merges them into a single data structure (with two indexes), which at least
means that if one is removed then the other is as well.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
2017-11-15 10:57:49 -08:00
Ben Pfaff
8639555322 netdev: Indentation and style fixes.
White space changes only.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
2017-11-15 10:57:30 -08:00
Ben Pfaff
0d8efdc9ca netdev: Change macro to function.
There was no reason that this should have been a macro.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
2017-11-14 10:14:18 -08:00
Ashish Varma
97459c2f01 netdev, dpif: fix the crash/assert on port delete
a crash is seen in "netdev_ports_remove" when an interface is deleted and added
back in the system and when the interface is part of a bridge configuration.
e.g. steps:
  create a tap0 interface using "ip tuntap add.."
  add the tap0 interface to br0 using "ovs-vsctl add-port.."
  delete the tap0 interface from system using "ip tuntap del.."
  add the tap0 interface back in system using "ip tuntap add.."
                       (this changes the ifindex of the interface)
  delete tap0 from br0 using "ovs-vsctl del-port.."

In the function "netdev_ports_insert", two hmap entries were created for
mapping "portnum -> netdev" and "ifindex -> portnum".
When the interface is deleted from the system, the "netdev_ports_remove"
function is not getting called and the old ifindex entry is not getting
cleaned up from the "ifindex_to_port" hmap.

As part of the fix, added function "dpif_port_remove" which will call
"netdev_ports_remove" in the path where the interface deletion from the system
is detected.
Also, in "netdev_ports_remove", added the code where the "ifindex_to_port_data"
(ifindex -> portnum map node) is getting freed when the ifindex is not
available any more. (as the interface is already deleted.)

VMware-BZ: #1975788
Signed-off-by: Ashish Varma <ashishvarma.ovs@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-13 11:05:31 -08:00
Ilya Maximets
14f137ba1f netdev: Remove EOPNOTSUPP related comment for netdev_send().
Since 57eebbb4c315, the caller must make sure that 'netdev' supports
sending. This mentioned at the start of the comment.

Fixes: 57eebbb4c315 ("dpif-netdev: Don't try to output on a device without txqs.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
2017-11-09 14:11:30 -08:00
Xiao Liang
fd016ae3fb lib: Move lib/poll-loop.h to include/openvswitch
Poll-loop is the core to implement main loop. It should be available in
libopenvswitch.

Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-03 10:47:55 -07:00
William Tu
d2a5f17093 netdev: Fix memory leak on error path.
Instead of freeing in the error path, move the allocation
after it.  Found by inspection.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-10-12 14:50:03 -07:00
Joe Stringer
c8d0f32a60 netdev: Free ifidx mapping in netdev_ports_remove().
Previously, netdev_ports_insert() would allocate and insert an
ifindex->odp_port mapping, but netdev_ports_remove() would never remove
the mapping or free the mapping structure. This patch fixes these up.

Fixes: 32b77c316d9982("dpif: Save added ports in a port map.")
Reported-by: Andy Zhou <azhou@ovn.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
2017-08-11 09:46:54 -07:00
Daniel Alvarez
0ac01021a9 netdev: check for NULL fields in netdev_get_addrs
When the interfaces list is retrieved through getiffaddrs(), there
might be elements with iface_name set to NULL.

This patch checks ifa_name to be not NULL before comparing it to the
actual device name in the loop that calculates how many interfaces
exist with that same name.

Also, this patch checks that ifa_netmask is not NULL for coherence
with the existing code so that it doesn't allocate more memory
than needed if this field is NULL.

Note, that these checks are already being done later in the function
so it should be done in both places.

Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Lance Richardson <lrichard@redhat.com>
2017-08-08 16:45:18 -07:00
Eelco Chaudron
8c2c225e48 netdev: Fix netdev_open() to track and recreate classless interfaces
Due to commit 67ac844 an existing issue with OVS persisten ports
surfaced. If we revert the commit we no longer get the error, and
basic traffic will flow. However the wrong netdev class is used, hence
the wrong callbacks get called.

The main issue is with netdev_open() being called with type = NULL
before the interface is actually configured in the system. This patch
tracks these "auto" generated interfaces, and once netdev_open() gets
called with a valid type, re-configures (re-create) it.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-02 13:49:10 -07:00
Roi Dayan
dfaf79ddd9 dpif: Refactor obj type from void pointer to dpif_class
It's basically what is being passed today and passing a specific
type adds a compiler type check.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-07-27 10:17:46 +02:00
Haifeng Lin
f70ad93492 netdev: Fix crash when ifa_netmask is null.
glibc sometimes doesn't initialize the ifa_netmask and ifa_addr fields, if
the ioctl to fetch them fails.  Check ifa_name also just for paranoia.

Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-07-12 08:50:25 -07:00
Zoltán Balogh
429be0eeed netdev: Fix crash when interface option is changed to invalid value.
When trying to modify an interface option (e.g. remote IP of a GRE port) to
an invalid value, the vswitchd does crash. For instance:
 ovs-vsctl add-br br0
 ovs-vsctl add-port br0 gre0 -- set interface gre0 type=gre \
           options:remote_ip=10.0.0.2
 ovs-vsctl set interface gre0 options:remote_ip=9.9.9

The bug is caused by trying to dereference a NULL pointer. It was introduced
by the commit 9fff138ec3a6. Before that, the NULL pointer was handled by the
VLOG_WARN_BUF macro.

Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
CC: Daniele Di Proietto <diproiettod@vmware.com>
Fixes: 9fff138ec3a6 ("netdev: Add 'errp' to set_config().")
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-07-11 17:35:29 -07:00
Ben Pfaff
875ab13020 userspace: Handling of versatile tunnel ports
In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based
on packet_type of flow. If it's about an Ethernet packet, it is set to
ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set
according to the name space type.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-06-27 17:28:30 -04:00
Ben Pfaff
81765c00a1 openvswitch.h: Use odp_port_t for port numbers in userspace-only structs.
Using the correct type reduces the need for type conversions.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
Reviewed-by: nickcooper-zhangtonghao <nic@opencloud.tech>
2017-06-20 07:35:49 +08:00
Paul Blakey
1067cf93ac netdev: Init flow api on already added ports on offload enable
Ports already added to a switch are not being initialized for offloading
so when enabling offload we need to go over those ports.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-15 11:56:01 +02:00
Paul Blakey
6c34398480 dpif-netlink: Use netdev flow get api to query a flow
Search all datapath added netdevs for a given flow
using netdev flow api and parse it back to dpif flow.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-15 11:49:26 +02:00
Paul Blakey
0335a89ced dpif-netlink: Use netdev flow del api to delete a flow
If a flow was offloaded to a netdev we delete it using netdev
flow api.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-15 11:48:22 +02:00
Paul Blakey
f2280b4198 dpif-netlink: Dump netdevs flows on flow dump
While dumping flows, dump flows that were offloaded to
netdev and parse them back to dpif flow.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-15 11:39:51 +02:00