2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-28 04:47:49 +00:00

109 Commits

Author SHA1 Message Date
Ilya Maximets
77ac0b28c8 netdev: Always clear struct ifreq before ioctl.
It's not nice to send random stack memory to a kernel over ioctl:

  Uninitialized bytes in ioctl_common_pre
    at offset 20 inside [0x7fff8899f3c0, 40)
  WARNING: MemorySanitizer: use-of-uninitialized-value
   0 0x1180b7 in af_inet_ioctl lib/socket-util-unix.c:417:15
   1 0x1180ff in af_inet_ifreq_ioctl lib/socket-util-unix.c:428:13
   2 0x11e4fd in netdev_linux_set_mtu lib/netdev-linux.c:2005:13
   3 0xba250b in netdev_set_mtu lib/netdev.c:1132:30
   4 0x59a19f in update_mtu_ofproto ofproto/ofproto.c:3042:18
   5 0x596947 in update_mtu ofproto/ofproto.c:3024:5
   6 0x5976d6 in ofport_install ofproto/ofproto.c:2617:5
   7 0x572e96 in update_port ofproto/ofproto.c:2893:21
   8 0x57636b in ofproto_port_add ofproto/ofproto.c:2208:17
   9 0x4f9e0b in iface_do_create vswitchd/bridge.c:2203:13
  10 0x4f7f43 in iface_create vswitchd/bridge.c:2246:13
  11 0x4f7a63 in bridge_add_ports__ vswitchd/bridge.c:1225:21
  12 0x4d8ea4 in bridge_add_ports vswitchd/bridge.c:1241:5
  13 0x4cce3a in bridge_reconfigure vswitchd/bridge.c:952:9
  14 0x4ca156 in bridge_run vswitchd/bridge.c:3439:9
  15 0x5278e7 in main vswitchd/ovs-vswitchd.c:137:9
  16 0x7f9def in __libc_start_call_main
  17 0x7f9def in __libc_start_main@GLIBC_2.2.5
  18 0x432b14 in _start (vswitchd/ovs-vswitchd+0x432b14)

We do already initialize this structure for a few ioctls, let's
do that for all of them.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2024-12-02 20:22:01 +01:00
Adrian Moreno
6240c0b4c8 netdev: Add netdev_get_speed() to netdev API.
Currently, the netdev's speed is being calculated by taking the link's
feature bits (using netdev_get_features()) and transforming them into
bps.

This mechanism can be both inaccurate and difficult to maintain, mainly
because we currently use the feature bits supported by OpenFlow which
would have to be extended to support all new feature bits of all netdev
implementations while keeping the OpenFlow API intact.

In order to expose the link speed accurately for all current and future
hardware, add a new netdev API call that allows the implementations to
provide the current and maximum link speeds in Mbps.

Internally, the logic to get the maximum supported speed still relies on
feature bits so it might still get out of sync in the future. However,
the maximum configurable speed is not used as much as the current speed
and these feature bits are not exposed through the netdev interface so
it should be easier to add more.

Use this new function instead of netdev_get_features() where the link
speed is needed.

As a consequence of this patch, link speeds of cards is properly
reported (internally in OVSDB) even if not supported by OpenFlow.
A test verifies this behavior using a tap device.

Also, in order to avoid using the old, this patch adds a checkpatch.py
warning if the old API is used.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2137567
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
2023-07-17 20:03:32 +02:00
Ben Pfaff
b53433079b treewide: Get rid of // comments, even inside comments.
Just a style fix.

With this patch, the following reports no hits:

git ls-files | grep '\.[ch]$' | grep -vE 'datapath|sflow' \
    | xargs grep -n // | grep -vE "http|s/|'|\""

Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2019-01-25 13:09:52 -08:00
Ilya Maximets
1270b6e52c treewide: Wider use of packet batch APIs.
This patch replaces most of direct accesses to the dp_packet_batch
internal components by appropriate APIs.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2018-12-11 20:18:26 +00:00
Ilya Maximets
72713c651d netdev-bsd: Fix build failure because of undefined NO_OFFLOAD_API.
NO_OFFLOAD_API was removed while netdev classes initialization
refactoring, but netdev-bsd still uses it. Instead of
redefining it, I just refactored the BSD classes to be same
as other netdevs.

CC: Ben Pfaff <blp@ovn.org>
Fixes: 89c09c1cd1f0 ("netdev: Clean up class initialization.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-10-17 09:45:50 -07:00
Olivier Cochard-Labbé
61b1c7acb9 netdev-bsd: Fix crash on FreeBSD.
Working on bug https://github.com/openvswitch/ovs-issues/issues/152, I've
found wrong mapping of netdev functions on FreeBSD.

Signed-off-by: Olivier Cochard <olivier@FreeBSD.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2018-07-11 11:17:16 -07:00
John Hurley
88dcf2aa82 netdev-provider: add class op to get block_id
Add a new class op for netdevs to get the block_id if one exists. The
block_id is used in offload ops to group multiple qdiscs together.

Stub calls are made to the new class op (implementation to follow in
further patches). The default block_id of 0 (no block) will be used in
these cases.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2018-06-29 14:51:47 +02:00
Jan Scheurich
8492adc270 netdev: Add optional qfill output parameter to rxq_recv()
If the caller provides a non-NULL qfill pointer and the netdev
implemementation supports reading the rx queue fill level, the rxq_recv()
function returns the remaining number of packets in the rx queue after
reception of the packet burst to the caller. If the implementation does
not support this, it returns -ENOTSUP instead. Reading the remaining queue
fill level should not substantilly slow down the recv() operation.

A first implementation is provided for ethernet and vhostuser DPDK ports
in netdev-dpdk.c.

This output parameter will be used in the upcoming commit for PMD
performance metrics to supervise the rx queue fill level for DPDK
vhostuser ports.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Acked-by: Billy O'Mahony <billy.o.mahony@intel.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
2018-05-11 08:08:24 +01:00
Ilya Maximets
ad8b0b4fe7 netdev: Remove useless cutlen.
Cutlen already applied while processing OVS_ACTION_ATTR_OUTPUT.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com
2017-12-20 21:07:46 +00:00
Ilya Maximets
b30896c969 netdev: Remove unused may_steal.
Not needed anymore because 'may_steal' already handled on
dpif-netdev layer and always true.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com
2017-12-20 21:07:46 +00:00
Xiao Liang
fd016ae3fb lib: Move lib/poll-loop.h to include/openvswitch
Poll-loop is the core to implement main loop. It should be available in
libopenvswitch.

Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-03 10:47:55 -07:00
Ben Pfaff
a61a289119 dp-packet: New function dp_packet_get_send_len().
This function is useful in a few places for representing the packet's
length minus the cutlen.

Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-02 18:58:10 -07:00
Ben Pfaff
875ab13020 userspace: Handling of versatile tunnel ports
In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based
on packet_type of flow. If it's about an Ethernet packet, it is set to
ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set
according to the name space type.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-06-27 17:28:30 -04:00
Paul Blakey
18ebd48cfb netdev: Adding a new netdev API to be used for offloading flows
Add a new API interface for offloading dpif flows to netdev.
The API consist on the following:
  flow_put - offload a new flow
  flow_get - query an offloaded flow
  flow_del - delete an offloaded flow
  flow_flush - flush all offloaded flows
  flow_dump_* - dump all offloaded flows

In upcoming commits we will introduce an implementation of this
API for netdev-linux.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-14 10:12:30 +02:00
Jan Scheurich
2482b0b0c8 userspace: Add packet_type in dp_packet and flow
This commit adds a packet_type attribute to the structs dp_packet and flow
to explicitly carry the type of the packet as prepration for the
introduction of the so-called packet type-aware pipeline (PTAP) in OVS.

The packet_type is a big-endian 32 bit integer with the encoding as
specified in OpenFlow verion 1.5.

The upper 16 bits contain the packet type name space. Pre-defined values
are defined in openflow-common.h:

enum ofp_header_type_namespaces {
    OFPHTN_ONF = 0,             /* ONF namespace. */
    OFPHTN_ETHERTYPE = 1,       /* ns_type is an Ethertype. */
    OFPHTN_IP_PROTO = 2,        /* ns_type is a IP protocol number. */
    OFPHTN_UDP_TCP_PORT = 3,    /* ns_type is a TCP or UDP port. */
    OFPHTN_IPV4_OPTION = 4,     /* ns_type is an IPv4 option number. */
};

The lower 16 bits specify the actual type in the context of the name space.

Only name spaces 0 and 1 will be supported for now.

For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet).
This is the default packet_type in OVS and the only one supported so far.
Packets of type (OFPHTN_ONF, 0) are called Ethernet packets.

In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet.
A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet
whith the Ethernet header (and any VLAN tags) removed to expose the L3
(or L2.5) payload of the packet. These will simply be called L3 packets.

The Ethernet address fields dl_src and dl_dst in struct flow are not
applicable for an L3 packet and must be zero. However, to maintain
compatibility with the large code base, we have chosen to copy the
Ethertype of an L3 packet into the the dl_type field of struct flow.

This does not mean that it will be possible to match on dl_type for L3
packets with PTAP later on. Matching must be done on packet_type instead.

New dp_packets are initialized with packet_type Ethernet. Ports that
receive L3 packets will have to explicitly adjust the packet_type.

Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-05-03 16:56:40 -07:00
Hui Kang
c8c300b88b netdev-bsd: Fix compiling error in netbsd.
In some netbsd version, RTF_LLINFO is undefined.

Signed-off-by: Hui Kang <hkang.sunysb@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-12-06 08:59:26 -08:00
Daniele Di Proietto
1c33f0c35e netdev: Pass 'netdev_class' to ->run() and ->wait().
This will allow run() and wait() methods to be shared between different
classes and still perform class-specific work.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-08-15 11:07:37 -07:00
Daniele Di Proietto
efe179e041 netdev-*: Do not use dp_packet_pad() in recv() functions.
All the netdevs used by dpif-netdev (except for netdev-dpdk) have a
dp_packet_pad() call in the receive function, probably because the
userspace datapath couldn't handle properly short packets.

This doesn't appear to be the case anymore.

This commit removes the call to have a more consistent behavior with the
kernel datapath.

All the testsuite changes in this commit adjust the expectations for
packet lengths in flow dumps and other stats.  There's only one fix in
ovn.at: one of the test_ip() functions generated an incomplete udp
packet, which was not a problem until now, because of the padding.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-07-29 14:08:10 -07:00
Ilya Maximets
324c837485 dpif-netdev: XPS (Transmit Packet Steering) implementation.
If CPU number in pmd-cpu-mask is not divisible by the number of queues and
in a few more complex situations there may be unfair distribution of TX
queue-ids between PMD threads.

For example, if we have 2 ports with 4 queues and 6 CPUs in pmd-cpu-mask
such distribution is possible:
<------------------------------------------------------------------------>
pmd thread numa_id 0 core_id 13:
        port: vhost-user1       queue-id: 1
        port: dpdk0     queue-id: 3
pmd thread numa_id 0 core_id 14:
        port: vhost-user1       queue-id: 2
pmd thread numa_id 0 core_id 16:
        port: dpdk0     queue-id: 0
pmd thread numa_id 0 core_id 17:
        port: dpdk0     queue-id: 1
pmd thread numa_id 0 core_id 12:
        port: vhost-user1       queue-id: 0
        port: dpdk0     queue-id: 2
pmd thread numa_id 0 core_id 15:
        port: vhost-user1       queue-id: 3
<------------------------------------------------------------------------>

As we can see above dpdk0 port polled by threads on cores:
	12, 13, 16 and 17.

By design of dpif-netdev, there is only one TX queue-id assigned to each
pmd thread. This queue-id's are sequential similar to core-id's. And
thread will send packets to queue with exact this queue-id regardless
of port.

In previous example:

	pmd thread on core 12 will send packets to tx queue 0
	pmd thread on core 13 will send packets to tx queue 1
	...
	pmd thread on core 17 will send packets to tx queue 5

So, for dpdk0 port after truncating in netdev-dpdk:

	core 12 --> TX queue-id 0 % 4 == 0
	core 13 --> TX queue-id 1 % 4 == 1
	core 16 --> TX queue-id 4 % 4 == 0
	core 17 --> TX queue-id 5 % 4 == 1

As a result only 2 of 4 queues used.

To fix this issue some kind of XPS implemented in following way:

	* TX queue-ids are allocated dynamically.
	* When PMD thread first time tries to send packets to new port
	  it allocates less used TX queue for this port.
	* PMD threads periodically performes revalidation of
	  allocated TX queue-ids. If queue wasn't used in last
	  XPS_TIMEOUT_MS milliseconds it will be freed while revalidation.
        * XPS is not working if we have enough TX queues.

Reported-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-07-27 12:56:04 -07:00
Terry Wilson
ee89ea7b47 json: Move from lib to include/openvswitch.
To easily allow both in- and out-of-tree building of the Python
wrapper for the OVS JSON parser (e.g. w/ pip), move json.h to
include/openvswitch. This also requires moving lib/{hmap,shash}.h.

Both hmap.h and shash.h were #include-ing "util.h" even though the
headers themselves did not use anything from there, but rather from
include/openvswitch/util.h. Fixing that required including util.h
in several C files mostly due to OVS_NOT_REACHED and things like
xmalloc.

Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-22 17:09:17 -07:00
William Tu
64839cf432 netdev-provider: Apply batch object to netdev provider.
Commit 1895cc8dbb64 ("dpif-netdev: create batch object") introduces
batch process functions and 'struct dp_packet_batch' to associate with
batch-level metadata.  This patch applies the packet batch object to
the netdev provider interface (dummy, Linux, BSD, and DPDK) so that
batch APIs can be used in providers.  With batch metadata visible in
providers, optimizations can be introduced at per-batch level instead
of per-packet.

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/145694197
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-07-21 16:46:32 -07:00
William Tu
aaca4fe0ce ofp-actions: Add truncate action.
The patch adds a new action to support packet truncation.  The new action
is formatted as 'output(port=n,max_len=m)', as output to port n, with
packet size being MIN(original_size, m).

One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload.  Example
use case is below as well as shown in the testcases:

    - Output to port 1 with max_len 100 bytes.
    - The output packet size on port 1 will be MIN(original_packet_size, 100).
    # ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)'

    - The scope of max_len is limited to output action itself.  The following
      packet size of output:1 and output:2 will be intact.
    # ovs-ofctl add-flow br0 \
            'actions=output(port=1,max_len=100),output:1,output:2'
    - The Datapath actions shows:
    # Datapath actions: trunc(100),1,1,2

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-06-24 09:17:00 -07:00
Daniele Di Proietto
050c60bfb5 netdev-dpdk: Use ->reconfigure() call to change rx/tx queues.
This introduces in dpif-netdev and netdev-dpdk the first use for the
newly introduce reconfigure netdev call.

When a request to change the number of queues comes, netdev-dpdk will
remember this and notify the upper layer via
netdev_request_reconfigure().

The datapath, instead of periodically calling netdev_set_multiq(), can
detect this and call reconfigure().

This mechanism can also be used to:
* Automatically match the number of rxq with the one provided by qemu
  via the new_device callback.
* Provide a way to change the MTU of dpdk devices at runtime.
* Move a DPDK vhost device to the proper NUMA socket.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2016-05-23 10:27:42 -07:00
Daniele Di Proietto
790fb3b745 netdev: Add reconfigure request mechanism.
A netdev provider, especially a PMD provider (like netdev DPDK) might
not be able to change some of its parameters (such as MTU, or number of
queues) without stopping everything and restarting.

This commit introduces a mechanism that allows a netdev provider to
request a restart (netdev_request_reconfigure()).  The upper layer can
be notified via netdev_wait_reconf_required() and
netdev_is_reconf_required().  After closing all the rxqs the upper layer
can finally call netdev_reconfigure(), to make sure that the new
configuration is in place.

This will be used by next commit to reconfigure rx and tx queues in
netdev-dpdk.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
2016-05-23 10:27:42 -07:00
William Tu
91644f45c6 dp-packet: Fix use of uninitialised value at emc_lookup.
Valgrind reports "Conditional jump or move depends on uninitialised value"
and "Use of uninitialised value" at case 2016 ovn -- 3 HVs, 1 LS, 3
lports/HV.  It is caused by 1) assigning an uninitialized value to 'key->hash'
at emc_processing(). Due to uninit rss_hash_valid, dp_packet_rss_valid() might
return true and undefined hash value is returned, and 2) at emc_lookup, the
'current_entry->key.hash' could be uninitialized due to dp_packet_clone().
The patch fixes the two and as a result, a couple of calls to
dp_packet_rss_invalidate() become redundant and thus are removed.

Call stacks:
- Connditional jump or move depends on uninitialised value(s)
    dpif_netdev_packet_get_rss_hash (dpif-netdev.c:3334)
    emc_processing (dpif-netdev.c:3455)
    dp_netdev_input__ (dpif-netdev.c:3639)
and,
- Use of uninitialised value of size 8
    emc_lookup (dpif-netdev.c:1785)
    emc_processing (dpif-netdev.c:3457)
    dp_netdev_input__ (dpif-netdev.c:3639)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-04-06 19:32:34 -07:00
Pravin B Shelar
6b6e13293e netdev: remove netdev_get_in4()
Since netdev can have multiple IP address use
generic api netdev_get_addr_list().  This also make it
easier to handle IPv4 and IPv6 address across vswitchd
layers.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-03-24 09:30:57 -07:00
Pravin B Shelar
a8704b5027 tunneling: Handle multiple ip address for given device.
Device can have multiple IP address but netdev_get_in4/6()
returns only one configured IPv6 address. Following
patch fixes it.
OVS router is also updated to return source ip address for
given destination, This is required when interface has multiple
IP address configured.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-03-24 09:30:57 -07:00
Lance Richardson
97d0619c11 osx: handle differences between OS X and other BSDs
Conditional compilation to account for:
  - OS X does not implement RTM_IFANNOUNCE.
  - OS X does not implement tap netdeivces.
  - OS X does not implement RT_ROUNDUP().

Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-03-22 18:42:58 -07:00
Ben Warren
3e8a2ad145 Move lib/dynamic-string.h to include/openvswitch directory
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-03-19 10:02:12 -07:00
Ilya Maximets
118c77b1a8 netdev: New field 'is_pmd' in netdev_class.
Made to simplify creation of derived classes.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-03-16 17:03:07 -07:00
xushengping
ed7f217471 netdev-bsd: Destroy mutex on netdev_bsd_construct_system() error path.
Signed-off-by: xushengping <shengping.xu@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-01-04 12:44:55 -08:00
YAMAMOTO Takashi
1e450da53a netdev-bsd: Update after eth_addr changes
Signed-off-by: YAMAMOTO Takashi <yamamoto@midokura.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2015-11-26 18:37:00 +09:00
Daniele Di Proietto
f2f44f5da0 dpif-netdev: Check for PKT_RX_RSS_HASH flag.
DPDK mbufs contain a valid RSS hash only if PKT_RX_RSS_HASH is
set in 'ol_flags'.  Otherwise the hash is garbage and doesn't
relate to the packet.

This fixes an issue with vhost, which, being a virtual NIC, doesn't
compute the hash.

Reported-by: Dongjun <dongj@dtdream.com>
Suggested-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Kevin Traynor <kevin.traynor@intel.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2015-09-11 17:43:39 +01:00
Jarno Rajahalme
74ff3298c8 userspace: Define and use struct eth_addr.
Define struct eth_addr and use it instead of a uint8_t array for all
ethernet addresses in OVS userspace.  The struct is always the right
size, and it can be assigned without an explicit memcpy, which makes
code more readable.

"struct eth_addr" is a good type name for this as many utility
functions are already named accordingly.

struct eth_addr can be accessed as bytes as well as ovs_be16's, which
makes the struct 16-bit aligned.  All use seems to be 16-bit aligned,
so some algorithms on the ethernet addresses can be made a bit more
efficient making use of this fact.

As the struct fits into a register (in 64-bit systems) we pass it by
value when possible.

This patch also changes the few uses of Linux specific ETH_ALEN to
OVS's own ETH_ADDR_LEN, and removes the OFP_ETH_ALEN, as it is no
longer needed.

This work stemmed from a desire to make all struct flow members
assignable for unrelated exploration purposes.  However, I think this
might be a nice code readability improvement by itself.

Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-08-28 14:55:11 -07:00
Dan McGregor
ccf7b34ead netdev-bsd: Include net/bpf.h.
The documentation says it is required to use bpf ioctls on both
NetBSD and FreeBSD. It causes a compile time failure on FreeBSD 10.

Signed-off-by: Dan McGregor <dan.mcgregor@usask.ca>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-05-20 18:45:02 -07:00
Daniele Di Proietto
2bc1bbd27d dp-packet: Rename 'dp_hash' in 'rss_hash'.
We already have the 'dp_hash' embedded in the metadata.  This caused
confusion in the code.  With this commit it should be clear that
'rss_hash' is the packet hash used for internal purposes, while
'md.dp_hash' is part of the flow, computed during the execution of
certain actions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
2015-04-20 12:49:41 -07:00
Ben Pfaff
e7d6348614 netdev-bsd: Fix netdev_bsd_get_mtu() return value.
When netdev_bsd_get_mtu() failed, it didn't report the error to the caller,
so the caller couldn't work around not knowing the MTU, and ended up using
an uninitialized 'mtu' value.

Found by LLVM scan-build.

Reported-by: Kevin Lo <kevlo@FreeBSD.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Kevin Lo <kevlo@FreeBSD.org>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
2015-04-17 09:22:25 -07:00
Kevin Lo
ea4226359f netdev-bsd: Remove duplicate header inclusion of <netinet/in.h>
Signed-off-by: Kevin Lo <kevlo@FreeBSD.org>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-04-15 07:40:33 -07:00
Kevin Lo
1547aff669 netdev-bsd: Fix sign extension bug in ifr_flags on FreeBSD.
FreeBSD fills the int return value with ifr_flagshigh in the high
16 bits and ifr_flags in the low 16 bits rather than blindly promoting
ifr_flags to an int, which will preserve the sign.
This commit makes sure the flags returned isn't negative and apply mask
0xffff to flags.

Signed-off-by: Kevin Lo <kevlo@FreeBSD.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
2015-04-05 09:55:25 -07:00
YAMAMOTO Takashi
ed3c9990c6 netdev-bsd: Fix a compilation error
Fix a compilation problem introduced by
commit cf62fa4c7074121184a1f1d07980990113657612
("dp-packet: Remove ofpbuf dependency.")

Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-03-04 15:13:48 +09:00
Pravin B Shelar
cf62fa4c70 dp-packet: Remove ofpbuf dependency.
Currently dp-packet make use of ofpbuf for managing packet
buffers. That complicates ofpbuf, by making dp-packet
independent of ofpbuf both libraries can be optimized for
their own use case.
This avoids mapping operation between ofpbuf and dp_packet
in datapath upcalls.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2015-03-03 13:37:37 -08:00
Pravin B Shelar
e14deea0bd dpif_packet: Rename to dp_packet
dp_packet is short and better name for datapath packet
structure.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2015-03-03 13:37:34 -08:00
Ben Pfaff
8742957c2b userspace: Replace all uses of strncpy() by ovs_strlcpy().
strncpy() has a lot of pitfalls.  A while back we replaced all its uses by
calls to ovs_strlcpy() or ovs_strzcpy(), but some more have crept in.  This
commit fixes them.

Reported-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
2015-02-20 12:32:17 -08:00
Thomas Graf
e6211adce4 lib: Move vlog.h to <openvswitch/vlog.h>
A new function vlog_insert_module() is introduced to avoid using
list_insert() from the vlog.h header.

Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-12-15 14:15:19 +01:00
Pravin B Shelar
a36de779d7 openvswitch: Userspace tunneling.
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.

Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-11-12 15:08:33 -08:00
Alex Wang
5496878cbf netdev: Add function for configuring tx and rx queues.
This commit adds a new API to the 'struct netdev_class' which
allows user to configure the number of tx queues and rx queues
of 'netdev'.  Upcoming patches will use this function to set
multiple tx/rx queues when adding the netdev to dpif-netdev.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-15 11:43:48 -07:00
Pravin B Shelar
2f9dd77fcd ofproto: Do not update stats on fake bond interface.
There are couple of reasons to remove this support:
*   This is used in very old OVS use-case. It is much better
    to read stats directly from OVS.
*   Forthcoming commit will remove support for setting stats
    for vport. The stats update depends on stats-set.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
2014-09-15 10:08:56 -07:00
Alex Wang
f00fa8cbad netdev: Add n_txq to 'struct netdev'.
This commit adds new variable n_txq to 'struct netdev' for recording
the number of tx queues.  Correspondingly, the send_*() functions are
extended to accept queue id as input argument.

All 'netdev-*' implementation will ignore the queue id since having
multiple tx queues is not supported.  Upcomping patches will start
using it and create multiple tx queues for dpdk netdev.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-12 11:30:58 -07:00
Alex Wang
7dec44fe1c netdev: Add function for getting the numa node id of netdev.
This commit adds a new API to the 'struct netdev_class' which
allows user to query the numa node id the 'netdev' is on.

Currently, only netdev-dpdk module implements this function.

Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-12 11:30:58 -07:00
Daniele Di Proietto
61a2647e15 packet-dpif: Add dpif_packet_{get, set}_hash()
These function are used to stored the packet hash. 'netdev-dpdk'
automatically set this value to the RSS hash returned by the
NIC. Other 'netdev's set it to 0 (which is an invalid hash
value), so that callers can compute the hash on their own.

If DPDK support is enabled, struct dpif_packet's member
'dp_hash' is removed and 'pkt.hash.rss' from DPDK mbuf is used

This commit also configure DPDK devices to compute RSS hash
for UDP and IPv6 packets

Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-08-29 16:32:21 -07:00