2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-23 10:28:00 +00:00

351 Commits

Author SHA1 Message Date
Bhanuprakash Bodireddy
e0a00cee33 netdev-linux: Clean up netdev_linux_sock_batch_send().
Use DP_PACKET_BATCH_FOR_EACH macro and dp_packet_batch_size() API
in netdev_linux_sock_batch_send().

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-03 13:22:00 -07:00
Bhanuprakash Bodireddy
13708b2183 netdev-linux: Use DP_PACKET_BATCH_FOR_EACH in netdev_linux_tap_batch_send.
Use DP_PACKET_BATCH_FOR_EACH macro in netdev_linux_tap_batch_send().

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-03 13:20:22 -07:00
Xiao Liang
fd016ae3fb lib: Move lib/poll-loop.h to include/openvswitch
Poll-loop is the core to implement main loop. It should be available in
libopenvswitch.

Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-03 10:47:55 -07:00
Kaige Fu
214117fdbd netdev-linux: Fix wrong ceil rate when max-rate less than 8bit.
When max-rate is less than 8bit, the hc->max_rate will be set
as htb->max_rate mistakenly instead of mtu of netdev.

Fixes: 13c1637 ("smap: New function smap_get_ullong().")
Signed-off-by: Kaige Fu <fukaige@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-11-02 11:32:21 -07:00
Roi Dayan
580e1152d4 netdev-linux: Reduce log level for ENODEV errors getting ifindex
These are normal and unavoidable, because the vifs
disappear from the kernel before they are removed them from the OVS
database.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-09 12:51:54 -07:00
Zhenyu Gao
d19cf8bb79 netdev-linux: Replace sendmsg with sendmmsg in netdev_linux_send
Sendmmsg can reduce cpu cycles in sending packets to kernel.
Replace sendmsg with sendmmsg in function netdev_linux_send to send
batch packets if sendmmsg is available.

If kernel side doesn't support sendmmsg, will fallback to sendmsg.

    netserver
|------------|
|            |
|  container |
|----veth----|
          |
          |        |------------|
          |---veth-|   dpdk-ovs |      netperf
                   |            |  |--------------|
                   |----dpdk----|  | bare-metal   |
                         |         |--------------|
                         |              |
                         |              |
                        pnic-----------pnic

Netperf was consumed to test the performance:

1)cmd:netperf -H remote-container -t UDP_STREAM -l 60 -- -m 1400
result: netserver received 2383.21Mb(sendmsg)/2551.64Mb(sendmmsg)

2)cmd:netperf -H remote-container -t UDP_STREAM -l 60 -- -m 60
result: netserver received 109.72Mb(sendmsg)/115.18Mb(sendmmsg)

Sendmmsg show about 6% improvement in netperf UDP testing.

Signed-off-by: Zhenyu Gao <sysugaozhenyu@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-02 18:58:42 -07:00
Ben Pfaff
a61a289119 dp-packet: New function dp_packet_get_send_len().
This function is useful in a few places for representing the packet's
length minus the cutlen.

Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-08-02 18:58:10 -07:00
Ben Pfaff
71f21279f6 Eliminate most shadowing for local variable names.
Shadowing is when a variable with a given name in an inner scope hides a
different variable with the same name in a surrounding scope.  This is
generally undesirable because it can confuse programmers.  This commit
eliminates most of it.

Found with -Wshadow=local in GCC 7.  The repo is not really ready to enable
this option by default because of a few cases that are harder to fix, and
harmless, such as nested use of CMAP_FOR_EACH.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
2017-08-02 15:03:35 -07:00
Ben Pfaff
875ab13020 userspace: Handling of versatile tunnel ports
In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based
on packet_type of flow. If it's about an Ethernet packet, it is set to
ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set
according to the name space type.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-06-27 17:28:30 -04:00
Joe Stringer
ef3767f5c6 tc: Tidy up includes.
Fix minor style variations and unnecessary includes.

Signed-off-by: Joe Stringer <joe@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Greg Rose <gvrose8192@gmail.com>
2017-06-19 16:07:43 -07:00
Paul Blakey
01b257860c netdev-vport: Use common offloads interface
netdev vports are backed by actualy netdev at the kernel
level, so they can use the common netdev-tc offloads interface
for flow offloading (if enabled).

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-15 11:51:43 +02:00
Paul Blakey
d5ae4a6026 netdev-linux: Disallow setting policing when configured with hw offload
Notify as not supported. Otherwise the ingress qdisc is being removed and
offload rules will be removed.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-15 11:51:04 +02:00
Paul Blakey
18ebd48cfb netdev: Adding a new netdev API to be used for offloading flows
Add a new API interface for offloading dpif flows to netdev.
The API consist on the following:
  flow_put - offload a new flow
  flow_get - query an offloaded flow
  flow_del - delete an offloaded flow
  flow_flush - flush all offloaded flows
  flow_dump_* - dump all offloaded flows

In upcoming commits we will introduce an implementation of this
API for netdev-linux.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-14 10:12:30 +02:00
Paul Blakey
c1c5c72340 tc: Introduce tc module
Add tc module to expose tc operations to be used by other modules.
Move some tc related functions from netdev-linux.c to tc.c
This patch doesn't change any functionality.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Co-authored-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-14 10:02:04 +02:00
Roi Dayan
7874bdff0d netdev-linux: Refactor two tc functions
Refactor tc_make_request and tc_add_del_ingress_qdisc to accept
ifindex instead of netdev struct.
We later want to move those outside netdev-linux module to be
used by other modules.
This patch doesn't change any functionality.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Co-authored-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
2017-06-14 09:55:58 +02:00
Flavio Leitner
61b9d078af netdev-linux: maintain original device's state
It is important to maintain the original state when
the device already exists in the system.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-06-06 17:12:49 -07:00
Flavio Leitner
0f28164be0 netdev-linux: make tap devices persistent.
When using data path type "netdev", bridge port is a tun device
and when OVS restarts, that device and its network configuration
is lost.

This patch enables the tap device to persist instead.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-06-06 17:11:07 -07:00
Zhenyu Gao
0a62ae2c62 netdev-linux: Refactor netdev_linux_send() in forwarding batch packets.
We don't need to initialize sock,msg and sll before calling sendmsg each
time.  Just initialize them before the loop, it can reduce cpu cycles.

Signed-off-by: Zhenyu Gao <sysugaozhenyu@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-05-31 08:54:37 -07:00
Jan Scheurich
2482b0b0c8 userspace: Add packet_type in dp_packet and flow
This commit adds a packet_type attribute to the structs dp_packet and flow
to explicitly carry the type of the packet as prepration for the
introduction of the so-called packet type-aware pipeline (PTAP) in OVS.

The packet_type is a big-endian 32 bit integer with the encoding as
specified in OpenFlow verion 1.5.

The upper 16 bits contain the packet type name space. Pre-defined values
are defined in openflow-common.h:

enum ofp_header_type_namespaces {
    OFPHTN_ONF = 0,             /* ONF namespace. */
    OFPHTN_ETHERTYPE = 1,       /* ns_type is an Ethertype. */
    OFPHTN_IP_PROTO = 2,        /* ns_type is a IP protocol number. */
    OFPHTN_UDP_TCP_PORT = 3,    /* ns_type is a TCP or UDP port. */
    OFPHTN_IPV4_OPTION = 4,     /* ns_type is an IPv4 option number. */
};

The lower 16 bits specify the actual type in the context of the name space.

Only name spaces 0 and 1 will be supported for now.

For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet).
This is the default packet_type in OVS and the only one supported so far.
Packets of type (OFPHTN_ONF, 0) are called Ethernet packets.

In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet.
A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet
whith the Ethernet header (and any VLAN tags) removed to expose the L3
(or L2.5) payload of the packet. These will simply be called L3 packets.

The Ethernet address fields dl_src and dl_dst in struct flow are not
applicable for an L3 packet and must be zero. However, to maintain
compatibility with the large code base, we have chosen to copy the
Ethertype of an L3 packet into the the dl_type field of struct flow.

This does not mean that it will be possible to match on dl_type for L3
packets with PTAP later on. Matching must be done on packet_type instead.

New dp_packets are initialized with packet_type Ethernet. Ports that
receive L3 packets will have to explicitly adjust the packet_type.

Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2017-05-03 16:56:40 -07:00
William Tu
48c6733cb5 bridge: Prohibit "default" and "all" bridge name.
Under Linux, when users create bridge named "default" or "all", although
ovs-vsctl fails but vswitchd in the background will keep retrying it,
causing the systemd-udev to reach 100% cpu utilization. The patch prevents
any attempt to create or open a netdev named "default" or "all" because
these two names are reserved on Linux due to
/proc/sys/net/ipv4/conf/ always contains directories by these names.

The reason for high CPU utilization is due to frequent calls into kernel's
register_netdevice function, which will invoke several kernel elements who
has registered on the netdevice notifier chain.  And due to creation failed,
OVS wakes up and re-recreate the device, which ends up as a high CPU loop.

VMWare-BZ: #1842388
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Greg Rose <gvrose8192@gmail.com>
2017-05-01 10:08:26 -07:00
Andy Zhou
72c84bc2db dp-packet: Enhance packet batch APIs.
One common use case of 'struct dp_packet_batch' is to process all
packets in the batch in order. Add an iterator for this use case
to simplify the logic of calling sites,

Another common use case is to drop packets in the batch, by reading
all packets, but writing back pointers of fewer packets. Add macros
to support this use case.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
2017-01-26 17:35:29 -08:00
Eric Garver
1ebdc7eb81 netdev-linux: double tagged packets should use 0x88a8
We need to check if a packet is double tagged. If so make sure to push
0x88a8 instead of 0x8100. Without this a simple port redirect of 802.1ad
frames means the outer tag gets translated from 0x88a8 to 0x8100 by the
userspace datapath.

This only affected kernels that don't use TP_STATUS_VLAN_TPID_VALID,
which is kernels < 3.14.

Signed-off-by: Eric Garver <e@erig.me>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-10-04 08:47:18 -07:00
David Hill
9120cfc05c netdev-linux: Use ethtool when miimon fails.
Some network drivers might return true to SIOCGMIIPHY and an error on
SIOCGMIIREG when using MII to query phy state. Fall back to ethtool if this
happens to allow failover to work when using such nics.

Reported-at: http://openvswitch.org/pipermail/dev/2016-August/078800.html
Signed-off-by: David Hill <dhill@redhat.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
2016-09-27 11:38:18 -07:00
Daniele Di Proietto
1c33f0c35e netdev: Pass 'netdev_class' to ->run() and ->wait().
This will allow run() and wait() methods to be shared between different
classes and still perform class-specific work.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-08-15 11:07:37 -07:00
Daniele Di Proietto
4124cb1254 netdev: Make netdev_set_mtu() netdev parameter non-const.
Every provider silently drops the const attribute when converting the
parameter to the appropriate subclass.  Might as well drop the const
attribute from the parameter, since this is a "set" function.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2016-08-12 19:32:12 -07:00
Ben Pfaff
13c1637f5b smap: New function smap_get_ullong().
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
2016-08-08 11:00:37 -07:00
Daniele Di Proietto
efe179e041 netdev-*: Do not use dp_packet_pad() in recv() functions.
All the netdevs used by dpif-netdev (except for netdev-dpdk) have a
dp_packet_pad() call in the receive function, probably because the
userspace datapath couldn't handle properly short packets.

This doesn't appear to be the case anymore.

This commit removes the call to have a more consistent behavior with the
kernel datapath.

All the testsuite changes in this commit adjust the expectations for
packet lengths in flow dumps and other stats.  There's only one fix in
ovn.at: one of the test_ip() functions generated an incomplete udp
packet, which was not a problem until now, because of the padding.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-07-29 14:08:10 -07:00
Ilya Maximets
324c837485 dpif-netdev: XPS (Transmit Packet Steering) implementation.
If CPU number in pmd-cpu-mask is not divisible by the number of queues and
in a few more complex situations there may be unfair distribution of TX
queue-ids between PMD threads.

For example, if we have 2 ports with 4 queues and 6 CPUs in pmd-cpu-mask
such distribution is possible:
<------------------------------------------------------------------------>
pmd thread numa_id 0 core_id 13:
        port: vhost-user1       queue-id: 1
        port: dpdk0     queue-id: 3
pmd thread numa_id 0 core_id 14:
        port: vhost-user1       queue-id: 2
pmd thread numa_id 0 core_id 16:
        port: dpdk0     queue-id: 0
pmd thread numa_id 0 core_id 17:
        port: dpdk0     queue-id: 1
pmd thread numa_id 0 core_id 12:
        port: vhost-user1       queue-id: 0
        port: dpdk0     queue-id: 2
pmd thread numa_id 0 core_id 15:
        port: vhost-user1       queue-id: 3
<------------------------------------------------------------------------>

As we can see above dpdk0 port polled by threads on cores:
	12, 13, 16 and 17.

By design of dpif-netdev, there is only one TX queue-id assigned to each
pmd thread. This queue-id's are sequential similar to core-id's. And
thread will send packets to queue with exact this queue-id regardless
of port.

In previous example:

	pmd thread on core 12 will send packets to tx queue 0
	pmd thread on core 13 will send packets to tx queue 1
	...
	pmd thread on core 17 will send packets to tx queue 5

So, for dpdk0 port after truncating in netdev-dpdk:

	core 12 --> TX queue-id 0 % 4 == 0
	core 13 --> TX queue-id 1 % 4 == 1
	core 16 --> TX queue-id 4 % 4 == 0
	core 17 --> TX queue-id 5 % 4 == 1

As a result only 2 of 4 queues used.

To fix this issue some kind of XPS implemented in following way:

	* TX queue-ids are allocated dynamically.
	* When PMD thread first time tries to send packets to new port
	  it allocates less used TX queue for this port.
	* PMD threads periodically performes revalidation of
	  allocated TX queue-ids. If queue wasn't used in last
	  XPS_TIMEOUT_MS milliseconds it will be freed while revalidation.
        * XPS is not working if we have enough TX queues.

Reported-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-07-27 12:56:04 -07:00
Terry Wilson
ee89ea7b47 json: Move from lib to include/openvswitch.
To easily allow both in- and out-of-tree building of the Python
wrapper for the OVS JSON parser (e.g. w/ pip), move json.h to
include/openvswitch. This also requires moving lib/{hmap,shash}.h.

Both hmap.h and shash.h were #include-ing "util.h" even though the
headers themselves did not use anything from there, but rather from
include/openvswitch/util.h. Fixing that required including util.h
in several C files mostly due to OVS_NOT_REACHED and things like
xmalloc.

Signed-off-by: Terry Wilson <twilson@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-07-22 17:09:17 -07:00
William Tu
64839cf432 netdev-provider: Apply batch object to netdev provider.
Commit 1895cc8dbb64 ("dpif-netdev: create batch object") introduces
batch process functions and 'struct dp_packet_batch' to associate with
batch-level metadata.  This patch applies the packet batch object to
the netdev provider interface (dummy, Linux, BSD, and DPDK) so that
batch APIs can be used in providers.  With batch metadata visible in
providers, optimizations can be introduced at per-batch level instead
of per-packet.

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/145694197
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-07-21 16:46:32 -07:00
Daniele Di Proietto
29736cc076 netdev-linux: Do not log a warning if the device is down.
In the userspace datapath we use tap devices as internal netdev.  The
datapath doesn't consider whether a device is up or down before sending
to it, and so far this hasn't been a problem.

Since Linux upstream commit 1bd4978a88ac("tun: honor IFF_UP in
tun_get_user()"), included in 4.4, writing to a tap device that is not
up sets errno to EIO.  This commit avoids printing a warning in this
case.

This fixes a failures in the system-userspace-testsuites.

Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-07-06 18:14:25 -07:00
William Tu
aaca4fe0ce ofp-actions: Add truncate action.
The patch adds a new action to support packet truncation.  The new action
is formatted as 'output(port=n,max_len=m)', as output to port n, with
packet size being MIN(original_size, m).

One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload.  Example
use case is below as well as shown in the testcases:

    - Output to port 1 with max_len 100 bytes.
    - The output packet size on port 1 will be MIN(original_packet_size, 100).
    # ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)'

    - The scope of max_len is limited to output action itself.  The following
      packet size of output:1 and output:2 will be intact.
    # ovs-ofctl add-flow br0 \
            'actions=output(port=1,max_len=100),output:1,output:2'
    - The Datapath actions shows:
    # Datapath actions: trunc(100),1,1,2

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
2016-06-24 09:17:00 -07:00
bschanmu@redhat.com
6cf888b821 netdev-linux: Add new QoS type linux-noop.
Linux ``No operation'' qos type is used to inform the vswitch that the
traffic control for the port is managed externally. Any configuration values
set for this type will have no effect.

This patch provides a solution suggested in this mail -
http://openvswitch.org/pipermail/discuss/2015-May/017687.html

Signed-off-by: Babu Shanmugam <bschanmu@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-06-23 16:51:11 -07:00
Daniele Di Proietto
050c60bfb5 netdev-dpdk: Use ->reconfigure() call to change rx/tx queues.
This introduces in dpif-netdev and netdev-dpdk the first use for the
newly introduce reconfigure netdev call.

When a request to change the number of queues comes, netdev-dpdk will
remember this and notify the upper layer via
netdev_request_reconfigure().

The datapath, instead of periodically calling netdev_set_multiq(), can
detect this and call reconfigure().

This mechanism can also be used to:
* Automatically match the number of rxq with the one provided by qemu
  via the new_device callback.
* Provide a way to change the MTU of dpdk devices at runtime.
* Move a DPDK vhost device to the proper NUMA socket.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
2016-05-23 10:27:42 -07:00
Daniele Di Proietto
790fb3b745 netdev: Add reconfigure request mechanism.
A netdev provider, especially a PMD provider (like netdev DPDK) might
not be able to change some of its parameters (such as MTU, or number of
queues) without stopping everything and restarting.

This commit introduces a mechanism that allows a netdev provider to
request a restart (netdev_request_reconfigure()).  The upper layer can
be notified via netdev_wait_reconf_required() and
netdev_is_reconf_required().  After closing all the rxqs the upper layer
can finally call netdev_reconfigure(), to make sure that the new
configuration is in place.

This will be used by next commit to reconfigure rx and tx queues in
netdev-dpdk.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
2016-05-23 10:27:42 -07:00
mweglicx
d6e3feb57c Add support for extended netdev statistics based on RFC 2819.
Implementation of new statistics extension for DPDK ports:
- Add new counters definition to netdev struct and open flow,
  based on RFC2819.
- Initialize netdev statistics as "filtered out"
  before passing it to particular netdev implementation
  (because of that change, statistics which are not
  collected are reported as filtered out, and some
  unit tests were modified in this respect).
- New statistics are retrieved using experimenter code and
  are printed as a result to ofctl dump-ports.
- New counters are available for OpenFlow 1.4+.
- Add new vendor id: INTEL_VENDOR_ID.
- New statistics are printed to output via ofctl only if those
  are present in reply message.
- Add new file header: include/openflow/intel-ext.h which
  contains new statistics definition.
- Extended statistics are implemented only for dpdk-physical
  and dpdk-vhost port types.
- Dpdk-physical implementation uses xstats to collect statistics.
- Dpdk-vhost implements only part of statistics (RX packet sized
  based counters).

Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com>
[blp@ovn.org made software devices more consistent]
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-05-06 15:28:56 -07:00
Daniele Di Proietto
4ec3d7c757 hmap: Add HMAP_FOR_EACH_POP.
Makes popping each member of the hmap a bit easier.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-04-26 23:28:59 -07:00
Miguel Angel Ajo
79abacc880 netdev-linux: Fix ingress policing burst rate configuration via tc
The tc_police structure was filled with a value calculated in bits
instead of bytes while bytes were expected. This led the setting
of an x8 higher burst value.

Documentation and defaults have been corrected accordingly to minimize
nuisances on users sticking to the defaults.

The suggested burst value is now 80% of policing rate to make sure
TCP works correctly.

Signed-off-by: Miguel Angel Ajo <majopela@redhat.com>
Tested-by: Miguel Angel Ajo <majopela@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-04-21 14:48:23 -07:00
William Tu
91644f45c6 dp-packet: Fix use of uninitialised value at emc_lookup.
Valgrind reports "Conditional jump or move depends on uninitialised value"
and "Use of uninitialised value" at case 2016 ovn -- 3 HVs, 1 LS, 3
lports/HV.  It is caused by 1) assigning an uninitialized value to 'key->hash'
at emc_processing(). Due to uninit rss_hash_valid, dp_packet_rss_valid() might
return true and undefined hash value is returned, and 2) at emc_lookup, the
'current_entry->key.hash' could be uninitialized due to dp_packet_clone().
The patch fixes the two and as a result, a couple of calls to
dp_packet_rss_invalidate() become redundant and thus are removed.

Call stacks:
- Connditional jump or move depends on uninitialised value(s)
    dpif_netdev_packet_get_rss_hash (dpif-netdev.c:3334)
    emc_processing (dpif-netdev.c:3455)
    dp_netdev_input__ (dpif-netdev.c:3639)
and,
- Use of uninitialised value of size 8
    emc_lookup (dpif-netdev.c:1785)
    emc_processing (dpif-netdev.c:3457)
    dp_netdev_input__ (dpif-netdev.c:3639)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-04-06 19:32:34 -07:00
Ben Warren
64c967795b Move lib/ofpbuf.h to include/openvswitch directory
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-03-30 13:10:18 -07:00
Pravin B Shelar
6b6e13293e netdev: remove netdev_get_in4()
Since netdev can have multiple IP address use
generic api netdev_get_addr_list().  This also make it
easier to handle IPv4 and IPv6 address across vswitchd
layers.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-03-24 09:30:57 -07:00
Pravin B Shelar
a8704b5027 tunneling: Handle multiple ip address for given device.
Device can have multiple IP address but netdev_get_in4/6()
returns only one configured IPv6 address. Following
patch fixes it.
OVS router is also updated to return source ip address for
given destination, This is required when interface has multiple
IP address configured.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-03-24 09:30:57 -07:00
Ben Warren
3e8a2ad145 Move lib/dynamic-string.h to include/openvswitch directory
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-03-19 10:02:12 -07:00
Ilya Maximets
118c77b1a8 netdev: New field 'is_pmd' in netdev_class.
Made to simplify creation of derived classes.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
2016-03-16 17:03:07 -07:00
Pravin B Shelar
989d713599 netdev-linux: Fix netdev ipv6 notification
Listen to RTNLGRP_IPV6_IFINFO to get IPv6 address change
notification.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
2016-03-10 18:53:50 -08:00
Thadeu Lima de Souza Cascardo
7c6401ca2b netdev-linux: Fix warning message.
Instead of reading

"error receiving Ethernet packet on Permission denied: ens3",

it should read

"error receiving Ethernet packet on ens3: Permission denied".

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2016-02-16 09:34:38 -08:00
Simon Horman
67bed84cb2 netdev-linux: Handle flags for 10G and 40G speeds
Handle advertised and supported flags for the following speeds:

* 1G base KX
* 10G base KX4, KR, R
* 40G base KR4, CR4, SR4, LR4

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2015-11-29 16:09:20 -08:00
Simon Horman
0c61535603 netdev-linux: correctly detect port speed bits beyond 16bit
This includes bits for:
* Backplane
* 1000 baseKX (full duplex)
* All speeds of 10Gbit and above other than 10000 baseT (full duplex).

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2015-11-29 16:05:21 -08:00
Ilya Maximets
bc85690660 netdev-linux: Remove unreachable code in netdev_linux_rx_recv_tap().
While splitting netdev_linux_rx_recv() into netdev_linux_rx_recv_sock()
and netdev_linux_rx_recv_tap() in commit
b73c85181df9 ("netdev-linux: Read packet auxdata to obtain vlan_tid")
error handling part was copied 'as is' to both functions.
But in case of netdev_linux_rx_recv_tap(), according to POSIX, the
number of bytes read shall never be greater than 'size'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2015-11-25 21:58:28 -08:00
Thadeu Lima de Souza Cascardo
c9697f354e Prevent test failures when there are non Ethernet devices on the system.
When there are PtP TUN devices on the system or SIT devices, tests will fail
because of a warning that it was not possible to get their Ethernet addresses.
That call comes from the route code adding tunnel ports.

Make that warning an informational message and filter that out during tests.

Also, return EINVAL when trying to get those interface Ethernet addresses, which
will prevent them from being added to the tunnel ports pool and will properly
fail in other places as well.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
2015-11-23 10:18:28 -08:00