Reducing non-const static data makes code more obviously thread-safe.
Although option parsing does not normally need to be thread-safe, I
don't know of a drawback to making its data const.
Signed-off-by: Ben Pfaff <blp@nicira.com>
This patch implements use-space datapath and non-datapath code
to match and use the datapath API set out in Leo Alterman's patch
"user-space datapath: Add basic MPLS support to kernel".
The resulting MPLS implementation supports:
* Pushing a single MPLS label
* Poping a single MPLS label
* Modifying an MPLS lable using set-field or load actions
that act on the label value, tc and bos bit.
* There is no support for manipulating the TTL
this is considered future work.
The single-level push pop limitation is implemented by processing
push, pop and set-field/load actions in order and discarding information
that would require multiple levels of push/pop to be supported.
e.g.
push,push -> the first push is discarded
pop,pop -> the first pop is discarded
This patch is based heavily on work by Ravi K.
Cc: Ravi K <rkerur@gmail.com>
Reviewed-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
A future commit will make all bridges use a single backing datapath.
This commit makes the "dp" argument for "dump-flows" and "del-flows"
optional, since there will typically only be one actual datapath.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Most of the code referred to datapath ports as 32-bit values, but a few
places still used 16-bit references.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
The datapath port number influences the OpenFlow port number in
ovs-vswitchd. The new "port_no" option for the "add-if" command allows
the user to request a specific datapath port number.
Feature #12642
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Rename do_* in ovs-dpctl and ovs-ofctl command with "dpctl_" or "ofctl_"
prefix.
Rename add_flow with dp_netdev_flow_add in lib/dpif-netdev.c.
Signed-off-by: Arun Sharma <arun.sharma@calsoftinc.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
A smap is a string to string hash map. It has a cleaner interface
than shash's which were traditionally used for the same purpose.
This patch implements the data structure, and changes netdev and
its providers to use it.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
This commit adapts a couple of existing pieces of code to use the
new data structure. The following commit will add another user
(which is also the first use of the simap_increas() function).
Signed-off-by: Ben Pfaff <blp@nicira.com>
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
The compare-odp-actions.pl utility isn't fully general, even for its
intended purpose of allowing sets of ODP actions to be compared
ignoring unimportant differences in ordering of output actions and
VLAN set actions. I decided that the proper way to do it was to have
a utility that can actually parse the actions, instead of just
doing textual transformations on them. So, this commit replaces
compare-odp-actions.pl by "ovs-dpctl normalize-actions", which is
sufficiently general for the intended purpose.
The new ovs-dpctl functionality can be easily extended to handle
differences in fields other than VLAN, but only VLAN is needed so
far.
This will be needed in an upcoming commit that in some cases
introduces redundant "set vlan" actions into the ODP actions, which
compare-odp-actions.pl doesn't tolerate.
Until now, OVS has handled IP fragments more awkwardly than necessary. It
has not been possible to match on L4 headers, even in fragments with offset
0 where they are actually present. This means that there was no way to
implement ACLs that treat, say, different TCP ports differently, on
fragmented traffic; instead, all decisions for fragment forwarding had to
be made on the basis of L2 and L3 headers alone.
This commit improves the situation significantly. It is still not possible
to match on L4 headers in fragments with nonzero offset, because that
information is simply not present in such fragments, but this commit adds
the ability to match on L4 headers for fragments with zero offset. This
means that it becomes possible to implement ACLs that drop such "first
fragments" on the basis of L4 headers. In practice, that effectively
blocks even fragmented traffic on an L4 basis, because the receiving IP
stack cannot reassemble a full packet when the first fragment is missing.
This commit works by adding a new "fragment type" to the kernel flow match
and making it available through OpenFlow as a new NXM field named
NXM_NX_IP_FRAG. Because OpenFlow 1.0 explicitly says that the L4 fields
are always 0 for IP fragments, it adds a new OpenFlow fragment handling
mode that fills in the L4 fields for "first fragments". It also enhances
ovs-ofctl to allow users to configure this new fragment handling mode and
to parse the new field.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Bug #7557.
Over time we wish to reduce the number of datapath-protocol.h definitions
used directly outside of Linux-specific code. This commit removes use of
"struct ovs_dp_stats" from platform-independent code.
Bug #7559.
Currently ovs is using device stats for Linux devices and count them
itself in other situations. This leads to overlap with hardware stats,
inconsistencies, etc. It's much better to just always count the packets
flowing through the switch and let userspace do any merging that it wants.
Following patch removes vport->get_stats() interface. vport-stat is changed
to use new `struct ovs_vport_stat` rather than rtnl_link_stats64.
Definitions of rtnl_link_stats64 is removed from OVS. dipf_port->stat is also
removed as aggregate stats are only available at netdev layer.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
The prefix "ODP_*" is not overly descriptive in the context of the
larger Linux tree. This commit changes the prefix to "OVS_*" for the
userpace to kernel interactions. The userspace libraries still use
"ODP_" in many of their interfaces since it is more descriptive in the
OVS oeuvre.
Feature #6904
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Until now, each call to netdev_open() for a particular network device
had to either specify a set of network device arguments that was either
empty or (for devices that already existed) equal to the existing device's
configuration. Unfortunately, the definition of "equality" in the latter
case was mostly done in terms of strict equality of string-to-string maps,
which caused problems in cases where, for example, one set of arguments
specified the default value of an optional argument explicitly and the
other omitted it.
The netdev interface does have provisions for defining equality other ways,
but this had only been done in one case that was especially problematic in
practice. One way to solve this particular problem would be to carefully
define equality in all the problematic cases.
This commit takes another approach based on the realization that there is
really no need to do any comparisons. Instead, it removes configuration
at netdev_open() time entirely, because almost all of netdev_open()'s
callers are not interested in creating and configuring a netdev. Most of
them just want to open a configured device and use it. Therefore, this
commit stops providing any configuration arguments to netdev_open() and the
provider functions that it calls. Instead, a caller that does want to
configure a device does so after it opens it, by calling
netdev_set_config().
This change allows us to simplify the netdev interface a bit. There is no
longer any need to implement argument comparisons. As a result, there is
also no need for "struct netdev_dev" to keep track of configuration at all.
Instead, the network devices that have configuration keep track of it in
their own internal form.
This new interface does mean that it becomes possible to accidentally
create and try to use an unconfigured netdev that requires configuration.
Bug #6677.
Reported-by: Paul Ingram <paul@nicira.com>
The Open vSwitch tree only has one user of the ability for a netdev to
receive packets from a network device. Thus, this commit simplifies the
common-case use of the netdev interface by replacing the "ethertype" option
from "struct netdev_options" by a new netdev_listen() call.
The only user of netdev_listen() wants to receive all packets from a
network device, so this commit also removes the ability to restrict the
received packets to a particular protocol. (This ability was once used by
the Open vSwitch integrated DHCP client, but that code has been removed.)
This commit also simplifies and improves the implementation of the code
in netdev-linux that started listening to a network device. Before, I had
not figured out how to avoid receiving all packets on all devices before
binding to a particular device, but I took a closer look at the kernel code
and figured it out.
I've tested that the userspace datapath (dpif-netdev), the only user of
netdev_recv(), still works after this change.
Expose the number of flows present in a datapath to user-space
and to users via ovs-dpctl show.
e.g.:
ovs-dpctl show br3
system@br3:
lookups: frags:0, hit:0, missed:0, lost:0
flows: 0
...
Signed-off-by: Simon Horman <horms@verge.net.au>
[Jesse: Add same logic to userspace datapath.]
Signed-off-by: Jesse Gross <jesse@nicira.com>
This "while" loop in do_add_if() is supposed to split up everything after
the interface name with ',' as the delimiter, but it didn't do that
correctly.
Also corrects a typo in the manpage pointed out by Justin Pettit.
Following this commit, "struct odp_flow_stats" is only used in
Linux-specific parts of OVS userspace code. This allows the actual Linux
datapath interface to evolve more freely.
Reviewed by Justin Pettit.
Following this commit, "struct odp_flow" and related data structures are
only used in Linux-specific parts of OVS userspace code. This allows the
actual Linux datapath interface to evolve more freely.
Reviewed by Justin Pettit.
As with n_flows, n_ports was used regularly by userspace to determine how
much memory to allocate when listing ports, but it is no longer needed for
that. max_ports, on the other hand, is necessary but it is also a fixed
value for the kernel datapath right now and if we expand it we can also
come up with a way to report the expanded value.
The remaining members of odp_stats are actually real statistics that I
intend to keep.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
This queue information will be available through the kernel socket layer
once we move over to Netlink socket as transports, so we might as well get
rid of the redundancy.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Userspace used to use the n_flows information here to decide how much
memory needed to be allocated to list flows, but that isn't necessary any
longer now that listing flows uses an iterator abstraction. The
cur_capacity and max_capacity members are just curiosities and don't
provide much information; if the implementation ever changes away from
the current hash table implementation then they could become meaningless
anyhow.
But more than anything, these aren't really the kind of statistics that
networking people usually care about.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Following this commit, "struct odp_port" is only used in Linux-specific
parts of OVS userspace code. This allows the actual Linux datapath
interface to evolve more freely.
Reviewed by Justin Pettit.
This is cleaner than parsing "odp_port"s directly. It takes one step
toward eliminating use of odp_port from any userspace code outside of
lib/netdev-vport.c and lib/dpif-linux.c.
Reviewed by Justin Pettit.
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to add new
features to the kernel vport layer without changing userspace software. In
turn, that means that the odp_port structure must become variable-length.
This does not, however, fit in well with the ODP_PORT_LIST ioctl in its
current form, because that would require userspace to know how much space
to allocate for each port in advance, or to allocate as much space as
could possibly be needed. Neither choice is very attractive.
This commit prepares for a different solution, by replacing ODP_PORT_LIST
by a new ioctl ODP_VPORT_DUMP that retrieves information about a single
vport from the datapath on each call. It is much cleaner to allocate the
maximum amount of space for a single vport than to do so for possibly a
large number of vports.
It would be faster to retrieve a number of vports in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.
The Netlink version won't need to take the starting port number from
userspace, since Netlink sockets can keep track of that state as part
of their "dump" feature.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length. This
commit makes that change using Netlink attribute sequences.
This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, because userspace doesn't yet
have enough information to do that intelligently. Upcoming commits will
fix that.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length. This does
not, however, fit in well with the ODP_FLOW_LIST ioctl in its current form,
because that would require userspace to know how much space to allocate
for each flow's key in advance, or to allocate as much space as could
possibly be needed. Neither choice is very attractive.
This commit prepares for a different solution, by replacing ODP_FLOW_LIST
by a new ioctl ODP_FLOW_DUMP that retrieves a single flow from the datapath
on each call. It is much cleaner to allocate the maximum amount of space
for a single flow key than to do so for possibly a very large number of
flow keys.
As a side effect, this patch also fixes a race condition that sometimes
made "ovs-dpctl dump-flows" print an error: previously, flows were listed
and then their actions were retrieved, which left a window in which
ovs-vswitchd could delete the flow. Now dumping a flow and its actions is
a single step, closing that window.
Dumping all of the flows in a datapath is no longer an atomic step, so now
it is possible to miss some flows or see a single flow twice during
iteration, if the flow table is modified by another process. It doesn't
look like this should be a problem for ovs-vswitchd.
It would be faster to retrieve a number of flows in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Presumably this function was written to iterate all of the ports because
at some point we didn't have a direct way to do this, but now
dpif_port_query_by_name() is the obvious way to do it.
Acked-by: Jesse Gross <jesse@nicira.com>
When "ovs-dpctl show" is run, return additional information about the
port. For example, tunnel ports will print the remote_ip, local_ip, and
in_key when defined.
In the medium term, we plan to migrate the datapath to use Netlink as its
communication channel. In the short term, we need to be able to have
actions with 64-bit arguments but "struct odp_action" only has room for
48 bits. So this patch shifts to variable-length arguments using Netlink
attributes, which starts in on the Netlink transition and makes 64-bit
arguments possible at the same time.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath. Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath. Similarly, a vport
could be detached and deleted separately.
After some study, I think I understand why this distinction exists. It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them. However, changing it to create them before it tries to
open them is not difficult, so this commit does this.
The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
The "port group" concept seems like a good one, but it has not been
used very much in userspace so far, so before we commit ourselves to
a frozen API that we must maintain forever, remove it. We can always
add it back in later as a new kind of vport.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.