Add northd logical flows in order to reports that the controller
received an IP packet for LB rule witn no backends.
This configuration is used by OpenShift to spin up a idle POD
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Co-authored-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Add trigger_event() ovn action in order to allow ovs-vswitchd to report
CMS related events.
This commit introduces a new event, empty_lb_backends. This event is
raised if a received packet is destined for a load balancer VIP that has
no configured backend destinations. For this event, the event info
includes the load balancer VIP, the load balancer UUID, and the
transport protocol.
The use case for this particular event is for the CMS to supply backend
resources to handle this traffic. For example, in Openshift, this event
can be used to spin up new containers to handle the incoming traffic.
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Co-authored-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
OVS has defines for loops like 'BITMAP_FOR_EACH_1' or
'ULLONG_FOR_EACH_1', but the regexp in checkpatch doesn't match with
numbers and skips these loops while checking.
This patch adds numbers into regexp and adds some FER_EACH loops to
the unit tests.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
After the commit [1], below test cases are failing repeatedly in travis CI.
2663: ovn -- 4 HV, 1 LS, 1 LR, packet test with HA distributed router gateway port FAILED (ovn.at:8597)
2664: ovn -- 4 HV, 3 LS, 2 LR, packet test with HA distributed router gateway port FAILED (ovn.at:8844)
2667: ovn -- vlan traffic for external network with distributed router gateway port FAILED (ovn.at:9580)
2691: ovn -- router - check packet length - icmp defrag FAILED (ovn.at:13624)
With the commit [1], ovn-controller sends GARPs for the IPs of the distributed
router ports. The failing tests did not handle the situation if multiple GARPs
are sent. The failures are mostly timing related. This patch fixes these issues.
[1] - d65586b6fa97 ("ovn: Send GARP for router port IPs of a router port connected to bridged logical switch")
Fixes: d65586b6fa97 ("ovn: Send GARP for router port IPs of a router port connected to bridged logical switch")
CC: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This doesn't impact the effectiveness of the test but just fix an
obvious error in ACL syntax which was noticed when looking at test
logs.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The test creates 2 logical switches and connect them with a logical router.
However, it didn't set the option "router-port", so the 2 LS datapaths
were not connected. This results in missing test coverage for port-binding
incremental processing: assume I-P has a bug and port-binding change always
trigger recompute, since each HV monitors only its own datapath (i.e.
HV1 -> ls1, HV2 -> ls2) then it never got notification of the other
port-binding change, thus recompute is never triggered when port-binding
is updated on the other datapath. With this fix, each HV's local datapaths
will include both ls1 and ls2, so port-binding change notification will
be received properly and unexpected recompute would be captured.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Prior to this patch, only db change aware connections were dropped
on a read/write status change. However, current schema in OVN does
not allow clients to monitor whether a particular DB changes this
status. In order to accomplish this, we'd need to change the schema
and adapting ovsdb-server and existing clients.
Before tackling that, this patch is changing ovsdb-server to drop
*all* the existing connections upon a read/write status change. This
will force clients to reconnect and honor the change.
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-July/048981.html
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The chassis_run code didn't take into account the scenario when the
system-id was changed in the Open_vSwitch table. Due to this the code
was trying to insert a new Chassis record in the OVN_Southbound DB with
the same Encaps as the previous Chassis record. The transaction used
to insert the new records was aborting due to the ["type", "ip"]
index constraint violation as we were creating new Encap entries with
the same "type" and "ip" as the old ones.
In order to fix this issue the flow is now:
1. the first time ovn-controller initializes the Chassis (shortly after
start up) we store the chassis-id.
2. for subsequent chassis_run calls we use last configured
chassis-id stored at the previous step to lookup the old Chassis record.
3. when ovn-controller shuts down gracefully we lookup the Chassis
record based on the chassis-id stored in memory at steps 1 and 2 above.
This is to avoid failing to cleanup the Chassis record in OVN_Southbound
DB if the OVS system-id changes between the last call to chassis_run and
chassis_cleanup.
Reported-at: https://bugzilla.redhat.com/1708146
Reported-by: Haidong Li <haili@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The first users will be added in an upcoming commit.
Also add tests.
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Background:
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html
[2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing
Key difference between an overlay logical switch and
vlan backed logical switch is that for vlan logical switches
packets are not encapsulated.
Hence, if a distributed router port is connected to vlan backed
logical switch, then router port mac as source mac could be
seen from multiple hypervisors. Same <mac,vlan> pairs coming
from multiple ports from a top of the rack switch (TOR) perspective
could be seen as a security threat and it could send alarms, drop
the packets or block the ports etc.
This patch addresses the same by introducing the concept of chassis mac.
A chassis mac is CMS provisioned unique mac per chassis. For any routed packet
(i.e source mac is router port mac) going on the wire on a vlan type
logical switch, we will replace its source mac with chassis mac.
This replacing of source mac with chassis mac will happen in table=65
of the logical switch datapath. A flow is added at priority 150, which
matches the source mac and replaces it with chassis mac if the value
is a router port mac.
Example flow:
cookie=0x0, duration=67765.830s, table=65, n_packets=0, n_bytes=0,
idle_age=65534, hard_age=65534, priority=150,reg15=0x1,metadata=0x4,
dl_src=00:00:01:01:02:03 actions=mod_dl_src:aa:bb:cc:dd:ee:ff,
mod_vlan_vid:1000,output:16
Here, 00:00:01:01:02:03 is router port mac and aa:bb:cc:dd:ee:ff
is chassis mac.
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch handles sending GARPs for
- router port IPs of a distributed router port
- router port IPs of a router port which belongs to gateway router
(with the option - redirect-chassis set in Logical_Router.options)
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
With the commit [1], the routing for the provider logical switches
connected to a router is centralized on the master gateway chassis
(if the option - reside-on-redirect-chassis) is set. When the
failover happens and a standby gateway chassis becomes master,
it should send GARPs for the router port macs. Without this, the
physical switch doesn't learn the new location of the router port macs
immediately and this could result in traffic disruption.
This patch addresses this issue so that the ovn-controller which claims the
distributed gatweway router port sends out the GARPs.
ovn-controller sends the GARPs if the Port_Binding.nat_addresses column
is set. This patch makes use of this column, instead of adding a new column
even though the name - nat_addresses seems a bit misnomer. The documentation is
updated to highlight the usage of this column.
This patch doesn't handle sending the GARPs for the gateway router port IPs.
This will be handled in a separate patch.
[1] - 85706c34d53d ("ovn: Avoid tunneling for VLAN packets redirected to a gateway chassis")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
The patch adds ip6gre support. Tunnel type 'ip6gre' with packet_type=
legacy_l2 is a layer 2 GRE tunnel over IPv6, carrying inner ethernet packets
and encap with GRE header with outer IPv6 header. Encapsulation of layer 3
packet over IPv6 GRE, ip6gre, is not supported yet. I tested it by running:
# make check-kernel TESTSUITEFLAGS='-k ip6gre'
under kernel 5.2 and for userspace:
# make check TESTSUITEFLAGS='-k ip6gre'
Tested-by: Greg Rose <gvrose8192@gmail.com>
Tested-at: https://travis-ci.org/gvrose8192/ovs-experimental/builds/552977116
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
During a code audit, the flow extraction fuzzer target was seen to be
parsing tcp flags from the fuzzer supplied input twice. This is
probably a typo since the second call to `parse_tcp_flags()` is
identical to the first.
Since a call to `parse_tcp_flags()` parses the Ethernet and IP headers
contained in the packet, the second (buggy) call to `parse_tcp_flags()`
creates an expectation that there is a second set of Ethernet and IP
headers beyond the first which is incorrect. This patch fixes this
problem by removing the duplicate code in question.
Signed-off-by: Bhargava Shastry <bshas3@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Previously, '--disable-system' disables both system dp and the system
routing table. The patch makes '--disable-system' only disable system
dp and adds '--disable-system-route' for disabling the route table.
This fixes failures when 'make check-system-userspace' for tunnel cases.
As a consequence, hitting errors due to OVS userspace parses the IGMP packet
but its datapaths do not, so odp_flow_key_to_flow() return ODP_FIT_TOO_LITTLE.
commit c645550bb249 ("odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP.")
Fix it by filtering out the IGMP-related error message.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Co-authored-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
We currently poll all available queues based on the max queue count
exchanged with the vhost peer and rely on the vhost library in DPDK to
check the vring status beneath.
This can lead to some overhead when we have a lot of unused queues.
To enhance the situation, we can skip the disabled queues.
On rxq notifications, we make use of the netdev's change_seq number so
that the pmd thread main loop can cache the queue state periodically.
$ ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 1:
isolated : true
port: dpdk0 queue-id: 0 (enabled) pmd usage: 0 %
pmd thread numa_id 0 core_id 2:
isolated : true
port: vhost1 queue-id: 0 (enabled) pmd usage: 0 %
port: vhost3 queue-id: 0 (enabled) pmd usage: 0 %
pmd thread numa_id 0 core_id 15:
isolated : true
port: dpdk1 queue-id: 0 (enabled) pmd usage: 0 %
pmd thread numa_id 0 core_id 16:
isolated : true
port: vhost0 queue-id: 0 (enabled) pmd usage: 0 %
port: vhost2 queue-id: 0 (enabled) pmd usage: 0 %
$ while true; do
ovs-appctl dpif-netdev/pmd-rxq-show |awk '
/port: / {
tot++;
if ($5 == "(enabled)") {
en++;
}
}
END {
print "total: " tot ", enabled: " en
}'
sleep 1
done
total: 6, enabled: 2
total: 6, enabled: 2
...
# Started vm, virtio devices are bound to kernel driver which enables
# F_MQ + all queue pairs
total: 6, enabled: 2
total: 66, enabled: 66
...
# Unbound vhost0 and vhost1 from the kernel driver
total: 66, enabled: 66
total: 66, enabled: 34
...
# Configured kernel bound devices to use only 1 queue pair
total: 66, enabled: 34
total: 66, enabled: 19
total: 66, enabled: 4
...
# While rebooting the vm
total: 66, enabled: 4
total: 66, enabled: 2
...
total: 66, enabled: 66
...
# After shutting down the vm
total: 66, enabled: 66
total: 66, enabled: 2
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
This patch fixes the ofp_port to odp_port translation issue on patch
port with nxt_resume. When OVS resumes processing a packet from
nxt_resume, OVS does not translate the ofp in_port to odp in_port
correctly if the packet is originally received from a patch port.
Currently,OVS sets the odp in_port for this resume pakcet as ODPP_NONE
and push the resume packet back to the datapath. Later on, if the packet
goes through a recirc, OVS will generate the following message since it
can not translate odp in_port (ODPP_NONE) back to ofp in_port during upcall,
and push down a datapath rule to drop the packet.
ofproto_dpif_upcall(handler16)|INFO|received packet on unassociated
datapath port 4294967295
When OVS revalidates the drop datapath flow with ODPP_NONE in_port, we
will see the following warning.
ofproto_dpif_upcall(revalidator18)|WARN|Failed to acquire udpif_key
corresponding to unexpected flow (Invalid argument): ufid:....
This patch resolves this issue by storing the odp in_port in the
continuation messages, and restores the odp in_port before push the
packet back to the datapath.
VMWare-BZ: 2364696
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Open vSwitch now supports all OpenFlow 1.5 required features, so enable
it by default.
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
OpenFlow 1.5 changed "meter" from an instruction to an action. This commit
supports it properly.
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Current issues with Flow API:
* OVS calls offloading functions regardless of successful
flow API initialization. (ex. on init_flow_api failure)
* Static initilaization of Flow API for a netdev_class forbids
having different offloading types for different instances
of netdev with the same netdev_class. (ex. different vports in
'system' and 'netdev' datapaths at the same time)
Solution:
* Move Flow API from the netdev_class to netdev instance.
* Make Flow API dynamic, i.e. probe the APIs and choose the
suitable one.
Side effects:
* Flow API providers localized as possible in their modules.
* Now we have an ability to make runtime checks. For example,
we could check if particular device supports features we
need, like if dpdk device supports RSS+MARK action.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Roi Dayan <roid@mellanox.com>
This adds a negative test for almost all of the error messages that
parsing an action or instruction can produce.
This commit removes now-redundant tests from multipath.at.
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Verification can fail for a variety of reasons but the code here always
reported "Incorrect instruction ordering".
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
ICMP/ICMPv6 fails, if the src/dst port is set in a common NAT rule.
For example:
actions=ct(nat(dst=172.16.1.100:5000),commit,table=40)
Fixes: 4cd0481c9e8b ("conntrack: Fix wasted work for ICMP NAT.")
CC: Darrell Ball <dlu998@gmail.com>
Signed-off-by: solomon <liwei.solomon@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Co-authored-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
For Openstack Internal DNS functionality we need
to provide support for domain_name option.
DHCP option 15 was previously used only in parser
tests and according to RFC it should be renamed to
domain_name [1].
This patch modifies its name in the tests from
'domain' to 'domain_name' and adds its support
to the code.
[1] https://tools.ietf.org/html/rfc2132#section-3.17
Signed-off-by: Maciej Józefczyk <mjozefcz@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
ovn-macros are needed to run the OVN system tests.
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Google oss-fuzz reported a build failure for the OVN expression parser.
Upon investigation, it turned out that the expr_parse_target fuzzer was
not being run by Google due to the said failure.
The root cause of the build failure turned out to be a change in the
definition of the expr_parse_string() API. Now, this API accepts an
additional parameter of type struct sset * that points to the set of
address set referenced which may be NULL if unused.
This patch adds this additional parameter to expr_parse_string()
setting the pointer to the set of address set referenced to NULL.
Once this patch is applied, ossfuzz's expr_parse_target should build
and subsequently be fuzzed.
CC: Han Zhou <hzhou8@ebay.com>
Fixes: 43e6900a7991 ("ovn-controller: Maintain resource references for logical flows.")
Signed-off-by: Bhargava Shastry <bshastry@sect.tu-berlin.de>
Signed-off-by: Ben Pfaff <blp@ovn.org>
From: Jakub Sitnicki <jkbs@redhat.com>
Add a test that performs typical operations of creating & destroying
logical routers, switches, ports, address sets and ACLs while checking
if they trigger full logical flow processing in the ovn-controller.
This way confirm that incremental processing is taking effect when we
expect it to.
Place the new test in a separate module - tests/ovn-performance.at,
instead of the usual tests/ovn.at as it doesn't test OVN's functionality
but rather a performance aspect of ovn-controller.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
When the content of an address set changes, ovn-controller will
not recompute all flows but only the ones related to the changed
address-set. The performance test result is discussed at [1].
[1] https://mail.openvswitch.org/pipermail/ovs-discuss/2018-June/046880.html
Tested-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch maintains the cross reference between logical flows and
the resources such as address sets and port groups that are used by
logical flows. This data will be needed in address set and port
group incremental processing.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Priority tags is a port configuration to determine how the port treats
priority tags, e.g. zero VLAN ID. Change the type from boolean to enum
as a pre-step towards introducing additional modes. The new options are
"never", equivalent to previously "false", and "if-nonzero",
equivalent to previously "true". "true" is still supported for backwards
compatibility.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Use sb mac binding table to trigger ip buffer dequeueing instead of
the APR/ND packet reception since the ARP reply can be managed on a
different chassis if a gw router port is scheduled on a different
node
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Userspace conntrack cares about IPv4 checksums, so this is a
prerequisite for adding zone limit support to userspace conntrack.
Fixes: 3f1087c70cf9 ("system-traffic: Add conntrack per zone limit test case.")
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Darrell Ball <dlu998@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Several of the ovn tests did not use the `--wait` flag to to wait for a
configuration change to propagate through the system. As a result,
these tests fail when `ovn-northd` is slow.
Fixed by adding `--wait=hv` or `--wait=sb` as appropriate.
Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
the logical-fields.h file was moved. Path has been updated
accordingly. This broke oss-fuzz buils.
CC: Numan Siddique <nusiddiq@redhat.com>
Fixes: 086470cdbe66 ("ovn: Add a new OVN field icmp4.frag_mtu")
Signed-off-by: Toms Atteka <cpp.code.lv@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
veth pair doesn't offload anything to HW. i.e. we should use 'tc' type
while requesting flows. 'offloaded' kept just in case to not update the
test if veths will be HW offloaded someday.
Additionally fixed missed for unknown reason 'ipv4' fields. Also
dropped stripping of the errors from log.
Fixes test:
2: offloads - ping between two ports - offloads enabled ok
CC: Gavi Teitz <gavi@mellanox.com>
Fixes: d63ca5329ff9 ("dpctl: Properly reflect a rule's offloaded to HW state")
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
ovn-northd deletes and recreates HA_Chassis rows (which belong
to a HA_Chassis_Group) whenever the HA_Chassis_Group/Gateway_Chassis
rows in Northbound DB are out of sync. If a Chassis table row in
Southbound DB is deleted and if this row is referenced by HA_Chassis
row (in Southbound DB), then the present code syncs the HA_Chassis
rows continously and this causes the ovn-controller's to wake up
and results in 100% cpu usage.
This was a simple case which the commit
1be1e0e5e0d1 ("ovn: Add generic HA chassis group") missed out addressing.
This patch fixes this issue.
Fixes: 1be1e0e5e0d1 ("ovn: Add generic HA chassis group")
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-April/048580.html
Reported-by: Daniel Alvarez Sanchez (dalvarez@redhat.com)
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch is adding support for Transport Zones. Transport zones (a.k.a
TZs) is way to enable users of OVN to separate Chassis into different
logical groups that will only form tunnels between members of the same
groups. Each Chassis can belong to one or more Transport Zones. If
not set, the Chassis will be considered part of a default group.
Configuring Transport Zones is done by creating a key called
"ovn-transport-zones" in the external_ids column of the Open_vSwitch
table from the local OVS instance. The value is a string with the name
of the Transport Zone that this instance is part of. Multiple TZs can
be specified with a comma-separated list. For example:
$ sudo ovs-vsctl set open . external-ids:ovn-transport-zones=tz1
or
$ sudo ovs-vsctl set open . external-ids:ovn-transport-zones=tz1,tz2,tz3
This configuration is also exposed in the Chassis table of the OVN
Southbound Database in a new column called "transport_zones".
The use for Transport Zones includes but are not limited to:
* Edge computing: As a way to preventing edge sites from trying to create
tunnels with every node on every other edge site while still allowing
these sites to create tunnels with the central node.
* Extra security layer: Where users wants to create "trust zones"
and prevent computes in a more secure zone to communicate with a less
secure zone.
This patch is also backward compatible so the upgrade guide for OVN [0]
is still valid and the ovn-controller service can be upgraded before the
OVSDBs.
[0] http://docs.openvswitch.org/en/latest/intro/install/ovn-upgrades/
Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-February/048255.html
Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
If a server claims itself as "disconnected", all clients connected
to that server will try to reconnect to a new server in the cluster.
However, currently a server would claim itself as disconnected even
when itself is the candidate and try to become the new leader (most
likely it will be), and all its clients will reconnect to another
node.
During a leader fail-over (e.g. due to a leader failure), it is
expected that all clients of the old leader will have to reconnect
to other nodes in the cluster, but it is unnecessary for all the
clients of a healthy node to reconnect, which could cause more
disturbance in a large scale environment.
This patch fixes the problem by slightly change the condition that
a server regards itself as disconnected: if its role is candidate,
it is regarded as disconnected only if the election didn't succeed
at the first attempt. Related failure test cases are also unskipped
and all passed with this patch.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
In commit-2bcb3b70 (ovsdb raft: Move ovsdb cluster tests to separate
testsuite.) the "clustered transactions" tests were left unexecuted
because they depend on "EXECUTION_EXAMPLES", which is defined in
ovsdb-execution.at.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch adds 2 stages in router pipeline after ARP_RESOLVE
and adds the logical flows to check the packet length and
generate ICMPv4 packet.
* S_ROUTER_IN_CHK_PKT_LEN - Which checks the packet length using
check_pkt_larger OVN action
* S_ROUTER_IN_LARGER_PKTS - Which generates icmp packet with
type 3 (Destination Unreachable),
code 4 (Frag Needed and DF was Set)
icmp4.frag_mtu = gw_mtu
In order to add these logical flows, CMS should set the
option 'gateway_mtu' for the distributed logical router port.
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Previous commit added a new OVS action 'check_pkt_larger'. This
patch supports that action in OVN. The syntax to use this would be
reg0[0] = check_pkt_larger(LEN)
Upcoming commit will make use of this action in ovn-northd and
will generate an ICMPv4 packet if the packet length is greater than
the specified length.
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This action is similar to the existing 'icmp4' OVN action except that
that this action is expected to be used to generate an ICMPv4 packet
in response to an error in original IP packet. When this action
injects the icmpv4 packet, it also copies the original IP datagram
following the icmp4 header as per RFC 1122: 3.2.2
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
In order to support OVN specific fields (which are not yet
supported in OpenvSwitch to set or modify values) a generic
OVN field support is added in this patch. These OVN fields
gets translated to controller actions.
This patch adds only one field for now - icmp4.frag_mtu.
It should be fairly straightforward to add similar fields in the
near future.
Example usage.
action=(icmp4 {"eth.dst <-> eth.src; "
"icmp4.type = 3; /* Destination Unreachable */ "
"icmp4.code = 4; /* Fragmentation Needed */ "
icmp4.frag_mtu = 1442;
...
"next; };")
action=(icmp4.frag_mtu = 1500; ..)
pinctrl module of ovn-controller will set the specified value
in the the low-order 16 bits of the ICMP4 header field that is
labelled "unused" in the ICMP specification as defined in the RFC 1191.
Upcoming patch will use it to send an icmp4 packet if the
source IPv4 packet destined to go via external gateway needs to
be fragmented.
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch adds a new action 'check_pkt_larger' which checks if the
packet is larger than the given size and stores the result in the
destination register.
Usage: check_pkt_larger(len)->REGISTER
Eg. match=...,actions=check_pkt_larger(1442)->NXM_NX_REG0[0],next;
This patch makes use of the new datapath action - 'check_pkt_len'
which was recently added in the commit [1].
At the start of ovs-vswitchd, datapath is probed for this action.
If the datapath action is present, then 'check_pkt_larger'
makes use of this datapath action.
Datapath action 'check_pkt_len' takes these nlattrs
* OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for
* OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER (optional) - Nested actions
to apply if the packet length is greater than the specified 'pkt_len'
* OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL (optional) - Nested
actions to apply if the packet length is lesser or equal to the
specified 'pkt_len'.
Let's say we have these flows added to an OVS bridge br-int
table=0, priority=100 in_port=1,ip,actions=check_pkt_larger:100->NXM_NX_REG0[0],resubmit(,1)
table=1, priority=200,in_port=1,ip,reg0=0x1/0x1 actions=output:3
table=1, priority=100,in_port=1,ip,actions=output:4
Then the action 'check_pkt_larger' will be translated as
- check_pkt_len(size=100,gt(3),le(4))
datapath will check the packet length and if the packet length is greater than 100,
it will output to port 3, else it will output to port 4.
In case, datapath doesn't support 'check_pkt_len' action, the OVS action
'check_pkt_larger' sets SLOW_ACTION so that datapath flow is not added.
This OVS action is intended to be used by OVN to check the packet length
and generate an ICMP packet with type 3, code 4 and next hop mtu
in the logical router pipeline if the MTU of the physical interface
is lesser than the packet length. More information can be found here [2]
[1] - 4d5ec89fc8
[2] - https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html
Reported-at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
CC: Ben Pfaff <blp@ovn.org>
CC: Gregory Rose <gvrose8192@gmail.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>