2
0
mirror of https://github.com/openvswitch/ovs synced 2025-08-22 01:51:26 +00:00

ipsec: libreswan: Remove old certs before importing new ones.

If started with --no-restart-ike-daemon, ovs-monitor-ipsec doesn't
clear the NSS database.  This is not a problem if the certificates do
not change while the monitor is down, because completely duplicate
entries cannot be added to the NSS database.  However, if the monitor
is stopped, then certificates change on disk and then the monitor is
started back, it will add new tunnel certificates alongside the old
ones and will fail to add the new CA certificate.  So, we'll end up
with multiple certificates for the same tunnel and the outdated CA
certificate.  This will not allow creating new connections as we'll
not be able to verify certificates of the new CA:

  # certutil -L -d sql:/var/lib/ipsec/nss

  Certificate Nickname             Trust Attributes
                                   SSL,S/MIME,JAR/XPI

  ovs_certkey_c04c352b             u,u,u
  ovs_cert_cacert                  CT,,
  ovs_certkey_c04c352b             u,u,u
  ovs_certkey_c04c352b             u,u,u
  ovs_certkey_c04c352b             u,u,u
  ovs_certkey_c04c352b             u,u,u
  ovs_certkey_c04c352b             u,u,u
  ovs_certkey_c04c352b             u,u,u

  pluto: "ovn-c04c35-0-out-1" #459: processing decrypted
   IKE_AUTH request containing SK{IDi,CERT,CERTREQ,IDr,AUTH,SA,
   TSi,TSr,N(USE_TRANSPORT_MODE)}
  pluto: "ovn-c04c35-0-out-1" #459: NSS: ERROR:
   IPsec certificate CN=c04c352b,OU=kind,O=ovnkubernetes,C=US invalid:
   SEC_ERROR_UNKNOWN_ISSUER: Peer's Certificate issuer is not recognized.
  pluto: "ovn-c04c35-0-out-1" #459: NSS: end certificate invalid

Fix that by always checking certificates in the NSS database before
importing the new one.  If they do not match, then remove the old
one from the NSS and add the new one.

We have to call deletion multiple times in order to remove all the
potential duplicates from previous runs.  This will be useful on
upgrade, but also may save us if one of the deletions ever fail for
any reason and we'll end up with a duplicate entry anyway.

One alternative might be to always clear the database, even if the
--no-restart-ike-daemon option is set, but there is a chance that
we'll refresh and ask to re-read secrets before we got all the tunnel
information from the database.  That may affect dataplane.  Even if
this is really not possible, the logic seems too far apart to rely on.
Also, Libreswan 4.6 seems to have some bug that prevents re-adding
deleted connections if we removed and re-add the same certificate
(newer versions don't have this issue), so it's better if we do not
touch certificates that didn't actually change if we're not restarting
the IKE daemon.

The clearing may seem redundant now, but it may still be useful to
clean up certificates for tunnels that disappeared while the monitor
was down.  Approach taken in this change doesn't cover this case.

Test is added to check the described scenario.  The 'on_exit' command
is converted to obtain the monitor PID at exit, since we're now killing
one monitor and starting another.

Fixes: fe5ff26a49f6 ("ovs-monitor-ipsec: Add option to not restart IKE daemon.")
Reported-at: https://issues.redhat.com/browse/FDP-1473
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This commit is contained in:
Ilya Maximets 2025-06-20 20:53:44 +02:00
parent 80d723736b
commit 83de251fa5
2 changed files with 180 additions and 7 deletions

View File

@ -77,7 +77,7 @@ RECONCILIATION_INTERVAL = 15 # seconds
TIMEOUT_EXPIRED = 137 # Exit code for a SIGKILL (128 + 9). TIMEOUT_EXPIRED = 137 # Exit code for a SIGKILL (128 + 9).
def run_command(args, description=None): def run_command(args, description=None, warn_on_failure=True):
""" This function runs the process args[0] with args[1:] arguments """ This function runs the process args[0] with args[1:] arguments
and returns a tuple: return-code, stdout, stderr. """ and returns a tuple: return-code, stdout, stderr. """
@ -99,7 +99,7 @@ def run_command(args, description=None):
proc.kill() proc.kill()
ret = TIMEOUT_EXPIRED ret = TIMEOUT_EXPIRED
if proc.returncode or perr: if (proc.returncode or perr) and warn_on_failure:
vlog.warn("Failed to %s; exit code: %d" vlog.warn("Failed to %s; exit code: %d"
% (description, proc.returncode)) % (description, proc.returncode))
vlog.warn("cmdline: %s" % proc.args) vlog.warn("cmdline: %s" % proc.args)
@ -920,18 +920,85 @@ conn prevent_unencrypted_vxlan
elif name.startswith(self.CERTKEY_PREFIX): elif name.startswith(self.CERTKEY_PREFIX):
self._nss_delete_cert_and_key(name) self._nss_delete_cert_and_key(name)
def _nss_get_cert(self, name):
"""Obtains ASCII-formatted (PEM) certificate from the NSS database
by name. If multiple certificates have the same name, all will be
returned together."""
ret, pout, perr = run_command(
['certutil', '-L', '-d', self.IPSEC_D, '-a', '-n', name],
"get certificate %s from NSS" % name, warn_on_failure=False)
return None if ret else pout
def _pem_get_certs(self, pem_text):
"""Returns a set of all base64-encoded certificates in the 'pem_text'.
Can be used to unify slightly different formats between different files
and NSS database dumps."""
pattern = r"--*BEGIN CERTIFICATE--*(.*?)--*END CERTIFICATE--*"
matches = re.findall(pattern, pem_text, re.DOTALL)
return {re.sub(r'[^A-Za-z0-9+/=]', '', body) for body in matches}
def _cert_check_and_remove_obsolete(self, cert, name):
"""Looks up the certificate in the NSS database, removes all the
certificates that have the same 'name' but do not match the 'cert'.
Returns False if the 'cert' is not in the database in the end,
True otherwise."""
certs = set()
with open(cert) as f:
certs = self._pem_get_certs(f.read())
while True:
current = self._nss_get_cert(name)
if current is None:
# There are no certificates in the database.
return False
current_certs = self._pem_get_certs(current)
if certs == current_certs:
# The certificates are up to date.
return True
vlog.info(
"Mismatched certificate %s in the NSS database, removing."
% name
)
# Delete one certificate and try again.
res = 0
if name.startswith(self.CERT_PREFIX):
res = self._nss_delete_cert(name)
else:
res = self._nss_delete_cert_and_key(name)
if res != 0:
# We failed to remove the certificate. Assume that the
# one we need is not there.
return False
def _nss_import_cert(self, cert, name, cert_type): def _nss_import_cert(self, cert, name, cert_type):
"""Cert_type is 'CT,,' for the CA certificate and 'P,P,P' for the """Cert_type is 'CT,,' for the CA certificate and 'P,P,P' for the
normal certificate.""" normal certificate."""
if self._cert_check_and_remove_obsolete(cert, name):
# Already in the database.
return
run_command(['certutil', '-A', '-a', '-i', cert, '-d', self.IPSEC_D, run_command(['certutil', '-A', '-a', '-i', cert, '-d', self.IPSEC_D,
'-n', name, '-t', cert_type], '-n', name, '-t', cert_type],
"import certificate %s into NSS" % name) "import certificate %s into NSS" % name)
def _nss_delete_cert(self, name): def _nss_delete_cert(self, name):
run_command(['certutil', '-D', '-d', self.IPSEC_D, '-n', name], return run_command(['certutil', '-D', '-d', self.IPSEC_D, '-n', name],
"delete certificate %s from NSS" % name) "delete certificate %s from NSS" % name)[0]
def _nss_import_cert_and_key(self, cert, key, name): def _nss_import_cert_and_key(self, cert, key, name):
if self._cert_check_and_remove_obsolete(cert, name):
# Already in the database. And if the cert is the same, the
# key must be the same as well.
return
# Avoid deleting other files # Avoid deleting other files
path = os.path.abspath('/tmp/%s.p12' % name) path = os.path.abspath('/tmp/%s.p12' % name)
if not path.startswith('/tmp/'): if not path.startswith('/tmp/'):
@ -952,8 +1019,9 @@ conn prevent_unencrypted_vxlan
def _nss_delete_cert_and_key(self, name): def _nss_delete_cert_and_key(self, name):
# Delete certificate and private key # Delete certificate and private key
run_command(['certutil', '-F', '-d', self.IPSEC_D, '-n', name], return run_command(
"delete certificate and private key for %s" % name) ['certutil', '-F', '-d', self.IPSEC_D, '-n', name],
"delete certificate and private key for %s" % name)[0]
class IPsecTunnel(object): class IPsecTunnel(object):

View File

@ -84,7 +84,7 @@ m4_define([IPSEC_ADD_NODE],
--ipsec-ctl=$ovs_base/$1/pluto.ctl \ --ipsec-ctl=$ovs_base/$1/pluto.ctl \
m4_if([$6], [], [], [$6]) \ m4_if([$6], [], [], [$6]) \
--no-restart-ike-daemon --detach ], [0], [], [stderr]) --no-restart-ike-daemon --detach ], [0], [], [stderr])
on_exit "kill `cat $ovs_base/$1/ovs-monitor-ipsec.pid`" on_exit 'kill $(cat $ovs_base/$1/ovs-monitor-ipsec.pid)'
dnl Set up OVS bridge dnl Set up OVS bridge
NS_CHECK_EXEC([$1], NS_CHECK_EXEC([$1],
@ -753,6 +753,111 @@ AT_CHECK([head -n $(grep -c ':' left/sa.before) left/sa.after \
OVS_TRAFFIC_VSWITCHD_STOP() OVS_TRAFFIC_VSWITCHD_STOP()
AT_CLEANUP AT_CLEANUP
AT_SETUP([IPsec -- Libreswan - certificate update while the daemon is down])
AT_KEYWORDS([ipsec libreswan geneve ca-cert])
dnl Note: Geneve test may not work on older kernels due to CVE-2020-25645
dnl https://bugzilla.redhat.com/show_bug.cgi?id=1883988
CHECK_LIBRESWAN()
OVS_TRAFFIC_VSWITCHD_START()
IPSEC_SETUP_UNDERLAY()
dnl Set up dummy hosts.
IPSEC_ADD_NODE_LEFT(10.1.1.1, 10.1.1.2)
IPSEC_ADD_NODE_RIGHT(10.1.1.2, 10.1.1.1)
dnl Create and set ca-signed certs.
AT_CHECK([ovs-pki -b --dir=${ovs_base} -l ${ovs_base}/ovs-pki.log \
--force init], [0], [ignore], [ignore])
AT_CHECK([ovs-pki -b --dir=${ovs_base} -l ${ovs_base}/ovs-pki.log \
req+sign -u left], [0], [ignore], [ignore])
AT_CHECK([ovs-pki -b --dir=${ovs_base} -l ${ovs_base}/ovs-pki.log \
req+sign -u right], [0], [ignore], [ignore])
OVS_VSCTL_LEFT(set Open_vSwitch . \
other_config:ca_cert=${ovs_base}/switchca/cacert.pem \
other_config:certificate=${ovs_base}/left-cert.pem \
other_config:private_key=${ovs_base}/left-privkey.pem)
OVS_VSCTL_RIGHT(set Open_vSwitch . \
other_config:ca_cert=${ovs_base}/switchca/cacert.pem \
other_config:certificate=${ovs_base}/right-cert.pem \
other_config:private_key=${ovs_base}/right-privkey.pem)
dnl Set up IPsec tunnel on 'left' host.
IPSEC_ADD_TUNNEL_LEFT([geneve],
[options:remote_ip=10.1.1.2 options:remote_name=right])
dnl Set up IPsec tunnel on 'right' host.
IPSEC_ADD_TUNNEL_RIGHT([geneve],
[options:remote_ip=10.1.1.1 options:remote_name=left])
CHECK_ESP_TRAFFIC
dnl Get the numbers of all the current SAs.
IPSEC_SA_LIST([left], [left/sa.before])
IPSEC_SA_LIST([right], [right/sa.before])
dnl Kill the ovs-monitor-ipsec on the left host.
AT_CHECK([kill $(cat ${ovs_base}/left/ovs-monitor-ipsec.pid)])
OVS_WAIT_WHILE([kill -0 $(cat ${ovs_base}/left/ovs-monitor-ipsec.pid)])
dnl Re-generate the certificates.
AT_CHECK([rm -rf ${ovs_base}/switchca *.pem])
AT_CHECK([ovs-pki -b --dir=${ovs_base} -l ${ovs_base}/ovs-pki.log \
--force init], [0], [ignore], [ignore])
AT_CHECK([ovs-pki -b --dir=${ovs_base} -l ${ovs_base}/ovs-pki.log \
req+sign -u left], [0], [ignore], [ignore])
AT_CHECK([ovs-pki -b --dir=${ovs_base} -l ${ovs_base}/ovs-pki.log \
req+sign -u right], [0], [ignore], [ignore])
dnl Re-start ovs-monitor-ipsec with --no-restart-ike-daemon.
NS_CHECK_EXEC([left], [ovs-monitor-ipsec unix:${OVS_RUNDIR}/left/db.sock \
--pidfile=${OVS_RUNDIR}/left/ovs-monitor-ipsec.pid \
--ike-daemon=libreswan \
--ipsec-conf=$ovs_base/left/ipsec.conf \
--ipsec-d=$ovs_base/left/ipsec.d \
--ipsec-secrets=$ovs_base/left/secrets \
--log-file=$ovs_base/left/ovs-monitor-ipsec-2.log \
--ipsec-ctl=$ovs_base/left/pluto.ctl \
--no-restart-ike-daemon --detach ], [0], [], [stderr])
OVS_WAIT_UNTIL([grep -q 'Connections for all(1) configured tunnels are Up.' \
$ovs_base/left/ovs-monitor-ipsec-2.log])
dnl Check that the original left-right tunnel still works.
NS_CHECK_EXEC([left], [ping -q -c 3 -i 0.1 -W 2 192.0.0.2 | FORMAT_PING], [0],
[3 packets transmitted, 3 received, 0% packet loss, time 0ms
])
NS_CHECK_EXEC([right], [ping -q -c 3 -i 0.1 -W 2 192.0.0.1 | FORMAT_PING], [0],
[3 packets transmitted, 3 received, 0% packet loss, time 0ms
])
dnl Check that ovs-monitor-ipsec didn't touch the original tunnel.
IPSEC_SA_LIST([left], [left/sa.after])
IPSEC_SA_LIST([right], [right/sa.after])
AT_CHECK([diff -u left/sa.before left/sa.after])
AT_CHECK([diff -u right/sa.before right/sa.after])
dnl Check that there are no extra certs in the NSS database.
AT_CHECK([certutil -L -d $ovs_base/left/ipsec.d | grep -c ovs_cert], [0], [2
])
dnl Check that loaded certs are the new ones. Using openssl for comparison
dnl to ensure the same format (the file on disk and the dump from the NSS
dnl database have slightly different formats).
AT_CHECK([certutil -L -d $ovs_base/left/ipsec.d \
-a -n ovs_cert_cacert], [0], [stdout])
AT_CHECK_UNQUOTED([openssl x509 -in stdout -text -noout], [0],
[$(openssl x509 -in switchca/cacert.pem -text -noout)
])
AT_CHECK([certutil -L -d $ovs_base/left/ipsec.d \
-a -n ovs_certkey_left], [0], [stdout])
AT_CHECK_UNQUOTED([openssl x509 -in stdout -text -noout], [0],
[$(openssl x509 -in left-cert.pem -text -noout)
])
OVS_TRAFFIC_VSWITCHD_STOP()
AT_CLEANUP
AT_SETUP([IPsec -- Libreswan NxN geneve tunnels + reconciliation]) AT_SETUP([IPsec -- Libreswan NxN geneve tunnels + reconciliation])
AT_KEYWORDS([ipsec libreswan scale reconciliation]) AT_KEYWORDS([ipsec libreswan scale reconciliation])
dnl Note: Geneve test may not work on older kernels due to CVE-2020-25645 dnl Note: Geneve test may not work on older kernels due to CVE-2020-25645