Debugging Certificate Errors

February 18th, 2022

It is a truth universally acknowledged, that any developer accessing a web service must be in want of using "curl -k".
-- Jane "DevOps" Austin

But why? Ok, sure: to get rid of the annoying cert error. But putting aside that not all errors here can even be ignored or "resolved" by passing -k, let us pretend that we actually cared and wanted to find out what the problem is. Some of the errors encountered when accessing an HTTPS service using e.g., curl(1) are self-explanatory, but for many others we are often left trying to determine just why our tool can't validate the remote certificate.

Here are a few debugging techniques that may help you better make sense of the errors:

Cert expired

Perhaps the most common error you will encounter is that the certificate has expired:

$ curl https://expired.badssl.com/
curl: (60) SSL certificate problem: certificate has expired
More details here: https://curl.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
$

Unfortunately, this error message doesn't give you the relevant important information as to when the certificate expired. For that, you need to look at the actual certificate:

$ </dev/null openssl s_client -connect expired.badssl.com:443 2>/dev/null | \
        openssl x509 -noout -dates
notBefore=Apr  9 00:00:00 2015 GMT
notAfter=Apr 12 23:59:59 2015 GMT

Remember that these dates are relative to your local clock: it's possible to encounter a certificate that is actually still valid (i.e., has a notAfter date that is objectively in the future), but is rejected because due to e.g., a clock, NTP, or configuration error your computer thinks it's next week already.

Likewise, although you will probably encounter this much less frequently, the notBefore date may (appear to) be in the future, in which case the error from curl(1) would be certificate is not yet valid instead of certificate has expired.

But you'll soon get tired of typing the lengthy openssl(1) command above, so I'd recommend creating a simple shell function instead:

$ cat >>~/.bashrc <<"EOF"

sitecert() {
        local p=443

        if [ -n "$2" ]; then
                p=$2;
        fi
        leafcert $1 $p | openssl x509 -text -noout
}

leafcert() {
        local p=443
        local chain=""

        if [ x"${1}" = x"full" ]; then
                chain="-showcerts"
                shift
        fi

        if [ -n "$2" ]; then
                p=$2;
        fi
        </dev/null openssl s_client ${chain} -connect $1:$p 2>/dev/null | \
                awk '/-----BEGIN/,/-----END/ { print }'
}

fullchain() {
        local p=443

        if [ -n "$2" ]; then
                p=$2;
        fi
        leafcert full $1 $p
}

EOF
$ . ~/.bashrc
$ sitecert expired.badssl.com | grep "Not "
            Not Before: Apr  9 00:00:00 2015 GMT
            Not After : Apr 12 23:59:59 2015 GMT
$

Wrong name

Perhaps the second most widely encountered error: the name on the cert doesn't match the name by which you tried to access the service.

$ curl https://wrong.host.badssl.com
curl: (60) SSL: no alternative certificate subject name matches target host name 'wrong.host.badssl.com'
More details here: https://curl.se/docs/sslcerts.html
$
$ sitecert wrong.host.badssl.com | egrep badssl
        Subject: C=US, ST=California, L=San Francisco, O=BadSSL Fallback. Unknown subdomain or no SNI., CN=badssl-fallback-unknown-subdomain-or-no-sni
                DNS:badssl-fallback-unknown-subdomain-or-no-sni
$

For this, remember that there are multiple names that can come into play:

the Subject Common Name or CN, found in the cert
the Subject Alternative Names or SANs, found in the cert
the Server Name Indication or SNI, from the TLS handshake
the hostname used to access the service

Wildcards

A common mistake you will encounter is the incorrect use of wildcards, whereby a service operator will assume that a wildcard cert for *.example.com would be valid for foo.bar.example.com.

It is not; wildcards can only match a single label. Per RFC2818, wildcards can match label fragments (*o.example.com or f*.example.com would match foo.example.com, but not oof.example.com), or intermittent, i.e., not just the left-most label (foo.*.example.com matches foo.bar.example.com), but RFC6125 discourages that use. Different CAs may have additional restrictions, and most browsers prohibit fragment wildcards, so that you will almost always only see full left-most label only wildcards.

CNAMEs

The discrepancy between the name used to access the service and the name on the certificate can be the result of e.g., using a CNAME:

$ host cname.example.com
cname.example.com is an alias for www.example.com
www.example.com has IPv6 address 2001:db8::c2ff:5444:a856:2d24
$ curl https://cname.example.com
curl: (60) SSL: no alternative certificate subject name matches target host name 'cname.example.com'
More details here: https://curl.se/docs/sslcerts.html

Since cname.example.com points to www.example.com, the RFC3546 TLS SNI will be set to cname.example.com, which is not included in the certificate this server uses. You can usually work around this through clever use of e.g., curl(1)'s --connect-to flag:

$ curl --connect-to www.example.com:443:cname.example.com:443 https://www.example.com

The trick here is to remember to use the name that you want the SNI to be set to in the hostname given in the URL (www.example.com), and then trick curl(1) into connecting to cname.example.com instead. (See below for another possible option using the --resolve flag.)

On the TLS layer, this is a bit more intuitively accomplished by using openssl s_client -servername.

Wrong IP address

Suppose you want to verify the certificate for a service endpoint against a specific server with a separate IP address from the service endpoint. This might be a common use case when you want to verify that a load-balanced origin server has the right cert. To illustrate this, let's assume http1.example.com is a member of a larger rotation some.rotation.example.com, and I want to explicitly hit that origin. To do that, I can use either the --connect-to flag from above, or instruct curl(1) to directly resolve the given name as I'd like, without having to monkey around with /etc/hosts (which I promise you you would regret):

$ host www.example.com
www.example.com is an alias for some.rotation.example.com
some.rotation.example.com has IPv6 address 2001:db8::6651:c98f:4e48:edf8
$ host http1.example.com
2001:db8::896a:f5a7:2828:f7cc
$ curl --connect-to www.example.com:443:http1.example.com:443 https://www.example.com

or

$ curl --resolve www.example.com:443:[2001:db8::896a:f5a7:2828:f7cc] https://www.example.com

Incomplete chain

Another common server misconfiguration occurs when the service provider only offers the leaf certificate instead of also including the intermediate certificate(s). Some clients will try to construct an alternate chain and not complain if they are successful, but in the end, the server needs to include the full chain minus the root certificates.

The error here is not 100% explicit about the issue being the missing intermediate certificate, however:

$ curl https://incomplete-chain.badssl.com
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.se/docs/sslcerts.html

$ </dev/null openssl s_client -connect incomplete-chain.badssl.com:443 >/dev/null
depth=0 C = US, ST = California, L = Walnut Creek, O = Lucas Garron Torres, CN = *.badssl.com
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 C = US, ST = California, L = Walnut Creek, O = Lucas Garron Torres, CN = *.badssl.com
verify error:num=21:unable to verify the first certificate
verify return:1
DONE

unable to get local issuer certificate is the relevant supplemental information to unable to verify the first certificate. The first certificate is the leaf provided by the service; it is attempted to be verified using the certificates from the local trust bundle because no intermediate certificate is provided.

To see what the certificate chain provided by the server is, use the -showcerts option to s_client(1):

$ </dev/null openssl s_client -connect incomplete-chain.badssl.com:443 -showcerts 2>/dev/null
CONNECTED(00000007)
---
Certificate chain
 0 s:C = US, ST = California, L = Walnut Creek, O = Lucas Garron Torres, CN = *.badssl.com
   i:C = US, O = DigiCert Inc, CN = DigiCert SHA2 Secure Server CA
-----BEGIN CERTIFICATE-----
MIIGqDCCBZCgAwIBAgIQCvBs2jemC2QTQvCh6x1Z/TANBgkqhkiG9w0BAQsFADBN
[...]
CC01zojqS10nGowxzOiqyB4m6wytmzf0QwjpMw==
-----END CERTIFICATE-----
---  
Server certificate
subject=C = US, ST = California, L = Walnut Creek, O = Lucas Garron Torres, CN = *.badssl.com

issuer=C = US, O = DigiCert Inc, CN = DigiCert SHA2 Secure Server CA

Note that there is only a single certificate, a leaf certificate issued by DigiCert SHA2 Secure Server CA, but this intermediate certificate is not found in the local trust bundle. But nor should it: the server ought to serve it as part of its chain!

Now compare this to a site with an intact certificate chain:

CONNECTED(00000007)
---
Certificate chain
 0 s:CN = netmeister.org
   i:C = US, O = Let's Encrypt, CN = R3
-----BEGIN CERTIFICATE-----
MIIFeTCCBGGgAwIBAgISA5eh8x6ZtEfv/vg8A6VzsgrbMA0GCSqGSIb3DQEBCwUA
[...]
lbjtFHTyfcyoA/v+iw==
-----END CERTIFICATE-----
 1 s:C = US, O = Let's Encrypt, CN = R3
   i:C = US, O = Internet Security Research Group, CN = ISRG Root X1
-----BEGIN CERTIFICATE-----
MIIFFjCCAv6gAwIBAgIRAJErCErPDBinU/bWLiWnX1owDQYJKoZIhvcNAQELBQAw
[...]
nLRbwHOoq7hHwg==
-----END CERTIFICATE-----
 2 s:C = US, O = Internet Security Research Group, CN = ISRG Root X1
   i:O = Digital Signature Trust Co., CN = DST Root CA X3
-----BEGIN CERTIFICATE-----
MIIFYDCCBEigAwIBAgIQQAF3ITfU6UK47naqPGQKtzANBgkqhkiG9w0BAQsFADA/
[...]
-----END CERTIFICATE-----
---
Server certificate
subject=CN = netmeister.org

issuer=C = US, O = Let's Encrypt, CN = R3

Note that this server includes three certificates: the leaf certificate, an intermediate certificate, and, apparently, a root certificate. Normally, there is no point in including any root certificates, since the client only uses root certificates from its trust bundle, but since Let's Encrypt had their ISRG Root X1 cross-signed for bootstrapping purposes, some clients might not have this certificate in their trust bundle, making it an effective intermediate cert.

(Note: since September 30th 2021, the DST Root CA X3 is expired, and the ISRG Root X1 certificate is widely included in clients' trust stores, so that including it in the chain is no longer necessary. I've included it in the example above for illustrative purposes only.)

Unknown root

If the server provides a valid and complete chain, you may run into the problem that it all chains up to a root that you simply do not have in your local trust bundle. Reasons for this may include the fact that you are using a restricted trust bundle explicitly tuned to only include CA certificates that you expect to see; a new public CA cert was used to sign the certificate that is not (yet) included in the client's trust bundle; you are talking to a middle-box, but the middlebox's CA cert is not in your trust store; you are actively being MitM'd.

The error you'd see here is, unfortunately, the exact same error as the one for an incomplete chain ("unable to get local issuer certificate"), as again the ultimate problem remains that the client cannot construct a path from the leaf to a trusted root.

Finding your trust bundle

But how do you know whether you have the root cert in your trust bundle? For starters, you have to figure out what trust bundle your app currently uses. That, it turns out, isn't as straight forward as you might think: your browser may use its own trust bundle, which may be different from your OS trust bundle, which may be different from the trust bundle used by the library with which e.g., curl(1) was linked; a static Go executable may use a different bundle than, say, the python script you use on the same system. Yes, it's a mess.

A few common locations, depending on your OS, include:

/etc/openssl/certs/ca-bundle.crt
/etc/openssl/cert.pem
/etc/ssl/certs/ca-bundle.crt
/etc/ssl/cert.pem
/etc/pki/tls/cert.pem
/etc/pki/tls/certs/ca-bundle.crt
/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
any of the above, but under /usr/local or /opt somewhere
whatever a certain environment variable (e.g., CURL_CA_BUNDLE) points to
whatever a certain inconsistent command-line option (e.g., --cacert (curl(1)), --ca-certificate (wget(1)), -CAfile (s_client(1))) specifies

Some tools may actually show you which bundle they use (curl -v does, although there was a bug in some versions leading to that information not being logged):

$ curl -s -v -I https://www.netmeister.org 2>&1 | more
*   Trying 2001:470:30:84:e276:63ff:fe72:3900:443...
* Connected to www.netmeister.org (2001:470:30:84:e276:63ff:fe72:3900) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /usr/pkg/etc/openssl/cert.pem
*  CApath: /usr/pkg/etc/openssl/certs

...while others leave you guessing. In that case, your best bet is break out ktrace(1)/strace(1)/dtruss(1), which is that one weird SysAdmin trick to check (for example) exactly what files an application actually opens:

macos$ sudo dtruss curl -s -I -v https://www.netmeister.org 2>&1 | tee /tmp/out
macos$ grep cert.pem /tmp/out
open_nocancel("/usr/local/etc/[email protected]/cert.pem\0", 0x0, 0x0)                 = 6 0

linux$ strace -e open curl -s -I https://www.netmeister.org 2>&1 >/dev/null | grep cert
open("/etc/pki/nssdb/cert9.db", O_RDONLY|O_CLOEXEC) = 4
open("/etc/pki/tls/certs/ca-bundle.crt", O_RDONLY) = 7

netbsd$ ktruss curl -s -I https://www.netmeister.org | grep cert 
  8526      1 curl open("/usr/pkg/etc/openssl/cert.pem", 0, 0x1b6) = 7
  8526      1 curl __stat50("/usr/pkg/etc/openssl/certs/8d33f237.0", 0x7f7fffd738a0) Err#2 ENOENT

(Remember that in order to use dtruss(1) on macOS with System Integrity Protection, you need to reboot into recovery mode and disable SIP. \o/)

Peeking into your trust bundle

Ok, now at least we know which trust bundle we have to look at. But of course trust bundles come in different flavors, too, and not all contain the plain text output, but perhaps only a whole bunch of PEM blobs. How do you check whether this giant base64 dump contains the root cert you're looking for?

Trying to copy and paste each individual cert into openssl x509 is painful, but fortunately I have just the tool for you: xpipe(1)!

$ root=$( </dev/null openssl s_client -connect www.netmeister.org:443 2>/dev/null | \
        awk -F= '/i:/ { print $NF }' | tail -1)
$ </usr/local/etc/[email protected]/cert.pem xpipe -p '^-----END CERTIFICATE-----$' \
        openssl x509 -noout -subject | grep "${root}"
subject=C = US, O = Internet Security Research Group, CN = ISRG Root X1
$

In this example, we pulled the CN from the openssl s_client command, then processed each PEM blob by having xpipe(1) feed it into openssl x509 and confirmed the presence by looking for the previously extracted CN.

When you're actually being MitM'd...

One of the reasons why your tool might not have the root in question in your trust bundle is that you are, in fact, being MitM'd. Probably not by a nation state attacker specifically targeting you personally (although, who knows), but more likely by some sort of middlebox.

Often times the name of the middlebox CA makes that obvious, but perhaps you want to verify and check just what the certificate actually used by the server is. Tricky problem, though, since you are being MitM'd - you'd need to check from another device on another network.

Or... you could use whatsthatcert, a very simple service I set up specifically for this purpose: it fetches the cert from the site, thereby telling you what my server sees. I have a few simple shell functions to let me check and compare my local view versus what whatsthatcert sees:

$ cat >>~/.bashrc <<"EOF"
x509fp() {
        local p=443

        if [ -n "$2" ]; then
                p=$2;
        fi
        </dev/null openssl s_client -connect $1:$p 2>/dev/null | \
                openssl x509 -fingerprint -noout
}

whatsthatcert() {
        curl -s "https://www.netmeister.org/whatsthatcert/?h=$1&$2"
}

am-i-being-mitmd() {
        if ! diff <(x509fp $1) <(whatsthatcert $1 out=fp) ; then
                echo "Yep, looks like we're being MitM'd."
        else
                echo "Nope, looks legit."
       fi
}
EOF
$ . ~/.bashrc
$ whatsthatcert badssl.com
-----BEGIN CERTIFICATE-----
MIIGqDCCBZCgAwIBAgIQCvBs2jemC2QTQvCh6x1Z/TANBgkqhkiG9w0BAQsFADBN
[...]
$ am-i-being-mitmd badssl.com
Nope, looks legit.
$ am-i-being-mitmd www.yahoo.com
1c1
< SHA1 Fingerprint=F7:27:7C:0C:BF:D4:53:F4:F9:A3:AF:F2:31:32:ED:88:03:0B:D7:E6
---
> SHA1 Fingerprint=69:F9:48:E4:6D:B5:F8:AE:04:B2:F6:C4:15:77:49:86:D3:1B:25:33
Yep, looks like we're being MitM'd.
$

Other Failures

There are many -- many -- other possible failure modes when talking HTTPS; badssl.com provides many very helpful examples. Some of these are protocol specific, others TLS cipher or key-exchange specific, and many are unique to the browser ecosystem, where a number of requirements are encoded in the CA/B Forum guidelines and enforced in the browsers, but perhaps not normally checked in many other tools.

One of those is, notoriously, a check for certificate revocation. While browsers may automatically perform this lookup against either the certificate revocation list (CRL), or the OCSP stapled response, command-line clients and library functions by and large do not perform this check, either at all, or only if explicitly configured to do so:

No error despite revoked certificate:
$ curl -s -I https://revoked.badssl.com | grep HTTP
HTTP/1.1 200 OK


To check CRL, first fetch it, then convert it, then use it.
$ sitecert revoked.badssl.com 2>/dev/null | grep -i crl | tail -1
                  URI:http://crl4.digicert.com/RapidSSLTLSDVRSAMixedSHA2562020CA-1.crl
$ curl -s http://crl4.digicert.com/RapidSSLTLSDVRSAMixedSHA2562020CA-1.crl | \
        openssl crl -outform PEM -inform DER >/tmp/crl
$ curl -v --crlfile /tmp/crl -s -I
https://revoked.badssl.com
*   Trying 104.154.89.105:443...
[...]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS alert, certificate revoked (556):
* SSL certificate problem: certificate revoked
* Closing connection 0
$

Summary

Having to debug certificate errors is a constant task for any systems engineer, and knowing what error conditions can be diagnosed in what ways is an important skill.

As hinted at above, many problem go well beyond what I tried to cover here, and may only manifest when using a browser; others are specific to e.g., the version of the TLS library in use. SSL Labs is, of course, an excellent and thorough way to check for compatibility issues, as is Hardenize, which performs an even more comprehensive check on multiple services offered on the given domain.

In a nutshell, however, make sure that you understand and practice the use of the openssl s_client and openssl x509 commands as well as the many options and flags for your preferred tool.

curl -k may be the easy way out, but it rarely is the answer you're really looking for.

February 18th, 2022

Links:

This blog post as a Twitter thread

Debugging Certificate Errors