The impact of SSL certificate revocation on web performance
The process of establishing a connection to a website is a complicated process. First the browser must turn the domain name into an IP address (DNS lookup), then once found it must negotiate a connection with the server via Transmission Control Protocol (TCP). Lastly if the site is being served over HTTPS (which around 83% of web requests are), the browser needs to negotiate an encrypted connection (via Transport Layer Security (TLS)) with the server. The last TLS stage is where digital certificates are involved, and this process is what the rest of this blog post will focus on.
If you’d like more information on the transport stage of the connection process, check out the High Performance Browser Networking chapters here and here.
Types of certificate validation levels
There are three validation levels that are used with digital certificates, each with their own criteria to be met:
To obtain a Domain Validation (DV) certificate you must prove you have control over one of the following criteria related to a domain: the whois records, DNS records file, email or web hosting account.
An Organisation Validation (OV) class certificate will be issued if you can meet two criteria related to a domain: one of the rights to administer a domain (as listed above with DV certificates), and an organisations existence as a legal entity.
The Extended Validation (EV) class of verification is the most comprehensive of the levels. To obtain one you must prove you have control over the right to administer a domain (as with DV and OV certificates), an organisations existence as a legal entity (as with OV certificates), and there will also be verification checks by a verifying agent from the CA.
An important point to note about all three certificates above, in terms of technology they are exactly the same. They are all X.509 public key certificates. The only difference is what is contained within the certificates. As you might expect, EV certificates will contain more information and have different properties set to distinguish them from DV/OV certificates. For example, EV certs have a different
subject which identifies a legal entity rather than a domain.
So how does your browser ensure that the site being visited is who they say they are?
A Certificate Authority (CA) is the trusted organisation that will issue a certificate once you (or your company) have met the criteria listed above. They are the trusted third party between the browser (user) and the server (website). When communicating with a server we want to make sure that they are who they say they are. And this is achieved by completing one of the validation processes mentioned earlier.
The Chain of Trust
There is something called a Chain of Trust which is relied upon when establishing the encrypted tunnel. And it works a little like this. The CA holds what is known as the root certificate and key, which is used to sign (verify) the intermediate certificate. The intermediate certificate and its key is then used to sign the sites (leaf) certificate.
The browser trusts the CA (root) which has signed the intermediate certificate. So the browser will also trust the intermediate certificate. The same process happens with the site certificate. It has been signed by the intermediate (via the root) so the browser now trusts the site certificate. With a chain of trust linking site certificate to the root, the browser can now trust the site.
If at any point in the process a site certificate has been modified in any way, it wouldn’t validate against the layer directly above it (intermediate then root). The browser would be able to see that the Chain of Trust has been broken and will report back to the user that the encrypted tunnel has been compromised (an unsecured network).
A SSL error can also happen if the webserver isn’t correctly setup, and doesn’t provide correct intermediate certificates in the chain to lead back to the root. In a report from 2016, missing intermediates account for >10% of all certificate validation errors in Chrome. Thankfully there is an extension in SSL certificates called Authority Information Access (AIA) that provides information about the issuer, and will include a link to the intermediate certificates if they are missing. Your users therefore shouldn’t see an ugly
NET::ERR_CERT_AUTHORITY_INVALID error message if AIA is used. NOTE: If the browser does AIA, it will slow down the SSL negotiation as the browser needs to fetch the intermediate certificate from the CA before the SSL negotiation can complete. So remember, check that your server is serving your intermediate certificates correctly.
For a little more clarification, there are a few more details on each of the certificate types below:
- Root certificate - usually physically distributed with the operating system and is used to issue intermediate certificates.
- Intermediate certificate(s) - CA’s use the intermediate certificates to insulate themselves from having to use the root certificate to sign leaf certificates. A compromised root certificate is bad, so issuing off an intermediate certificate is much safer. It’s next to impossible to revoke a root and replace it, and it would take years for the new one to filter through the ecosystem.
- Leaf certificate - This is the sites certificate which is issued from the intermediate certificate.
NOTE: It is possible to have multiple intermediate certificates, although not recommended as it adds extra steps to the checking process.
When using HTTPS (as is increasingly common for websites and is a recommended best practice), the connection must be setup and secured cryptographically. The method of encryption and which keys to use are negotiated during the TLS handshake, which must happen before a HTTPS connection can be established. Before any application data can be sent over the connection, the HTTPS connection must be established.
It’s important to remember that a TLS handshake doesn’t come for free. Time and resources are used to establish the secure connection. It can typically take 2-RTT to establish this encrypted tunnel (depending on the server setup), but if the certificate size is large it could be more. If the server supports TLSv1.3 then it is possible to establish the connection in 1-RTT, or even 0-RTT in some circumstances.
For example, let’s assume our hypothetical server takes 2-RTT to establish this encrypted tunnel, and our user is on a mobile with a 3G connection. A 3G connection will likely have a minimum RTT of around 300 ms. So it will take a minimum of 600 ms to establish the encrypted tunnel, and that’s not even taking into account the DNS lookup, connection negotiation and any network congestion. This is why Transmission Control Protocol (TCP) connections are even more expensive in the new world of HTTPS. Thankfully QUIC, a new internet transport protocol which uses User Datagram Protocol (UDP) has been created which should hopefully solve this issue.
There are times where the trust between browser and server can be broken. It could be that a certificate has been compromised, and when this happens there needs to be a way to let the browser know about it. If the certificate has been compromised, you don’t want a person with malicious intentions to be able to mimic a trusted website. That would be incredibly damaging to users, the website, and the Certificate Authority involved. To allow this to happen, ways to revoke certificates have been invented:
Certificate Revocation List
A Certificate Revocation List (CRL) is exactly what the name suggests. It is a large list containing the serial numbers of revoked certificates. Each and every CA updates this list regularly, and the list is shared with browsers. Now as you can imagine, the web has a lot of certificates, and therefore there’s going to be a lot of revoked certificates. So this list will continue to grow and grow. Thankfully expired certificates aren’t revoked as there’s no need too, and their serials are often trimmed from the CRL’s.
To use the list, a browser must download it in its entirety, and loop through each serial number to check and see if the specific certificate they are examining has been revoked. This takes both time and resources, which can slow down the TLS handshake. And what happens if the browser can’t update the CRL? Does it just assume that the certificate it has been sent hasn’t been revoked, then proceed as normal hoping there isn’t an issue with it? (this is known as a ‘soft-fail’). Given that the vast majority of certificates are NOT revoked, a ‘soft-fail’ is really the only viable option. Unfortunately a ‘soft-fail’ strategy has a critical security flaw, as described here. How a ‘soft-fail’ is handled depends entirely on the browser being used.
Online Certificate Status Protocol
So if CRL doesn’t work particularly well, what else can we use? This is the reason the Online Certificate Status Protocol (OCSP) was created. Essentially, it allows a browser to send information about the certificate it is verifying to an OCSP responder, and the responder will send a message back saying either:
Unknown. The payload is small and looping through a huge list of revoked certificate serials isn’t required. Sounds better doesn’t it?
So with the creation of OCSP has the problem of certificate revocation been solved? I’m afraid not. OCSP also comes with a number of issues:
- Checking requires a TCP connection to the OCSP responder - This network overhead takes time to establish and adds latency to the SSL negotiation.
- OCSP slows down the TLS handshake - The SSL negotiation cannot fully complete until a response is received from the OCSP responder. (there is a fallback which I mentioned earlier: ‘soft-fail’)
- An OCSP responder can sometimes fail - What happens if the browser doesn’t get a response? It can either blindly trust the certificate is still valid (‘soft-fail’), or terminate the connection (‘hard-fail’).
- Privacy concerns - By contacting the responder, the browser is leaking what website a user is accessing to the CA. Sounds like a potential method to track a person across the web that they don’t know about or opt-out of.
Due to the issues with both CRL and OCSP, browser vendors and Operating System (OS) manufacturers have come up with their own proprietary systems for handling certificate revocations:
- Chrome (Chromium) uses CRLSets
- Firefox uses OneCRL
- Safari is handled at the OS level
- Edge(Chromium) uses CRLSets
- Edge(EdgeHTML) handles it at the OS level
- Internet Explorer handles it at the OS level
- Mobile browsers… could be OS level, I’m not 100% sure.
This is my current understanding from lots of searching. If anyone has any information on how mobile browser revocation works I’d love to know. I’ve struggled to find any definitive answers.
How browsers handle OCSP
Next, let’s examine what browsers do with OCSP when the different types of certificates are encountered, as this is an important consideration when it comes to web performance. For this investigation I’ve used a mixture of WebPageTest for the visualisation of the waterfall,
tcpdump and Wireshark for the packet level analysis, and plain old Google detective work.
The spreadsheet I’ve compiled can be found here. There’s some missing information where I either couldn’t test, or couldn’t find any solid information about what happens. If you spot any errors or can fill in the blanks please do let me know.
I have to admit, finding this information isn’t easy. Browsers progress so fast, documentation can quickly become out of date, and analysing the network for OCSP requests isn’t as simple as I first thought it would be. But assuming that the spreadsheet is mostly correct, you will see that a number of browsers actually use OCSP to check the revocation status of SSL certificates.
Web Performance and OCSP
So as OCSP checks are made in quite a number of browsers (and more specifically always made with EV certificates in Chrome), what’s the issue with OCSP and web performance? Let’s go into a few details:
In the example waterfall below you can see two TCP connections to two separate domains. Request 1 is for the intermediate certificates OCSP check, and request 2 is the OCSP check for the leaf certificate.
Don’t be fooled into thinking you only require a single TCP connection to an OCSP responders domain either, as that isn’t always the case. In the example below (taken from the WebPageTest connection view), the browser opens 3 connections to the responder:
And I’m not quite sure what is happening with this connection view below. We have 1 successful attempt to the responder at request 1, and what looks to be 5 failed requests. Then later in the waterfall we see 5 successful connections to the same responder. It’s quite an unusual waterfall chart and I’d like to think there’s something unusual with either the server setup or test itself, and this is a rare occurrence.
When it comes to web performance and loading websites quickly, you may think it is the bandwidth of the connection that is the most important factor. But that isn’t the case. It’s actually the roundtrip time (RTT). The longer it takes a packet to travel to the server and the response coming back again, the slower the whole experience will be for a user. Let’s see what happens if the RTT is large:
Here’s an example setup to demonstrate:
- A WebPageTest run on Apple.com (which uses an EV certificate)
- Google Chrome used as the browser so an OCSP check occurs
- 3G connection with 300ms of latency added to the last mile
- Our user is located in Vietnam
Now you may be asking “Why Vietnam?”. Well, it’s important to remember that the location of your users has a huge effect on the RTT. If your user is located in a country outside of America, but the OCSP responder is only located in America, that’s a fair bit of latency being added to the RTT. So what does the start of a waterfall look like in this scenario:
Notice how the OCSP request isn’t made right away because it needs to get the certificate from the negotiation to know what to check! Before this point the DNS lookup and initial TCP connection need to complete. Both of these stages are also impacted heavily by a large RTT. Once the SSL negotiation starts it takes 1.5 seconds on its own(!), and 67% (1022 ms) of this time is spent waiting for the OCSP response.
It’s hard to know if 1.5 seconds is a long time for a SSL negotiation if we have nothing to compare it with. Thankfully we can use Mozilla Telemetry data to see how long a typical SSL negotiation takes across a large number of requests (from real browser data):
- Firefox beta 72 (all connections types)
- 740 billion measurements
- Median time for SSL completion: 177 ms
- 95th Percentile: 1090 ms
Although the browsers are different in the comparison (Chrome vs Firefox), the numbers give you an idea of what the usual SSL negotiation time is for many requests. The median, and even the 95th percentile are nowhere near the 1.5 seconds we see above. What’s also interesting is we know from the browser spreadsheet I compiled earlier in the post, Firefox will OCSP check all certificates types. So even accounting for that in the large data set, the results listed aren’t anywhere close to the 1.5 seconds we see in the demonstration.
OCSP is synchronous
Looking closely at the waterfall where and OCSP request is made, you can see that these responses are synchronous and blocking.
As can be seen in the waterfall above, the OCSP check at request number two must wait for request number 1 to complete. And the whole SSL negotiation has to wait for both checks to complete before it can fully negotiate the encrypted tunnel. Having synchronous and blocking requests at the start of a waterfall doesn’t sound too great for web performance!
As you can probably guess, the reliability of the OCSP responder in this whole revocation process is very important. After all, if the response is slow (or it times out) this is going to have a impact on the connection negotiation, the Time to First Byte (TTFB), and pretty much the whole waterfall after it. Again Mozilla Telemetry can give us an indication of what Desktop Firefox 71 beta users are experiencing. These results suggest that over 7% of OCSP checks are timing out today.
And this is what a waterfall looks like if the OCSP check(s) take a long time:
This waterfall is from a WebPageTest user who’d recently added EV certificates to their website. They’d then noticed that this change had caused a delay of up to 22 seconds with agents located in China, Russia, and India. They also note it occurs on a number or websites that require OCSP checks, not just their own. You can read the whole forum post here if you are interested. Upon reading it, it was one of the reasons why I decided to write this very blog post.
There’s a very interesting section related to this in the EV SSL Certificate Guidelines 1.7.1. Section 13 states that:
The requirements in Section 4.9 of the Baseline Requirements apply equally to EV Certificates. However, CAs MUST ensure that CRLs for an EV Certificate chain can be downloaded in no more than three (3) seconds over an analog telephone line under normal network conditions.
Strangely, there’s next to no mention of OCSP in these guidelines. So I wonder what the minimum response requirements would be for OCSP responders?
So a question you may be asking is what happens if the browser fails to receive an OCSP response? Maybe the connection times out, or the responder is offline, or the request is being blocked (corporate firewall etc)? Since the whole point of the OCSP check is to see if a certificate is still valid and hasn’t been revoked, you may assume that the SSL negotiation will fail and the connection won’t complete. That isn’t the case though. This is where a browser will do what is called a ‘soft-fail’, which involves the browser trying to check the certificates revocation status and failing to get a response. Instead of killing the connection the browser proceeds with the connection regardless. In doing so the browser is assuming that the certificate hasn’t been compromised. Sounds a little risky doesn’t it? There’s a whole post by Adam Langley from 2014 all about this particular issue that’s well worth reading if you want to know more.
Third party domains can require OCSP checks too
It isn’t just the origin domain where these checks can happen. Any third party domain you load from may require the browser to perform an OCSP check on their domain too:
Lets’s focus on EV certificates, as I feel there are some unique things to consider when it comes to web performance.
EV Certificates and OCSP
I’d like to bring your attention to the results from Chrome in the browser spreadsheet, as you will see EV certs are always OCSP checked when using it, but other certificate types aren’t. Now considering Chrome is the most popular browser on the planet (with around 56% usage according to W3Counter), that’s a lot of users having to do an OCSP check on sites that use EV certificates. And that’s not even including all the other browsers that require it too. Browsers like:
- Edge (Chromium & EdgeHTML)
- Firefox (Android)
- Samsung Internet
We already discussed the performance impact of OCSP revocation checking. So the fact that by choosing a certain type of certificate will force the most popular browser on the planet to encounter these performance issues, we should examine why you’d choose to use one.
Why use an EV certificate?
So what are the advantages of choosing to use an EV certificate over say, a DV certificate?
One of the driving forces behind their adoption could probably be marketing related. Browsers used to show the name of the website right next to a little green lock in the URL bar:
My guess would be that because you have your site name next to the little green lock, the site looks ‘more legitimate’, ‘more trusted’ and ‘more secure’… maybe? But it isn’t that simple. The name next to the lock may not match the branding of the website, due websites being owned by parent/subsidiary companies or other legal entities. And if that were the case then it just looks strange. To a user it may even look like the site is broken. They have no idea who the parent company is of the site they are visiting (or probably even care). So why does the little green lock reference a company they’ve never heard of? A user reaction could be: “Is this website trying to steal my bank details?”.
The green lock and name in the certificate have been visible in the past, but what about in 2020: Do modern browsers still show the name next to the green lock for EV certificates?
- Chrome: Has removed the name from the URL bar from version 77, but it’s shown when clicked on. The padlock is no longer green.
- Safari: Removed the name from the URL bar in June 2018, but it’s shown when clicked on. The padlock is still green.
- Firefox: Moved the name from the URL bar in version 70, but it’s shown when clicked on. The padlock is no longer green.
There’s a whole thread from Scott Helme on Twitter that talks all about these changes. In my opinion the marketing angle of EV certificates was weak before, but now it’s non existent. Scott also has a blog post here discussing many other aspects of the perceived ‘positives’ of EV certificates, which makes for an interesting read.
How popular are EV Certificates?
So how many sites actually use EV certificates? Mozilla Telemetry data gives us some insightful stats about the number of EV certificates that are encountered in their data set (based on real browser usage):
- Firefox desktop, beta 72 (all connection types)
- 19.2 billion measurements
- 434M (2.24%) invalid certificates
- 18.7B (96.51%) DV certificates
- 242M (1.25%) EV certificates
As you can see, EV certificates are pretty rare on the web. This is exactly what Scott Helme reported back in 2017 some 2.5 years ago. So I’d guess that number may be even lower now? A look at some more of Scott’s stats from February 2019 sees the same downwards trend of EV certificate usage. And according to SSL Labs SSL Pulse, EV certs are only used on 8.3% of the top 150,000 websites from Alexa’s list of the most popular sites in the world.
If you are interested a whole host of WebPageTest research on EV certificates and OCSP stapling, I highly recommend reading Aaron Peters superb blog post ‘EV Certificates Make The Web Slow and Unreliable’.
How to improve performance?
So OCSP doesn’t sound great for performance, but what can we do about it? Thankfully there is something called ‘OCSP stapling’ that can alleviate some of the performance pain (but not all of it!).
The problem with OCSP requests is every browser of every user must make these requests (assuming the browser does the checks). That sounds like a very wasteful strategy. So what if we got the websites server to periodically check on the OCSP status of its own certificate (which is cryptographically signed so that its validity can be verified). It could then send the response from this check during the TLS handshake to the browser, at the same time as the certificate. In doing so the users browser gets the revocation information it needs during the TLS handshake, so there’s no need for it to make a separate request to the OCSP responder.
What I’ve just described is exactly what OCSP stapling does. You enable it on the server and it will do what I described above. Resulting in better privacy for the user, better performance, and less load on the OCSP responder. Excellent stuff! So what’s the catch?
Unfortunately OCSP stapling doesn’t solve all the problems. When a server staples a response it will only do it to one ‘level’ deep. So you can staple the leaf certificate check (the one for the site), but you can’t staple the full chain. This is a known issue and there’s an RFC6961 from June 2013: ‘Multiple Certificate Status Request Extension’ to add this ability to TLS, but this has now been obsoleted by RFC8446: ‘The Transport Layer Security (TLS) Protocol Version 1.3’, so maybe this will be possible with TLS 1.3? I’m interested to find out if this is possible in 1.3, so if you know please get in touch.
So what does OCSP stapling look like on a WebPageTest waterfall? NOTE: The site the tests were run on uses an EV certificate.
Two OCSP checks can be seen below. One for the intermediate certificate at request 1 (which contacts the root OCSP responder), the other check is for the sites certificate at request 2 (which contacts the intermediates OCSP responder):
With stapling enabled, one of the checks is removed. The OCSP check made to the intermediate responder for the site certificate has now been ‘stapled’, and is sent to the browser along with the site certificate during the TLS handshake:
OSCP stapling and EV certificates in Chrome
So what does OCSP stapling mean for EV certificates? Unfortunately it doesn’t completely solve the performance issue. Even with OSCP stapling enabled, Chrome is going to run an OSCP check on the intermediate certificate. So you’re going to have a OCSP performance hit with EV certificates no matter what:
As they say, the proof is in the WebPageTest waterfall chart. Above I have run a test on Apple.com, it has an EV certificate and the server has OCSP stapling enabled (SSLLabs test). The test is run using Chrome and the OCSP check can still be seen at request number 1. OCSP stapling has stapled the leaf certificate check and sent it along with the leaf certificate during the TLS handshake, but the check on the intermediate certificate to the root OCSP responder is still visible. It is worth noting that commonly used intermediate certificates will be cached between sites, so the impact of this may not be as large as it sounds. But caching of intermediate certificates comes with some privacy concerns in Firefox (I’ve not seen any similar reports for Chrome).
The future of certificate revocation
From the outside looking in (as I was, and still am, a complete ‘n00b’ to the subject until recently) it comes across to me as incredibly complicated and all a bit of a mess. CRL doesn’t scale well, and native OCSP has been all but abandoned by Chrome (apart from for EV certs). OCSP stapling solves some problems with OCSP, but also comes with its own set of problems. So what else is on the horizon?
Well there’s an interesting new technology called CRLite that Mozilla have been testing in the Firefox Nightly builds since December 2019.
How does CRLite differ from other CRL revocation technology?
CRLite promises a lot, but how is it different to other CRL revocation technology available in browsers? I’ll try and summarise below:
As mentioned before, CRL is a list of digital certificates that have been revoked by the issuing CA. The browser downloads a copy of the list which it can then use to check to see if the certificate it has been provided is on the list. A big problem with CRL is the list of serials is large and can be quite a considerable file size over the network. And to make matters worse, if a client doesn’t have a fresh copy of the list, it has to fetch one during the initial connection to the site! This slows down the whole SSL negotiation process.
OneCRL is what Firefox currently uses for CRL revocation checking. It’s been using it since version 37, released back in March 2015. In practice it’s a centralised list of revoked certificates that is periodically checked and updated. This list is then pushed out to the browser. To do this, Firefox extends an existing mechanism called blocklisting, which is already used to protect users from errant add-ons, plugins and buggy graphics drivers.
OneCRL comes with a major benefit: speed. If a certificate is covered by OneCRL, a live OCSP request isn’t required across the network, so the revocation check can all be done locally. No additional latency is incurred. This is very important for EV certs where a positive OCSP response is required. An important point to note is OneCRL only covers CA intermediate certificates so as to limit the size of the blocklist.
CRLSets is the primary method Chrome (and Chromium based browsers) currently use for quickly revoking certificates in emergency situations. A secondary function is to revoke non-emergency certificates. These secondary revocations are found by crawling CRL’s published by CA’s. The method of how the sets are delivered to the browser sound very similar to OneCRL in Firefox. The list is compiled daily and then the delta is pushed to the browser via an auto-update mechanism. Adam Langley wrote a very interesting post about CRLSets back in 2014, which gives some insight into how they work and some of their pitfalls.
There’s a detailed post about how CRLite works here, but as I understand it, the whole revocation list resides inside a users browser and it is updated four times per day. This sounds very much like OneCRL, only with a major difference:
CRLite has low bandwidth costs: it can represent all certificates with an initial download of 10 MB (less than 1 byte per revocation) followed by daily updates of 580 KB on average.
That’s an impressive amount of compression! Where OneCRL could only cover intermediate certificates, CRLite should be able to cover them all. If it works, it promises to allow revocation checks when required that replace a network round-trip, with a local lookup for a larger set of revoked certificates. So no web performance overhead like what is seen when establishing connections to an OCSP responder. Sounds promising! I’m interested in seeing how it develops and if it can be standardised and other browser vendors are willing to adopt it too.
So we come to the end of a meandering blog post on different types of SSL certificates and the performance impact of OCSP revocation checks. As we’ve seen, these checks don’t come for free. They can have a significant impact on web performance depending on what browser a user is using, and the location they are browsing from. And if a site uses an EV certificate, it is forcing the most popular browser on the planet (Chrome) to open a revocation check, which will have an impact on performance for those users who visit your site using it. It makes you wonder if the ‘perceived’ advantages of using an EV certificate are really worth it? Personally, I don’t see any advantages, only disadvantages. And there are a number of blog posts that agree with this stance.
So when you are next considering what type of certificate to purchase for your website, it’s worth weighing up the pro’s and con’s of each, and how your choice could effect your sites performance.
- 26/01/20: Initial blog post published. Thanks Barry Pollard for the (lots of) technical review and feedback.
- 27/01/20: Small tweaks to some sections and added information about the Authority Information Access (AIA) SSL extension. Thanks Barry Pollard.
- 30/01/20: Added link to Aaron Peters blog post ‘EV Certificates Make The Web Slow and Unreliable’.