OCSP – Should I stay or should I go?

There has been some buzz around Let’s Encrypt’s announcement to end OCSP support in 2025. And since Let’s Encrypt has become a purveyor of best practice to many IT organizations out there, many PKI operators are probably poring over this sentence right now and try to decide whether or not to adopt this guidance in their own implementations:

OCSP and CRLs are both mechanisms by which CAs can communicate certificate revocation information, but CRLs have significant advantages over OCSP.

TL;DR: In my opinion, there are significant advantages in sticking with, or even migrating to, OCSP in locally operated PKI implementations. If you’re interested in the why, read on.

The reasons Let’s Encrypt is moving away from OCSP are (not necessarily in the order that they are of concern to LE):

  1. Performance: While OCSP requires a connection to the Validation Authority every time a certificate’s status needs to be checked, a CRL gets cached by the client and reused for a large portion of its validity period.
  2. Privacy: Every OCSP request tells the VA which web server a certain client is trying to access, theoretically allowing profile creation and other potential privacy violations. A request for a CRL download only tells the VA that this client tried to access ‚a‘ website protected by a certificate issued by the corresponding CA – in case of Let’s Encrypt this doesn’t narrow it down all that much.

Both issues can be overcome with a technique known as OCSP Stapling where the OCSP response, signed by the issuing CA, is not requested from the VA but rather sent by the web server itself as a part of the TLS handshake. It does introduce a certain staleness to the response, but since many popular OCSP implementations, including Microsoft’s, are based on CRLs anyway, we can safely ignore this latency. Let’s Encrypt has been supporting this functionality for quite a while but at the time of writing, only about 300k hostnames are requesting the „Must Staple“ extension – less than one cert in a thousand, given the current stats.

In view of the above, in order to understand how Let’s Encrypt’s new guidance applies to your own infrastructure, you have to acknowledge that your local PKI implementation, however big or small, operates under fundamentally different conditions compared to Let’s Encrypt’s.

Let’s start with security, because that’s what certificates are ultimately all about. Let’s Encrypt’s CAs provide certificates of limited impact and short lifetime while being themselves fairly well protected against compromise (make no mistake – there is a nonzero chance of that happening, but in this case the world as we know is going to end anyway.) Your local PKI, on the other hand, runs a significantly higher probability of a CA certificate getting into wrong hands while issuing longer-living certificates for purposes that allow much shorter paths to total domination: Code Signing (attacker could inject malicious scripts or binaries into your IT processes), Client Authentication (attacker could penetrate your Network Access Control if it’s based on 802.1X) or maybe even Kerberos Authentication (attacker could authenticate as legitimate user or even as an administrator.) To effectively protect against certificate forgery, it is therefore imperative to distinguish between certificates that have not been revoked and certificates that have not been issued by the CA in the first place. A CRL cannot deliver an answer to that because it’s a static list of certificates that have been revoked. It’s a case of „blacklist vs. whitelist“ where a security person will always choose the latter if given a choice. In comparison, OCSP represents an online VA which means that it is tied to the corresponding issuing CA and theoretically able to provide a deterministic status. Microsoft’s ADCS includes a rather clunky implementation of this feature, but ultimatley most enterprise-grade PKI products offer „deterministic GOOD“ functionality with OCSP. You cannot deterministically validate certificates‘ status using CRL, period. You might not need to, but if you do, implement OCSP.

By the way: There is another security-related aspect to a public CA’s decision to move from OCSP to CRL. OCSP is known to have been used for data exfiltration in the past. If *all* public CAs were to abandon OCSP, you could block outbound OCSP requests on your protocol-aware firewall and forget about this attack vector. You should do that even now and provide explicit exceptions for those CAs whose roots you choose to trust within your organization!

Having said that, what about Let’s Encrypt’s original concerns as quoted above? Those are easy:

  1. Your revocation checking traffic is not going to overload your OCSP server(s). If it does, you know how to scale it – up or out. It’s a webserver, albeit a specialized one. On the scale Let’s Encrypt operates, there is a world of difference between serving up a dozen static files that can be delivered via a CDN and running specialized webservers that provide dynamic responses and cannot therefore be amplified using a CDN. Not to sound condescending, but on the scale your internal IT operates, it probably doesn’t matter.
  2. The privacy concerns that exist on the public Internet do not apply to internal traffic within your organization. The owner of the OCSP VA is usually identical with the owner of the application for which a certificate’s validity is being checked so the information about who accessed what from where is available anyway.

To summarize, there is nothing to gain from replacing existing internal OCSP infrastructure by HTTP CDP. On the other hand, implementing OCSP allows you to provide „deterministic good“ responses to critical applications including Kerberos Auth if your CA is valid for that.

Happy validating!

Image by Joshua Choate from Pixabay