During a recent discussion about the DarkMatter CA on a Mozilla mailing list, it was found that their 64-bit serial numbers weren’t actually 64 bits, and it opened a can of worms. It turns out that the serial number was effectively 63 bits, which is a violation of the CA/B Forum Baseline Requirements that state it must contain 64 bits of output from a secure random number generator (CSPRNG). As a result of this finding, 2,000,000 certificates or more may need to be replaced by Google, Apple, GoDaddy and various others.
Update: GoDaddy initially said that more than 1.8 million of their certificates were impacted; they have drastically reduced this number in an update posted on 2019-03-12. The fully number of certificates impacted by this is still being discussed.
It’s quite likely that the full scope of this problem hasn’t been determined yet.
During an analysis of certificates issued by DarkMatter, it was found that they all had a length of exactly 64 bits – not more, not less. If there’s a rule that requires 64 bits of CSPRNG output, and the serial number is always 64 bits, at first glance this seems fine. But, there’s a problem, and it’s in RFC 5280; it specifies the following:
The serial number MUST be a positive integer assigned by the CA to each certificate. It MUST be unique for each certificate issued by a given CA (i.e., the issuer name and serial number identify a unique certificate). CAs MUST force the serialNumber to be a non-negative integer.
Requiring a positive integer means that the high bit can’t be set – if it is set, it can’t be used directly as a certificate serial number. As such, if the high bit is set, there are two1 possible options:
- Pad the serial with an additional byte, so that the full 64 bits of output is used.
- Discard the value, and try again until you get a value without the high bit set. This means that the size is always 64 bits, and the high bit is always 0 – giving you 63 effective bits of output.
A popular software package for CAs, EJBCA had a default of using 64-bit serial numbers, and used the second strategy for dealing with CSPRNG output with the high bit set. This means that instead of using the full 64-bit output, it effectively reduced it to 63 bits – cutting the number of possible values in half. When we are talking about numbers this large, it’s easy to think that 1 bit wouldn’t make much difference, but the difference between
2^63 is substantial – to be specific,
2^63 is off by over 9 quintillion or more specifically 9,223,372,036,854,775,808.
The strategy of calling the CSPRNG until you get a value that has the high bit unset violates the intention of the rule imposed by the Baseline Requirements, meaning that all certificates issued using this method were mis-issued. This is a big deal, at least for a few CAs and their customers.
Now, the simple solution to this is to just increase the length of the serial beyond 64 bits; for CAs that used 72 or more bits of CSPRNG output, this is a non-issue, as even if they coerce the high bit, they are still well above the 64-bit minimum. This is a clear case of following a standard as close to the minimum as possible, which left no margin for error. As the holders of those 2+ million certificates are learning, they cut it too close.
The Baseline Requirements are the minimum rules2 that all CAs must follow; these rules are voted on by a group of browser makers and CAs, and often debated in detail. Thankfully for all involved, much of these discussions happen on public mailing lists, so it’s easy to see what’s been discussed and what the view of the different parties were when a change was approved. This is a good thing when it comes to understanding this issue.
The relevant rule in this case is in section 7.1:
Effective September 30, 2016, CAs SHALL generate non-sequential Certificate serial numbers greater than zero (0) containing at least 64 bits of output from a CSPRNG.
On a prima facie reading of this requirement, it appears that the technique that EJBCA used could be valid – it is the output of a CSPRNG, and it is 64 bits. However, the Baseline Requirements can’t be read so simply, you have to look deeper to find the full intention. In this case, the fact that 1 bit would be lost in a purely random serial was pointed out by Ryan Sleevi of Google and Ben Wilson of DigiCert. This fact is not pointed out in the requirement itself, but is available to anyone that spends a few minutes looking at the history3 of the requirement.
With a deeper reading, it’s clear that a 64-bit serial, the smallest permitted, in quite likely to be a violation of the Baseline Requirements. While you can’t look at a single certificate to determine this, looking at a larger group will reveal if the certificate serial numbers are consistently 64 bits, in which case, there could be a problem.
When a certificate is issued that doesn’t meet the Baseline Requirements, the issuing CA is required to take quick action. Once again, the we look to the Baseline Requirements (220.127.116.11) to find guidance:
The CA SHOULD revoke a certificate within 24 hours and MUST revoke a Certificate within 5 days if one or more of the following occurs: … 7. The CA is made aware that the Certificate was not issued in accordance with these Requirements or the CA’s Certificate Policy or Certification Practice Statement; …
This makes it clear that the CA has to revoke any certificate that wasn’t properly issued within 5 days. As a result, CAs are under pressure to address this issue as quickly as possible – replacing and revoking certificates with minimal delay to avoid missing this deadline. Google was able to revoke approximately 95% of their mis-issued certificates within the 5 days, Apple announced that they wouldn’t be able to complete the process within 5 days, and GoDaddy stated that they would need 30 days to complete the process. The same reason was cited by all three: minimizing impact. Without robust automation4, changing certificates can be complex and time-consuming, leaving the CA to choose between complying with requirements or impacting their customers.
Failing to comply with the Baseline Requirements will complicate audits, and could put a CA at risk of being removed from root stores.
The full impact of this issue is far from known. For Google and Apple, both in the process of replacing their mis-issued certificates, they were only issued to their own organizations – reducing the impact. On the other hand GoDaddy, which has mis-issued more than 1.8 million certificates5, is facing a much larger problem as these were certificates issued to customers. Customers that are likely managing their certificates manually, and will require substantially longer to complete the process.
It’s also not clear how many other CAs may be impacted by this issue; while a few have come forward, I would be shocked if this is the full list. This is likely an issue that will live on for some time.
[Note on DarkMatter: This post is solely about the issue with serial numbers discovered as a result of the discussion around DarkMatter operating as a trusted CA in the Mozilla root store. It does not take any position on the issue of DarkMatter being deserving of such trust, which is left as an exercise for the reader.]
[Note on Exploitation Risk: Entropy in the serial number is required as a way to prevent hash collisions from being used to forge certificates; this requires an ability to predict or control certificate contents and the use of a flawed hashing algorithm, adding a random value makes this more difficult. This type of issue has been exploited with MD5, and could someday be exploited with SHA1; there’s no known flaws in the SHA2 family (used in all current end-entity certificates) that would allow such an attack. In addition, while due to this issue, the level of protection is reduced by half,
2^63 is still a large number and provides a substantial amount of safety margin.]
- There may be additional ways of handling this situation, though these are the most likely. Other methods may or may not actually be compliant with the Baseline Requirements. ↩
- Root store programs have their own rules which CAs must follow that go beyond the Baseline Requirements (BRs); as such, the BRs are not the final word in what is required, but a set of minimum requirements that all involved have agreed to. ↩
- Given the complex and sometimes adversarial nature of the CA/B Forum, even small and obvious changes are sometimes debated for extended periods. This makes updating the BRs more complex than it should be, and appears to drive changes to be as minimal as possible to avoid conflict. In an ideal world, CA/B Forum would produce an annotated version of the BRs that offer additional insight into the rules, their origins, and their intentions. In the world we live in, that would require a level of cooperation and coordination that is exceedingly unlikely. ↩
- With events like this, Heartbleed, and others that can lead to certificates being revoked with short notice, using robust automation to manage certificates is the only logical way forward. While this makes some people uncomfortable, manual management exposes organizations to far greater risk. ↩
- At the time of writing, these are preliminary numbers; the number of certificates that are being reissued is not clear. ↩