TLS Certificates from the Top Million Sites

Thanks to the recent WoSign / StartCom issues with misused or flawed certificates, there was a question about how many of the certificates issued were actually active – so this seemed to be a good time to run a scan against the Alexa top 1 million sites to see what I could find.

Scan information: The scan was ran with an 8 second timeout – so any server that couldn’t complete a handshake within 8 seconds isn’t counted. The scanner attempted to connect to the domain on port 443, and if that failed, then attempted to connect to the “www” subdomain. No certificate validation was performed. The scan didn’t attempt any other ports or subdomains, so it’s far from a complete picture.

Of the top 1 million sites, 700,275 respond with a certificate on port 443; here are the top certificate authorities identified:


  • Comodo: 192,646
  • GeoTrust Inc: 85,964
  •, Inc: 45,609
  • GlobalSign: 42,111
  • Let’s Encrypt: 38,190
  • Symantec Corporation: 27,612
  • DigiCert Inc: 27,588
  • cPanel, Inc: 24,195
  • thawte, Inc: 18,640
  • Starfield Technologies, Inc: 11,411

The first thing you may notice looking at that list, is that Comodo is the overwhelming leader – accounting for 28% of all certificates returned. Part of the reason for this is the partnerships they have: they are the certificate provider for CloudFlare, which offers TLS support to all customers, including those on their free plan, as well as various hosting providers. Here’s a breakdown:

  • CloudFlare: 73,339
  • HostGator: 9,499
  • BlueHost: 4,410
  • Others: 19,904

It’s quite exciting to see Let’s Encrypt not only in the top 10, but in the top 5 – for an organization that runs on less than $3M, they have done a remarkable job on impacting the industry.

For the two that started it all, StartCom has 10,221 and WoSign comes in at 3,965 – though in discussions with someone who has access to Google’s internal crawl logs, this is far from a complete picture of the usage of their certificates. As this scan only included the root (or www subdomain in case of an error) and port 443, this scan only captures some of the picture.

For invalid certificates (self-signed & otherwise untrusted), certificates with no issuer organization actually made it in the top 10, with 25,954 certificates; this is followed by localhost with 13,450 and Parallels Panel with 11,450 certificates.

The raw data for the scan is available here and was produced with a beta feature in YAWAST.

Thanks to Kenn White for the graphic!

Ruby + GCM Nonce Reuse: When your language sets you up to fail…

A couple hours ago, Mike Santillana posted to oss-security about a rather interesting find in Ruby’s OpenSSL library; in this case, the flaw is subtle – so much so that it’s unlikely that anyone would notice it, and it’s a matter of a seemingly insignificant choice that determines if your code is affected. When performing AES-GCM encryption, if you set the key first, then the IV, and you are fine – set the IV first, you’re in trouble.

Depending on the order you set properties, you can introduce a critical flaw into your application.

If you set the IV before the Key, it’ll use an empty (all zeros) nonce. We all (hopefully) know just how bad nonce reuse in GCM mode can be – in case you don’t recall, there’s a great paper on the topic. The short version is, if you reuse a nonce you are in serious trouble.

The Issue

Here’s the code that demonstrates the issue (based on code from Mike’s post, with some changes to better demonstrate the issue):

Each time this is called, a unique ciphertext should be produced thanks to the random IV (or nonce in this case), yet, that isn’t what happens:

In the first two cases, a random IV is used (cipher.random_iv), and should produce unique ciphertext every time it’s called, in the third case, we explicitly set a null IV – and if you notice, all three produce the same output. What’s happening is that the nonce is all zeros for all three cases – the random nonce isn’t being used at all, and thus the ciphertext is repeated each time the same message is encrypted. The fact that a null IV is used when no value is supplied is actually documented – it just so happens to be that setting an IV prior to setting the key is effectively the same as not setting one at all.

A bug 5 years in the making…

The cause of this issue is a test case that was found five years ago:

ruby -e 'require "openssl";"ECB").update "testtesttesttest"'

What this code did was trigger a segmentation fault due to performing an update before the key was set. The workaround that was added was to initialize the Cipher with a null key to prevent the crash. At the time, this change wasn’t seen as being significant:

Processing data by Cipher#update without initializing key (meaningless usage of Cipher object since we don’t offer a way to export a key) could cause SEGV.

This ‘fix’ set the stage for this issue to come up. Setting a key that was meant to be overwritten caused a change in behavior in OpenSSL’s aes_gcm_init_key – instead of preserving a previously set IV, the IV is instead overwritten with the default null value. This isn’t exactly obvious behavior, and can only be seen by careful examination of the OpenSSL code.

So this is less a single bug, and more of a combination of odd behaviors that combine in a specific way to create a particularly nasty issue.

APIs That Lead To Failure

OpenSSL is notorious for its complicated API – calling it ‘developer unfriendly’ would be a massive understatement. I greatly appreciate the work that the OpenSSL developers do, don’t get me wrong – but seeing issues due to mistakes and misunderstandings of how OpenSSL works is quite common. Even in simple cases, it’s easy to make mistakes that lead to security issues.

It’s clear that this issue is just the latest in a long line caused by a complex and difficult to understand API, that makes what appears to be a simple change have far greater impact than anticipated. The result, is another API that you have to understand in great detail to use safely.

To make it clear, this isn’t the only case where order of operations can lead to failure.

The Workaround

The workaround is to just update code to move the call to cipher.random_iv to some point after the key is set – but this is something that shouldn’t matter. There are discussions going on now to determine how to correct the issue.

Ruby’s OpenSSL library is a core library that’s widely used, and performs security critical tasks in countless applications. For a flaw like this to go unnoticed is more than a little worrying. It’s vital that languages and libraries make mistakes as hard as possible – in this case, they failed.

Testing for SWEET32 with YAWAST

Testing for SWEET32 isn’t simple – when the vulnerability was announced, some argued that the best solution was to assume that if a TLS server supported any of the 3DES cipher suites, consider it vulnerable. The problem is, it’s not that simple. On my employer’s corporate blog, I wrote about practical advice for dealing with SWEET32 – and pointed out that there are ways around the vulnerability, and some are quite simple.

The easy options:

  • Disable the following cipher suites:
  • Limit HTTP Keep-Alive to something reasonable (Apache & Nginx default to 100 requests)

If you don’t need to worry about users on old platforms, it’s easy – just disable the 3DES suites and you’re done. If you do need to worry about customers using Windows XP, disabling the 3DES cipher suites may not be an option, as the only other cipher suites they support use RC4. In that case, you need to limit Keep-Alive sessions.

In cases where 3DES suites are enabled, but reasonable / default limits are in place, SWEET32 isn’t an issue – as there’s no way to issue enough requests within the same session (and thus the same encryption key) to be able to exploit it.

Testing with YAWAST

To test for this, I added a new feature to YAWAST that will connect to the server using a 3DES suite, and issue up to 10,000 HEAD requests, to see how many requests the server will allow in a single session. If the server allows all 10,000 requests, there’s a good chance that the server is vulnerable to SWEET32. This method is used to determine that the server is likely vulnerable, without the massive data transfer and time required to actually verify that it’s vulnerable.

Testing for SWEET32 with YAWAST looks like this:

In this case I’m using YAWAST to run a ssl scan, using the --tdessessioncount parameter to instruct YAWAST to perform the SWEET32 test. In this case, you can see that the TLS session was ended after 100 requests (Connection terminated after 100 requests (TLS Reconnected)) – which is a clear indication that the server isn’t vulnerable. Had the server actually been vulnerable, this message would have been displayed:

[V] TLS Session Request Limit: Connection not terminated after 10,000 requests; possibly vulnerable to SWEET32

The use of 10,000 requests is somewhat arbitrary – it was selected as it’s far above the defaults used for the most common servers, and is small enough that it can be conducted in a reasonable time. It could be argued that it should be larger, to reduce false positives – it could also be argued that it’s excessive, as if it allows 5,000, it’s a reasonable bet that there’s no limit being enforced. This value may change over time based on community feedback, but for now, it’s a safe value that strikes a good balance.

Vulnerable or Possibly Vulnerable?

Unfortunately the only way to confirm that a server is truly vulnerable is to perform the attack, or at least a large part of it. There will be false positives using this technique, but that’s the cost of performing a quick test (that doesn’t involve transferring tens or hundreds of gigabytes of traffic).

When looking at the results, it’s important to understand that it indicates that the server is likely vulnerable, but it is not a confirmation of vulnerability.

Developers: Placing Trust in Strangers

Much has been said, especially recently, about that mess of dependencies that modern applications have – and for those of us working in application security, there is good reason to be concerned about how these dependencies are being handled. While working on YAWAST, I was adding a new feature, and as a result, I needed a new dependency – ssllabs.rb.

While most Ruby dependencies are delivered via Gems, ssllabs.rb is a little different – it pulls directly from Github:

This got me thinking about a pattern that I’m seeing more often, across multiple languages, of developers pulling dependencies directly from the Github account of other people. There are hundreds of thousands of examples just in Ruby, that doesn’t include various build scripts or other places this is done – or all the other languages. To make this worse, this is generally a pull from HEAD, instead of a specific revision.

Absolute Trust

By pulling in a dependency, especially from HEAD, from another developer is placing a remarkable amount of trust in them – as there’s no release process, no opportunity for code signing (which is missing from most dependency systems), nothing but faith in that developer.

But, it’s not just faith in the developer – it’s faith in the strength of their password, faith in the security of their computer, faith in the security of their email account, faith that they have two-factor authentication enabled – I could keep going. If the developer they pull from is compromised – in any of many ways, it’s a trivial matter to commit a change as them. Once committed, that change will be picked up the next time those that use it build (or update) – giving an attacker code running on their machine – code that will most likely end up in production.

I’ve written before about the difficulty in spotting malicious code – an apparently simple refactoring commit can easy hide the insertion of malicious code, and it’s likely that it wouldn’t be quickly noticed.

Once a developer’s Github account or computer is compromised, it’s a simple matter to commit something that appears innocent, but adds a backdoor, which could be picked up by hundreds or thousands of those that depend on the code before they are able to recover access, detect, and address the malicious code. Depending on the attack vector, it could take days for the developer to regain full access – that’s a large window to be deploying malicious code. That’s assuming of course, that the developer even notices it, many popular dependencies have been inactive for months or even years.

A larger problem…

This is, of course, just one front in a larger problem – the external dependencies that are used are rarely reviewed, almost never signed, the dependencies they bring in are almost always ignored. Far too many development shops have no idea of what they are actually pushing to production – or what issues that code is bringing with it.

There are so many opportunities to attack dependency management – thankfully few attacks are seen in the wild, but I don’t take too much comfort in that.

Seamless Phishing

Phishing attacks are a fact of life, especially for users of the largest sites – Facebook being the most common I’m seeing today. Pretty much everybody, from the SEC to antivirus companies have published guides on what users should do to avoid phishing – so I picked one at random and pulled out the key points:

  • 1). Always check the link, which you are going to open. If it has some spelling issues, take a double-take to be sure — fraudsters can try to push on a fake page to you.
  • 2). Enter your username and password only when connection is secured. If you see the “https” prefix before the site URL, it means that everything is OK. If there is no “s” (secure) — beware.
  • 5). Sometimes emails and websites look just the same as real ones. It depends on how decently fraudsters did their “homework.” But the hyperlinks, most likely, will be incorrect — with spelling mistakes, or they can address you to a different place. You can look for these tokens to tell a reliable site from a fraud.

This is from Kaspersky – unfortunately the advice is far from great, but it follows pretty closely to what is generally advised. It’s quite common for people to be told to check the URL to make sure it’s the site they think it is, check for a padlock, and if everything looks right, it should be safe. Except of course, for when this advice isn’t nearly enough.

Seamless Integration

Facebook allows for third-party applications to integrate seamlessly, this has been a key to achieving such a high level of user engagement. When accessing an application via Facebook, you end up at a URL like this:


As you can see, the URL is * – as people would expect. It uses HTTPS, and not just HTTPS, but HTTPS with an Extended Validation certificate. It passes those first critical tests that users rely on to keep them safe. Let’s take a look at the page that URL points to:


The header is from Facebook, as is the side-bar – but the rest of the page is actually an iframe from a malicious third-party. What appears at first glance to be a legitimate Facebook page, is actually a Facebook page that includes a login form that is being used for phishing.


Everything looks right, the style makes sense, there are no obvious errors, the URL is right, and there’s the padlock that everyone is taught to look for. This is a fantastic phishing attack – not at all hard to implement, and it passes all of the basic checks; this is the kind of attack that even those that are careful can fall for.

Because of just how seamless Facebook has made their integration, they have opened the door for extremely effective phishing attacks that few normal users would notice. Anytime an application allows third-parties to embed content blindly, they are doing so at the cost of security. This shows the need for increased vigilance on the part of Facebook – doing a better job of monitoring applications, and that users need to be taught that going through a simple checklist is far from adequate to prevent attacks – as is often the case, checklists don’t solve security problems.

Thanks to @Techhelplistcom for pointing this out.