Cloudbleed: Final Damage Assessment Still Pending

Cloudflare CEO Prince has published a persuasive case that Cloudbleed hasn't caused any detectable damage. Let's go slow on that.

Charles Babcock, Editor at Large, Cloud

March 21, 2017

8 Min Read
Source: Pixabay

The impact of Cloudbleed, the blind and inadvertent retrieval of customer data from the active memory banks of Cloudflare's distributed servers, has been minor so far as we know.

But it was an incident that had some unique characteristics. It was not an exposure leading to break-in. The software fault lead to the distribution of customer data by Cloudflare itself. Once the somnolent software bug became hyperactive, it kicked in with a vengeance Feb. 13. That being the case, it may not have sunk in in many quarters that it was a data leak of unprecedented scale. It occurred only .00003% of the time at Cloudflare, according to this source. Nevertheless, given the amount of traffic at Cloudflare, once it was booted into greater activity, it became a Niagara Falls, not a trickle down, once-in-a-blue-moon occurrence.

All involved should heave a sigh of relief if no serious consequences emerge in the future. That's a possibility because there's no evidence that any hackers knew about the leak before it had been corrected and expunged from publicly-available pages in search engine caches. Nice damage containment work was done by Cloudflare. Congratulations would be in order, if it weren't for a lingering, queasy feeling.

This was a near catastrophe, halted in its tracks by effective exposure identification by Tavis Ormandy of Google's Project Zero, its full-time Internet bug detection unit formed in July 2014. Coincidentally, Google Capital was a December 2014 investor in Cloudflare and perhaps one consequence of this detection was to shore up Google's investment in a young and highly successful startup. Nevertheless, it was persistent watchfulness that spotted the bug and a competent technical response that contained it.

For more on the response, check out Cloudflare CEO Matthew Prince's account March 1 on the Cloudflare web site. It should serve as the text-book example of how a post mortem on a Web-scale company error is handled. The sense of highly motivated investigation and straight forward accountability is there. Still, the title, Quantifying The Impact of Cloudbleed, has to be taken with a grain of salt. Prince does quantify the impact, and based on what he finds, he also minimizes the impact.

This is as natural as dairy farmers marketing milk, but it's too early to truly minimize the outcome. Prince may be proven right when no damages emerge three months or three years from now. But I prefer the headline, Cloudflare Breach Had Potential To Be Much Worse. Perhaps more of our focus should go toward examining the still-possible consequences rather than trying to establish how we have escaped serious reverberations this time around.

Want to learn more about how Cloudflare emerged in the  market? See Cloudflare Supplies Security At Network's Edge.

After 12 days of investigation, Cloudflare concluded: "We have found no evidence based on our logs that the bug was maliciously exploited before it was patched; 2) the vast majority of Cloudflare customers had no data leaked; 3) after a review of tens of thousands of pages of leaked data from search engine caches, we have found a large number of instances of leaked internal Cloudflare headers and customer cookies, but we have not found any instances of passwords, credit card numbers, or health records."

In effect, the conclusion was that Cloudflare found plenty of its own data, mostly harmless, had leaked but no essential customer data among the leaked pages. That's both a good and a convenient outcome, legally speaking. While "no passwords, credit card numbers or health records," sounds like a strong rejection of customer data being exposed, it touches on three sensitive data types without necessarily ruling out all other types. The statement sounds categorical, but it's not. It leaves open the possibility that customer data might have been spotted but didn't fall into these three categories.

"The vast majority" of customers were not exposed, which means a small minority were. So what types of their data was exposed? We don't know from this eloquent and cleverly phrased account.

Then we have Google's Tavis Ormandy's statement that Project Zero did see passwords in the leaked data. In checking the data leak, he found "encryption keys, cookies, passwords, chunks of POST data…" and other data types, Ormandy wrote in a blog post Feb. 19. When he posted examples, he had to excise "personally identifiable information" from pages retrieved from Uber, Fitbit and the data site, OkCupid. It was Tavis who unintentionally dubbed the leak "Cloudbleed" because of its analogy to the earlier Heartbleed bug found in OpenSSL.

At this point, I start scratching my head and wishing that someone other than the CEO -- someone who lead the Cloudflare forensic investigation -- had provided the accounting. In my opinion, Prince is more capable than most CEO's to sum up the investigation, but is that the best idea? Prince is also an attorney. Shouldn't the technical people closest to the leak investigate and report to the public? The CEO could comment on their document, if he chose to, rather than issuing the outcome himself. Now that the CEO has said no passwords were found, who on the investigation team is likely to stand up and contradict him?

It's possible Google's Ormandy was referring to Cloudflare passwords or to Cloudflare customers' passwords or to passwords whose owners could not be identified. He just cited plain "passwords." But his statement still appears to contradict Prince's, with no higher level referee available.

What the numbers say

The bug was buried and inactive in a parser that analyzes data in the Cloudflare network until Sept. 22, 2016, when a new version of the parser was implemented and exposed it. The parser converts HTTP to HTTPS, hides email addresses and provides other functions. When the parser encountered a page with its HTML broken in a specific way, with certain parser features turned on, the parser read an HTML page as directed, then continued to read from adjacent memory in the server, "which contained data from other customers' requests," Prince's blog stated. "The contents of that adjacent memory was then dumped onto the page with the flawed HTML," he continued, and that data appeared to the end user as a jumbled text at the bottom of the page.

Unless someone knew what was happening, the hodge podge of data would make little sense. If, on the other hand, someone were aware of the Cloudflare flaw, the dumped data represented a rich vein to mine. There is no evidence in Cloudflare's server logs that someone persistently attempted to retrieve pages from certain sites on a scale that indicated they were trying to find valuable data, Prince wrote.

How often did this event happen? From Sept. 22, 2016 through Feb. 13, the bug was active in the updated parser code on 180 of the 2 million sites that Cloudflare serves. The pages that triggered the bug were accessed 605,037 times, he wrote.

On Feb. 13, Cloudflare unintentionally expanded the circumstances under which the bug would be activated and between Feb. 13 and Feb. 18, those numbers increased to 6,457 sites, with pages accessed 637,034 times.  That would bring the total number of pages with data dumps to 1,242,071, still a small percentage of the traffic passing through Cloudflare's busy network but a rapidly escalating number. Fortunately the bug was detected and Cloudflare had a fix installed in hours. Unfortunately, over 1.2 million pages with unsolicited data were distributed onto the Internet.

Catching up with this mishap was like unspreading butter. The pages were crawled by multiple search engines and their data elements added to the search engine caches. Prince said the incident was different than a data breach, where an intruder steals identifiable data. It's more like "a stranger may have listened in on two employees at your company talking over lunch. The good news is the amount of information for any conversation that's eavesdropped is limited. The bad news is you can't know exactly what the stranger may have heard," Prince said.

This was the weakest element of his summery, to me. A better analogy might be one of Federal Reserve employees accidentally dumping a billion dollars into the Niagara River and observant opportunists at the falls dipping nets into the stream. Even if they didn't understand why the bonanza was happening, it would be hard for the net holders not to come away with something of value.

"For the last twelve days we've been reviewing our logs to see if there's any evidence to indicate that a hacker was exploiting the bug before it was patched. We’ve found nothing so far to indicate that was the case," Prince wrote. The investigation will continue and Cloudflare will share the results, he said.

The logs that the Cloudflare teams were using represent 1% of the total Cloudflare traffic. That may or may not have been enough to base a conclusion that no lasting damage will likely to ensue from this incident. My own is that it was a customer data dump of unprecedented scale. Cloudflare has also provided a textbook example of how to try to undo the damage of such an event and talk about that effort afterward.

What's not clear is whether a security text in the future will be able to say everything they did was right.

Read more about:

20172017

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights