8 Linux Security Improvements In 8 Years

Linux started getting really serious about security in 2007, and it has made big strides in the past three years. As open source code faces more threats, Linux can't rest on its laurels.

Charles Babcock, Editor at Large, Cloud

May 6, 2015

10 Slides

(Image: Nemo via Pixabay)

At a time when faith in open source code has been rocked by an outbreak of attacks based on the Shellshock and Heartbleed vulnerabilities, it's time to revisit what we know about Linux security. Linux is so widely used in enterprise IT, and deep inside Internet apps and operations, that any surprises related to Linux security would have painful ramifications.

In 2007, Andrew Morton, a no-nonsense colleague of Linus Torvalds known as the "colonel of the kernel," called for developers to spend time removing defects and vulnerabilities. "I would like to see people spend more time fixing bugs and less time on new features. That's my personal opinion," he said in an interview at the time.

So how's that going? Since Morton issued his call, Linux has added several million lines of code and many thousands of patches and new features. The Linux kernel development process has shown marked improvement on the security front. It was as good as, or better than, most commercial code when Morton issued his 2007 challenge. As InformationWeek checked into its defect-fixing record, it was surprising how many gains have been made in the last three years.

Linux is better than most commercial code. For example, where one defect per 1,000 lines of code is considered quality, Linux in July 2014 had .55 defects per 1,000 lines. Linux also is better than most other open source projects. That didn't happen overnight, and it didn't happen without changes to the kernel process. What has happened with Linux should serve as a standard by which other projects are measured. As concern grows about the security and maintainability of open source code in the Internet's infrastructure, there may be lessons to learn from Linux's example.

Linux is an extremely large software project. It had 4,100 contributors to its last release, and over half of them were new contributors. It's one thing for a small and practiced software team to ride herd on a tight code base and police each other's bugs. It's another thing entirely to clean up a long-term project with a sprawling and revolving list of contributors. The larger the project, the higher the likely rate of defects. With that in mind, let's look at steps Linux has taken, learn about the people involved in that effort, and explore how Linux stacks up in 2015.

(Image: Steve French)

Andrew Morton wasn't the only one concerned about defects creeping into open source code widely used on the Internet. The Department of Homeland Security, smelling trouble in the loosely supervised spread of open source code, issued a large contract in 2006 to a group in the Computer Science Laboratory at Stanford University. It was their job to produce an automated code-checking system that could scan the C, C++, C#, and Java code in many open source code projects. A firm called Coverity was formed to capitalize on the code analysis service that resulted, looking to make a business out of scanning commercial and open source code after the DHS contract ran out.

The project produced the Coverity Static Analysis Verification Engine (SAVE). The online service could be loaded with a project's latest build in order to perform a static analysis, meaning the code is not running. Its lines would be examined one by one, and code paths analyzed with the target system at rest. By running many tests of different variables passing through the system, the engine could spot buffer overflows, broken authentication, cross-site scripting, code injection opportunities, and other vulnerabilities that a malicious hacker might take advantage of.

The scanning system was slow to build credibility and catch on with independent-minded open source code projects. It would be several years before the Linux project embraced Coverity scans, but once it did, the payoff proved dramatic.

Pictured above is an example of what developer Steve French sees when he logs into his account at Coverity and inspects the Linux scan report.

(Image: Nemo via Pixabay)

As Coverity produced free scans of projects, and sought to engage in a discussion about what they meant, it gained converts among a few leading open source projects, including the Apache Web Server and Samba conversion system between Linux and Windows files. In July 2013 the Linux kernel project named Dave Jones, an experienced Red Hat Fedora code bug identifier and fixer, as its Coverity scan manager for Linux builds. Jones subdivided the big kernel project into several code groups, with the bulky device drivers in the project further subdivided and tested by a few large groups reflecting device types. Jones's subdivision of the kernel and the Coverity scans began to produce a picture of where persistent problems lay in the kernel by identifying and listing specific defects.

For years the number of defects that remain unfixed in the kernel had been growing at a faster rate than the number of fixes. In 2013, that ratio began to reverse itself and the number of fixes started to accumulate faster than the number unfixed defects.

(Image: Nemo via Pixabay)

Coverity's automated identification of bugs, defects, and vulnerabilities proved that knowing the bugs existed was not enough. Even knowing precisely where flaws were located hadn't been the real problem, since for years kernel developers understood that unfixed bugs remained resident in the code. What Linux really needed was a process to make sure the party responsible for a bug paid attention and fixed it. Getting developers to do that work, when all the glory lay in getting new code accepted into the kernel, proved hard to do.

With practice, Dave Jones's scan team could connect the problematic code highlighted by the scans to the developer who submitted it. Then Jones, the code's reviewer for the kernel, or on rare occasions Linus Torvalds himself could point to the scan and ask the submitter to fix the problem. Developers became accountable for the code they wrote. Getting recognition is a big part of an open source project. So when merit was earned not only for the amount of code but also the lack of bugs, developers started working to eliminate bugs attached to their names.

On August 13, 2014, Jones reported on his first full year of administering Coverity scans for Linux. (He moved from Red Hat to become principal software engineer at Akamai Technologies at the end of 2014.) He wrote:

Many developers have signed up for (Coverity) accounts and are looking over their subsystems [with] each release, which is great. It means I have to spend less time sending email. :)

In other words, rather than waiting for the finger of authority to point at them, kernel code submitters were voluntarily scrutinizing the scan lists of defects to see what they should fix.

(Image: Public Domain Pictures via Pixabay)

In the same blog post, Dave Jones offered a chart to illustrate the declining defect rate. In commercial code a defect rate of one per 1,000 lines of code is considered high-quality code. In the Linux kernel, the defect rate already was better than that mark -- at less than .7 defects per 1,000 lines -- and continued its decline from August 2013 through July 2014.

(Image: OpenClips via Pixabay)

Another lesson from the Linux experience since 2013 is that speed in spotting defects and vulnerabilities is essential to getting them fixed. By scanning recently built code several times a week, Dave Jones is identifying problems right after the code has been reviewed and committed and is still fresh in the submitter's -- the original developer's -- mind.

Jones commented in the 2013 Coverity Scan Report on the result of the early scans:

Overall there is an increased interest in the bugs that are being found through the Coverity Scan service. We have more people looking at the bugs and taking them more seriously.

(Image: Nemo via Pixabay)

The battle with existing bugs in the Linux code base started long before the Coverity scans began in 2013. In 2007, the year that Andrew Morton first sounded his call to arms against defects, the Linux kernel had 3,458,369 lines of code, with 425 new bugs identified that year (217 were fixed). In 2012, with the kernel then containing 7,387,508 lines of code, 5,803 bugs were identified and 5,170 fixed, a ratio much closer to parity. In 2013, the ratio reversed itself for the first time, with more bugs fixed than found their way into the code. A total of 3,299 were identified in a kernel that consisted of 8,578,254 lines of code, and 3,346 were fixed. The kernel has since grown to nearly 20 million lines of code.

(Image: OpenClilps via Pixabay)

Greg Kroah-Hartman, Linux kernel developer and its chief maintainer after Torvalds issues a release, credits Coverity with improving the process of bug fixing. "We all get emails from Coverity when it finds new problems in the tree, so that might be one reason why" the defect rate is declining, he said in an exchange of emails with InformationWeek in April. At the time, he was in Paris at Pierre and Marie Currie University working with doctoral students on producing clean Linux code. He noted that there tools other than Coverity as well.

"We have great tools like Coccinelle, which allows scripts to be written to analyze code for problems, and it creates fixes for those problems," he wrote. "We get new scripts added to our testing framework every week, to ensure that problems that match patterns that we have fixed in the past never get added again in any new code." In other words, not only is the defect automatically identified, but the fix, if it's a proven one, can be generated automatically by the tool.

The Linux project is using Coccinelle on submitted code before it goes into a kernel release, eliminating bugs before the code gets exposed to a Coverity scan. In addition to scans, Kroah-Hartman said he thinks the decline in bugs can be "attributed to something totally different. We are now doing a _LOT_ of automated testing on every single commit that goes into the kernel tree, before it gets there," he wrote.

Coccinelle was invented by Julia Lawall, a professor of computer science at Lip6/INRIA (The Institute For Research In Computer Science) at Pierre and Marie Currie University. Coccinelle "is built to take rules and then transform code based on those rules. It works very well and is very powerful," wrote Kroah-Hartman. "Because of this, I like to tell people that Julia has fixed more security bugs in the kernel than anyone else, and continues to automatically do so without her having to do additional work. The tool she wrote and maintains does it for us."

(Image: OpenClips via Pixabay)

Greg Kroah-Hartman said the Zero-day Bot tool, developed at Intel and run by Fengguang Wu, "runs through all development trees (where kernel committers place new, approved code) and runs a ton of static analysis tools that we have." These include the sfs tests for file system testing and the open source snatch tool. Zero-day Bot sends reports of problems it sees to kernel committers, who are reviewing developers' patches. It sometimes can offer automated patches with which to fix them as well.

Fengguang Wu is a full-time kernel developer employed by Intel in Shanghai and was educated at China's University of Science and Technology. At the 2012 Linux Summit, he spoke about the Intel Zero-day Bot, so named because it doesn't allow any time (zero days) for bad code in the Linux kernel to be exposed for possible exploits. The Linux Weekly News reported Fengguang Wu as saying: "The best way to ensure [defects] are resolved is to quickly and accurately determine the cause of the regression and promptly notify the developer …"

The tool emails the developer a notice of a bug's existence one hour after code commits have been merged and tested. The swift notice means the affected developer "is more likely to be 'hot' on the technical details and able to fix the problem quickly," he said.

The process of merging various Linux code trees into a new release takes weeks or months, and means the developer who submitted patches has other things on his mind by the time a Coverity scan is conducted on the new release.

(Image: OpenClips via Pixabay)

Coverity isn't allowed to release the results of its tests of commercial code, such as Oracle or SAP, but it has tested other open source projects and posted the results. The Linux kernel's .55 defect rate per 1,000 lines of code compares favorably with most other open source projects. It looks particularly good when compared to Java-based projects, such as Hadoop, with 1.71 defects per 1,000, or CloudStack, with 6.96 in 2013. But Java is more prone to defects than C or C++ projects, and a more accurate comparison would be with some of them. Among operating system projects, NetBSD and FreeBSD, both versions of Berkeley Unix, would be more comparable to Linux. They come in at 1.09 and 1.06 per 1,000 lines, respectively. The Samba Project is among the lowest, with a rate that's ranged from .63 to .59 per 1,000 over the last two years, according to the Coverity scans.

(Image: OpenClips via Pixabay)

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

See more from Charles Babcock

Related Topics

Recent in Leadership

Related Topics

Recent in Resilience

Related Topics

Recent in ML & AI

Related Topics

Recent in Data

Related Topics

Recent in Sustainability

Related Topics

Recent in Infrastructure

Related Topics

Recent in Software

Related Topics

8 Linux Security Improvements In 8 Years

About the Author(s)

Editor's Choice

Related Topics

Recent in Leadership

Related Topics

Recent in Resilience

Related Topics

Recent in ML & AI

Related Topics

Recent in Data

Related Topics

Recent in Sustainability

Related Topics

Recent in Infrastructure

Related Topics

Recent in Software

Related Topics

<span class="ArticleBase-LargeTitle">8 Linux Security Improvements In 8 Years</span>8 Linux Security Improvements In 8 Years

About the Author(s)

Editor's Choice

8 Linux Security Improvements In 8 Years