Take a crash course in Ubuntu server administration and learn specific details that set Ubuntu Server apart from other platforms in this chapter from The Official Ubuntu Book.
The Story of RAID
If you've got one only hard drive in your server, feel free to skip ahead. Otherwise, let's talk about putting those extra drives to use. The acronym RAID stands for redundant array of inexpensive disks, although if you're a businessperson, you can substitute the word independent for inexpensive. We forgive you. And if you're in France, RAID is short for recherche assistance intervention dissuasion, which is an elite commando unit of the National Police—but if that's the RAID you need help with, you're reading the wrong book. We think RAID is just a really awesome idea for data: When dealing with your information, it provides extra speed, fault tolerance, or both.
At its core, RAID is just a way to replicate the same information across multiple physical drives. The process can be set up in a number of ways, and specific kinds of drive configurations are referred to as RAID levels. These days, even low to mid-range servers ship with integrated hardware RAID controllers, which operate without any support from the OS. If your new server doesn't come with a RAID controller, you can use the software RAID functionality in the Ubuntu kernel to accomplish the same goal.
Setting up software RAID while installing your Linux system was difficult and unwieldy only a short while ago, but it is a breeze these days: The Ubuntu installer provides a nice, convenient interface for it and then handles all the requisite backstage magic. You can choose from three RAID levels: 0, 1, and 5.
RAID 0 A so-called striped set, RAID 0 allows you to pool the storage space of a number of separate drives into one large, virtual drive. The important thing to keep in mind is that RAID 0 does not actually concatenate the physical drives—it actually spreads the data across them evenly, which means that no more space will be used on each physical drive than can fit on the smallest one. In practical terms, if you had two 250GB drives and a 200GB drive, the total amount of space on your virtual drive would equal 600GB; 50GB on each of the two larger drives would go unused. Spreading data in this fashion provides amazing performance but also significantly decreases reliability. If any of the drives in your RAID 0 array fail, the entire array will come crashing down, taking your data with it.
RAID 1 This level provides very straightforward data replication. It will take the contents of one physical drive and multiplex it to as many other drives as you'd like. A RAID 1 array does not grow in size with the addition of extra drives—instead, it grows in reliability and read performance. The size of the entire array is limited by the size of its smallest constituent drive.
RAID 5 When the chief goal of your storage is fault tolerance, and you want to use more space than provided by the single physical drive in RAID 1, this is the level you want to use. RAID 5 lets you use n identically sized physical drives (if different-sized drives are present, no more space than the size of the smallest one will be used on each drive) to construct an array whose total available space is that of n–1 drives, and the array tolerates the failure of any one—but no more than one—drive without data loss.
The Mythical Parity Drive
If you toss five 200GB drives into a RAID 5 array, the array's total usable size will be 800GB, or that of four drives. This makes it easy to mistakenly believe that a RAID 5 array "sacrifices" one of the drives for maintaining redundancy and parity, but this is not the case. Through some neat mathematics of polynomial coefficients over Galois fields, the actual parity information is striped across all drives equally, allowing any single drive to fail without compromising the data. Don't worry, though. We won't quiz you on the math.
InformationWeek Tech Digest, Nov. 10, 2014Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?