SSD Options: Tier Vs. Cache
By George Crump
InformationWeek
Automated tiering and caching often get confused. While each vendor's technology will vary a bit automated tiering is generally seen to be a more permanent placement of data on a faster tier of storage. It also can be seen as a way to move less active data to a high capacity but more cost effective tier of storage. Caching is often seen as more temporary in nature, accelerating only the most active data and, in most cases, this approach does not move old data to a third tier of storage.
The challenge in trying to grasp these two methods is that when used with solid state their use looks similar. In the past, caching was often thought of as a very small area of memory used to accelerate disk access for a very short period of time. Often it held only the most recent minutes of accessed data. Obviously the chances of a cache miss were relatively high, which meant a performance degradation as data was retrieved from mechanical hard disk. This lead to a very narrow deployment model, either a single server or a specific application on that server.
With the falling cost of today's flash-based SSDs, a very large cache can be created and data can reside on cache for a long period of time. This of course reduces the chance of a cache miss. It also means that data can be in cache for hours, even days if the flash memory in the cache is sized large enough. Flash has allowed large caches to be deployed in a much broader fashion and across multiple servers and applications.
A big difference between cache and automated tiering is that the data in cache is always a second copy of the data that is on the hard drive. Automated tiering is an actual move of data from the hard drive. Failure of the cache rarely produces a data loss, just a performance loss since everything would need to be served from mechanical drives until the cache can be replaced.
Since the SSD tier holds potentially the only copy of data in an automated tiering system, the failure of the SSD tier can't be tolerated so these systems have to set the SSD tier in a redundant configuration by using a RAID-like data protection scheme. The overhead of that protection, RAID parity bit calculation for example, may impact performance and of course any RAID algorithm requires extra disk capacity. Having to purchase extra SSD to support a RAID-like function makes an already premium priced technology even more expensive.
In most situations, read performance should be about the same between the two options. Mostly the efficiency of read performance is going to depend on the efficiency and customizability of the caching appliance to promote data. The goal should be to make sure the right data is in cache at the right moment in time. As we discuss in our recent article "Maximizing SSD Investment With Analytics" we believe that this is the largest opportunity for improvement in this technology. Both caching and automated tiering need to become smarter about what they cache and when.
Another area to examine with automated tiering vs. caching is which one can deliver better write performance and can be clear are of distinction between automated tiering and caching. We'll cover this in our next entry.
Follow Storage Switzerland on Twitter
George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Storage Switzerland's disclosure statement. George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Storage Switzerland's disclosure statement.
Federal agencies must eliminate 800 data centers over the next five years. Find how they plan to do it in the new all-digital issue of InformationWeek Government. Download it now (registration required).
| To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy. |
Virtual Infrastructure Reports
Informed CIO: VDI Snake Oil Check
You won't lose your shirt on a desktop virtualization initiative, but don't expect it to be simple to build or free of complications. This report examines the three biggest problems when developing a business case for VDI: storage costs, ongoing licensing, and the wisdom of prolonging the investment in PC infrastructure.
Fundamentals: Next-Generation VM Security
Server virtualization creates new security threats while turning the hypervisor into a network black hole, hiding traffic from traditional hardware defenses -- problems a new breed of virtualization-aware security software tackles head-on.
Delegation Delivers Virtualization Savings
IT can't-and shouldn't-maintain absolute control over highly virtualized infrastructures. Instituting a smart role-based control strategy to decentralize management can empower business units to prioritize their own data assets while freeing IT to focus on the next big project.
The Zen of Virtual Maintenance
Server virtualization has many advantages, but it can also lead to chaos. Many organizations have unused or test VMs running on production systems that consume memory, disk and power. This means critical resources may not be available in an emergency: say, when VMs on a failed machine try to move to another server. This can contribute to unplanned downtime and raise maintenance costs. Easy deployment also means business units may come knocking with more demands for applications and services. This report offers five steps to help IT get a handle on their virtual infrastructure.
Pervasive Virtualization: Time to Expand the Paradigm
Extending core virtualization concepts to storage, networking, I/O and application delivery is changing the face of the modern data center. In this Fundamentals report, we'll discuss all these areas in the context of four main precepts of virtualization.
Virtually Protected: Key Steps To Safeguarding Your VM Disk Files
We provide best practices for backing up VM disk files and building a resilient infrastructure that can tolerate hardware and software failures. After all, what's the point of constructing a virtualized infrastructure without a plan to keep systems up and running in case of a glitch--or outright disaster.



Subscribe to RSS