Importance of Striping in the Cloud
By Joe Kinsella & Josh Pasqualetto
Introduction
While the cloud offers numerous means of storing data, the Swiss army knife of cloud storage remains the block store. Up until recently, Amazon was the only provider of block storage, recognizing early the importance of attaching variable sized storage to compute as a means of supporting a wide variety of customer use cases. But as cloud providers move to add block storage to their offerings, not all are acknowledging the most important feature of the Amazon block store: striping.
Background
A block store is a service that allows users to attach storage volumes to virtual instances. For example, Amazon EBS allows one or more volumes of up to 1 TB to be attached to EC2 instances and used as virtual disks. Block storage remains the only option when there is a need to read/write large amounts of data using a file systems, or when an application requires high I/O performance. Striping multiple disks together (RAID 0) is one way to achieve this goal. It allows the host system to effectively leverage multiple parallel input/output data streams to multiple disks while they appear to the host machine as a single disk.
Amazon first introduced EBS in August 2008, and it almost instantly became a foundational component of their cloud. It has also been a sometimes maligned component of the Amazon Web Services infrastructure, due in large part due to its historical high performance variability, and the role it played in the April 2011 outage. Since the outage though, an increasing focus on EBS has resulted in improving reliability of the service, providing a standard for other vendors to match.
Striping By the Numbers
A benchmark test on a single EBS volume on Amazon shows it performs around 60 Input Output Operations per Second (IOPS), which is about 60% as fast as the 7200 RPM disk in a typical laptop or desktop. While this performance is not overly impressive, its ability to support a write rate of about 70MB/sec still makes it an effective storage option for a variety of use cases (by comparison, one external analysis of the Amazon S3 object store shows a single instance write rate of about of about 15MB/sec).
To prove the importance of striping to block storage, we have performed benchmark tests on different configurations of Amazon EBS volumes to produce the below results. Note: we have included physical hardware in the results for a side by side comparison.
[table id=13 /]
As you can see, when you attach more than one volume to an instance and expose it as a striped disk, the less than impressive performance of an EBS volume becomes high performance. When you stripe 8 or more volumes, you begin achieve disk performance that mirrors and in some cases exceeds the speed of high performance disk drives. Note: it is not generally recommended to stripe more than 8 volumes due to a the increase in Mean Time Between Failure (MTBF) of the disk caused by aggregation of volumes.
The below table shows a more detailed breakout of random and sequential reads/writes to disks with different striping configurations, further demonstrating the value of striping on a block store.
[table id=14 /]
Conclusions
If a block store is the Swiss army knife of cloud storage, striping is the indispensable main blade. It provides cloud customers the ability to support a wide variety of use cases and performance criteria which are not appropriate for other forms of storage (e.g. object, table). It also allows fine grained control over the trade off between MTBF and performance of cloud storage. Cloud providers take note: striping is an essential feature for the success of any block store.