In computing, specifically computer storage, a Redundant Array of Independent Drives (or Disks), also known as Redundant Array of Inexpensive Drives (or Disks), (RAID) is an umbrella term for data storage schemes that divide and/or replicate data among multiple hard drives. RAID can be designed to provide increased data reliability and/or increased I/O performance.
A number of standard schemes have evolved which are referred to as levels. There were five RAID levels originally conceived, but many more variations have evolved, notably several nested levels and many non-standard levels (mostly proprietary).
Overview
RAID combines physical hard disks into a single logical unit either by using special hardware or software. Hardware solutions often are designed to present themselves to the attached system as a single hard drive and the operating system is unaware of the technical workings. Software solutions are typically implemented in the operating system, and again would present the RAID drive as a single drive to applications.
There are three key concepts in RAID: Mirroring, the copying of data to more than one disk; Striping, the splitting of data across more than one disk; and Error correction, where redundant data is stored to allow problems to be detected and possibly fixed (known as Fault tolerance). Different RAID levels use one or more of these techniques, depending on the system requirements. The main aims of using RAID are to improve reliability, important for protecting information that is critical to a business, for example a database of customer orders; or where speed is important, for example a system that delivers on-demand TV programs to many viewers.
The configuration affects reliability and performance in different ways. The problem with using more disks is that it is more likely that one will go wrong, but by using error checking the total system can be made more reliable by being able to survive and repair the failure. Basic mirroring can speed up reading data as a system can read different data from both the disks, but it may be slow for writing if it insists that both disks must confirm that the data is correctly written. Striping is often used for performance, where it allows sequences of data to be read off multiple disks at the same time. Error checking typically will slow the system down as data needs to be read from several places and compared. The design of RAID systems is therefore a compromise and understanding the requirements of a system is important. Modern Disk arrays typically provide the facility to select the appropriate RAID configuration.
RAID systems can be designed to keep working when there is failure - disks can be hot swapped and data recovered automatically while the system keeps running. Other systems have to be shut down while the data is recovered. RAID is often used in High availability systems, where it is important that the system keeps running as much of the time as possible.
RAID is typically used on servers but can be used on workstations. The latter is especially true in storage-intensive computers such as those used for video and audio editing.
RAID Level 0
RAID 0: Striped Set (2 disks minimum) without parity. Provides improved performance and additional storage but no fault tolerance from disk errors or disk failure. Any disk failure destroys the array, which becomes more likely with more disks in the array. The reason a single disk failure destroys the entire array is because when data is written to a RAID 0 drive, the data is broken into fragments. The number of fragments is dictated by the number of disks in the drive. The fragments are written to their respective disks simultaneously on the same sector. This allows smaller sections of the entire chunk of data to be read off the drive in parallel, giving this type of arrangement huge bandwidth. When one sector on one of the disks fails, however, the corresponding sector on every other disk is rendered useless because part of the data is now corrupted. RAID 0 does not implement error checking so any error is unrecoverable. More disks in the drive means higher bandwidth, but greater risk of data loss.
RAID Level 1
RAID 1: Mirrored Set (2 disks minimum) without parity. Provides fault tolerance from disk errors and single disk failure. Increased read performance occurs when using a multi-threaded operating system that supports split seeks, very small performance reduction when writing. Array continues to operate so long as at least one drive is functioning.
RAID Level 3 and Level 4
RAID 3 and RAID 4: Striped Set (3 disk minimum) with Dedicated Parity, the parity bits represent a memory location each, they have a value of 0 or 1, whether the given memory location they represent, is empty or full, thus enhancing the speed of read and write. This mechanism provides an improved performance and fault tolerance similar to RAID 5, but with a dedicated parity disk rather than rotated parity stripes. The single disk is a bottle-neck for writing since every write requires updating the parity data. One minor benefit is the dedicated parity disk allows the parity drive to fail and operation will continue without parity or performance penalty.
RAID Level 5
RAID 5: Striped Set (3 disk minimum) with Distributed Parity. Distributed parity requires all but one drive to be present to operate; drive failure requires replacement, but the array is not destroyed by a single drive failure. Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user. The array will have data loss in the event of a second drive failure and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive.
RAID Level 6
RAID 6: Striped Set (4 disk minimum) with Dual Distributed Parity. Provides fault tolerance from two drive failures; array continues to operate with up to two failed drives. This makes larger RAID groups more practical, especially for high availability systems. As drives grow in size, they become more prone to error and exposure to failure during fixing, a single drive may be 1 Terabyte in size. Single parity RAID levels are vulnerable to data loss until the failed drive is rebuilt: the larger the drive, the longer the rebuild will take. With dual parity, it gives time to rebuild the array by recreating a failed drive with the ability to sustain failure on another drive in the same array.
Nested RAID levels
Many storage controllers allow RAID levels to be nested. That is, one RAID can use another as its basic element, instead of using physical drives. It is instructive to think of these arrays as layered on top of each other, with physical drives at the bottom.
Nested RAIDs are usually signified by joining the numbers indicating the RAID levels into a single number, sometimes with a '+' in between. For example, RAID 10 (or RAID 1+0) conceptually consists of multiple level 1 arrays stored on physical drives with a level 0 array on top, striped over the level 1 arrays. In the case of RAID 0+1, it is most often called RAID 0+1 as opposed to RAID 01 to avoid confusion with RAID 1. However, when the top array is a RAID 0 (such as in RAID 10 and RAID 50), most vendors choose to omit the '+', though RAID 5+0 is more informative.
Common nested RAID levels
* RAID 0+1: Striped Set + Mirrored Set (4 disk minimum; Even number of disks) provides fault tolerance and improved performance but increases complexity. The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set. The array continues to operate with one or more drives failed in the same mirror set, but if two or more drives fail on different sides of the mirroring, the data on the RAID system is lost.
* RAID 1+0: Mirrored Set + Striped Set (4 disk minimum; Even number of disks) provides fault tolerance and improved performance but increases complexity. The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses as long as no two drives lost comprise a single pair of one mirror.
* RAID 5+0: A stripe across distributed parity RAID systems
* RAID 5+1: A mirror striped set with distributed parity (some manufacturers label this as RAID 53)
Software RAID
Software implementations are now provided by many operating systems. A software layer sits above the (generally block based) disk device drivers and provides an abstraction layer between the logical drives (RAID arrays) and physical drives. Software RAID is typically limited to RAID 0 (striping across multiple drives for increased space and performance), RAID 1 (mirroring two drives) and RAID 5 (data striping with parity).
In a multi-threaded operating system (such as Linux, FreeBSD, Mac OS X, Windows NT/2000/XP/Vista and Novell NetWare?) the operating system can perform overlapped I/O, allowing multiple read or write requests to be initiated without waiting for completion on each request. This is the capability that makes RAID 0/1 possible in an operating system. However, most operating systems do not support RAID 0/1 striping or mirroring with parity, due to the substantial processing demands of calculating parity[citation needed].
Since the software must run on a host server attached to storage, the processor (as mentioned above) on that host must dedicate processing time to run the RAID software. Like hardware-based RAID, if the server experiences a hardware failure, the attached storage could be inaccessible for a period of time.
Software implementations can allow RAID arrays to be created from partitions rather than entire physical drives.
Hardware RAID
A hardware implementation of RAID requires at a minimum a special-purpose RAID controller. On a desktop system, this may be a PCI expansion card, or might be a capability built in to the motherboard. In industrial applications the controller and drives are provided as a stand alone enclosure. The drives may be IDE/ATA, SATA, SCSI, SSA, Fibre Channel, or any combination thereof. The using system can be directly attached to the controller or, more commonly, connected via a SAN. The controller hardware handles the management of the drives, and performs any parity calculations required by the chosen RAID level.
Most hardware implementations provide a read/write cache which, depending on the I/O workload, will improve performance. Cached RAID controllers are most commonly used in industrial applications. Sometimes write cache is non-volatile, so pending writes are not lost on power failure.
Hardware implementations provide guaranteed performance, add no overhead to the local CPU complex and can support many operating systems, as the controller simply presents a logical disk to the operating system.
Hardware implementations also typically support hot swapping, allowing failed drives to be replaced while the system is running.
Add new comment