RAID (Redundant Array of Independent Disks) is a technology used to increase the performance and/or reliability of data storage. The abbreviation stands for "redundant array of independent disks".
A RAID system consists of two or more disks working in parallel. These can be hard drives, but recently there is a trend to use SSD (solid state storage) technology. There are different levels of RAID, each optimized for a specific situation. They are not standardized by an industry group or standardization committee. This explains why companies sometimes make up their own unique numbers and implementations.
This article covers the following RAID levels:
RAID 0 - rotation
RAID 1 - mirroring
RAID 5 - alternating with parity
RAID 6 - alternation with double parity
RAID 10 is a combination of mirroring and alternating
The software to perform the RAID functions and drive management can be located on a separate controller board (hardware RAID controller) or can simply be a driver. Some versions of Windows, such as Windows Server 2012, as well as Mac OS X, include software RAID functionality. Hardware RAID controllers cost more than pure software, but they also offer better performance, especially with RAID 5 and 6.
RAID systems can be used with multiple interfaces, including SCSI, IDE, SATA, or FC (Fiber Channel). There are systems that use SATA drives for internal use, but have a FireWire or SCSI interface for the host system.
Sometimes disks in a storage system are defined as JBOD, which means "just a set of disks". This means that these drives do not use a specific RAID level and act as stand-alone drives. This is often done for drives that contain swap files or spooled data.
RAID level 0 - Alternate
In a RAID 0 system, data is divided into blocks that are written to all disks in the array. When using multiple drives at the same time (minimum 2), this provides excellent I/O performance. This performance can be improved by using multiple controllers, ideally one controller per drive.
Advantages
RAID 0 provides high performance in both read and write operations. There is no overhead caused by parity checking.
The entire amount of memory is used, there are no overheads.
The technology is simple to implement.
Disadvantages
RAID 0 is not fault tolerant.
If one drive fails, all data in the RAID 0 array will be lost.
It should not be used for critical systems.
Best use:
RAID 0 is ideal for non-critical storage of data that must be read/written at high speed, for example, for image retouching or a video editing station.
If you want to use RAID 0 exclusively to pool storage capacity on a single volume, consider connecting one drive to the folder path of another drive. This is supported on Linux, OS X as well as Windows and has the advantage that the failure of one drive does not affect the data on the second drive or SSD.
RAID level 1 - Mirroring
Data is stored twice by writing to both a data disk (or set of data disks) and a mirrored disk (or set of disks). In the event of a disk failure, the controller uses the data disk or mirrored disk to recover the data and continue operation. You need at least 2 drives for a RAID 1 array.
Advantages
RAID 1 offers excellent read speeds and write speeds comparable to a single drive.
In the event of a disk failure, the data does not need to be rebuilt, it just needs to be copied to a new disk.
RAID 1 is a very simple technology.
Disadvantages
The main disadvantage is that the effective storage capacity is only half of the total disk capacity, since all data is written twice.
RAID 1 software solutions do not always allow hot swapping of a failed drive. This means that a faulty disk can be replaced only after turning off the computer to which it is connected.
For servers that are used by many people at the same time, this may not be acceptable. Such systems typically use hardware controllers that support hot swapping.
Perfect use
RAID-1 is ideal for mission-critical storage, such as accounting systems. It is also suitable for small servers in which only two data disks will be used.
RAID level 5
RAID 5 is the most common secure RAID level. Requires at least 3 disks, but can run up to 16. Blocks of data are distributed across disks, and a parity checksum of all data in the block is written to one disk. Parity data is not written to a fixed disk, it is spread across all disks as shown in the figure below. Using parity check data, the computer can recalculate data from one of the other data blocks if that data is no longer available. This means that a RAID 5 array can withstand a single drive failure without losing data or access to it. Although RAID 5 can be implemented in software, a hardware controller is recommended. Often, additional cache memory is used on these controllers to improve write performance.
Advantages
Data read transactions are very fast, while data write transactions are somewhat slower (due to the parity that needs to be calculated).
In the event of a disk failure, you still have access to all of your data, even if the failed disk is replaced and the storage controller restores the data to the new disk.
Disadvantages
Disk failures affect throughput, although this is still acceptable.
This is a complex technology. If one of the drives in an array using 4TB drives fails and is replaced, data recovery (recovery time) can take a day or more, depending on array load and controller speed. If another drive fails during this time, the data will be lost forever.
Perfect use
RAID 5 is a good all-round system that combines efficient storage with excellent security and decent performance. It is ideal for file servers and application servers with a limited number of data disks.
RAID level 6 - Alternate with double parity
RAID 6 is similar to RAID 5, but parity data is written to two disks. This means it needs at least 4 drives and can withstand 2 drives dying at the same time. The probability of two discs breaking at the same time is, of course, very small. However, if a drive in RAID 5 systems dies and is replaced with a new one, it takes hours or even more than a day to recover the replaced drive. If another drive dies during this time, you still lose all your data. When using RAID 6, the RAID array will survive even this second failure.
Advantages
As in RAID 5, data read operations are performed very quickly.
If two drives fail, you still have access to all of your data, even if the failed drives are replaced. Therefore, RAID 6 is more secure than RAID 5.
Disadvantages
Data write operations are slower than RAID 5 due to the additional parity data that must be calculated. Recording performance can theoretically be 20% lower.
Disk failures affect throughput, although this is still acceptable.
This is a complex technology. It can take a long time to recover from an array that has a single disk failure.
Perfect use
RAID 6 is a good all-round system that combines efficient storage with excellent security and decent performance. This is better than RAID 5 on file servers and application servers that use many large disks to store data.
RAID level 10 is a combination of RAID 1 and RAID 0
It is possible to combine the advantages (and disadvantages) of RAID 0 and RAID 1 in one system. This is a nested or hybrid RAID configuration. It provides security by mirroring all data on secondary disks, while using partitioning per disk set to speed up data transfer.
Advantages
If something goes wrong with one of the drives in a RAID 10 configuration, the recovery time is very fast because all that is required is to copy all the data from the surviving mirror to the new drive. This can take as little as 30 minutes for 1 TB drives.
Disadvantages
Half of the storage capacity goes to mirroring, so compared to larger RAID 5 or RAID 6 arrays, this is an expensive way to provide redundancy.
What about RAID levels 2, 3, 4 and 7?
These levels exist, but they are not shared (RAID 3 is essentially like RAID 5, but the parity data is always written to the same disk). This article describes only the general classification of RAID systems, and displays general information about the technology of combining drives.
RAID will not replace a backup!
All RAID levels except RAID 0 provide protection against a single drive failure. A RAID 6 system will continue to work even if 2 drives fail at the same time. To be completely safe, you still need to back up your data from the RAID system.
This backup will come in handy if all drives fail at the same time due to a power spike.
This is protection against theft of the storage system.
Backups can be stored outside the server room or data center, in another location. This can come in handy in the event of an emergency, large-scale system failure, fire, etc.
The most important reason for backing up multiple generations of data is user error. If someone accidentally deletes some important data and it goes unnoticed for hours, days or weeks, a good set of backups ensures that you can still save those files.
How can we help?
The Server Solutions company sells Dell PowerEdge R760 and Dell PowerEdge R760xs servers throughout Ukraine, among our customers are small, medium and large businesses. If you or your company needs advice and the purchase of high-quality server equipment, then you should contact us.