A SQL Server DBA given choice to select his Disk configuration should know the different Raid configurations and should select them appropriately.
Common RAID Levels used are:
RAID 0 which is also called a strip set. It is Very fast but have no redundancy.
A RAID 0 array must have 2 or more disks. A RAID 0 array has no redundancy. As data is written to the array it is spread across the disks for maximum speed. This RAID level should never be used for a SQL Server. In the event of a failure of a single disk in the array all data is lost. To calculate the space created from a RAID 0 array multiple the number of drives times the size of the drives. (Example: 3 drives x 50GB = 150 GB)
RAID 1 which is also called a mirrored set. It is not very fast but has maximum redundancy.
A RAID 1 array can have only 2 disks in the array. A RAID 1 array has maximum redundancy as every byte of data is written to both disks at the same time. From a speed point of view a RAID 1 array is no faster or slower than a single disk. In the event of a single disk failure the second disk continues to serve the data to the server. The software does not know that a disk has failed. To calculate the space created for a RAID 1 array it is simply the size of one drive. (Example: 2 drives x 50GB = 50GB)
RAID 5 which is fast and good redundancy. It does a parity check on every write.
A RAID 5 array must have at least 3 disks in the array. RAID 5 writes are expensive in terms of disk operations and traffic between the disks and the controller. The parity blocks are not read on data reads, since this would add unnecessary overhead and would diminish performance.A RAID 5 array has a nice balance of speed and redundancy. A RAID 5 array is the most cost effective RAID level (the least cost effective are RAID 1 and RAID 0+1). A RAID 5 array can survive a single disk failure without affecting the data. RAID 5 arrays are best used for arrays that are read intensive. The reason for this is that for every byte written to the disk parity has to be determined. This does not have to be done when reading data from the disk. To calculate the space created for a RAID 4 array take the number of drives, subtract 1 and multiply by the size of the drives. (Example: 6 drives x 50 GB = 250GB)
RAID 0+1 which is also called a mirrored strip. It is very fast and maximum redundancy.(Most recommended for its speed and redundancy).A RAID 0+1 array must have at least 4 disks, and it must have an even number of disks in the array (a 2 disk RAID 0+1 array is simply a RAID 1 array). The basic idea of a RAID 0+1 array is that the RAID controller makes two RAID 0 arrays and mirrors them together. This gives your maximum speed and maximum redundancy. However this takes a lot of disks, and a therefore costs a lot of money. To calculate the space created for a RAID 0+1 array take the number of drives, divide by 2 and multiply by the size of the drives. (Example: 6 drives x 50GB = 150GB)
A RAID 5+1 array must have at least 6 disks, and it must have an even number of disks in the array. The basic idea of a RAID 5+1 array is that the RAID controller makes two RAID 5 arrays, and mirrors them together. This configuration gives you better redundancy than a RAID 0+1 without giving up to much performance. To calculate the space created for a RAID 5+1 array take the number of drives, divide by 2, subtract 1 and multiply by the size of the drives. (Example: 6 drives x 50GB = 100GB) A RAID 5+1 array will give you the same performance as a RAID 5 array but with increased redundancy. Only a few select RAID cards can handle this RAID level. For the most part this RAID level is used within a SAN environment.
Now coming to different files placed on Servers:
1. Operating System files are best places on Raid 1 (with hot spare if possible). This includes page file which is 1:1 Read Write ratio.
2. Binaries – I usually place the binaries on the OS drive. Mostly they are read from once (when the SQL Services are started) and not again. The only thing that will be written here are the error log and sql agent log files.
3. System databases – Each system database has different requirements. Overall a Raid 5 is best if going with least expensive option otherwise if budget permits Raid 0+1..
• master – The master database is mostly read, seldom write.
• msdb – The msdb database is about 50/50 read/write.
• model – The model database is ready only when a new database is created or the SQL Service is started. It should almost never be written to.
4. User databases data files – If your databases are mostly read, seldom write then a RAID 5 array is the best option. If your database written to often then a RAID 0+1 would be your best option.
5. User databases log files – Transaction Log files are very write intensive. They should be placed on a RAID 1 or RAID 0+1 array.
6. Tempdb is very write intensive. It should be placed on a RAID 1 or RAID 0+1 array. On most of the SQL Servers, the data file and log file for the tempdb database can be placed on the same array. But on SQL Servers with truly massive tempdb usage you may want to split the tempdb data file and log file onto their own RAID 1 or RAID 0+1 arrays for giving it separate spindle for data and log Files.)
Depending on the SQL Server size, traffic, criticality of data, performance required the above requirements change.
1. For a small Server with only a few users and a low number of TPS (transactions per second) putting the databases, transaction logs, and tempdb all on the same RAID 5 array is perfectly fine. The load on the system is low enough that the system speed won’t suffer by having them all on the same drives.
2. For an average sized database server with a few hundred users and > 100-150 TPS (transactions per second)then ideally the databases and logs should be placed on separate drives. If you can manage then tempdb should go on its own RAID 1 array.
3. For a large database server with thousands of users and some hundreds of TPS(transactions per second) should be specked/Built out VERY carefully. Few configurations which may be helpful in designing a Large database server are listed below:
• Any database file with high usage should be placed on its on Drive with Raid 0+1 configuration should be fine.
• To separate out the sequential writes with Random Writes, Data File and Log Files should be placed on their own drives.
• Tempdb should go on its own RAID 1 array with Data Files and Log Files on Separate Drives if possible.
• Moreover, you can breakup the database files and the logs creating additional file groups and moving objects into those file groups.
• If needed to improve the performance on indexes, we can place the non-clustered indexes into a separate database file on a separate drive.
• If needed to improve the performance on Full text data (the data within a text, ntext, or image data type field) can be place on a separate drive.
Any above Files with only few writes with lot of reads can go on Raid 5 otherwise a Raid 1 or Raid 1+0 are better choices with having Redundancy.
When working with SAN, we must know some of the terms like
• Rebuilding the Array – When you replace a disk in a RAID array the array has to copy the data to that disk. This is called rebuilding the array.
• RAID Cache – This is memory built onto the RAID controller and is used to speed up operations to and from the disks. Having Cache on the controller makes a huge difference.
• Hot Spare – A hot spare is a spare hard drive installed in your server. In the event of a disk failure the array will automatically rebuild to the hot spare. This means that your data is redundant in the fastest possible time. When you replace the failed drive the new drive now becomes the hot spare. (RAID 0 can not make use of a hot spare as it is not redundant)
• Parity – Parity is used to create the redundancy in RAID 5 arrays.