RAID-configured Serial Attached SCSI (SAS) hard drives are complex storage arrays used in Enterprise servers, such as Dell PowerEdge® and HP ProLiant®. Compared to parallel SCSI drives, commonly known as “scuzzy” drives, SAS hard drives are faster and support Serial ATA (SATA) and SAS drives in a single configuration. That is one reason the servers with SAS hard drives are preferred in high-performance Enterprise computing environments where high data availability and faster input/output rates are mission-critical.
However, the complexity of SAS storage servers means that data stored in RAID-configured SAS hard drives could face a higher risk of loss or turn inaccessible if the storage disks fail or encounter any other issue leading to a dead server.
This blog post outlines data loss problems encountered with SAS hard drives and the ways to recover the data from a failed (or dead) server.
Table of Content
SAS Storage Server Failure: Common Data Loss Scenarios
Following are some data loss situations associated with SAS hard drives of a failed server:
RAID server failure due to hard drive crash
This situation arrives when one or more SAS hard drives comprising the RAID array fail, leading to a server crash. Notably, issues in a single hard drive don’t cause server crashes but only deprecate the RAID performance. However, delays in swapping the crashed hard drive with a healthy drive can lead to subsequent failures, resulting in a dead server. Sometimes, even the RAID controller may fail to rebuild the array using the pre-configured hot spare due to glitches in the auto-replacement mechanism.
Data recovery from SAS drives of dead servers is a challenging task. It requires advanced in-lab expertise to restore the crashed drive(s) to a functional state, rebuild the RAID, and recover data from SAS hard drive.
[Suggested reading]: Stellar® recovers data from 8 TB SAS drives of a crashed Dell® storage server
RAID failure due to the foreign configuration of hard drives
Sometimes, the server may fail even after the timely replacement of the failed hard drive. In this situation, the RAID controller does not recognize the original RAID configuration to allow RAID rebuilding after the hard drive is replaced. Instead, the controller marks the replaced drives as “foreign configuration,” due to which the server cannot detect the SAS hard drives leading to data inaccessibility. If not fixed, this problem can lead to permanent data loss.
This scenario is highly complex, requiring experienced technicians who can diagnose the failed hard drives, restore them to an operational state, and determine the original RAID parameters. After a series of laboratory techniques, the RAID system is rebuilt based on parameters, like strip size, data flow pattern, parity flow, etc., to allow data recovery from SAS drives of a dead server.
[Suggested reading]: Stellar® recovers 34-TB archived data from failed Barracuda® Message Archiver
RAID server failure due to file system corruption
Another complicated data loss situation emerges from the corruption of file systems used in RAID arrays. Typically, the root cause of file system corruption is a crashed hard drive that degrades the RAID server performance. If not addressed promptly, this issue can result in a catastrophic failure through intermittent server reboot cycles and permanent RAID degradation.
File system corruption combined with physically crashed SAS hard drives makes for an incredibly tough data loss situation, which cannot be resolved without the aid of certified data recovery experts operating in Class 100 Clean Room lab.
[Suggested reading]: Data recovery from NetApp® server in WAFL inconsistent state
Data Recovery from SAS Hard Drive of a Failed Server: Key Steps
Typically, the data recovery process for physically failed servers is “invasive,” i.e., it involves opening the hard drive casing, performing in-lab diagnosis, and replacing components like head assembly if found broken. The entire process is strictly done inside a Class 100 Clean Room laboratory through the following steps:
In this step, the SAS hard drives constituting the RAID server are checked for functional status and failure markers like clicking or beeping sound. The suspected hard drives are segregated for in-lab diagnosis and recovery operations.
In this step, the hard drive casing is opened inside a Class 100 Clean Room and the drive’s components, like head assembly, arms, actuators, servo controller, etc., are examined. Typically, hard drive crashes happen due to the failure of mechanical components like head assembly.
Head assembly replacement and disk cloning
This is a crucial step for data recovery from SAS drives of a failed server. Advanced laboratory-based techniques are employed in this step to transplant new components like head assembly on the physically crashed hard drive. Aside from skills, techniques, and tools used for head assembly replacement, another key challenge in this step is to find the appropriate replacement component.
The purpose of head assembly replacement is to restore the drive’s functioning to allow “disk cloning” — the process of replicating the hard drive’s data “as is” on a brand new hard drive. This clone hard drive is then operated upon using various data recovery techniques to recover the data. The purpose of cloning is to allow 100% safe data recovery without risks of drive failure.
Determine the RAID’s original configuration
This step is technically challenging, as there is no straightforward method or mechanism to determine RAID parameters. Without knowing parameters like strip size, data flow pattern, parity flow, etc., it is difficult to rebuild the RAID, which hampers data recovery from SAS or any other RAID-configured drives of a dead server. Typically, RAID configuration is determined using proprietary techniques and highly specialized equipment.
Data recovery from SAS drive
After finding out the RAID configuration, a single clone is created. Further, a specialized data recovery tool is used to scan the clone and extract the data.
Best Method to Recover Data from SAS Hard Drive of a Dead Server
Considering the inherent difficulty of retrieving data from hard drives of failed servers, the best method to rescue data is to take the help of a reliable data recovery service expert. Be aware that you get only one chance, so taking a quick and well-informed decision can greatly impact the success rate.
Stellar® brings 25+ years of data recovery expertise in all types of complex storage systems, like RAID, NAS, Cloud, Virtual Machine, etc. Get a free quote for your requirement now.