Studying for the A+, Network+ or Security+ exams? Get over 2,600 pages of FREE study guides at CertiGuide.com!|
Join the PC homebuilding revolution! Read the all-new, FREE 200-page online guide: How to Build Your Own PC!
NOTE: Using robot software to mass-download the site degrades the server and is prohibited. See here for more.
Find The PC Guide helpful? Please consider a donation to The PC Guide Tip Jar. Visa/MC/Paypal accepted.
|Take a virtual vacation any time at DesktopScenes.com - view my art photos online for FREE in either Flash or HTML!|
[ The PC Guide | Systems and Components Reference Guide | Hard Disk Drives | Hard Disk Performance, Quality and Reliability | Hard Disk Quality and Reliability | Hard Disk Quality and Reliability Features ]
Self-Monitoring Analysis and Reporting Technology (SMART)
In an effort to help users avoid data loss, drive manufacturers are now incorporating logic into their drives that acts as an "early warning system" for pending drive problems. This system is called Self-Monitoring Analysis and Reporting Technology or SMART. The hard disk's integrated controller works with various sensors to monitor various aspects of the drive's performance, determines from this information if the drive is behaving normally or not, and makes available status information to software that probes the drive and look at it.
The fundamental principle behind SMART is that many problems with hard disks don't occur suddenly. They result from a slow degradation of various mechanical or electronic components. SMART evolved from a technology developed by IBM called Predictive Failure Analysis or PFA. PFA divides failures into two categories: those that can be predicted and those that cannot. Predictable failures occur slowly over time, and often provide clues to their gradual failing that can be detected. An example of such a predictable failure is spindle motor bearing burnout: this will often occur over a long time, and can be detected by paying attention to how long the drive takes to spin up or down, by monitoring the temperature of the bearings, or by keeping track of how much current the spindle motor uses. An example of an unpredictable failure would be the burnout of a chip on the hard disk's logic board: often, this will "just happen" one day. Clearly, these sorts of unpredictable failures cannot be planned for.
The drive manufacturer's reliability engineers analyze failed drives and various mechanical and electronic characteristics of the drive to determine various correlations: relationships between predictable failures, and values and trends in various characteristics of the drive that suggest the possibility of slow degradation of the drive. The exact characteristics monitored depend on the particular manufacturer and model. Here are some that are commonly used:
(Some of the quality and reliability features I am describing in this part of the site are in fact used to feed data into the SMART software.)
Using statistical analysis, the "acceptable" values of the various characteristics are programmed into the drive. If the measurements for the various attributes being monitored fall out of the acceptable range, or if the trend in a characteristic is showing an unacceptable decline, an alert condition is written into the drive's SMART status register to warn that a problem with the drive may be occurring.
SMART requires a hard disk that supports the feature and some sort of software to check the status of the drive. All major drive manufacturers now incorporate the SMART feature into their drives, and most newer PC systems and motherboards have BIOS routines that will check the SMART status of the drive. So do operating systems such as Windows 98. If your PC doesn't have built-in SMART support, some utility software (like Norton Utilities and similar packages) can be set up to check the SMART status of drives. This is an important point to remember: the hard disk doesn't generate SMART alerts, it just makes available status information. That status data must be checked regularly for this feature to be of any value.
Clearly, SMART is a useful tool but not one that is foolproof: it can detect some sorts of problems, but others it has no clue about. A good analogy for this feature would be to consider it like the warning lights on the dashboard of your car: something to pay attention to, but not to rely upon. You should not assume that because SMART generated an alert, there is definitely a drive problem, or conversely, that the lack of an alarm means the drive cannot possibly be having a problem. It certainly is no replacement for proper hard disk care and maintenance, or routine and current backups.
If you experience a SMART alert using your drive, you should immediately stop using it and contact your drive manufacturer's technical support department for instructions. Some companies consider a SMART alert sufficient evidence that the drive is bad, and will immediately issue an RMA for its replacement; others require other steps to be performed, such as running diagnostic software on the drive. In no event should you ignore the alert. Sometimes I see people asking others "how they can turn off those annoying SMART messages" on their PCs. Doing that is, well, like putting electrical tape over your car's oil pressure light so it won't bother you while you're driving! :^)
Next: Idle Time Error Checking