RSS 2.0 News Feed
HEXUS.net - Definitive Technology News and Reviews
Latest content
Notebooks
Notebooks
NVIDIA Optimus technology: mobile GPU implementation done right?
Latest Reviews
minimise maximise
Beans
minimise maximise
Guides
minimise maximise
Press Releases
minimise maximise

Manufacturers' MTTF Underestimates HDD Failure Rates

Storage
Storage

Published: Sunday 4th March, 2007 | Author: Navin Maini

Addthis
printer friendly layout     discuss in the forums     email to a friend
Advertisement

A study conducted by Carnegie Mellon University, concludes that hard disk drive failure rates are fifteen times higher than indicated by the MTTF information provided by manufacturers.

Using a sample of roughly 100,000 drives, the study which was presented last month at the 5th USENIX Conference in San Jose also found that SATA drives were no less reliable than the more expensive and faster Fibre Channel varieties.

Computerworld, reports that a further study presented at the same conference, which used drive samples from Google run data centres, interestingly found that drive temperatures appeared to have little effect on drive reliability rates.

The Carnegie Mellon University study debates that manufacturer data sheets for the drive samples used as part of the study, indicated MTTF values (Mean Time To Failure) of between 1 to 1.5 million hours, by which a conclusion was drawn that the worst case failure rate would equal some 0.88%. The study however uncovered that the failure rate of the drive samples used, which were from large production systems, Internet service sites and so on, was on an annual figure of somewhere between 2 and 4%. Some systems delivered failure rates of an astounding 13%.

Commenting on the study itself, the associate professor of computer science, as well as the co-author of the study itself, Garth Gibson, stressed that the goal of the study was to aid manufacturers to make improvements not only with the design of drives, but with the testing processes used as well. He went on to make clear that he had no vendor-specific material and that the study did not necessarily track actual drive failures but instead, customer diagnosed drive failure where the customer felt the drive in question required replacement. Lastly, he stated that helping users to distinguish between the best and worst vendors was not a goal of the study.

Mr. Gibson, went on to voice similar opinions held by analysts and vendors that perhaps as many as 50% of storage drives returned by customers had no failures and that failures in general could occur for a multitude of reasons, ranging from extraordinary environments which the drive may be subjected to, to random read-write or intensive operations that could simply cause premature wear and tear to the mechanical components within the drive.

Amongst the drive vendors who were asked to comment upon the study, several declined the invitation. A spokesperson from Seagate, based in California, responded via e-mail to express that 'The conditions that surround true drive failures are complicated and require a detailed failure analysis to determine what the failure mechanisms were'. Mirroring the information provided in the paragraph above, the spokesperson went on to state that 'It is important to not only understand the kind of drive being used, but the system or environment in which it was placed and its workload'.

However, perhaps not everyone was surprised at the results observed within the study. In particular, Ashish Nadkarni, holding the position of a principal consultant at Massachusetts based storage services provider, GlassHouse Technologies Inc., expressed no surprise at the replacement rates quoted because of the distinct differences between the environment used by drive vendors to test drives and, the dust, environment, noise and vibrations which may be present in a data centre.

Mr. Nadkarni elaborated further by describing how, in his view, the overall quality of drives, due to price competition within the industry, has been falling over time. He suggested that customers to implement tracking of disk drive records and to press vendors to review their internal testing procedures.

HEXUS.links

HEXUS.community :: Voice your opinion.

HEXUS.community :: your right2reply

Not suprising really. Working in a computer repair shop we regularly see drives failing after 3/4 years....sooner in laptops. But on the other hand we sell machines with 98 on still going strong...

BUT the best way to keep your drive alive IMO is to make sure you've got enough ram - drives die much much earlier if their constantly acting as your virtual mem. This can be shown with the equation :

256mb + xp + Norton = dead hard drive

There was a particular batch of emachines from 2002/3 that shipped with xp and 128mb !! Those poor suckers died after 18 months.

oh, and i would avoid maxtor :D those suckers drop like mayflies


fQuote
Something tells me the MTBF testing method involves testing in an airconditioned room with the drive powered on but only writing or reading once a day with a special casing to damped its vibrations. They are proabably also read bedtime stories and fed only the finest conditioned power.Quote

Quote: arthurleung
Operating temperature have no effect is quite interesting, does that mean I can run my harddrives at 55'C and they will not age faster than at 40'C?

No, no - operating temperature have a limited effect, but within reason.

And frankly, if you're even getting a hard drive failure a year, you've got some serious problems. Or you keep buying Maxtors.

Even all 3 of my Western Digitals from 2001 are still working now, and until around September, they were used almost daily (first at home, and now in a RAID0 [edit: well 2 of them of course] array in my machine at work, on 24/7 [edit: the other was all-but-retired in sometime 2005, but still use it externally sometimes for dumping temp files]).


Quote: funnelhead
BUT the best way to keep your drive alive IMO is to make sure you've got enough ram - drives die much much earlier if their constantly acting as your virtual mem. This can be shown with the equation :

256mb + xp + Norton = dead hard drive

:D :D

You could even take out the "256mb" and "XP" and that would be an equally valid equation.Quote
Or

Maxtor + oem = rma

XP + Norton / 256 > 1 (dead Hard Disk)

XP + ipod / itunes = 0 (optical drives, once their gear drivers b0rk ur optical drivers)

PC + Boy * [X] Mbps internet * [y] hours home alone = X^3 (Gb of pr0n) + Y (hours of tendonitis)

Sony Vaio * out-of-warranty-fault / your credit rating = -300 (pounds from your bank account)

Sony + lithium Ion = BANGQuote

Quote: funnelhead
oh, and i would avoid maxtor :D those suckers drop like mayflies


f

I'll second that, 3 in 3 months, one of which is a warranty replacement :rolleyes:Quote

Reply

Home PC Repairs
For speedy, cost effective home PC repairs, connect to the experts and have it done right the first time.

My HEXUS


:: New User
:: Lost Password

Browser Plugins
:: IE7 Search
:: Firefox 2 Search
Hottest items
minimise maximise
Latest Poll
minimise maximise

2010, the year of...








Headlines
minimise maximise