Review of HDD failure statistics

If you would like to post, you'll need to register. Note that if you have a BCG store account, you'll need a new, separate account here (we keep the two sites separate for security purposes).

RichF

Well-known and Infamous Member
Supporting Member
Marketplace
Backblaze provides a view into which of their drives fail and how long they last till failure. This is an excellent resource if you are in the market for a HDD.

Also I recommend Backblaze for could storage. Unlimitted storage $70/year.

 
I believe Backblaze provides the most impressive set of data on HDD reliability, but remember they run their drives in a controlled environment - little temperature and pressure variation - and drives are not switched off-on. So if you'll run you HDD in a closet at home where temperature varies and you switches off the equipment from time to time brand reliability might look a little different (but not much).
BB's price is sharp compared to AWS or Azure
 
Thanks, Rich!

This is a good resource. Backblaze has enough volume to use statistical data in a meaningful way.

It also points out:
  • All drives fail
  • Sometimes specific models are problematic compared to others
  • It's a balance of low failure rate and a reasonable price. They could buy drives that are less likely to fail but they cost a lot more.
  • You have to really watch model numbers - many of these models are not readily available for consumers.
I used this data to avoid 3 TB drives a number of years ago. I also use it as confirmation 8 TB drives generally work pretty well so I have 6 of them.

The biggest problem with drives is on start up, but BackBlaze also measures run time to get an idea of run time to failure and the risk associated with drives that have not failed yet. As a consumer of hard drives, I go 1-4 weeks without turning on my external drives - or longer it it is an offsite drive. Yes - I face the issue of startup risk, but I don't have uncontrolled power outages and restarts, overheating, or constant use that will wear out a drive. I'm not sure what the right answer is, but suspect there is value in considering constant operation vs. intermittent operation.
 
Thanks, Rich!

This is a good resource. Backblaze has enough volume to use statistical data in a meaningful way.

It also points out:
  • All drives fail
  • Sometimes specific models are problematic compared to others
  • It's a balance of low failure rate and a reasonable price. They could buy drives that are less likely to fail but they cost a lot more.
  • You have to really watch model numbers - many of these models are not readily available for consumers.
I used this data to avoid 3 TB drives a number of years ago. I also use it as confirmation 8 TB drives generally work pretty well so I have 6 of them.

The biggest problem with drives is on start up, but BackBlaze also measures run time to get an idea of run time to failure and the risk associated with drives that have not failed yet. As a consumer of hard drives, I go 1-4 weeks without turning on my external drives - or longer it it is an offsite drive. Yes - I face the issue of startup risk, but I don't have uncontrolled power outages and restarts, overheating, or constant use that will wear out a drive. I'm not sure what the right answer is, but suspect there is value in considering constant operation vs. intermittent operation.
HI Eric

A couple of things (you may already know). I run a certification cycle on all new drives. Test them, HARD, for at least 24 hours before I put them into use. I have heard, but never seen, a drive fail early in its life.

I never have problems with my backup drives. One set is in the house which I connect every week or two. The other set is in the vault. Durring the peak of the Covid pandemic this set was not connected for months at a time (I avoid going into the bank). Now I update them every month or two, or after every major trip. So far no problem.

I have only had one disk go bad, and Softraid gave me a early heads up so I was able to replace under warranty without losing any data.
 
There's a version of Gresham's Law that applies to computer hardware. The Law states that 'bad money drives out good money', the corollary being that bad hardware drives out good hardware. For example, things like highly reliable ECC memory are driven out by cheap commodity memory, and much the came is true of disk drives. The current crop of 'enterprise grade' disk drives are far better than the commodity drives, but aren't as reliable as they might be (the HDD market collapsed in 2001 and we have many fewer vendors.) (I used to work on big servers and file systems).

A residue of that time is that I rarely turn my RAID arrays and NAS servers off, unless I'll be gone more than 4-5 days. Start up shock kills electronics faster than anything besides heat.
 
Most suppliers of HDD (and other electronics) provides warranty for their product, for HDD typically 3 or 5 years. Of course the warranty does not cover your data, it will only replace the drive when failed. So the warranty term is an indication on how long time the supplier expects the majority of the delivered units to last in normal use. For some units you can get information on how the supplier has expected the device to be operated to fulfill their life cycle expectation, it can be that they don't expect the device to be power cycled often. This applies to server and NAS drives, so don't put them in a desktop that's being powered off when not used.

Over time a HDD will be worn, this applies to the actuator, drive engine and capacitors. Failure in those elements will lead to a failed drive which is most unlikely to be repaired.
Some years a go it was a saying that a HDD supplier did not expect to replace more than 1-2% of all the drives produced - they would go bankrupt if the rate increased a lot. But HDD buyers tend prioritize a long warranty, so producers need to balance warranty term with replacement rate. So my these is that when you run a device beyond the warranty term you face 2 issues. First you will pay for replacement yourself, but a more than 5 year old HDD might be to small for your current needs anyhow. Second - and worse - failure rate is known to increase over time, and definitely after end of the warranty term.

So at some point in time you should consider replacing old drives, but when? Definitely hard to define the optimum time for replacement.

My current desktop actually runs a 8-9 year old 1 Tb drive, it runs fine (touch wood). Data on is not the most frequently used, but its nice to spread IO over more drives, and it is included in the backup schedule - going to a never drive in the NAS and from there to the cloud.

On my NAS I recently replaced an only 7 years old 2 TB drive, but it started to report errors - mostly bad blocks - so it was replaced with a never, faster and bigger drive. I recommend that you keep an eye on your logs. Because it's fun to install upgrades, restores are boring, data loss is horrible.

When reading the Backblace statistics you can see that some drives have very low error rates even after more than 6 years operation. So maybe we shouldn't worry to much over old HDD's, but of course new ones provide more space, less noise, less heat, less power consumption, and faster access so you can easily find a reason to replace :cool:
 
Back
Top