Truth In Hard Drive Space Marketing
One question I get asked from time to time by, well, pretty much anyone I've ever worked for or with, is why their hard drive was marketed at one size yet the available size is far smaller. For example, they may buy a hard drive that's marketed at 320 Gigabytes but when they get it installed there are only 286 Gigabytes of space available. On the surface this can be pretty frustrating in the "Bait and Switch" kind of way; you buy one thing and get another, lesser, thing but like most things about computers, the reality is actually pretty rational if not simple.
Traditionally, when confronted with the question I had always just gone with the tried and true line that I cribbed from Tom's Hardware:
Hard Disk Drive (HDD) makers define "GB" or "Gigabyte" as 1,000,000,000 bytes. Microsoft (in Windows) defines it as 1,024 x 1,024 x 1,024 bytes, or 1,073,741,824 bytes.
Thanks to Stack Overflow I now know that the real, thorough, answer is:
There are 3 reasons why the amount of space you can actually use is different from that listed for the drive, all of which work against you:
- Hard drive manufactures treat 1GB as one billion bytes, while the operating system calls it 1,073,741,824 bytes (1000 * 1000 * 1000 vs 1024 * 1024 * 1024).
- You lose some space for file tables when formatting.
- Disk space is divided into chunks larger than 1 byte (typically 4K). Using typical Windows defaults, a 1 byte file takes up 4K of space on disk.
Of these, the first two can influence the amount of space reported by the drive (though IIRC the 2nd one was more of an issue with FAT32 than NTFS). The last one only influences the amount of free space remaining, but will still prevent you from using the full capacity of your 80GB drive.
It's almost reminiscent of the Mars rover debacle about metric vs standard measurements; HDD makers use a decimal metric while the operating system uses binary. Anywho, this means that if you have a 320 Gigabyte HDD Windows will recognize it as 286 Gigabyte HDD. Pretty straightforward really and aside from having to dodge the occasional rant about how illogical that is and how come I just don't know more about the ins and outs of HDD business practices it's not really been that big of a concern for me professionally. Well, now as my client's needs for storage are increasing well, it's, in a sorta kinda way, becoming a bit of an annoyance.
A recent project I had was to design a system for obscene storage space; the client is a civil engineering firm so they deal in those HUGE CAD files for city planning and architectural plans and they want to do version control on them. So, lots and lots of space was needed and while investigating a possible solution that involved a dedicated RAID array it dawned on me that the specs were a little more complicated than they aught to be. Math was going to be required above the normal RAID calculations you have to do for setting up a RAID array. I mean, God forbid I say they'll have one amount of space when the reality is they'll get a lesser amount; I don't need that kind of attention.
On the surface setting up a RAID 5 device with 8 2 terabyte HDDs would yield a total of 16 terabytes before the RAID and, according to the RAID calculator, around 13 terabytes of usable space after the RAID setup. The reality though is that each 2 terabyte HDD is only going to yield 1.8 terabytes each and once the RAID is configured the total size is going to be 11.7 terabytes. A little over a full terabyte is missing; a non trivial amount to be sure.
So it's a knowable problem and it just requires a little math but the more I think about it the more I'm starting to empathize with the reactions of previous clients. Put simply; wtf man? Why don't the HDD makers put on the box the amount of space most people will experience; at least as a sub text on the box? For example, putting both the decimal and the binary amounts side by side or maybe listing the amount of space available for the big 3 operating systems (Windows, Mac and *nix) would be way more helpful and truthful than just listing the binary amount and leaving customers feeling ripped off.