units

    Introducing metric quantity units for computing

    In computing, metric-sounding prefixes almost universally refer to sizes expressed as powers of two:

    • kilo = 2^10 = 1024
    • mega = 2^20 = 1,048,576
    • giga = 2^30 = 1,073,741,824
    • …and so on.

    In 1998, the IEC incorrectly voted to change that, and it’s time to fix this mistake.

    1K = 1K

    Using “k” to mean 2^10 dates back to at least 1959, with Gordon Bell of IBM ("Architecture of the IBM System/360"), Gene Amdahl of DEC ("Instrumentation Techniques in Nuclear Pulse Analysis"), and others standardizing them as units in 1964. Since that time, binary units have been used pervasively to describe quantities. Well, almost. Hard drive manufacturers started using the smaller, metric homonyms to describe their products with larger numbers than their competitors. That is, a company could market their 50MB hard drive as 52 (metric) MB so that it sounded larger than anyone else’s 50MB drive. This caught on like wildfire because marketing loved it, even though binary sizes were correctly used for everything else.

    The International Electrotechnical Commission decided to weigh in, and in 1998 (the same year that gave us SOAP) decided that the electronics industry should change their standard units to use a new system. Henceforth metric-sounding prefixes would start referring to decimal sizes, like:

    • kilo = 10^3 = 1,000
    • mega = 10^6 = 1,000,000
    • giga = 10^9 = 1,000,000
    • etc.

    This was bad enough, because those numbers don’t naturally correspond to anything computer-related except hard drive sizes. For instance, the IEC would have us incorrectly believe that a 32-bit address could refer to 4.29GB of RAM. No. Worse, though, were the fictional binary units they invented to replace the actual industry standard. From then on, we were to say that:

    • 1,024 bytes = 1 KiB = 1 kibibyte
    • 1,048,576 bytes = 1 MiB = 1 mebibyte
    • 1,073,741,824 = 1 GiB = 1 gibibyte
    • and I lack the stomach to continue.

    Donald Knuth said:

    The members of those committees deserve credit for raising an important issue, but when I heard their proposal it seemed dead on arrival — who would voluntarily want to use MiB for a maybe-byte?! […] I am extremely reluctant to adopt such funny-sounding terms; Jeffrey Harrow says “we’re going to have to learn to love (and pronounce)” the new coinages, but he seems to assume that standards are automatically adopted just because they are there.

    Knuth, as always, was right. The awful-sounding standard was appropriately mocked and ignored. Western Digital settled a lawsuit in 2006 for marketing an 80 billion byte hard drive as 80 gigabytes, with the plaintiff citing the fact that even then — 8 years after the “standard” was passed — essentially no one used metric sizes to refer to quantities.

    A few well-meaning but misled companies have started using the metric units. For instance, Apple’s macOS describes hard drive sizes in metric units (but inconsistently lists RAM quantities in correct binary sizes such as 16GB). Before this snowballs out of control, we need to reach a real industry-wide standard that engineers will actually use. I assert that:

    • Computing, as do all other industries, has its own jargon. Our mouse is not a mammal, and our prefixes don’t need to mirror the metric system.
    • The current IEC standard looks terrible, sounds terrible, and is nearly universally avoided.
    • The great thing about standards is that we can make our own and start using it.

    The binary kilobyte, megabyte, and gigabyte are our heritage and our vocabulary. In the realm of computing, we own those terms. Therefore, I propose a new standard for describing storage quantities in computing. Effective immediately, metric-sounding prefixes in computing officially refer to their binary sizes as they have since IBM and DEC claimed them in the 1960s. Furthermore, metric sizes will use the new “tri” infix notation — abbreviated “t” — like so:

    • 1,000 bytes = 1 KtB = 1 kitribyte
    • 1,000,000 bytes = 1MtB = 1 metribyte
    • 1,000,000,000 bytes = 1GtB = 1 gitribyte
    • and so on for tetribyte, petribyte, extribyte, and so on.

    Let people who want to use different units be the ones to adopt them. And frankly, “metribyte” sounds a lot better than “mebibyte” ever will.