Kibioctets

The Sensible Way Of Measuring Binary Data

A proposal by Lyle Zapato

The Problem

In 1884, the Lord Kelvin had the following to say about inadequate measurement systems:

"You, in this country, are subjected to the British insularity in weights and measures; you use the foot, inch and yard. I am obliged to use that system, but must apologize to you for doing so, because it is so inconvenient, and I hope Americans will do everything in their power to introduce the French metrical system. ... I look upon our English system as a wickedly, brain-destroying system of bondage under which we suffer. The reason why we continue to use it, is the imaginary difficulty of making a change, and nothing else; but I do not think in America that any such difficulty should stand in the way of adopting so splendidly useful a reform." [Source]

120 years later, America (and, sadly, much of Cascadia,) still hasn't heeded His words. To the contrary, we have shackled ourselves with an additional modern form of measuremental bondage that is even more brain-destroying than anything the most wicked Brit could have devised -- one that even perverts the system that Kelvin advocated. I am speaking of the units we use to measure data on computers; the bytes, kilobytes, megabytes, etc. that we have all become so familiar with, and yet can be so confused by.

Let's start with the basic units. The bit, in case you didn't know, is the smallest unit of information in a binary system. There are no fractional bits, and the term is unambiguous. This is an acceptable unit. The byte is a little more convoluted. In present-day usage, 1 byte = 8 bits. However, the term originally referred to the number of bits needed to encode a character. Consequently, there were computer systems where bytes were different numbers of bits. This system-specific functional term only later became a general unit of information when the 8-bit character size became a standard, resulting in one term having two incompatible meanings (albeit one now considered obsolete) in the same field.

But the real confusion comes when bit and byte are used together. The abbreviation or symbol for byte is uppercase B, whereas the symbol for bit is lowercase b. In theory this seems simple and even eloquent, but in practice people often use B/b indiscriminately, usually out of ignorance of the difference (not to mention problems caused by caps lock scofflaws and e. e. cummings wannabes.) Oddly, the original "bite" was given a "y" so that it wouldn't be misspelled "bit," but this rather obvious abbreviation problem was overlooked (even more odd considering the widespread use at the time of text based computer systems with no lower case letters).

The next level of trouble comes from the prefixes used with these two terms. How many bytes are in a kilobyte? The answer depends on whom you ask. Computer science people would say 1 kilobyte = 1,024 bytes (1,024 is a round number in binary notation: 10000000000.) But this ignores the accepted meanings of the SI prefixes (kilo = 1,000, mega = 1,000,000, etc.) which means proponents of the Metric System correctly reject this usage as improper.

Ambiguous Meaning Of Prefixes
Unit	CS Use	SI Use
kilobyte	1,024 bytes	1,000 bytes
megabyte	1,048,576 bytes	1,000,000 bytes
gigabyte	1,073,741,824 bytes	1,000,000,000 bytes

Now if it was just people in an unrelated field being persnickety then maybe this wouldn't really be a practical problem for computer users; However, the proper Metric meaning of the prefixes are used by some in the computing industry, although often out of ulterior motives. For instance, if a harddrive manufacturer says a drive has 10 gigabytes, then it actually has 10,000,000,000 bytes, not 10,737,418,240 bytes as your operating system would measure 10 gigabytes -- you may be getting less than you think you're getting. The result of this mix of proper and improper use of Metric prefixes is ambiguity and the potential for errors (or dishonest pricing).

When we combine the two problems above, things get even worse:

Compounded Confusion
Symbol Used	Possible Meanings	Symbol Used	Possible Meanings
kb or KB	1,000 bits	kB	8,000 bits
	1,024 bits		8,192 bits
	8,000 bits		Person is 133t?
	8,192 bits

There's also the possibility that various mixes of these different interpretations could all end up being fed into a single calculation, resulting in errors even greater than above and potentially leading to disaster (much like when the mixing of metric and imperial measurments resulted in NASA's Mars Orbitor being lost in space.)

The Solution

So how should we solve these problems? For starters, we need to replace the term byte with one that has a more obvious abbreviated distinction from bit. Just as Kelvin urged Americans to follow the lead of the French, I am urging everyone to use the French term for 8-bits: octet. This term -- born out of anglophobia -- is both unambiguous and descriptive. Octet literally means a group of eight. In the context of informational measurement, it means 8 bits. Octet is abbreviated o, so there's no confusing it for bits. Plus, the French have already been using it for years, with no problems. Thus:

Base Units Of Binary Information
Unit	Symbol	Value
bit	b	the quantum of binary information
octet	o	8 bits (formerly 1 byte)

Next, we need to consistently stop misusing the Metric prefixes. A kilo is defined as 1,000 and it should never be used for something else. Instead, we should widely adopt the binary prefixes that were approved as a standard in 1998 by the International Electrotechnical Commission (IEC -- whose first president, incidentally, was Lord Kelvin). These prefixes are as follows:

Binary Multiple Prefixes
Prefix	Symbol	Value
kibi-	Ki	2¹⁰ (1,024)
mebi-	Mi	2²⁰ (1,048,576)
gibi-	Gi	2³⁰ (1,073,741,824)
tebi-	Ti	2⁴⁰ (1,099,511,627,776)
pebi-	Pi	2⁵⁰ (1,125,899,906,842,624)
exbi-	Ei	2⁶⁰ (1,152,921,504,606,846,976)
zebi-	Zi	2⁷⁰ (1,180,591,620,717,411,303,424)
yobi-	Yi	2⁸⁰ (1,208,925,819,614,629,174,706,176)

(For more on these prefixes, see the official standard in IEC 60027-2, the IEC article "When is a kilobyte a kibibyte?", and the NIST reference page on prefixes for binary multiples.)

Combining these prefixes with the base unit bit, we get the following binary exponential units:

Binary Exponential Units of Data (Bit)
Unit	Symbol	Number Of Bits
kibibit	Kib	1,024 bits
mebibit	Mib	1,048,576 bits
gibibit	Gib	1,073,741,824 bits
tebibit	Tib	1,099,511,627,776 bits
pebibit	Pib	1,125,899,906,842,624 bits
exbibit	Eib	1,152,921,504,606,846,976 bits
zebibit	Zib	1,180,591,620,717,411,303,424 bits
yobibit	Yib	1,208,925,819,614,629,174,706,176 bits

And combining the prefixes with the base unit octet, we get:

Binary Exponential Units of Data (Octet)
Unit	Symbol	Number Of Bits (Depreciated Unit)
kibioctet	Kio	8,192 bits (kilobyte)
mebioctet	Mio	8,388,608 bits (megabyte)
gibioctet	Gio	8,589,934,592 bits (gigabyte)
tebioctet	Tio	8,796,093,022,208 bits (terabyte)
pebioctet	Pio	9,007,199,254,740,992 bits (petabyte)
exbioctet	Eio	9,223,372,036,854,775,808 bits (exabyte)
zebioctet	Zio	9,444,732,965,739,290,427,392 bits (zettabyte)
yobioctet	Yio	9,671,406,556,917,033,397,649,408 bits (yottabyte)

(For brevity, I have referred to the system above and the two sets of binary exponential units derivied from it as the "kibioctet standard" -- or informally just "kibioctets", as in "promote kibioctets" below -- since the unit kibioctet concisely shows the two differences separating the standard from the depreciated nonstandard standard currently in use.)

Promote Kibioctets

Feel free to use these badges to show your site's compliance with the kibioctet standard and to promote kibioctet usage and understanding:

ZPi

Research Labs

Kibioctets

Research:

Applications:

Projects:

Kibioctets

The Sensible Way Of Measuring Binary Data

A proposal by Lyle Zapato

The Problem

The Solution

Promote Kibioctets