What is a bit? A bit or binary digit is a digit that can take on the value 0 or 1.
Discussion occurs in Episode 19 of the Local Maximum.
When viewed as a digit, bits can come only in whole amounts. There is no way to receive a part of a digit. However, there is another way to view a bit which is as a unit of information measuring either the number of configurations of a piece of data or how that data subverts the expectations of the receiver. In both cases, the idea of a bit retains its original meaning when it refers to a single binary digit, but is expanded to include other objects as well.
As a Measure of Configurations in Data
A bit carries data that can take exactly 2 different values. If 2 bits are received, 4 configurations are possible, and if n bits are received, 2^n configurations are possible. Working backwards, this means that if there are N possible configurations, the number of bits used to represent this data is log_2(N), or ln(N)/ln(2).
Sometimes, the number of configurations is not a power of 2. For example, consider a decimal digit (which is the usual digit with values 0 through 9). Here there are 10 different values. If we are required to represent this in bits only, 4 bits are required - but they are overkill as 4 bits can actually represent 16 values.
Therefore, as a measure of “bits of information”, a normal base-10 digit is worth ln(10)/ln(2) bits, or about 3.3219…
In this situation, the number of bits no longer has to be whole, but the number of configurations still does. Therefore, the value will always be ln(N)/ln(2) where N is a positive integer. It also means that you can’t have a piece of data worth less than 1 bit but more than zero.
But there is another way to think about it where this is indeed possible - explained below!
As Subversion of Expectation
The definition of a bit does not consider the relative likelihood of receiving a 1 or a 0. But if that’s considered, then a bit of information can refer to a situation where 0 and 1 are equally likely. If only 0s are possible, then a 0 received does not convey any information, and if 0s are more likely than 1s, we can say that the information received from a 0 is less than one bit (but more than zero).
Using the observation above, we can conclude that if a particular piece of information has probability p, then the number of “bits” of information received by its confirmation is ln(1/p) / ln(2), or -ln(p)/ln(2). This also suggests a natural unit of information, or nats, which is equal to -ln(p), or the number of bits times ln(2).
A few points must be made about this information measurement scheme:
First, these units of information actually can be mapped back into the concept of bits that we know and love by receiving the same message (with the same expectation of frequency) many times over. For example, if 0 is very common, then we might build an encoding scheme where a short block to bits to represent a long string of 0s whereas 1s (which are rare) can take more bits to encode. Looking at these schemes over the long run, it turns out that the average number of bits it takes to represent a 0 converges to the information content of 0 in bits, and the same for 1s.
Second, the amount of information contained in a message should not be confused with the value of that information. In reality, real petabytes of information can be mostly useless while a single bit could contain all the value in the world.
Third, note that this measurement scheme relies on the subjective expectation of the message receiver of how probable and improbable each message is. Because these are subjective determinations, the information content of a message depends on the observer and their view of the world.