Redundancy (information theory)

Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. Data compression is a way to eliminate unwanted redundancy, while checksums are a way of adding desired redundancy for purposes of error correction when communicating over a noisy channel of limited capacity.

Quantitative definition
Recall that the rate of a source of information is (in the most general case)


 * $$r=\mathbb E H(M_t|M_{t-1},M_{t-2},M_{t-3}, \dots), $$

the expected, or average, conditional entropy per message (i.e. per unit time) given all the previous messages generated. It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a memoryless source is simply $$H(M)$$, since by definition there is no interdependence of the successive messages of a memoryless source.

The absolute rate of a language or source is simply


 * $$R = \log |M| ,\,$$

the logarithm of the cardinality of the message space, or alphabet. (This formula is sometimes called the Hartley function.) This is the maximum possible rate of information that can be transmitted with that alphabet. (The logarithm should be taken to a base appropriate for the unit of measurement in use.) The absolute rate is equal to the rate if and only if the source is memoryless and has a uniform distribution.

The absolute redundancy can then be defined as


 * $$ D = R - r ,\,$$

the difference between the rate and the absolute rate.

The quantity $$\frac D R $$ is called the relative redundancy and gives the maximum possible data compression ratio, when expressed as the percentage by which a file size can be decreased. (When expressed as a ratio of original file size to compressed file size, the quantity $$R : r$$ gives the maximum compression ratio that can be achieved.) Complementary to the concept of relative redundancy is efficiency, defined as $$\frac r R .$$  A memoryless source with a uniform distribution has zero redundancy (and thus 100% efficiency), and cannot be compressed.