Coding and Compression: A Happy Union of Theory and Practice

New Image

The mathematical theory behind coding and compression began in 1948, a little more than fifty years ago, with the publication of Claude Shannon's (1948) paper "A Mathematical Theory of Communication" in the Bell System's Technical Journal. This paper laid down the foundation for what is now known as information theory in a mathematical framework that is probabilistic (see e.g., Cover and Thomas 1991, Verdu 1998). That is, Shannon modeled the signal or message process by a random process and a communication channel by a random transition matrix that may distort the message. In the five decades that followed, information theory provided fundamental limits for communication in general and coding and compression in particular. These limits, predicted by information theory under probabilistic models, are now being approached in real products such as computer modems. Since these limits or fundamental communication quantities such as entropy and channel capacity vary from signal process to signal process or from channel to channel, they have to be estimated for each communication setup. In this sense, information theory is intrinsically statistical. Moreover, the algorithmic theory of information has inspired an extension of Shannon's ideas that provides a formal measure of information of the kind long sought for in statistical inference and modeling. This measure has led to the Minimum Description Length (MDL) principle for modeling in general and model selection in particular (see Rissanen 1978, Rissanen 1989, Barron, Rissen and Yu 1988, Hansen and Yu 1998).