Skip to main content icon/video/no-internet

Information Theory

Some theories excite the popular imagination so strongly that they acquire broad cultural influence. Ideas and terminology from the technical core reach significantly beyond the original domain of application. Information theory is one such body of ideas, canonized by the name we give to our times (“the age of information”). The initial developments of information theory, notably at Bell Laboratories, were undertaken by engineers and mathematicians such as Harry Nyquist and David Slepian trying to rationalize the design of communication systems and drew from concepts from statistical physics and signal processing. Claude Shannon, in a series of papers beginning with “A Mathematical Theory of Communication,” gave the field definitive shape and used similar ideas to develop a theory of cryptographic systems. Important contributions were also made by Norbert Wiener and Andrei Kolmogorov. The resulting framework remains one of the most conceptually well-characterized engineering theories. Note that the phrase “information theory” can be used with a broad or a narrow scope, depending on taste: It is used in this entry in a relatively narrow way, to refer to the “mathematical theory of communication,” as Shannon did. Concepts from information theory and signal processing enter into the study of perception in two ways. The first is to study some aspect of perception from the point of view of a communication engineer: How good are our visual, auditory, gustatory, olfactory, and other “sensors”? Do they pick up signals with high fidelity, and how rapidly can they sense changes in the environment? The second, related usage is in statistical measures for analyzing recordings of behavior or neural activity.

The original concern of information theory was to understand the communication of messages over noisy channels. Consider a binary channel in which only one message can be sent every time slot, with two possible values (“yes,” “no”). Now suppose that the channel is noisy, so that sometimes a “yes” becomes a “no” during transit through the channel, and vice versa, with probability p. If this happens relatively rarely (p < 1/2), one could still send a reliable “yes” message through the channel, by sending a long block of “yes's”. Some of these will become “no” because of the noise process, but by taking the majority vote, one should be able to recover the original message. For a finite block length, there is always a possibility that a chance fluctuation will lead to the wrong majority vote—but this possibility of error should intuitively decrease as one takes a longer and longer block. Mathematically, the law of large numbers can be used to show that as long as the probability of changing a “yes” message to a “no” message is less than one half, the probability of obtaining the wrong message by taking majority vote will go to zero as the block size becomes infinitely large. This corresponds to sending an unambiguous, error-free message over a noisy channel.

There is a catch with this strategy, however: Because infinitely long blocks are needed to make a single “yes” message truly noise free, it would take infinitely long to send a single noise-free binary message. This does not work because the rate of error-free communication (which is proportional to the inverse of the time taken to send a single error-free binary message) then drops to zero. Shannon's insight was to realize that by appropriately introducing redundancy into a time varying signal, error free communication was possible at a finite rate per unit time, called the information capacity of the channel.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading