Skip to main content icon/video/no-internet

Central Limit Theorem

The central limit theorem (CLT) is, along with the theorems known as laws of large numbers, the cornerstone of probability theory. In simple terms, the theorem describes the distribution of the sum of a large number of random numbers, all drawn independently from the same probability distribution. It predicts that, regardless of this distribution, as long as it has finite variance, then the sum follows a precise law, or distribution, known as the normal distribution.

Let us describe the normal distribution with mean μ and variance σ2: It is defined through its density function,

None

where the variable x ranges from −∞ to +∞. This means that if a random variable follows this distribution, then the probability that it is larger than a and smaller than b is equal to the integral of the function f(x) (the area under the graph of the function) from x = a to x = b. The normal density is also known as Gaussian density, named for Carl Friedrich Gauss, who used this function to describe astronomical data. If we put μ = 0 and σ2 = 1 in the above formula, then we obtain the so-called standard normal density.

In precise mathematical language, the CLT states the following: Suppose that X1, X2,… are independent random variables with the same distribution, having mean μ and variance σ2 but being otherwise arbitrary. Let Sn = X1 + … + X2 be their sum. Then

None

It is more appropriate to define the standard normal density as the density of a random variable ζ with zero mean and variance 1 with the property that, for every a and b there is c such that if ζ1, ζ2 are independent copies of ζ, then

None
is a copy of . It follows that
None
holds and that there is only one choice for the density of ζ, namely, the standard normal density.

As an example, consider tossing a fair coin n = 1,000 times and determining the probability that fewer than 450 heads are obtained. The CLT can be used to give a good approximation of this probability. Indeed, if we let Xi be a random variable that takes value 1 if heads show up at the zth toss or value 0 if tails show up, then we see that the assumptions of the CLT are satisfied because the random variables have the same mean μ = 1/2 and variance σ2 = 1/4. On the other hand,

None
is the number of heads. Since Sn 450 if and only if
None
, we find, by the CLT, that the probability that we get at most 450 heads equals the integral of the standard density from −∞ to −3.162. This integral can be computed with the help of a computer (or tables in olden times) and found to be about 0.00078, which is a reasonable approximation. Incidentally, this kind of thing leads to the so-called statistical hypothesis testing: If we toss a coin and see 430 heads and 570 tails, then we should be suspicious that the coin is not fair.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading