Skip to main content icon/video/no-internet

Significance Level, Concept of

The concept of significance level originates from the discipline of statistical inference, which might be summarized as the application of the scientific method to either observed or collected data. In this setting, it is assumed that the data might be described by some stochastic model, which means that random variation is associated with the variables being measured.

First, an appropriate model for the data must be determined. Then, two hypotheses, known as the null hypothesis (typically denoted H0) and the alternative hypothesis (typically denoted Ha), are formulated. These hypotheses are precisely stated in terms of a parameter for the chosen model, such as a mean, proportion, or standard deviation. The null hypothesis specifies a particular value for the parameter of interest, and the alternative hypothesis considers a range of possibilities different from the specified value. Based on the data, the strength of the evidence against the null hypothesis is assessed, considering the probability of observing the data if, in fact, the null hypothesis is correct. Because it was assumed that the data are the product of a random process, it can never be concluded with perfect certainty that the null hypothesis is right or wrong, because even the most unlikely outcomes might occur at one time or another. The probability that the null hypothesis is incorrectly rejected based on the observed data is called the p value of a statistical test, and such a false rejection is called a Type I error. Common practice is to decide to reject null hypotheses if p values fall below a certain threshold, such as .05 or .01, so that the probability of a Type I error is small. These thresholds are known as significance levels, and it is common to make statements such as “the test was significant at the .05 level” as an alternative to reporting a precise p value.

As an example, suppose a referee for an upcoming professional football game is responsible for the official coin toss. Before the game, the referee decides to determine whether the coin is “fair” by performing the simple experiment of tossing it ten times and counting the number of heads that are observed. If the coin is fair, then the number of heads should follow a binomial distribution with parameters n = 10 (number of tosses) and p = .5 (probability of a head on each toss). The referee formulates the null and alternative hypothesis, H0: p = .5 and Ha: p ≠ .5. This is known as a two-sided alternative, which considers the possibility that the true value of p might be either larger or smaller than the value specified in the null hypothesis. After tossing the coin ten times, the referee observes eight heads and two tails. The mathematical probability of observing an outcome at least as extreme as this one, assuming that the coin is actually fair, turns out to be .11. This means that, if the referee rejects the null hypothesis on the basis of this evidence, then he or she would be incorrect in doing so 11% of the time. Because the p value is not smaller than .05, he or she cannot reject the null hypothesis at this significance level and also cannot reject the null hypothesis at the less conservative .10 significance level. This does not mean, however, that the coin is fair and that the referee can accept the null hypothesis. It simply means that there is insufficient evidence to reject this hypothesis with a high degree of statistical confidence on the basis of the available data.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading