Skip to main content icon/video/no-internet

Significance, Statistical

Statistical significance refers to the difference between two measurements that results from more than randomness. Every research project, experiment, or study that counts, measures, quantifies, or otherwise collects or handles data will ultimately have the need to make comparisons between their data and some other standard. If a difference between two measurements or a measurement and some standard is detected, then that difference is a statistically significant difference if it is caused by an actual difference between the two, rather than simply a result of random variation.

For example, a psychologist might be interested in determining which of two treatments is more effective in treating depression. Having determined some means of determining each treatments’ effectiveness, the researcher might, then, administer the different treatments to selected individuals as a part of a designed experiment. Of course, any measure of the treatments’ effectiveness, no matter how objective, will only be an estimate of the actual effect. This estimate will vary from the true amount because of many factors, including idiosyncrasies in the testing process, errors in subjective judgment, flaws in measurement, or any number of other sources. Because of the inherent variability in the estimation of any unknown population parameter (in this case, the true effectiveness of each test), in comparing these two measures, the researcher must determine whether the difference between the two results is only caused by variability in the estimation process or if it is caused by an actual difference between the measurements. If the latter is true, then the difference in the measurements is said to be statistically significant. Although there are other methods, this determination is made most frequently by means of hypothesis testing. After briefly discussing the history, this entry discusses hypothesis testing and multiple comparisons and then the objections to statistical significance testing.

History

Although the formal development of hypothesis testing would not begin until 1925, less formal, ad hoc testing for statistical significance was being done around the turn of the 20th century. In 1908, William Gosset, who is commonly known as “Student,” developed his t test for the mean of a normally distributed population with unknown population standard deviation, and before that, in 1892, Karl Pearson published work on chi-square tests for significance with frequency distributions. Perhaps the earliest example of testing for statistical significance is a paper entitled “An Argument for Divine Providence Taken From the Constant Regularity of the Births of Both Sexes” by John Arbuthnot written in 1710, in which he examines birth records in London and concludes that there is good reason to think that the birth rate of males was higher than that of females (i.e., significantly higher). It was not until 1925, however, that R.A. Fisher began the formal development of testing for statistical significance. His work, along with that of Jerzy Neyman and Egon Pearson a few years later, is the foundation for what is known today as hypothesis testing.

Hypothesis Testing

Fisher, and others writing on this topic at that time, were influenced by the view, which was largely advanced by Karl Popper, that scientific theories must be falsifiable. To that end, the chief purpose of hypothesis testing is not to determine the actual size of the difference between two measurements, but rather to demonstrate that the difference exists (i.e., is not zero) given some observed data. Specifically, hypothesis testing requires two hypotheses: the null hypothesis (often written H0) and the alternative hypothesis (often written Ha or H1). The null hypothesis is a straw man. It is the theory that the researcher is attempting to falsify by experimentation. The alternative hypothesis is a statement of what the researcher believes to be the true state of affairs. For instance, if a sociologist performs research to determine whether after-school programs reduce the likelihood that participants will be involved in violent crime, the appropriate null hypothesis is that such programs do not reduce the likelihood that participants will be involved in violent crime, whereas one alternative hypothesis might be that these programs do, in fact, reduce such crimes. An educational researcher might want to determine whether preschool attendance increases test scores in at-risk children. That researcher's null hypothesis would be that preschool does not increase test scores, whereas the alternative hypothesis might suggest that it does. In practice, the null hypothesis generally involves the “equals” sign, whereas the alternative hypothesis employs some sort of inequality. Typically, two types of alternative hypotheses are used: one-sided and two-sided. Although the null hypothesis states the simple and specific equality that the researcher seeks to disprove, the one-sided alternative hypothesis gives the direction in which the true value differs from the hypothesized value. A one-sided alternative hypothesis can be right-tailed, indicating that the true value of the population parameter under consideration is greater than the value hypothesized in H0, or left-tailed, indicating that the true value is less than the hypothesized value. For example, if a particular null hypothesis states that the true mean of a given population is, say, 5, then the corresponding right-tailed alternative hypothesis would be that the true mean is greater than 5, whereas the corresponding left-tailed hypothesis is that the true mean is less than 5. A two-tailed alternative hypothesis is different only in that it does not indicate direction (e.g., the true mean is not equal to 5). These hypotheses must be chosen before the data are collected. If the researcher allows the data to influence the choice of hypotheses, then the test for statistical significance will lose accuracy.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading