Skip to main content icon/video/no-internet

Normality Assumption

The normal distribution (also called the Gaussian distribution: named after Johann Gauss, a German scientist and mathematician who justified the least squares method in 1809) is the most widely used family of statistical distributions on which many statistical tests are based. Many measurements of physical and psychological phenomena can be approximated by the normal distribution and, hence, the widespread utility of the distribution. In many areas of research, a sample is identified on which measurements of particular phenomena are made. These measurements are then statistically tested, via hypothesis testing, to determine whether the observations are different because of chance. Assuming the test is valid, an inference can be made about the population from which the sample is drawn.

Hypothesis testing involves assumptions about the underlying distribution of the sample data. Three key assumptions, in the order of importance, are independence, common variance, and normality. The term normality assumption arises when the researcher asserts that the distribution of the data follows a normal distribution. Parametric and nonparametric tests are commonly based on the same assumptions with the exception being nonparametric tests do not require the normality assumption.

Independence refers to the correlation between observations of a sample. For example, if you could order the observations in a sample by time, and observations that are closer together in time are more similar and observations further apart in time are less similar, then we would say the observations are not independent but correlated or dependent on time. If the correlation between observations is positive then the Type I error is inflated (Type I error level is the probability of rejecting the null hypothesis when it is true and is traditionally defined by alpha and set at .05). If the correlation is negative, then Type I error is deflated. Even modest levels of correlation can have substantial impacts on the Type I error level (for a correlation of .2 the alpha is .11, whereas for a correlation of .5, the alpha level is .26). Independence of observations is difficult to assess. With no formal statistical tests widely in use, knowledge of the substantive area is paramount and a through understanding of how the data were generated is required for valid statistical analysis and interpretation to be undertaken.

Common variance (often referred to as homogeneity of variance) refers to the concept that the variance of all samples drawn has similar variability. For example, if you were testing the difference in height between two samples of people, one from Town A and the other from Town B, the test assumes that the variance of height in Town A is similar to that of Town B. In 1953, G. E. P. Box demonstrated that for even modest sample sizes, most tests are robust to this assumption, and differences of up to 3-fold in variance do not greatly affect the Type I error level. Many statistical tests are available to ascertain whether the variances are equal among different samples (including the Bartlett–Kendall test, Levene's test, and the Brown–Forsythe test). These tests for the homogeneity of variance are sensitive to normality departures, and as such they might indicate that the common variance assumption does not hold, although the validity of the test is not in question.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading