Skip to main content icon/video/no-internet

Correlation is a descriptive statistical technique whereby the relationship between pairs of variables is assessed. The strength of the association between two variables can be determined either qualitatively, with a scatterplot, and/or quantitatively, with a correlation coefficient. A scatterplot is a two-dimensional graph with one variable plotted on each axis, whereby the slope of the least squares regression line (i.e., a linear ‘fit’ through the maximum point cluster) indicates the overall direction and strength of the association. If the linear trend of ordered pairs is sloped upward (downward) from the origin and toward the right, the correlation between the two variables is positive (negative); no slope indicates a neutral relationship, and the variables are unrelated. The closer the data points cluster to this line, the stronger the association between the two variables.

The most common quantitative measure of correlation is the Pearson product-moment correlation coefficient, or simply Pearson's correlation (r), expressed as the ratio of the covariance between two variables to the product of their individual standard deviations. Pearson's correlation assumes a linear relationship for the variables under consideration and that the data are of ordinal or ratio scale and normally distributed. Correlation values range between − 1.00 and 1.00, with more positive (negative) values indicating a stronger, direct (inverse) relationship and those values closer to zero signifying that the variables are not linked. The categorization of the strength of correlation coefficient values varies by discipline and inquiry. Typically, meaningful r values in the behavioral sciences are in the order of | 0. 60 − 1.00| for strong, |0.:59 − 0.40| for moderate, and |0.2 −0.39| for weak associations. The researcher must evaluate the correlation coefficients with caution as an association between two variables does not equate with causation, and, in some cases, spurious relationships may be found between variables that defy logical explanation.

To test whether or not a correlation coefficient is significant, the t-test statistic is the most common measure used, especially for sample sizes smaller than 30. The t-test is the ratio of the Pearson's correlation coefficient (r) to the standard error of the estimate, a measurement of the variability of sample means. The t-test scores are often converted to probability values (p values), which identifies the probability of erroneously rejecting the null hypothesis if it is, in fact, true (i.e., a Type I error). Significant p values are typically those that are less than 0.05, indicating a 95% (or greater) confidence interval and, thus, the probability of making a Type Ierror less than 5%. Inthecase of a significant t-test, the null hypothesis that the two variables are independent (i.e., not correlated) must be rejected and the alternate hypothesis that the variables are related should be accepted. Based on the tail of the probability distribution, the alternate hypothesis is designated as either one-tailed (directional) or two-tailed (nondirectional), the latter being more appropriate if there is no a priori knowledge on the direction of the correlation between the two variables.

The use of Pearson's product-moment correlation coefficient in educational psychology is illustrated in the following example. Suppose a researcher wanted to examine the relationship between the verbal test score portion on a standardized college entrance exam and number of hours spent in preparation for a group of students (n=50). To interpret the correlation coefficient examining this relationship, the researcher should inspect both the sign (i.e., positive or negative) and the magnitude of the correlation. If the correlation coefficient is high and positive (e.g., .78), that means that those students who score high on the verbal section of the test are very likely to have spent more hours in preparation than other students. If the correlation is close to zero (e.g., .08), then there seems to be no direct relationship between the test score and how long the students prepared. Finally, if the correlation is high and negative (e.g., −.62), then the students who prepared more are not likely to do as well on the test as those who prepared less.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading