Skip to main content icon/video/no-internet

coefficients of correlation, Alienation, and Determination

The coefficient of correlation evaluates the similarity of two sets of measurements (i.e., two dependent variables) obtained on the same observations. The coefficient of correlation indicates the amount of information common to the two variables. This coefficient takes values between − 1 and +1 (inclusive). A value of +1 shows that the two series of measurements are measuring the same thing. A value of − 1 indicates that the two measurements are measuring the same thing, but one measurement varies inversely to the other. A value of 0 indicates that the two series of measurements have nothing in common. It is important to note that the coefficient of correlation measures only the linear relationship between two variables and that its value is very sensitive to outliers.

The squared correlation gives the proportion of common variance between two variables and is also called the coefficient of determination. Subtracting the coefficient of determination from unity gives the proportion of variance not shared between two variables. This quantity is called the coefficient of alienation.

The significance of the coefficient of correlation can be tested with an F or a t test. This entry presents three different approaches that can be used to obtain p values: (1) the classical approach, which relies on Fisher's F distributions; (2) the Monte Carlo approach, which relies on computer simulations to derive empirical approximations of sampling distributions; and (3) the nonparametric permutation (also known as randomization) test, which evaluates the likelihood of the actual data against the set of all possible configurations of these data. In addition to p values, confidence intervals can be computed using Fisher's Z transform or the more modern, computationally based, and nonparametric Efron's bootstrap.

Note that the coefficient of correlation always overestimates the intensity of the correlation in the population and needs to be “corrected” in order to provide a better estimation. The corrected value is called shrunken or adjusted.

Notations and Definition

Suppose we have S observations, and for each observation s, we have two measurements, denoted Ws and Ys, with respective means denoted Mw and My. For each observation, we define the cross-product as the product of the deviations of each variable from its mean. The sum of these cross-products, denoted SCPwy, is computed as

None

The sum of the cross-products reflects the association between the variables. When the deviations have the same sign, they indicate a positive relationship, and when they have different signs, they indicate a negative relationship.

The average value of the SCPWY is called the covariance (just like the variance, the covariance can be computed by dividing by S or by S − 1):

None

The covariance reflects the association between the variables, but it is expressed in the original units of measurement. In order to eliminate the units, the covariance is normalized by division by the standard deviation of each variable. This defines the coefficient of correlation, denoted rW.Y, which is equal to

None

Rewriting the previous formula gives a more practical formula:

None

where SCP is the sum of the cross-product and SSW and SSY are the sum of squares of W and Y, respectively.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading