Skip to main content icon/video/no-internet

Canonical Correlation Analysis

Canonical correlation analysis (CCA) is a multivariate statistical method that analyzes the relationship between two sets of variables, in which each set contains at least two variables. It is the most general type of the general linear model, with multiple regression, multiple analysis of variance, analysis of variance, and discriminant function analysis all being special cases of CCA.

Although the method has been available for more than 70 years, its use has been somewhat limited until fairly recently due to its lack of inclusion in common statistical programs and its rather labor-intensive calculations. Currently, however, many computer programs do include CCA, and thus the method has become somewhat more widely used.

This entry begins by explaining the basic logic of and defining important terms associated with CCA. Next, this entry discusses the interpretation of CCA results, statistical assumptions, and limitations of CCA. Last, it provides an example from the literature.

Basic Logic

The logic of CCA is fairly straightforward and can be explained best by likening it to a “multiple-multiple regression.” That is, in multiple regression a researcher is interested in discovering the variables (among a set of variables) that best predict a single variable. The set of variables may be termed the independent, or predictor, variables; the single variable may be considered the dependent, or criterion, variable. CCA is similar, except that there are multiple dependent variables, as well as multiple independent variables. The goal is to discover the pattern of variables (on both sides of the equation) that combine to produce the highest predictive values for both sets. The resulting combination of variables for each side, then, may be thought of as a kind of latent or underlying variable that describes the relation between the two sets of variables.

A simple example from the literature illustrates its use: A researcher is interested in investigating the relationships among gender, social dominance orientation, right wing authoritarianism, and three forms of prejudice (stereotyping, opposition to equality, and negative affect). Gender, social dominance orientation, and right wing authoritarianism constitute the predictor set; the three forms of prejudice are the criterion set. Rather than computing three separate multiple regression analyses (viz., the three predictor variables regressing onto one criterion variable, one at a time), the researcher instead computes a CCA on the two sets of variables to discern the most important predictor(s) of the three forms of prejudice. In this example, the CCA revealed that social dominance orientation emerged as the overall most important dimension that underlies all three forms of prejudice.

Important Terms

To appreciate the various terms associated with CCA, it is necessary to have a basic understanding of the analytic procedure itself. The first step in CCA involves collapsing each person's score for each variable, in the two variable sets, into a single composite, or “synthetic,” variable. These synthetic variables are created such that the correlation between the two sets is maximal. This occurs by weighting each person's score and then summing the weighted scores for the respective variable sets. Pairs of linear synthetic variables created by this maximization process are called canonical variates. The bivariate correlation between the pairs of variates is the canonical correlation (sometimes called the canonical function). There will be two canonical variates produced for each canonical correlation, with one variate representing the predictor variables and the other representing the criterion variables. The total number of canonical variate pairs produced is equal to the number of variables in either the criterion or predictor set, whichever is smaller. Finally, squaring the canonical correlation coefficient yields the proportion of variance the pairs of canonical variates (not the original variables) linearly share.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading