• Entry
• Entries A-Z
• Subject index

Canonical Correlation

Canonical correlation is a statistical measure for expressing the relationship between two sets of variables. Formally, given two random vectors x ∈ Rdx and y ∈ Rdy with some joint (unknown) distribution D, the canonical correlation analysis (CCA) seeks vectors u ∈ Rdx and v ∈ Rdy, such that the random vectors when projected along these directions, that is, variables u > x and v > y, are maximally correlated. Equivalently, we can write CCA as the following optimization problem: find u ∈ Rdx, v ∈ Rdy that:

$\begin{array}{l}{\text{Maximize}}_{dx\text{\hspace{0.17em}}dy}\text{\hspace{0.17em}}\rho \left(u>x,v>y\right),\hfill \\ \text{u}\in \text{R},v\in \text{R}\hfill \end{array}$

where the correlation, ρ(u > x, v > y), between two random variables, is defined as $\rho \left(u>x,v>y\right)=\sqrt{\mathrm{cov}\left(u>\sqrt{x,v>}y\right)}$. Assuming that vectors x and y are 0 mean, we can write CCA as the problem var(u > x) var(u > ...