Skip to main content icon/video/no-internet

Principal Components Analysis

Also known as empirical orthogonal function analysis, principal components analysis (PCA) is a multivariate data analysis technique that is employed to reduce the dimensionality of large data sets and simplify the representation of the data field under consideration. PCA is used to understand the interdependencies among variables and trim down the redundant (or significantly correlated) variables that are measuring the same construct. Data sets with a considerable proportion of interrelated variables are transformed into a set of new hypothetical variables known as principal components, which are uncorrelated or orthogonal to one another. These new variables are ordered so that the first few components retain most of the variation present in the original data matrix. The components reflect both common and unique variance of the variables (as opposed to common factor analysis that excludes unique variance), with the last few components identifying directions in which there is negligible variation or a near linear relationship with the original variables. Thus, PCA reduces the number of variables under examination and allows one to detect and recognize groups of interrelated variables. Frequently, PCA does not generate the final product and is often used in combination with other statistical techniques (e.g., cluster analysis) to uncover, model, and explain the leading multivariate relationships. The method was first introduced in 1901 by Karl Pearson and subsequently modified three decades later by Harold Hotelling for the objective of exploring correlation structures; it has since been used extensively in both the physical and social sciences.

Mathematical Origins and Matrix Constructs

PCA describes the variation in a set of multivariate data in terms of a new assemblage of variables that are uncorrelated to one another. Mathematically, the statistical method can be described briefly as a linear transformation from the original variables, x1;…;xp, to new variables, y1;…;yp(as described succinctly by Geoff Der and Brian Everitt), where

None

The coefficients, app,defining each new variable are selected in such a way that the yvariables or principal components are orthogonal, meaning that the coordinate axes are rotated such that the axes are still at right angles to each other while maximizing the variance. Each component is arranged according to decreasing order of variance accounted for in the original data matrix. The number of possible principal components is equal to the number of input variables, but not all components will be retained in the analysis seeing that a primary goal of PCA is simplification of the data matrix (see the subsequent section on principal component truncation methods).

The original coordinates of the zth data point, xij,j = 1,…, p,becomes in the new system (as explained by Trevor Bailey and Anthony Gatrell):

None

The jth new variable y7is normally referred to as the jth principal component, whereas yijis termed the score of the jth observation on the /th principal component. The relationship between the /th principal component and the kth original variable is described by the covariance between them, given as

None

where skkis the estimated variance of the th original variable or the th diagonal element of the data matrix S. This relationship is referred to as a loading of the th original variable of the /th principal component. Component loadings are essentially correlations between the variables and the component and are interpreted similarly to product-moment correlation coefficients (or Pearson's r). Values of components loadings range from 1.0 to 1.0. More positive (negative) component loadings indicate a stronger linkage of a variable on a particular component, and those values closer to zero signify that the variable is not being represented by that component.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading