Partial Least Squares Regression

Michael S.Lewis-Beck; Alan Bryman; Tim FutingLiao

doi:10.4135/9781412950589

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Partial Least Squares Regression

Edited by:
Michael S. Lewis-Beck
,
Alan Bryman
&
Tim Futing Liao
In:The SAGE Encyclopedia of Social Science Research Methods
Chapter DOI:https://doi.org/10.4135/9781412950589.n690
Subject:Research Methods
Keywords:regression

Request Permissions

Show page numbers Hide page numbers

Partial least squares (PLS) regression is a recent technique that generalizes and combines features from principal components analysis and multiple regression. It is particularly useful when we need to predict a set of dependent variables from a (very) large set of independent variables (i.e., predictors). It originated in the social sciences (specifically, economics; Wold, 1966) but became popular first in chemometrics, due in part to Wold's son Svante (see, e.g., Geladi & Kowalski, 1986), and in sensory evaluation (Martens & Naes, 1989). But PLS regression is also becoming a tool of choice in the social sciences as a multivariate technique for nonexperimental and experimental data alike (e.g., neuroimaging; see McIntosh, Bookstein, Haxby, & Grady, 1996). It was first presented as an algorithm akin to the power method (used for computing eigenvectors) but was rapidly interpreted in a statistical framework (Frank & Friedman, 1993; Helland, 1990; Höskuldsson, 1988; Tenenhaus, 1998).

Prerequisite Notions and Notations

The I observations described by K dependent variables are stored in an I × K matrix denoted Y, and the values of the J predictors collected on these Iobservations are collected in the I × J matrix X.

Goal

The goal of PLS regression is to predict Y from X and to describe their common structure. When Y is a vector and X is full rank, this goal could be accomplished using ordinary least squares (OLS). When the number of predictors is large compared to the number of observations, X is likely to be singular, and the regression approach is no longer feasible (i.e., because of multicollinearity). Several approaches have been developed to cope with this problem. One approach is to eliminate some predictors (e.g., using stepwise methods); another one, called principal component regression, is to perform a principal components analysis (PCA) of the X matrix and then use the principal components of X as regressors on Y.

The orthogonality of the principal components eliminates the multicollinearity problem. But the problem of choosing an optimum subset of predictors remains. A possible strategy is to keep only a few of the first components. But they are chosen to explain X rather than Y, and so nothing guarantees that the principal components, which “explain” X, are relevant for Y.

By contrast, PLS regression finds components from X that are also relevant for Y. Specifically, PLS regression searches for a set of components (called latent vectors) that perform a simultaneous decomposition of X and Y with the constraint that these components explain as much as possible of the covariance between X and Y. This step generalizes PCA. It is followed by a regression step in which the decomposition of X is used to predict Y.

Simultaneous Decomposition of Predictors and Dependent Variables

PLS regression decomposes both X and Y as a product of a common set of orthogonal factors and a set of specific loadings. So, the independent variables are decomposed as X = TPT with TTT = I, with I being the identity matrix (some variations of the technique do not require T to have unit norms). By analogy, with PCA, T is called the score matrix and P the loadingmatrix (in PLS regression, the loadings are not orthogonal). Likewise, Y is estimated as Ŷ = TBCT, where B is a diagonal matrix with the “regression weights” as diagonal elements (see below for more details on these weights). The columns of T are the latent vectors.When their number is equal to the rank of X, they perform an exact decomposition of X. Note, however, that they only estimateY (i.e., in general, Ŷ is not equal to Y).

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Partial Least Squares Regression

Prerequisite Notions and Notations

Goal

Simultaneous Decomposition of Predictors and Dependent Variables

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends