Entry
Reader's guide
Entries A-Z
Subject index
Principal Components Analysis
Principal components analysis (PCA) is the workhorse of exploratory multivariate data analysis, especially in those cases when a researcher wants to gain an insight into and an overview of the relationships between a set of variables and evaluate individuals with respect to those variables. The basic technique is designed for continuous variables, but variants have been developed that cater for variables of categorical and ordinal MEASUREMENT LEVELS, as well as for sets of variables with mixtures of different types of measurement levels. In addition, the technique is used in conjunction with other techniques, such as REGRESSION ANALYSIS. In this entry, we will concentrate on standard PCA, but overviews of PCA at work in different contexts can, for instance, be found in the books by Joliffe (1986) and Jackson (1991), and an exposé of PCA for variables with different measurement levels is contained in Meulman and Heiser (2000).
Theory
Suppose that we have the scores of I individuals on J variables and that the relationships between the variables are such that no variable can be perfectly predicted by all the remaining variables. Then these variables form the axes of a J-dimensional space, and the scores of the individuals on these J variables can be portrayed in this J-dimensional space. However, looking at a high-dimensional space is not easy; moreover, most of the variability of the high-dimensional arrangement of the individuals can often be displayed in a low-dimensional space without much loss in variability. As an example, we see in Figure 1 that the two-dimensional ellipse A of scores of Sample A can be reasonably well represented in one dimension by the first principal component, and one only needs to interpret the variability along this single dimension. However, for the scores of Sample B, the one-dimensional representation is much worse (i.e., the variance accounted for by the first principal component is much lower in Case B than in Case A, and interpreting a single dimension might not suffice in Case B).
The coordinate axes of the low-dimensional dimensional space are commonly called components. If the components are such that they successively account for most of the variability in the data, they are called principal components. The coordinates of the individuals on the components are called component scores. To interpret components, the coordinates for the variables on these components need to be derived as well, and the common approach to do this is via EIGENVALUEEIGENVECTOR techniques. If both the variables and the components are standardized, the variable coordinates are the correlations between variables and components. By inspecting these correlations, commonly known as component loadings, one may assess the extent to which the components measure the same quantities as (groups of) variables. In particular, when a group of variables has high correlations with a component, the component has something in common with all of them, and on the basis of the substantive content of the variables, one may try to ascertain what the common element between the variables may be and hypothesize that the component is measuring this common element.
...
- Analysis of Variance
- Association and Correlation
- Association
- Association Model
- Asymmetric Measures
- Biserial Correlation
- Canonical Correlation Analysis
- Correlation
- Correspondence Analysis
- Intraclass Correlation
- Multiple Correlation
- Part Correlation
- Partial Correlation
- Pearson's Correlation Coefficient
- Semipartial Correlation
- Simple Correlation (Regression)
- Spearman Correlation Coefficient
- Strength of Association
- Symmetric Measures
- Basic Qualitative Research
- Basic Statistics
- F Ratio
- N(n)
- t-Test
- X¯
- Y Variable
- z-Test
- Alternative Hypothesis
- Average
- Bar Graph
- Bell-Shaped Curve
- Bimodal
- Case
- Causal Modeling
- Cell
- Covariance
- Cumulative Frequency Polygon
- Data
- Dependent Variable
- Dispersion
- Exploratory Data Analysis
- Frequency Distribution
- Histogram
- Hypothesis
- Independent Variable
- Measures of Central Tendency
- Median
- Null Hypothesis
- Pie Chart
- Regression
- Standard Deviation
- Statistic
- Causal Modeling
- Discourse/Conversation Analysis
- Econometrics
- Epistemology
- Ethnography
- Evaluation
- Event History Analysis
- Experimental Design
- Factor Analysis and Related Techniques
- Feminist Methodology
- Generalized Linear Models
- Historical/Comparative
- Interviewing in Qualitative Research
- Latent Variable Model
- Life History/Biography
- Log-Linear Models (Categorical Dependent Variables)
- Longitudinal Analysis
- Mathematics and Formal Models
- Measurement Level
- Measurement Testing and Classification
- Multilevel Analysis
- Multiple Regression
- Qualitative Data Analysis
- Sampling in Qualitative Research
- Sampling in Surveys
- Scaling
- Significance Testing
- Simple Regression
- Survey Design
- Time Series
- ARIMA
- Box-Jenkins Modeling
- Cointegration
- Detrending
- Durbin-Watson Statistic
- Error Correction Models
- Forecasting
- Granger Causality
- Interrupted Time-Series Design
- Intervention Analysis
- Lag Structure
- Moving Average
- Periodicity
- Serial Correlation
- Spectral Analysis
- Time-Series Cross-Section (TSCS) Models
- Time-Series Data (Analysis/Design)
- Trend Analysis
- Loading...
Get a 30 day FREE TRIAL
-
Watch videos from a variety of sources bringing classroom topics to life
-
Read modern, diverse business cases
-
Explore hundreds of books and reference titles
Sage Recommends
We found other relevant content for you on other Sage platforms.
Have you created a personal profile? Login or create a profile so that you can save clips, playlists and searches