Skip to main content icon/video/no-internet

The roots of structural equation modeling (SEM) begin with the invention of least squares about 200 years ago, the invention of factor analysis about 100 years ago, the invention of path analysis about 75 years ago, and the invention of simultaneous equation models about 50 years ago. The primary focus with SEM is on testing causal processes inherent in our theories. Before SEM, measurement error was assessed separately and not explicitly included in tests of theory. This separation has been one of the primary obstacles to advancing theory. With SEM, measurement error is estimated and theoretical parameters are adjusted accordingly—that is, it is subtracted from parameter estimates. Thus, SEM is a fundamental advancement in theory construction because it integrates measurement with substantive theory. It is a general statistical methodology, extending correlation, regression, factor analysis, and path analysis.

SEM is sometimes referred to as ‘latent variable modeling’ because it reconstructs relationships between observed variables to infer latent variables. Many variables in epidemiological research are observable and can be measured directly (e.g., weight, pathogens, mortality). However, many variables are also inherently unobservable or latent, such as wellbeing, health, socioeconomic status, addiction, and quality of life. Measuring and interpreting latent variables requires a measurement theory. Latent variables and its respective measurement theory can be tested using an SEM technique called ‘confirmatory factor analysis.’ This involves specifying which latent variables are affected by which observed variables and which latent variables are correlated with each other.

SEM also provides a way of systematically examining reliability and validity. Reliability is the consistency of measurement and represents the part of a measure that is free from random error. In SEM, reliability is assessed as the magnitude of the direct relations that all variables except random ones have on an observed variable. This capability of SEM to assess the reliability of each observed variable and simultaneously estimate theoretical and measurement parameters is a fundamental methodological advancement. The potential for distortion in theoretical parameters is high when measurement error is ignored, and the more complicated the model the more important it becomes to take measurement error into account. Validity is the degree of direct structural relations (invariant) between latent and measured variables. SEM offers several ways of assessing validity. Validity differs from reliability because we can have consistent invalid measures. The R2 value of an observed variable offers a straightforward measure of reliability. This R2 sets an upper limit for validity because the validity of a measure cannot exceed its reliability.

Major Assumptions

Like other kinds of analyses, SEM is based on a number of assumptions. For example, it assumes that data represent a population. Unlike traditional methods, however, SEM tests models by comparing sample data with the implied population parameters. This is particularly important because the distinction between sample and population parameters has been often ignored in practice. SEM generally assumes that variables are measured at the interval or ratio level, and ordinal variables, if used at all, are truncated versions of interval or ratio variables. Hypothesized relationships are assumed to be linear in their parameters. All variables in a model are assumed to have a multivariate Gaussian or normal distribution. Therefore, careful data screening and cleaning are essential to successfully work with SEM.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading