Skip to main content icon/video/no-internet

This entry provides a nontechnical description of loglinear models, which were developed to analyze multivariate cross-tabulation tables. Although a detailed exposition is beyond its scope, the entry describes when loglinear models are necessary, what these models do, how they are tested, and the more familiar extensions of binomial and multinomial logistic regression.

Why Loglinear Models?

Many social science phenomena, such as designated college major or type of exercise, are non-numeric, and categories of the variable cannot even be ordered from highest to lowest. Thus, the phenomenon is a nominal dependent variable; its categories form a set of mutually exclusive qualities or traits. Any two cases might fall into the same or different categories, but we cannot assert that the value of one case is more or less than that of a second.

Many popular statistics assume the dependent or criterion variable is numeric (e.g., years of formal education). What can the analyst investigating a nominal dependent variable do? There are several techniques for investigating a nominal dependent variable, many of which are discussed in the next section. (Those described in this entry can also be used with ordinal dependent variables. The categories of an ordinal variable can be rank ordered from highest to lowest, or most to least.)

One alternative is logistic regression. However, many analysts have learned binomial logistic regression using only dichotomous or “dummy” dependent variables scored 1 or 0. Furthermore, the uninitiated interpret logistic regression coefficients as if they were ordinary least squares (OLS) regression coefficients. A second analytic possibility uses three-way cross-tabulation tables and control variables with nonparametric statistical measures. This venerable tradition of “physical” (rather than “statistical”) control presents its own problems, as follows:

  • Limited inference tests for potential three-variable statistical interactions.
  • Limiting the analysis to an independent, dependent, and control variable.
  • There is no “system” to test whether one variable affects a second indirectly through a third variable; for example, education usually influences income indirectly through its effects on occupational level.
  • The three-variable model has limited utility for researchers who want to compare several causes of a phenomenon.

A third option is the linear probability model (LPM) for a dependent dummy variable scored 1 or 0. In this straightforward, typical OLS regression model, B coefficients are interpreted as raising or lowering the probability of a score of 1 on the dependent variable.

However, the LPM, too, has several problems. The regression often suffers from heteroscedasticity in which the dependent variable variance depends on scores of the independent variable(s). The dependent variable variance is truncated (at a maximum 0.25.) The LPM can predict impossible values for the dependent variable that are larger than 1 or less than 0.

Thus the following dilemma: Many variables researchers would like to explain are non-numeric. Using OLS statistics to analyze them can produce nonsensical or misleading results. Some common methods taught in early statistics classes (e.g., three-way cross tabulations) are overly restrictive or lack tests of statistical significance. Other techniques (e.g., LPM) have many unsatisfactory outcomes.

Loglinear models were developed to address these issues. Although these models have a relatively long history in statistical theory, their practical application awaited the use of high-speed computers.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading