General Linear Model

Neil J.Salkind

doi:10.4135/9781412961288

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

General Linear Model

Edited by:
Neil J. Salkind
In:Encyclopedia of Research Design
Chapter DOI:https://doi.org/10.4135/9781412961288.n166
Subject:Research Design
Keywords:general linear models; independent variables

Request Permissions

Show page numbers Hide page numbers

The general linear model (GLM) provides a general framework for a large set of models whose common goal is to explain or predict a quantitative dependent variable by a set of independent variables that can be categorical or quantitative. The GLM encompasses techniques such as Student's t test, simple and multiple linear regression, analysis of variance, and covariance analysis. The GLM is adequate only for fixed-effect models. In order to take into account random-effect models, the GLM needs to be extended and becomes the mixed-effect model.

Notations

Vectors are denoted with boldface lower-case letters (e.g., y), and matrices are denoted with boldface upper-case letters (e.g., X). The transpose of a matrix is denoted by the superscript, and the inverse of a matrix is denoted by the superscript 1. There are I observations. The values of a quantitative dependent variable describing the I observations are stored in an I by 1 vector denoted y. The values of the independent variables describing the I observations are stored in an I by K matrix denoted X. K is smaller than I, and X is assumed to have rank K (i.e., X is full rank on its columns). A quantitative independent variable can be directly stored in X, but a qualitative independent variable needs to be recoded with as many columns as there are degrees of freedom for this variable. Common coding schemes include dummy coding, effect coding, and contrast coding.

Core Equation

For the GLM, the values of the dependent variable are obtained as a linear combination of the values of the independent variables. The vectors for the coefficients of the linear combination are stored in a K by 1 vector denoted b. In general, the values of y cannot be perfectly obtained by a linear combination of the columns of X, and the difference between the actual and the predicted values is called the prediction error. The values of the error are stored in an I by 1 vector denoted e. Formally, the GLM is stated as

The predicted values are stored in an I by 1 vector denoted ŷ, and therefore, Equation 1 can be rewritten as

Putting together Equations 1 and 2 shows that

Additional Assumptions

The independent variables are assumed to be fixed variables (i.e., their values will not change for a replication of the experiment analyzed by the GLM, and they are measured without error). The error is interpreted as a random variable; in addition, the I components of the error are assumed to be independently and identically distributed (i.i.d.), and their distribution is assumed to be a normal distribution with a zero mean and a variance denoted σ2e. The values of the dependent variable are assumed to be a random sample of a population of interest. Within this framework, [Page 539]the vector b is seen as an estimation of the population parameter vector β.

Least Square Estimate

Under the assumptions of the GLM, the population parameter vector β is estimated by b, which is computed as

This value of b minimizes the residual sum of squares (i.e., b is such that eTe is minimum).

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

General Linear Model

Notations

Core Equation

Additional Assumptions

Least Square Estimate

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends