Residuals

Neil J.Salkind

doi:10.4135/9781412961288

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Residuals

Edited by:
Neil J. Salkind
In:Encyclopedia of Research Design
Chapter DOI:https://doi.org/10.4135/9781412961288.n385
Subject:Research Design
Keywords:residuals

Request Permissions

Show page numbers Hide page numbers

In statistics in general, the concept of residuals (e) is used when referring to the observable deviation of an individual observation in a sample to its corresponding sample mean. Not to be confused with the statistical error (∊), which refers to the often unobservable deviation of an individual observation from its often unknown population mean. The residual serves as an estimate for the unobservable statistical error when a sample is drawn from the population of interest but the population mean cannot be computed because the researcher simply does not have all individual observations belonging to the target population.

In this entry, the concept of residuals in regression analysis is defined, the use of residuals for checking the standard regression assumptions is discussed, and the role of the residuals in assessing the quality of the regression model and in hypothesis testing of the estimated regression coefficients is explained. In addition, the need for different types of residuals is described.

Regression Analysis

The concept of residual in regression analysis is of utmost importance, particularly when one is [Page 1268]applying the regression method of ordinary least squares. In ordinary least squares regression analysis, the goal of which is to fit a line or (hyper)plane to observed data when one is using more than one explanatory variable, residuals depict the vertical distances between observed values and the corresponding fitted values. Conceptually, the residual is the part of the dependent variable (Y) that is not linearly related to the explanatory variables (X1, X2, …, Xp). In other words, it is the part of Y that cannot be explained by the estimated regression model.

In a regression model defined as

the ith fitted value, the value lying on the hyper-plane, can be calculated as

where p refers to the number of explanatory variables and n, the number of observations.

Analogously, the vertical distance between observed and fitted values for the ith observation, the ith residual (ei), is defined as

which is graphically shown for the simple regression model in Figure 1.

Figure 1 Residual (ei) in Ordinary Least Square Regression Analysis

Standard Regression Assumptions

The assumptions about the error terms for the ordinary least square estimation are that the error terms are independently and identically distributed normal random variables, each with a mean of zero and a common variance, often expressed as

where σ2In denotes a (n-n) matrix with the individual error variances on the main diagonal being equal to σ2 and all the covariances on the off-diagonal being zero. For instance, for n = 6, the matrix σ2In is

Because of these error term assumptions, a standard practice in applied regression analysis is to include a series of tests and/or residual plots to check whether the residuals violate some of these assumptions. The literature is rich on statistical tests based on the calculated residuals, which include the Durbin-Watson test for autocorrelation (i.e., the errors are independent of each other) and the Breusch-Pagan test for heteroscedasticity (i.e., constant variance of the errors).

In addition to all these statistical tests, residual plots are an excellent and necessary tool to check for potential violations of the standard regression assumptions. As explained below, however, for better interpretation most residual plots do not use the original estimated residuals but rather use transformed residuals, such as standardized or studentized residuals. For instance, the normal probability plot of studentized residuals is suitable to check for the normality assumption, while the scatterplot of studentized residuals versus the explanatory variables is usable to check for the linearity and the constant variance assumptions. Further, the index plot of studentized residuals will point toward violations of the independence of errors, such as autocorrelation or spatial dependence of the error terms.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Residuals

Regression Analysis

Figure 1 Residual (ei) in Ordinary Least Square Regression Analysis

Standard Regression Assumptions

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends