Ordinary Least Squares Regression

Michael W.Kattan

doi:10.4135/9781412971980

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Ordinary Least Squares Regression

Edited by:
Michael W. Kattan
In:Encyclopedia of Medical Decision Making
Chapter DOI:https://doi.org/10.4135/9781412971980.n245
Subject:Medical Decision Making
Keywords:ordinary least squares regression

Request Permissions

Show page numbers Hide page numbers

The treatment of errors has a long tradition with attempts to combine repeated measurements in astronomy and geodesy in the early 18th century. In 1805, Adrien-Marie Legendre introduced the method of least squares as a tool for using models with specification errors to fit data collected to determine the shape and circumference of the earth. Specifying the earth's shape to be a sphere, he had to estimate three parameters using five observations from the 1795 survey of the French meridian arc. With three unknowns and five equations, any estimate of the unknown parameters led to errors, when fitted to the five observations. He then proposed to choose those estimates that make “the sum of squares of the errors a minimum” (Legendre, 1805, pp. 72–73).

A formal statistical theory of errors was developed by Gauss in 1809 and Laplace in 1810. The method of least squares was shown to possess many desirable statistical properties. For more than 200 years, a method invented to deal with experimental errors in the physical sciences has become universal and is used, with practically little or no modification, in the biological and social sciences.

A scientific method in the biological sciences often involves statement of a causal relationship between observable variables and a statistical model to estimate the relation and test some hypotheses.

Three common medical decision problems involving statistical methods are screening, diagnosis, and treatment. Data used in statistical analysis include medical history, clinical symptoms, and laboratory tests. For many medical conditions, there are no perfect tests such as an X-ray to detect the fracture of a bone. Decisions have to be made using one or more associated, observable factors.

Two problems arise with this approach: (1) How does one formulate a decision rule using the associated factors? and (2) Since no decision rule will be perfect, how is one to compare the decision rules, that is, the errors associated with these rules?

The ordinary least squares regression (OLS) method provides a solution. Suppose the medical condition is type 2 diabetes, and the gold standard is the oral glucose tolerance test. For a screening rule, we want to use readily available data for risk factors such as age, gender, body mass index, race, and so on, to predict the blood glucose and identify individuals with high risk for follow-up tests. Any function of the risk factors will provide an estimate of the blood sugar and hence be useful in diagnosing diabetes. Errors associated with the estimates are calculated using the observed blood sugar. The OLS method can be used to select a set of weights to combine the risk factors and estimate the blood sugar as follows: For every set of weights, there will be corresponding predicted values of blood sugar. Prediction errors can be calculated using the observed blood sugars. One can then calculate the sum of squares of the errors and choose the set of weights with the least sum.

Why square the errors and sum? Why not simply sum the errors? A simple sum of errors will be 0 if the positive errors add up exactly to the sum of the negative errors and hence will be misleading. On the other hand, the sum of squares of errors will be 0 if and only if all the errors are 0, that is, only if there are no errors.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Ordinary Least Squares Regression

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends