Degrees of Freedom

Sarah Boslaugh

doi:10.4135/9781412953948

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Degrees of Freedom

Edited by:
Sarah Boslaugh
In:Encyclopedia of Epidemiology
Chapter DOI:https://doi.org/10.4135/9781412953948.n104
Subject:Epidemiology & Biostatistics, Public Health (general), Public Health Research Methods

Request Permissions

Show page numbers Hide page numbers

A good understanding of degrees of freedom (df)is important in statistics, but most statistics textbooks do not really explain what it means. In most cases, degrees of freedom are thought of as a parameter used to define statistical distributions and conduct [Page 258]hypothesis tests. For instance, the sampling distribution for the t statistic is a continuous distribution called the t distribution. The shape of the t distribution depends on one parameter, the degrees of freedom. In a sample of size n, the t distribution has n − 1 df.

Degrees of freedom can be thought of in other ways also. The degrees of freedom indicate the number of independent pieces of information that are allowed to vary in a system. A simple example is given by imagining a four-legged table. When three of the legs are free to be any length, the fourth leg must be a specified length if the table is to stand steadily on the floor. Thus, the degrees of freedom for the table legs are three. Another example involves dividing a sample of n observations into k groups. When k − 1 cell counts are generated, the kth cell count is determined by the total number of observations. Therefore, there are k − 1 df in this design.

Generally, every time a statistic is estimated, 1 df is lost. A sample of n observations has ndf. A statistic calculated from that sample, such as the mean, also has ndf. The sample variance is given by the following equation:

where x¯ is the sample mean. The degrees of freedom for the sample variance are n − 1, because the number of independent pieces of information in the system that are allowed to vary is restricted. Since the sample mean is a fixed value—it cannot vary—1 df is lost. Another reason there are n − 1 degrees of freedom is that the sample variance is restricted by the condition that the sum of errors

is zero. When m linear functions of the sample data are held constant, there are n − mdf.

We can look at degrees of freedom another way by referring to simple regression for an example. Often, we want to compare results from a regression model (the full model) with another model that includes fewer parameters and, therefore, has fewer degrees of freedom (the reduced model). The difference in degrees of freedom between the full and reduced models is the number of estimated parameters in the full model, p(f), minus the number of estimated parameters in the reduced model, p(r). The full regression model is

There are two parameters to be estimated in this model, p(f) = 2. The reduced model is constructed based on the null hypothesis that β1 is equal to zero. Therefore, the reduced model is

There is only one parameter to be estimated in this model, p(r) = 1. This means that there is p(f) − p(r) =1 piece of information that can be used for estimating the value of the full model over the reduced model. A test statistic used to compare the two models (the F-change statistic, for instance) will have 1 df.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Degrees of Freedom

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends