R2

Neil J.Salkind

doi:10.4135/9781412961288

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

R2

Edited by:
Neil J. Salkind
In:Encyclopedia of Research Design
Chapter DOI:https://doi.org/10.4135/9781412961288.n357
Subject:Research Design
Keywords:linear models; measures of variability

Request Permissions

Show page numbers Hide page numbers

R-squared (R2) is a statistic that explains the amount of variance accounted for in the relationship between two (or more) variables. Sometime R2 is called the coefficient of determination, and it is given as the square of a correlation coefficient.

Given paired variables (Xi, Yi), a linear model that explains the relationship between the variables is given by

where e is a mean zero error. The parameters of the linear model can be estimated using the least squares method and denoted by

0 and

1, respectively. The parameters are estimated by minimizing the sum of squared residuals between variable Yi and the model β0 + β1Xi, that is,

It can be shown that the least squares estimations are

where the sample cross-covariance Sxy is defined as

Statistical packages such as SAS, SPLUS, and R provide a routine for obtaining the least squares estimation. The estimated model is denoted as

With the above notations, the sum of squared errors (SSE), or the sum of squared residuals, is given by

SSE measures the amount of variability in Y that is not explained by the model. Then how does one measure the amount of variability in Y that is [Page 1188]explained by the model? To answer this question, one needs to know the total variability present in the data. The total sum of squares (SST) is the measure of total variation in the Y variable and is defined as

where Y is the sample mean of Y variables, that is,

Since SSE is the minimum of the sum of squared residuals of any linear model, SSE is always smaller than SST Then the amount of variability explained by the model is SST −SSE, which is denoted as the regression sum of squares (SSR), that is,

The ratio SSR/SST = (SST −SSE)/SST measures the proportion of variability explained by the model. The coefficient of determination (R) is defined as the ratio

The coefficient of determination is given as the ratio of variations explained by the model to the total variations present in Y Note that the coefficient of determination ranges between 0 and 1. R value is interpreted as the proportion of variation in Y that is explained by the model. R = 1 indicates that the model exactly explains the variability in Y and hence the model must pass through every measurement (Xi, Yi). On the other hand, R2 = 0 indicates that the model does not explain any variability in Y R value larger than .5 is usually considered a significant relationship.

Case Study and Data

Consider the following paired measurements from Moore and McCabe (1989), based on occupational mortality records from 1970 to 1972 in England and Wales. The figures represent smoking rates and deaths from lung cancer for a number of occupational groups.

Smoking index	Lung cancer mortality index
77	84
137	116
117	123
94	128
116	155
102	101
111	118
93	113
88	104
102	88
91	104
104	129
107	86
112	96
113	144
110	139
125	113
133	146
115	128
105	115
87	79
91	85
100	120
76	60
66	51

For a set of occupational groups, the first variable is the smoking index (average 100), and the second variable is the lung cancer mortality index (average 100). Suppose we are interested in determining how much the lung cancer mortality index (Y variable) is influenced by the smoking index (X variable). Figure 1 shows the scatterplot of the smoking index versus the lung cancer mortality index. The straight line is the estimated linear model, and it is given

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

R2

Case Study and Data

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends