Skip to main content icon/video/no-internet

Root Mean Square Error

The term root mean square error (RMSE) is the square root of mean squared error (MSE). RMSE measures the differences between values predicted by a hypothetical model and the observed values. In other words, it measures the quality of the fit between the actual data and the predicted model. RMSE is one of the most frequently used measures of the goodness of fit of generalized regression models.

RMSE for Regression

In the application of regression models, unless the relationship or correlation is perfect, the predicted values are more or less different from the actual observations. These differences are prediction errors or residuals. These residuals are measured by the vertical distances between the actual values and the regression line. Large distances are indicative of large errors. However, for a given fitted regression line, the average or the sum of the residuals equals zero, as the overestimation of some scores can be canceled out by underestimations of other scores. Thus, a common practice in statistical work is to square the residuals to indicate the magnitude of absolute differences. Indeed, a primary goal in linear regression is to minimize the sum of the squared prediction errors to best model the relations among variables.

To acquire RMSE, one can square and average the individual prediction errors over the whole sample. The average of all the squared errors is the MSE. The MSE, in essence, is an average of the spread of the data around the regression line and reflects how big the “typical” prediction error is. Furthermore, the MSE can be “square rooted” to obtain RMSE. RMSE is used to represent prediction errors in the same units as the data, rather than in squared units.

Mathematical Definition

Since RMSE is the square root of MSE, a thorough knowledge of MSE is important to an understanding of the mathematical definition and properties of RMSE.

MSE

MSE is the mean of the overall squared prediction errors. It takes into account the bias, or the tendency of the estimator to overestimate or underestimate the actual values, and the variability of the estimator, or the standard error.

Suppose that

None
is an estimate for a population parameter θ. The MSE of an estimator
None
is the expected value of (
None
– θ)2, which is a function of the variance and bias of the estimator.
None

where V(

None
) denotes the variance of
None
and B(
None
) denotes the bias of the estimator
None
.

An MSE of zero means that the estimator

None
predicts observations of the parameter θ perfectly. Different values of MSE can be compared to determine how well different models explain a given data set. The smaller the MSE is, the closer the fit is to the data.

RMSE

RMSE is the average vertical distance of the actual data points from the fitted line. Mathematically, RMSE is the square root of MSE.

None

For an unbiased estimator, RMSE is equivalent to the standard error of the estimate, and it can be calculated using the formula

None

where n denotes the size of the sample or the number of observations; Xi represents individual values, and

None
represents the sample mean.

In many statistical procedures, such as analysis of variance and linear regression, the RMSE values are used to determine the statistical significance of the variables or factors under study. RMSE is also used in regression models to determine how many predictors to include in a model for a particular sample.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading