Skip to main content icon/video/no-internet

Research in educational psychology is often interested in examining the relationship between one or more variables and using this relationship to predict future behavior. Regression is the process of modeling the relationship between variables. In statistical form, regression models can be used to predict the outcome of a particular variable given knowledge of the value of one or more other variables.

Regression can be used to answer questions related to performance on standardized examinations in subject areas such as mathematics, reading, and science. For instance, student performance on a mathematics assessment is likely related to performance in a mathematics-related class such as algebra or geometry. The relationship between these variables can be used to devise a regression equation that allows a prediction to be made. That is, given knowledge of performance in an algebra class, the regression equation can be used to predict (with error) performance on the mathematics assessment. The regression equation also explains a certain amount of variability in the outcome measure.

Regression models take many forms, but linear models are most common. Linear regression models refer to a pattern of change where the outcome measure or dependent variable (DV), Y, increases (decreases) as the independent variable (IV), X, increases (decreases). Theoretically, the linear regression equation can be expressed as

None

where a and b are estimated values for the intercept and slope, respectively, and e represents measurement error or the amount of unexplained variability in the DV. The intercept is defined as the value of the DV when the IV is equal to zero, and the slope provides information on the direction and magnitude of the relationship between the IV and DV.

The goal in formulating regression models is to minimize the amount of measurement error by explaining as much variability in the DV as possible. This can be accomplished through either a simple regression model or a multiple regression model. The aforementioned example is considered a simple regression model using a single IV, performance in an algebra class, to predict performance on the mathematics assessment. Including an additional variable, such as performance in a geometry class, would likely result in a better explanation of the variability in mathematics assessment scores. Theoretically, the multiple regression equation takes the same form as the simple regression equation but contains an additional slope value:

None

The slope in a simple linear regression equation is calculated as

None

where sXY is the covariance between X and Y and s2X is the total amount of variability in X. The y intercept is

None

where Y¯ is the mean of Y and X¯ is the mean of X. The slope is reported in either standardized or unstandardized form. Unstandardized values of b explain the magnitude in the relationship between the independent variable and dependent variable in the original units of the variables. However, if the data are normalized, standardized values for the slope can be calculated. Using standardized values for the slope results in interpreting the relationship between X and Y in standard deviation units. Standardized values are useful for determining which IV is contributing more toward explaining the variance in the DV.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading