Skip to main content icon/video/no-internet

Predictor variable is the name given to an independent variable used in regression analyses. The predictor variable provides information on an associated dependent variable regarding a particular outcome. The term predictor variable arises from an area of applied mathematics that uses probability theory to estimate future occurrences of an event based on collected quantitative evidence.

Predicted outcomes have become part of colloquial phrases in modern language. People speak about having higher risks associated with certain lifestyle choices, or about students’ predicted academic performance in a university setting based on scores from standardized assessments. Examples abound in common language regarding predictive variables, yet confusion and fundamental misunderstandings surround the topic. Because of the frequent use and communication of predictor variables in behavioral and medical sciences, it is important for researchers to understand and clearly communicate information related to this topic.

This entry addresses basic assumptions associated with predictor variables and the basic statistical formula used in linear regression. It begins with an explanation of basic concepts and key words commonly associated with predictor variables; following this, basic linear regression formulas are subjected to analysis with an emphasis on predictor variables. The entry concludes with a familiar example of predictor variables used in common applications.

Basic Concepts and Terms

At the most fundamental level, predictor variables are variables that are linked with particular outcomes. As such, predictor variables are extensions of correlational statistics. Therefore, it is important to understand basic correlational concepts. As a function of the correlational relationship, when strength increases, slope, directional trends, and clustering effects become increasingly defined. Slope defines the degree of relationship between variables, with stronger relationships evidencing a steeper slope. The degree of slope varies depending on the relationship of the examined variables. Directional trends are either positive—as the x variable increases, the y variable increases—or negative—as either the x or the y variable increases, the other variable decreases. An example of a positive relationship is height and weight. As people grow, weight typically increases. The relationship is not necessarily a one-to-one ratio, and exceptions certainly exist; however, this relationship is frequently observed. The relationship between weight of clothing worn by people and temperature is an example of a negative relationship between variables. People are unlikely to wear fewer clothing items on days when the weather is below the freezing point, and they are just as unlikely to wear layers of clothing when the temperature nears 100°F. As mentioned in the previous example, variation will exist in the population—some will choose to wear a light jacket on warm days, whereas others may wear T-shirts and shorts. This variation is called clustering. Correlational clusters are the degree to which variables form near each other around an estimate line that provides a representation of a linear trend in the scatterplot. Researchers commonly refer to this linear estimate as the line of best fit. As the observed spread in the cluster decreases, the observed variables exhibit less variance from the line of best fit; this increases the precision of the linear estimate. In a perfect correlational relationship, no deviations from the line of best fit occur, although this is a rare phenomenon.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading