Skip to main content icon/video/no-internet

For a linear regression model, the consistency of the ordinary least squares (OLS) estimator depends heavily on the assumption that the explanatory variables and the statistical disturbances are uncorrelated. When the regressors are uncorrelated with the disturbance term, we say that we have exogenous explanatory variables, whereas a regressor correlated with the error term is said to be endogenous. The terms exogenous and endogenous originated in simultaneous equations analysis, but the term endogenous explanatory variable covers any case where a regressor is correlated with the disturbance term. A usual source of endogeneity is omission of important variables. Other sources include simultaneity with one equation forgotten and autoregressive models with serially correlated errors.

When an explanatory variable is endogenous, it is not plausible to separate variation in the explanatory variable from variation in the disturbance term, and as a result, OLS yields a biased and inconsistent estimator. In order to separate the variation in the explanatory variables from the variation in the error term, we need more information, called instrumental variables or simply instruments. In linear models, any variable uncorrelated with the error term is an instrumental variable.

Example

Suppose that we are interested in estimation of the demand curve for a good, and gather data for the price P and the quantity purchased Q of the good. One might set up the following model for the demand curve:

None

and run a regression of log quantity on log price. However, although the above equation describes the functional relationship between the price P and the quantity demanded QD, the collected price and quantity data are equilibrium prices and equilibrium quantities, that is, the solutions to the simultaneous equations

None

where the first equation is the demand equation, and the second is for supply. In other words, although we are interested in the slope of the demand curves D1, D2, D3, and so on in Figure 1, the observed data are the equilibria E1, E2, E3, and so on, which are affected by the demand shocks uD and the supply shocks uS. Therefore, least squares regression using the observed data is unlikely to yield an unbiased estimator for the demand function.

Figure 1 Demand and Supply When Both Demand and Supply Shocks Are Present

None

An intuitive solution to this problem is to consider a third variable that shifts the supply curve but does not affect the demand. Then, the equilibrium prices and quantities trace out the demand curve as shown in Figure 2, and proper exploitation of the additional variable will lead to a consistent estimator. This variable, the instrumental variable, is correlated with the price P, but is uncorrelated with the demand shocks uD.

Figure 2 Demand and Supply With Only Supply Shocks

None
Note: A variable that affects only supply is an instrumental variable.

Instrumental Variable Estimation

Consider the linear regression model

None

where the variables in x1t are exogenous and x2t is the vector of endogenous regressors. By definition, the exogenous variables x1t are all uncorrelated with the error term, so these are all instruments. In addition, we may consider collecting additional variables z2t, which are also uncorrelated with the errors. Together, the instruments are zt = (x1t, z2t), and by assumption, these are uncorrelated with the errors ut; that is, Eztut = 0, or

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading