Skip to main content icon/video/no-internet

Disturbance Terms

In the field of research design, researchers often want to know whether there is a relationship between an observed variable (say, y) and another observed variable (say, x). To answer the question, researchers may construct the model in which y depends on x. Although y is not necessarily explained only by x, a discrepancy always exists between the observed value of y and the predicted value of y obtained from the model. The discrepancy is taken as a disturbance term or an error term.

Suppose that n sets of data, (x1, y1), (x2, y2), … (xn, yn), are observed, where yi is a scalar and xi is a vector (say, 1 × k vector). We assume that there is a relationship between x and y, which is represented as the model y = f(x), where fix) is a function of x. We say that y is explained by x, or y is regressed on x. Thus y is called the dependent or explained variable, and x is a vector of the independent or explanatory variables. Suppose that a vector of the unknown parameter (say β, which is a k × 1 vector) is included in f(x). Using the n sets of data, we consider estimating β in f(x). If we add a disturbance term (say u, which is also called an error term), we can express the relationship between y and x as y = f(x) + u. The disturbance term u indicates the term that cannot be explained by x. Usually, x is assumed to be nonstochastic. Note that x is said to be nonstochastic when it takes a fixed value. Thus f(x) is deterministic, while u is stochastic. The researcher must specify f(x). Representatively, it is often specified as the linear function f(x) = xβ.

The reasons a disturbance term u is necessary are as follows: (a) There are some unpredictable elements of randomness in human responses, (b) an effect of a large number of omitted variables is contained in x, (c) there is a measurement error in y, or (d) a functional form of f(x) is not known in general. Corresponding examples are as follows: (a) Gross domestic product data are observed as a result of human behavior, which is usually unpredictable and is thought of as a source of randomness. (b) We cannot know all the explanatory variables that depend on y. Most of the variables are omitted, and only the important variables needed for analysis are included in x. The influence of the omitted variables is thought of as a source of u. (c) Some kinds of errors are included in almost all the data, either because of data collection difficulties or because the explained variable is inherently unmeasurable, and a proxy variable has to be used in their stead. (d) Conventionally we specify f(x) as f(x) = . However, there is no reason to specify the linear function. Exceptionally, we have the case in which the functional form of f(x) comes from the underlying theoretical aspect. Even in this case, however, f(x) is derived from a very limited theoretical aspect, not every theoretical aspect.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading