Skip to main content icon/video/no-internet

Most statistical techniques assume that relationships are constant over space (the assumption of spatial stationarity). Geographically weighted regression (GWR) is a statistical technique that allows relationships to vary spatially within a study region (it therefore allows spatial nonstationarity). The relaxation of the stationarity assumption in regression has proven to be effective in modeling many spatial processes. This entry summarizes the conceptual and mathematical background to GWR.

Background

Regression analysis, in various guises, is without much doubt the most popular form of statistical analysis practiced. Its popularity arises because it allows researchers to quantify the relationship between two variables while accounting for (holding constant) the potentially confounding relationships between other variables. This is an extremely useful feature that can be employed in a wide variety of application areas. In a linear framework, these relationships can be represented with the following general model:

None

where yi is the value of the dependent variable observed at location i; x1i, x2i, …, xni are the values of the independent variables observed at i; ß0, ß1, …, ßn are parameters to be estimated; and εi is an error term that is assumed to be normally distributed.

Within this model, researchers are interested in obtaining accurate estimates of the parameters ß0, ß1, …, ßn, which inform them of the nature of various statistical relationships that in turn can be used to infer aspects of the determinants of variation in the dependent variable y. The parameter estimates obtained in the calibration of such a model are obtained from the following estimator:

None

where ß∗ represents an estimate of ß.

To make clear what is happening here, multiple observations on each of the variables y, x1, x2, …, xn are needed to obtain a single estimate of each parameter in the model. It is impossible to determine any set of estimates based on a single observation of y, x1, x2, …, xn since then an infinite number of parameter values would fit the equation exactly. Consequently, researchers have to estimate the parameters in the model statistically from multiple observations of the variables y, x1, x2, …, xn, and the more observations they have on each of the variables, the more reliable their estimates become as representatives of the true relationships within the model, ceteris paribus.

The question then is where do these data come from? If the data are time series, then multiple observations of y,x1, x2, …, xncan be obtained at the same location and a regression run for that location (while taking into account serial autocorrelation issues). If the data are spatial (i.e., one has observations at multiple locations), then an estimate of the parameters in the model can be obtained by pooling the data from the various locations at which the data are recorded within a study area. Obviously then, the parameters that are estimated in the regression analysis represent averages of the relationships that exist in each of the locations at which the data are recorded. This is the method by which regression is performed in spatial analysis, and two potential problems arise that are unique to spatial data (although equivalent problems surface in aspatial data, they are not as complex as those found in spatial data).

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading