Skip to main content icon/video/no-internet

Geographic information is collected to support scientific investigations and decision making in a wide variety of domains (e.g., ecology, environmental engineering and science, geography, geosciences, public health, and social sciences). Spatial interpolation, one of the most widely used spatial analysis methods, is often integrated within geographic information systems (GIS) to estimate unknown or unavailable information associated with locations of interest based on collected information. Spatial interpolation is described in most of the textbooks in GIS and spatial analysis. This entry provides a concise but comprehensive review of spatial interpolation methods with a focus on the computation of spatial interpolation.

Taxonomy

Spatial interpolation methods can be developed based on different assumptions, purposes, and evaluations. These methods exploit Tobler's (1970) famous first law of geography: “Everything is related to everything else, but near things are more related than distant things” (p. 236). For example, local perspectives for interpolation often assume positive spatial autocorrelation, while global perspectives tend to find trend surfaces across the entire space being considered. Spatial interpolation also depends on the type of spatial entities (e.g., point and area) where measurements are made and estimations drawn. The difference between point and area entities contributes to the distinction between point-based and areal interpolation methods, developed for the purposes of different applications such as point-based temperature surface interpolation and the areal interpolation of population characteristics between different areal units (e.g., census tracts and blocks). Consistent with the general concept of interpolation, spatial interpolation is often assessed by whether an interpolated value matches a given sample (i.e., exact interpolation) or not (i.e., approximate interpolation).

Another way to classify spatial interpolation focuses on whether statistical properties are addressed. Deterministic interpolation (e.g., inverse distance weighted, referred to as IDW) does not incorporate consideration of the randomness and probability of spatial variables, while geostatistical interpolation (e.g., kriging) does. A particular spatial interpolation method may take deterministic or geostatistical approaches, hold local or global perspectives, tailor to be point based or areal, and produce approximate or exact outcome. These various aspects of taxonomy are often combined with each other and integrated with additional dimensions (e.g., temporal) based on application requirements and the constraints of computational tractability, leading to many kinds of spatial interpolation techniques and algorithms.

Computation

Computation of spatial interpolation is often intensive for sizable data sets, and thus, high-performance computing environments based on parallel and distributed processing have been employed to enable large-scale interpolation. For example, the IDW interpolation has been extensively studied to improve the performance of computation required to find the nearest neighbors on which calculations are based. To find nearest neighbors is a well-defined computation problem—formally called k-nearest neighbor search—that is notoriously computationally intensive. Various IDW interpolation algorithms have been developed based on parallel and distributed computing architectures. Most of these algorithms exploit the spatial distribution characteristics of data sets to resolve the computational intensity of k-nearest neighbor search.

Another example of computationally intensive spatial interpolation is Bayesian kriging, which provides realistic estimation of interpolation error and is able to combine information from disparate sources. However, Bayesian kriging based on Markov chain Monte Carlo (MCMC) methods poses even more significant computational challenges than classical kriging. Both require linear algebra operations that are computationally intensive when the number of measurement locations is large. Because an MCMC sampler typically must be run for thousands of iterations, each requiring numerous operations, the run time for sequential Bayesian kriging algorithms quickly becomes unacceptable. Even with parallel MCMC algorithms running on single high-performance computers, run times may be in the range of several hours, especially for vast geographic data sets. Therefore, the emerging cyberinfrastructure provides an ideal platform to develop parallel MCMC algorithms for Bayesian kriging by taking advantage of dynamically configurable computer resources.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading