Skip to main content icon/video/no-internet

All events have space and time coordinates attached to them—they happen somewhere at some time. In many areas of epidemiology, recording the place of individual events and exposure is vitally important. The recent surge in the availability of desktop computing power, geographical information systems (GIS) software, and interest in the effect of neighborhood conditions on development of disease have caused a resurgence of interest in spatial data analysis.

Types of Spatial Data

Spatial data consist of measurements or observations taken at specific locations or within specific spatial areas. In addition to values for various attributes of interest, spatial data sets also include the locations or relative positions of the data values. There are three main types of spatial data. The first type of data, geostatistical data, is measurements taken at fixed locations. In most cases, the locations are spatially continuous, that is, data locations are available in close proximity to each other. An example of geostatistical data would be measures of the concentration of pollutants at monitoring stations. The second type of spatial data is lattice data, which are areareferenced data with observations specific to a region or area. An example of lattice data is the rate of specific types of cancer deaths by state from the National Cancer Institute's cancer mortality atlas. The areas can be regularly or irregularly spaced. Areas are often referenced by their centroid. The third type of spatial data is point pattern data, which arise when locations themselves are of interest. Spatial point patterns consist of a finite number of locations observed in a spatial area. Examples of point pattern data include the locations of women diagnosed with breast cancer on Long Island.

Spatial Scale

The spatial scale or resolution is an important issue in the analysis of spatial data. Patterns observed in spatial data may be the result of different processes operating at different scales. This is known as the modifiable areal unit problem (MAUP). The MAUP consists of both a scale and an aggregation problem. The concept of the ecological fallacy is closely related to the MAUP. The scale problem refers to the variation that can occur when data from one scale of areal units are aggregated into more or fewer areal units. For example, much of the variation among counties changes or is lost when the data are aggregated to the state level. The choice of spatial areas is often arbitrary in nature, and different areal units can be just as meaningful in displaying the same base-level data. Clearly, it is more meaningful to use ‘naturally’ defined areas (e.g., neighborhoods or hospital catchment areas) rather than arbitrary administrative units.

Frequentist versus Bayesian Analysis

Traditionally, epidemiologists have used the frequentist approach for the analysis of data, including spatial data. However, fluctuations in disease rates may occur because of the sparseness of the data in certain areas. In addition, data are often spatially correlated, meaning there is a tendency of adjacent areas to have similar rates of disease incidence. In many cases, there are no valid or accepted frequentist methods for tackling these problems, or the frequentist methods are complex and difficult to interpret.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading