Skip to main content icon/video/no-internet

Geographical Analysis Machine (GAM)

A geographical analysis machine (GAM) is an exploratory method for the detection of the raised incidence of some phenomenon in an at-risk population. The name and its acronym were coined by Stan Openshaw and others in 1987. The method was developed at the Department of Geography at the University of Newcastle in the mid-1980s in an attempt to determine whether spatial clustering was evident in the incidence of various cancers in children in northern England. The locations of the individual patients were known, and the at-risk population was available for small areas with about 200 households known as enumeration districts (ED). Digitized boundaries for the enumeration districts were not known but a “centroid” was available.

In Openshaw's original conception, the “machine” had four components: (a) a means of generating spatial hypotheses, (b) a significance test procedure, (c) a data retrieval system, and (d) a graphical system to display the results.

The hypothesis generator lies at the heart of GAM. A lattice is placed over the study area, and at the mesh points of the lattice, the disease cases and the at-risk population are counted within a circular search region. To allow the circles to overlap, the lattice spacing is taken as 0.8 of the radius of the circles. For each circle, a statistical test is undertaken to determine whether the number of observed cases is likely to have arisen by chance. Results for circles that are significant are stored for later plotting. When all the circles of a given radius have been tested, the radius is increased, the grid mesh size changed accordingly, and the process of data extraction and significance testing is repeated for every circle again. In the original GAM, circles of size 1 km to 20 km were used in 1 km increments.

The significance test used was based on Hope's Monte Carlo test. The observed count was compared with 499 simulated counts. This allows a significance level of 0.002, which was intended to minimize the identification of false positives. The data retrieval over-heads are enormous, so an efficient data structure is required. The chosen data structure was Robinson's KDB tree. At the time, this was perhaps one of the better spatial retrieval methods for large spatial databases.

The plotting of the significant circles was the task of a separate program. This software read the file of results, which consisted of the coordinates of the center and the radius of each significant circle. The circles were plotted on an outline of the administrative districts of northern England. In these days of laser printers, it is perhaps worth recalling a technical problem—drawing thousands of overlapping circles with a rollerball pen on thin paper would often cause local saturation of the paper eventually leading to the tears and holes appearing in the plot. The problem was obviated by plotting the circles in a randomized order.

The whole system was hand coded in FORTRAN, and the graphical output was plotted on a Calcomp large-format pen plotter. The software ran on a powerful Amdahl 5860 mainframe computer. The software was organized so that the statistical computing was undertaken by one program, which would be run over several night shifts, and the graphics would be handled by another program, which received the output from the first in a text file. Openshaw quoted a run time of 22758 CPU seconds for one of the runs; in 1986, this was viewed as a major problem. On a present day personal computer (PC), one might expect the run time to be of the order of a few minutes.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading