Skip to main content icon/video/no-internet

Automated georeferencing uses advanced geographic information technologies, such as geoparsing, gazetteer lookup, uncertainty calculation, and outlier checking, to automatically decode and extract geographic information stored in digital text references. For example, field records for biological specimens often include location references in forms such as “2 1/4 mi N of Columbia,” “at the junction of Route 3 and High Street,” “approximately 3 miles upriver of the Johnson ford on White River.” This entry explains the need for this process and introduces the BioGeomancer Web Services/Workbench developed by an international consortium of natural history and geospatial data experts to address this need.

The Need for Automated Georeferencing

For hundreds of years, biologists have been going into the field to make observations and to collect plant and animal specimens, which have then been stored in museums and herbaria. Along with the collections and observations, a great deal of information has been recorded, including information on their locations. This information is now commonly archived in electronic databases.

It is estimated that there are currently between 2.5 and 3 billion biological collections or observations held in the world's natural history collections, but so far, only about 1% have actually been georeferenced. While the task is enormous, it is of major international importance that this huge store of legacy information be georeferenced, as it is often the only record of the previous location of many species (some now extinct) that were originally collected in areas that have since been turned into agricultural land, urban areas, or sunk under man-made dams. Many historic locations recorded no longer exist or have changed their circumscription over time.

BioGeomancer

The BioGeomancer Project was established to bring together a range of experts in a collaborative project to focus efforts on developing automated ways to georeference biodiversity data from the world's natural history collections and other biological data archives. The BioGeomancer Project will lower the cost of georeferencing to a point where it is cost-effective for all those creating digital databases of their records to simultaneously georeference them.

The BioGeomancer Workbench uses automatic geoparsing of locality descriptions, links to online gazetteers, and outlier detection algorithms to generate georeferences for these billions of biological records. Not only are the individual georeferences determined, but also their spatial accuracy and uncertainty. These techniques are being set up in such a way that they can easily be applied to any other earthly feature that requires georeferencing. The Workbench builds on the work of several existing projects and expands and enhances that work. The Workbench takes data through a number of key steps in order to produce a validated georeference.

Geoparsing. Data are passed through several different geoparsing engines to separate the data into their component parts and to interpret the semantic components of each. Each part of the text string is converted into NS and EW distance and heading components, along with their associated units of measurement and a number of feature components at different levels in the hierarchy (e.g., town, county, state).

Gazetteer Lookup. Each of the feature's output from the geoparsing is then checked against a number of online and in-house gazetteers and a footprint determined for each.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading