Outliers

Shuming Bao

doi:10.4135/9781412953962

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Outliers

By: Shuming Bao
In:Encyclopedia of Geographic Information Science
Chapter DOI:https://doi.org/10.4135/9781412953962.n157
Subject:Geographic Information Systems

Request Permissions

Show page numbers Hide page numbers

Outliers refers to atypical and infrequent observations that differ markedly from the bulk of observations (in location, scale, or distributional pattern). An observed outlier may be caused by the error in measurement or processing, influenced by an interruptive event (such as strike, natural disaster, political or economic crises), or generated by a different mechanism. Although an outlier may not necessarily be “wrong,” the effect of outliers on inference procedures can be substantial: A small number of outliers may have a disproportionate influence on the estimated value of the correlation coefficients or the slope of the regression line (see Figure 1); the real efficiency of optimal statistical methods could be reduced; and the resultant inference from the statistical data analysis could be unreliable or even invalid.

Outlier detection is important for effective data analysis and modeling. Various methods can be used to [Page 336]detect outliers in data analysis (such as histogram, boxplot, and scatterplot). If outliers are detected, they should not be simply excluded from the data set. It is important to find out whether they represent a purely random phenomenon or whether they indicate some misspecification in the systematic part of the model. In some cases, an outlier may be corrected by error control in measurement or recording. In the case of a highly asymmetric data distribution, an outlier may become a normal observation after a data transformation.

Figure 1 The Effects of Outlier on the Slope Coefficients of Linear

In most cases, the outliers are the most interesting observations in the data set, since they may reveal some unusual and interesting phenomenon. A thorough investigation of outliers will help achieve a better understanding of the data structure and more confidence in data modeling. To control the excessive influence of outliers, resistant methods (such as weighted-median polish) may be used in exploratory data analysis to help identify data structure, and robust methods (such as robust regression) may be used in confirmatory data analysis to produce efficient parameter estimates. Some methods available through geovisualization, such as brushing and linking, are useful means for exploring outliers.

ShumingBao

http://dx.doi.org/10.4135/9781412953962.n157

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

Entry

Reader's guide

Entries A-Z

Subject index

Outliers

Figure 1 The Effects of Outlier on the Slope Coefficients of Linear

Further Readings

Sign in to access this content

Get a 30 day FREE TRIAL

Sage Recommends

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Outliers

Figure 1 The Effects of Outlier on the Slope Coefficients of Linear

Further Readings

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends