Scatterplot

Fei Gu; Neal M. Kingston

doi:10.4135/9781071812082

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Scatterplot

By: Fei Gu & Neal M. Kingston
In:The SAGE Encyclopedia of Research Design
Chapter DOI:https://doi.org/10.4135/9781071812082.n548
Subject:Research Methods & Evaluation (general), Research Design
Keywords:correlation; scatterplot

Request Permissions

Show page numbers Hide page numbers

A scatterplot is a graphic representation of the relationship between two or three variables. Each data point is represented by a point in n space, where n is the number of variables. The most common type of scatterplot involves two variables, with data indicated by its bivariate coordinates, usually denoted by X and Y. Trivariate plots are also common. Representing data in more than three dimensions requires multiple bivariate and/or trivariate plots. So basically, all scatterplots fall into two types: bivariate and trivariate.

Manipulated data points are used to present a bivariate scatterplot in Figure 1 and a trivariate plot in Figure 2.

Scatterplots in Data Analysis

[Page 1467]Scatterplots can be used to represent variables that have linear, nonlinear, or no relationship. Often scatterplots can be used to provide researchers with the information necessary to decide whether they should fit a linear model to their data. This is particularly important if they plan to use a statistical technique that assumes linearity, such as an ordinary least squares regression. Looking at the scatterplot in Figure 3 would lead a researcher to understand that fitting a linear model would be misleading.

Correlation and Regression

Correlation is a good way to examine the linear relationships between two variables, for example, a person’s weight and height, or a student’s high school grade point average, and his SAT/ACT score. The strength of the relationship between two variables is usually described in terms of the correlation coefficient (also known as the Pearson correlation coefficient), which ranges from −1 to 1. A scatterplot is often used to provide researchers the graphic view of what tends to happen to one score when another score increases/decreases.

Figure 1 Bivariate Scatterplot

Figure 2 Trivariate Scatterplot

When a set of variables is at hand, it is fairly easy to draw the scatterplot by plotting one score on the vertical axis and the other on the horizontal axis. Figures 4 through 7 provide examples of variable pairs with correlations of −0.8, −0.3, 0.3, and 0.8, respectively.

A positive correlation describes the situation in which an increase in variable X is associated with an increase in variable Y, whereas a negative correlation implies that an increase in variable X is associated with a decrease in variable Y. But a correlation of 0.8 is not stronger than a correlation of −0.8. It is the magnitude that matters. They simply work in opposite directions.

In certain extreme situations, all of the dots fall on a straight line. This is called a perfect correlation. Figures 8 and 9 show perfect positive and perfect negative correlations, respectively.

The line interpolating all the dots in Figures 8 and 9 is considered as the line of best fit. In a real-world data analysis, the line of best fit does not necessarily go through all the dots in a scatterplot. Essentially, the line of best fit means that a line is closest to most of the dots or a line is as close to most of the dots as [Page 1468]possible. The vertical distances between the line and those dots are called residuals. In statistics, the least squares method is used to get a regression line, which minimizes the sum of the squared residuals. Generally, the line of best fit is also referred to as the regression line.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Scatterplot

Scatterplots in Data Analysis

Correlation and Regression

Figure 1 Bivariate Scatterplot

Figure 2 Trivariate Scatterplot

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends