Skip to main content icon/video/no-internet

Bootstrapping is a computer—intensive, nonparametric approach to statistical inference. Rather than making assumptions about the sampling distribution of a statistic, bootstrapping uses the variability within a sample to estimate that sampling distribution empirically. This is done by randomly resampling with replacement from the sample many times in a way that mimics the original sampling scheme. There are various approaches to constructing confidence intervals with this estimated sampling distribution that can be then used to make statistical inferences.

Goal

The goal of statistical inference is to make probability statements about a population parameter, θ, from a statistic, None, calculated from sample data drawn randomly from a population. At the heart of such analysis is the statistic's sampling distribution, which is the range of values it could take on in a random sample of a given size from a given population and the probabilities associated with those values. In the standard parametric inferential statistics that social scientists learn in graduate school (with the ubiquitous Z-tests and t-values), a statistic's sampling distribution is derived using basic assumptions and mathematical analysis. For example, the central limit theorem gives one good reason to believe that the sampling distribution of a sample mean is normal in shape, with an expected value of the population mean and a standard deviation of approximately the standard deviation of the variable in the population divided by the square root of the sample size. However, there are situations in which either no such parametric statistical theory exists for a statistic or the assumptions needed to apply it do not hold. In analyzing survey data, even using well—known statistics, the latter problem may arise. In these cases, one may be able to use bootstrapping to make a probability—based inference to the population parameter.

Procedure

Bootstrapping is a general approach to statistical inference that can be applied to virtually any statistic. The basic procedure has two steps: (1) estimating the statistic's sampling distribution through resampling, and (2) using this estimated sampling distribution to construct confidence intervals to make inferences to population parameters.

Resampling

First, a statistic's sampling distribution is estimated by treating the sample as the population and conducting a form of Monte Carlo simulation on it. This is done by randomly resampling with replacement a large number of samples of size n from the original sample of size n. Replacement sampling causes the resamples to be similar to, but slightly different from, the original sample, because an individual case in the original sample may appear once, more than once, or not at all in any given resample.

For the resulting estimate of the statistic's sampling distribution to be unbiased, resampling needs to be conducted to mimic the sampling process that generated the original sample. Any stratification, weighting, clustering, stages, and so forth used to draw the original sample need to be used to draw each resample. In this way, the random variation that was introduced into the original sample will be introduced into the resamples in a similar fashion. The ability to make inferences from complex random samples is one of the important advantages of bootstrapping over parametric inference. In addition to mimicking the original sampling procedure, resampling ought to be conducted only on the random component of a statistical model. For example, an analyst would resample the error term of a regression model to make inferences about regression parameters, as needed, unless the data are all drawn from the same source, as in the case of using data from a single survey as both the dependent and independent variables in a model. In such a case, since the independent variables have the same source of randomness—an error as the dependent variable—the proper approach is to resample whole cases of data.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading