Bayesian Data Analysis

Guillermo Campitelli

doi:10.4135/9781071812082

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Bayesian Data Analysis

By: Guillermo Campitelli
In:The SAGE Encyclopedia of Research Design
Chapter DOI:https://doi.org/10.4135/9781071812082.n42
Subject:Research Methods & Evaluation (general), Research Design
Keywords:anxiety; Bayesian statistics; confidence intervals; data analysis; distribution; likelihood functions; parameter estimation; parameters; probability

Request Permissions

Show page numbers Hide page numbers

Bayesian data analysis is an umbrella term that encompasses different data analysis approaches that have in common the use of Bayes’s rule as a guiding principle, and the goal of quantifying evidence in favor of possible parameter values, models, or hypotheses, rather than making a decision on whether a parameter value is different than a specific value (e.g., zero) or not. This entry describes two approaches in detail—Bayesian parameter estimation and Bayesian hypothesis testing—and briefly describes two other approaches: Bayesian model comparison and hierarchical Bayesian models.

Bayesian Parameter Estimation

The goal of Bayesian parameter estimation is the same as the traditional frequentist parameter estimation: to make an estimation of the value of a population parameter such as a mean, a difference between two means, a correlation between two variables, or a regression coefficient. The difference between the two approaches is that in the traditional approach the estimation consists of providing a point estimate and a confidence interval, typically a 95% confidence interval, whereas in the Bayesian approach the estimation consists of providing a posterior distribution, which could be summarized, for instance, with the mean of the distribution and its 2.5th and the 97.5th percentiles (i.e., a 95% credible interval or the 95% high-density interval).

The advantage of the 95% credible (or high density) interval in Bayesian parameter estimation is that it provides the information that researchers are typically interested in. It indicates that the probability that the actual value of the parameter of interest (e.g., the population mean of a variable) is within the interval is 95%. In other words, the researcher can be 95% confident that the actual parameter value is in the interval. The traditional 95% confidence interval is more difficult to interpret: It is the interval generated by a procedure that provides confidence intervals that include the actual parameter value 95% of the times the procedure is used. The process to obtain a posterior distribution of a parameter of interest consists of four steps: choice of [Page 93]distribution to create the likelihood function, choice of prior distribution, data collection and obtention of relevant statistics, and obtention of posterior distribution.

Choice of Distribution to Create the Likelihood Function

Let’s consider the case in which a researcher aims to estimate the proportion of people who are currently experiencing anxiety in a specific city by administering an anxiety scale to 200 people in that city; that scale determines whether the person is experiencing anxiety (1) or not (0). The distribution to create the likelihood function has a resemblance with the sampling distribution in the traditional approach. In this case, the binomial distribution is the most appropriate distribution to construct the likelihood function. The binomial distribution provides the probability of obtaining a specific number of people with anxiety in the sample of 200 participants given that the proportion of people of anxiety in the population (denoted by π) is a determined value. For example, if π = 0.50, the most probable value in the sample is 100 people with anxiety, while 99 and 101 being the second most probable values, 98 and 102 following in probability, and so forth, with values close to 0 and 200 having an extremely low probability of occurrence. The binomial distribution is then used to construct a likelihood function over parameter values once the data are observed. The likelihood function will come into play again when this function is combined with the prior distribution to produce the posterior distribution.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Bayesian Data Analysis

Bayesian Parameter Estimation

Choice of Distribution to Create the Likelihood Function

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends