Bayesian Information Criterion

Neil J.Salkind

doi:10.4135/9781412952644

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Bayesian Information Criterion

Edited by:
Neil J. Salkind
In:Encyclopedia of Measurement and Statistics
Chapter DOI:https://doi.org/10.4135/9781412952644.n46
Subject:Quantitative/Statistical Research, Test & Measurement

Request Permissions

Show page numbers Hide page numbers

The Bayesian information criterion (BIC) is a statistic used for comparison and selection of statistical models. BIC is given by a simple formula that uses only elements of standard output for fitted models. It is calculated for each model under consideration, and models with small values of BIC are then preferred for selection. The BIC formula and the sense in which the model with the smallest BIC is the “best” one are motivated by one approach to model selection in Bayesian statistical inference.

Definition

Suppose that we are analyzing a set of data D of size n. Here is the sample size if D consists of statistically independent observations and the “effective sample size” in some appropriate sense when the observations are not independent. Suppose that alternative models Mk are considered for D, and that each model is fully specified by a parameter vector θk with pk parameters. Let p(D |θk;Mk) denote the likelihood function for model Mk, l(θk) = log p(D |θk;Mk) the corresponding log-likelihood, and θ∘k the maximum likelihood estimate of θk.

Let Ms denote a saturated model that fits the data exactly. One form of the BIC statistic for a model Mk is

where

l(θ∘s) is the log-likelihood for the saturated model, G2k is the deviance statistic for model Mk, and

dfk is its degrees of freedom.

This version of BIC is most appropriate when the idea of a saturated model is natural, such as for models for contingency tables and structural equation models for covariance structures. The deviance and its degrees of freedom are then typically included in standard output for the fitted model. In other cases, other forms of BIC may be more convenient. These variants, all of which are equivalent for purposes of model comparison, are described at the end of this entry.

Motivation as an Approximate Bayes Factor

The theoretical motivation of BIC is based on the idea of a Bayes factor, which is a statistic used for comparison of models in Bayesian statistical analysis. First, define for model Mk the integrated likelihood

where p(θk | Mk) is the density function of a prior distribution specified for the parameters θk, and the integral is over the range of possible values for θk. Defining p(θk | Ms) similarly for the saturated model, the Bayes factor between models Ms and Mk is the ratio BFk = p(D | Ms)/p(D | Mk). It is a measure of the evidence provided by the data in favor of Ms over Mk. The evidence favors Ms if BFk is greater than 1 and Mk if BFk is less than 1.

BICk is an approximation of 2logBFk. The approximation is particularly accurate when each of the prior distributions p(θk | Mk) and p(θs| Ms) is a multivariate normal distribution with a variance matrix comparable to that of the sampling distribution of the maximum likelihood estimate of the parameters based on a hypothetical sample of size n = 1. An assumption of such prior distributions, which are known as unit information priors, thus implicitly underlies BIC Equation 1. Their motivation and the derivation of BIC are discussed in detail in the Further Reading list below.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Bayesian Information Criterion

Definition

Motivation as an Approximate Bayes Factor

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends