Skip to main content icon/video/no-internet

Statistical inference is the science of making conclusions, decisions, or inferences about a population based on information obtained in a sample. The procedure that leads to rejecting or not rejecting specific statements about a population is called hypothesis testing. If there is only one population under investigation, researchers conduct a one-sample test. If they are comparing two populations, they conduct a twosample test. Additionally, there are multisample tests (also known as k-sample tests) to consider more than two populations.

A statistical hypothesis is an assumption or statement or inference regarding one or more parameters of a population distribution, or the type or nature of a population. Both the following are examples of statistical hypotheses: (1) The specificities of two diagnostic tests are the same and (2) the average score on the ABC test is 70. Statistical hypotheses are always about population parameters, while a decision to accept or reject a hypothesis is based on a test statistic derived from a sample.

Hypothesis testing is a decision-making process for evaluating claims about a population. A hypothesis test determines if an observed value of a statistic (from a sample) differs enough from a hypothesized value of a parameter (from a population) to draw the inference that the hypothesized value is not the true value. ‘Enough’ difference is measured by using subjective judgment and statistical conventions to set a predetermined acceptable probability of making an inference error due to sampling error.

The most commonly used current method for hypothesis testing is formally called ‘null hypothesis significance testing.’ Most people today simply refer to this method as significance testing or hypothesis testing and often use the terms interchangeably. However, this method combines the two historical procedures that are known separately as ‘significance testing’ and ‘hypothesis testing.’ In this article, unless a distinction is necessary, the term hypothesis test will refer to the current method of the ‘null hypothesis significance test.’

History

Fisher's significance test concentrates on a Type I error and the associated p value. In the 1920s, R. A. Fisher developed his significance test of a null hypothesis that is a statistical inference based on deductive probabilities yielding a p value (which measures the discrepancy between the null hypothesis and the data). This test specifies only one hypothesis (the null hypothesis), but the alternative hypothesis is implicitly defined. The outcome of Fisher's test is a statement of whether significant or nonsignificant results are obtained. The significance or nonsignificance is decided based on the p value compared with a predetermined level of allowable Type I error (a).

In the early 1930s, Jerzy Neyman and Egon Pearson developed a method called the NeymanPearson hypothesis test that requires that researchers specify two point hypotheses, as well as Type I (a) and Type II (b) error rates in advance of conducting the experiment. These prespecifications are used to create a decision rule for rejecting or accepting the null hypothesis. Common problems encountered with this approach are that (1) most researchers are unwilling to specify values for the alternative hypothesis and (2) point hypotheses are always untrue if calculations are carried to sufficient decimal places.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading