Skip to main content icon/video/no-internet

A p value expresses the probability that a given statistical result is due to chance. p Values are automatically produced for many statistical procedures by analysis packages such as SAS, SPSS, and Stata, and they are commonly cited in epidemiological and medical research as evidence that results should be considered significant. For instance, a clinical trial concerning the effect of two different drugs on a medical outcome would almost certainly cite one or more p values as evidence of the influence, or lack thereof, of the drugs on the outcome. In the example above, perhaps we would expect that if the two drugs are no different in their effect, we would expect the outcome to be the same in each group. Therefore, we would expect the difference between the two groups to be zero, and would be interested in determining whether the actual difference we found could be attributed to chance or whether it is likely to indicate true difference between the groups.

Generally speaking, the greater the difference between the expected and observed results, the smaller the p value and, therefore, the less likely it is that the difference is due to chance (random variation). As the likelihood of chance explaining the difference diminishes, so does its plausibility—giving way to an alternative explanation: that the difference is not due to chance, the difference is due to the expected value being wrong.

We will illustrate the concept of p value using the simple example of 10 tosses of a coin. Without saying more, it is reasonable to believe the coin is fair: There is an equal chance of either heads or tails on each toss. Following this line of reasoning, it is logical to expect 5 heads from 10 tosses, so the expected value of heads is 5. On 10 tosses, however, a variety of outcomes are not at all remarkable. For example, 6 heads (H) and 4 tails (T); 5H and 5T; 4H and 6T do not surprise or call into question the reasonable presumption of a fair coin. Further departures from the 5H and 5T, however, credibly cause doubt, increasing doubt with increasing departure.

Consider the result 8H and 2T. Such an outcome is not expected and causes doubt about the fairness of the coin (or process). Certainly, chance could have produced the outcome, but it is unlikely. The p value is the measure of that chance. The probability that such a result is due to chance is derived from the binomial distribution and is .0439. This calculation assumes that the coin is fair, that is, that the probability of heads on each toss is .5. Therefore, the p value calculation is a conditional probability with the condition being that the expected value is true. Such an assumption is important since we are determining the probability of the observed difference (departure) from the expected value if the expected value was correct in the first place.

In statistics, we are usually concerned with the probability not of achieving a particular result but of obtaining results at least as extreme as our result. In our example, the expected result is 5H and 5T, so we consider deviations from that expectation to be more extreme as they are less likely. So 9H and 1T is even less likely than 8H and 2T (the probability of 9H and 1T, given a fair coin, is .0098), and 10H and 0T is yet more extreme (with a probability of .0010). To calculate the probability of a result at least as extreme as 8T and 2H, we add together these probabilities: .0439 + .0098 + .0010 + .0547. A final comment to this illustration: The calculation of .0547 is based on a one-sided p value. In other words, it only considers the probability of getting 8 or more heads in 10 tosses of a fair coin and ignores the probability of getting 8 or more tails, which would be equally as extreme. If we would consider a deviation toward either more heads or more tails to be a significant result, then the p value should be a two-sided calculation incorporating all the outcomes consistent with observation. In our illustration, the set of outcomes would number six: 10H and 0T; 9H and 1T; 8H and 2T; 2H and 8T; 1H and 9T; 0H and 10T. The probability would be the sum of these six independent outcomes: .0010 + .0098 + .0439 + .0439 + .0098 + .0010 = .1094. (The fact that this is exactly two times the one-sided p value is because the binomial distribution is symmetrical when the probability of a success is .5, the expected probability of an H when assuming a fair coin.) Thus, there is approximately 11% chance (.1094) that 10 fair tosses of a fair coin would produce any one of the six ‘extreme’ values: 10H and 0T; 9H and 1T; 8H and 2T; 2H and 8T; 1H and 9T; 0H and 10T.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading