Alpha, Significance Level of a Test

Michael S.Lewis-Beck; Alan Bryman; Tim FutingLiao

doi:10.4135/9781412950589

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Alpha, Significance Level of a Test

Edited by:
Michael S. Lewis-Beck
,
Alan Bryman
&
Tim Futing Liao
In:The SAGE Encyclopedia of Social Science Research Methods
Chapter DOI:https://doi.org/10.4135/9781412950589.n11
Subject:Research Methods

Request Permissions

Show page numbers Hide page numbers

The development of significance testing is attributed to Ronald Fisher. His work in agrarian statistics in the 1920s led him to choose the significance threshold of .05, a level that has been accepted by research communities in many fields. The underlying motivations for the acceptance of hypothesis testing and the P value have been reviewed (Goodman, 1999). However, it must be emphasized that no mathematical theory points to .05 as the optimum Type I error level—only tradition.

The alpha error level is that level prospectively chosen during the design phase of the research; the p value is that error level computed based on the research data. Alpha levels are selected because they indicate the possibility that random sampling error has produced the result of the study, so they may be chosen at levels other than .05. The selection of an alpha level greater than .05 suggests that the investigators are willing to accept a greater level of Type I error to ensure the likelihood of identifying a positive effect in their research. Alternatively, the consequences of finding a positive effect in the research effort may be so important or cause such upheaval that the investigators set stringent requirements for the alpha error rate (e.g., α = .01). Each of these decisions is appropriate if it (a) is made prospectively and (b) is adequately explained to the scientific community.

One adaptation of significance testing has been the use of one-tailed testing. However, one-sided testing has been a subject of contention in medical research and has been the subject of controversy in the social sciences as well. An adaptation of two-tailed testing, which leads to nontraditional alpha error levels, is the implementation of asymmetric regions of significance. When testing for the effect of an intervention in improving education in elementary school age children, it is possible that the intervention may have a paradoxical effect, reducing rather than increasing testing scores. In this situation, the investigator can divide the Type I error rate, so that 80% of the available Type I error level (.04 if the total alpha error level is .05) is in the “harm” end of the tail, and put the remaining 1% in the benefit tail of the distribution. The prospective declaration of such a procedure is critical and, when in place, demonstrates the investigator's sensitivity to the possibility that the intervention may be unpredictably harmful. Guidelines for the development of such asymmetric testing have been produced (Moyé, 2000).

When investigators carry out several hypothesis tests within a clinical trial, the family-wise or overall Type I error level increases. A research effort that produces two statistical hypotheses resulting in p values at the .045 level may seem to suggest that the research sample has produced two results that are very likely to reflect findings in the larger population from which the sample was derived. However, an important issue is the likelihood that at least one of these conclusions is wrong. Assuming each of the two tests is independent, then the probability that at least one of these analyses produces a misleading result through sampling error alone is 1 – (.955)2 =.088. This can be controlled by making adjustments in the Type I error levels of the individual tests (e.g., Bonferroni techniques), which are useful in reducing the alpha error level for each statistical hypothesis test that was carried out. As an example, an investigator who is interested in examining the effect of a smoking cessation program on teenage smoking may place a Type I error rate of .035 on the effect of the intervention on self-reports of smoking and place the remaining .015 on the intervention's effect on attitudes about smoking. This approach maintains the overall error rate at .05 while permitting a positive result for the effect of therapy on either of these two endpoint measurements of smoking.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Alpha, Significance Level of a Test

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends