Skip to main content icon/video/no-internet

Newman–Keuls Test and Tukey Test

An analysis of variance (ANOVA) indicates whether several means come from the same population. Such a procedure is called an omnibus test, because it tests the whole set of means at once (omnibus means “for all” in Latin). In an ANOVA omnibus test, a significant result indicates that at least two groups differ from each other, but it does not identify the groups that differ. So an ANOVA is generally followed by an analysis whose goal is to identify the pattern of differences in the results. This analysis is often performed by evaluating all the pairs of means to decide which ones show a significant difference. In a general framework, this approach, which is called a pairwise comparison, is a specific case of an “a posteriori contrast analysis,” but it is specific enough to be studied in itself. Two of the most common methods of pairwise comparisons are the Tukey test and the Newman– Keuls test. Both tests are based on the “Studentized range” or “Student's q:” They differ in that the Newman–Keuls test is a sequential test designed to have more power than the Tukey test.

Choosing between the Tukey and Newman–Keuls tests is not straightforward and there is no consensus on this issue. The Newman–Keuls test is most frequently used in psychology, whereas the Tukey test is most commonly used in other disciplines. An advantage of the Tukey test is to keep the level of the Type I error (i.e., finding a difference when none exists) equal to the chosen alpha level (e.g., α = .05 or α = .01). An additional advantage of the Tukey test is to allow the computation of confidence intervals for the differences between the means. Although the Newman-Keuls test has more power than the Tukey test, the exact value of the probability of making a Type I error of the Newman-Keuls test cannot be computed because of the sequential nature of this test. In addition, because the criterion changes for each level of the Newman-Keuls test, confidence intervals cannot be computed around the differences between means. Therefore, selecting whether to use the Tukey or Newman-Keuls test depends on whether additional power is required to detect significant differences between means.

Studentized Range and Students q

Both the Tukey and Newman-Keuls tests use a sampling distribution derived by Willam Gosset (who was working for Guiness and decided to publish under the pseudonym of “Student” because of Guiness's confidentiality policy). This distribution, which is called the Studentized Range or Student's q, is similar to a i-distribution. It corresponds to the sampling distribution of the largest difference between two means coming from a set of A means (when A = 2, the q distribution corresponds to the usual Student's t).

In practice, one computes a criterion denoted qobserved, which evaluates the difference between the means of two groups. This criterion is computed as

None

where Mi and Mj are the group means being compared, MSerror is the mean square error from the previously computed ANOVA (i.e., this is the mean square used for the denominator of the omnibus F ratio), and S is the number of observations per group (the groups are assumed to be of equal size).

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading