Skip to main content icon/video/no-internet

Intracluster Homogeneity

Intracluster (or intraclass) homogeneity is a concept related to the degree of similarity between elements in the same cluster. The intracluster (or intraclass) correlation coefficient, ρ, measures the degree of homogeneity among population elements within the sampling clusters. Intracluster homogeneity is computed as the Pearson correlation coefficient between pairs of elements that are in the same cluster.

In terms of the variance components in an analysis of variance (ANOVA), intracluster homogeneity measures the extent to which the total element variance in the population is due to the between-cluster variance. In other words, ρ measures intracluster homogeneity in terms of the portion of the total variance that is attributable to cluster membership. When there is complete homogeneity within clusters, the between-cluster variance accounts for all the variance in the population and ρ is equal to 1.0. When there is complete heterogeneity within clusters, the within-cluster variance accounts for all the variance in the population and ρ is a negative number equal to the inverse of the size of the cluster minus 1.0. Finally, when the clusters are comprised on random elements from the population with no relationship to each other, ρ is zero.

In practice, the intracluster correlation coefficient typically is positive, but usually not very close to 1.0. This implies that there is some homogeneity within clusters, with elements from the same cluster being more similar to each other than elements selected at random from the population. In these cases, cluster sampling is less efficient than simple random sampling, necessitating some other gain to cluster sampling, like cost savings, to justify the efficiency loss for a cluster sample.

Cluster sampling is frequently used in practice because often it is not feasible or possible to compile sampling frames that consist of all population elements, especially when sampling large human populations. In addition, many times the costs of face-to-face interview data collection are prohibitive when sampling large human populations that are geographically dispersed.

For example, a complete sampling frame of all K-12 public school students in the United States does not exist, and it would be prohibitively expensive for any survey organization to construct such a sample frame. On the other hand, a complete frame of all K-12 public schools in the United States may be available from various sources, and a complete frame of students within each school is usually available. Therefore, a sample of students may be selected in two stages. In the first stage, a sample of schools is selected from the frame of all schools. In the second stage, a sample of students is selected from the frame of all students within each selected school.

Under this sample design, each school constitutes a sampling cluster, and the final sample consists of all sampled students from all sampled schools. This two-stage cluster sample design may be expanded to incorporate additional sampling stages. For example, one possible four-stage design is to select school districts in the first stage, schools in the second stage, classrooms in the third stage, and students in the fourth and final stage. Thus, cluster sampling allows the sample to be selected in successive stages. The sampling frame at each stage is either readily available or can be conveniently constructed.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading