The Effect of Cluster Randomization on Sample Size in Prevention Research

The Journal of Family Practice. 2001 March;50(03):242

March 1, 2001|Family Medicine

In the example presented, the ICC for the outcome measure “up-to-datedness” was approximately 0.04 in contrast to the ICC for inappropriateness, which was 0.18. The required sample size per group for the outcome measure “up-to-datedness” would be 21 physicians compared with inappropriateness, where the sample size would be 25 per group. In contrast, if the study dealt with improving smoking cessation counseling or reducing chest x-rays in smokers, the sample size would be 27 or 42 physicians per group. Treating the unit of analysis and the unit of randomization the same would require only 19 physicians per group.

Campbell and colleagues¹⁹ looked at a number of primary and secondary care study data sets and found that ICCs for measures in primary care were generally between 0.05 and 0.15. In contrast, in this study the ICCs ranged from 0.005 to 0.66, depending on the measure. The difference in ICC between measures and across studies is interesting, and we can only speculate why some measures show more interdependence. It is possible that inappropriateness taps phenomena such as policies at the practice level which physicians can not easily influence, while up-to-datedness may help explain how physicians even when working in the same practice setting behave independently when it comes to delivering recommended preventive care. It is important to be aware and not to assume that because one measure may show independence that all measures under study show the same independence. For example, blood pressure measurement and urine proteinuria screening are different in terms of ICC. Differences between outcome measures should be taken into account when calculating required sample size and in statistical analysis when the unit of randomization and analysis are not the same.

Limitations

There are 2 limitations with this research. First, analysis of respondents and nonrespondents to the recruitment effort showed that the study participants were more likely to be younger and women. This would imply that our findings may not be generalizable to the HSO population as a whole. Second, the measures of preventive performance were based on a chart audit and as a consequence are susceptible to the potential problems associated with chart documentation. A low level of preventive performance does not necessarily mean that prevention is not being practiced or that it is being performed inconsistently within a group practice. It may indicate that a less sophisticated documentation process is being used.

Conclusion

Physicians clustered together in the same practice do not necessarily perform the delivery of preventive services equally. As demonstrated by the measure “up-to-datedness,” there is relatively little correlation among physicians working together for performance of many preventive maneuvers. For some maneuvers, most notably those that may be automatically performed as part of practice policy, there is modest correlation among physicians who work together. We hope that these findings assist other researchers in their decision making around the need to adjust sample sizes for the effect of clustering.

References