Cross-sectional surveys of disease prevalence, including for tuberculosis (TB), often use a two (or more) stage sampling procedure. By choosing clusters of people randomly from all possible clusters, the logistic costs of doing the survey can be reduced. However, this increases the statistical uncertainty in the estimate of prevalence, and we need to balance the reduction in cost against the increase in uncertainty. Here we describe cluster sampling and consider ways to determine the optimal survey design as well as the extent to which deviations from the optimal design matter. We illustrate the results using data from a recent survey in Cambodia in which TB was diagnosed using sputum smears, cultures and X-rays.
Document's year of publication: 2006-2010