Chapter 7 Covariance Matrix

The current convey software does not account for the covariance across groups for linearization-based variance estimators (as the objects created using svydesign). We acknowledge that, under ideal circumstances, the covariance would be properly accounted for in the software. Modifying our software to account for the covariance using linearization remains a future goal, but users can readily use resampling-based variance estimation methods for such purpose. Transforming a svydesign object to a svrepdesign with survey::as.svrepdesign might be a useful approach around this limitation.

Accounting for the covariance between estimates is particularly important for inference regarding net changes, for instance. Many countries use rotating panel schemes with overlapping samples for their labour surveys, meaning that that part of the sample is interviewed again in the next round. For instance, the PNADC and the Basic Monthly CPS both use a rotating panel scheme, where a household is interviewed a number of times before dropping from the sample. For net changes, the overlapping samples tend to produce positive covariances over time, so that accounting for the covariance between the estimates can produce improved inferences through less conservative confidence intervals.

The practical implications can be made clear by studying the variance of the difference. Consider the two estimates \(\widehat{T}_1\) and \(\widehat{T}_2\), where we are interested in making inferences about \(T_1 - T_2\). We can estimate this difference using \(\widehat{T}_1 - \widehat{T}_2\). Put simply, the variance of this difference is given by

\[ Var \big( \widehat{T}_1 - \widehat{T}_2 \big) = Var \big( \widehat{T}_1 \big) + Var \big( \widehat{T}_2 \big) - 2 Cov \big( \widehat{T}_1 , \widehat{T}_2 \big) \]

where: \(Var \big( \widehat{T}_1 \big)\) and \(Var \big( \widehat{T}_2 \big)\) are the variances of the estimators \(\widehat{T}_1\) and \(\widehat{T}_2\); \(Cov \big( \widehat{T}_1 , \widehat{T}_2 \big)\) is the covariance of these estimators. Usually, the estimators are assumed to be independent and the covariance term can be ignored. But, when the covariance is strongly positive, this results in a overly conservative variance estimator for the difference.15 The covariance can also be negative; then, the variance of the difference would be larger than the sum of the variances. While theoretically possible, we have not observed it in practice yet.