7.1 Using the survey package

Influence functions and “resampling” replicates can be used to improve the inference about differences and changes between estimates.

The survey package already provides an approach for estimating the variance-covariance matrix using the svyby function. Based on the ?survey::svyby examples, we have:

# load the survey library
library(survey)

#  load data set
data( api )

# declare sampling design
dclus1 <- svydesign( id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc )

# estimate means
mns <-svyby(~api99, ~stype, dclus1, svymean,covmat=TRUE)

# collect variance-covariance matrix of estimates
( m <- vcov( mns ) )
##          E         H         M
## E 520.5973  573.0404  684.6562
## H 573.0404 1744.2317  747.1989
## M 684.6562  747.1989 1060.1954
# compute variance terms
var.naive <- sum( diag( m[c(1,3),c(1,3)] ) )
cov.term <- sum( diag( m[ c(1,3),c(3,1)] ) )

# "naive" SE of the difference
sqrt( var.naive )
## [1] 39.75918
# SE of the difference
sqrt( var.naive - cov.term )
## [1] 14.54236
#... or using svycontrast
svycontrast( mns , c(E = 1, M = -1) )
##          contrast     SE
## contrast -0.80833 14.542

Notice that, because the covariance terms are positive, the (actual) variance of the difference is smaller than the “naive” variance.

A similar idea can be implemented with other estimators, such as inequality and poverty measures. However, this has not yet been implemented for linearization/influence function methods. In the next section, we show an example with the Gini index.