7.1 Using the survey package | Poverty and Inequality with Complex Survey Data

7.1 Using the `survey` package

Influence functions and “resampling” replicates can be used to improve the inference about differences and changes between estimates.

The survey package already provides an approach for estimating the variance-covariance matrix using the svyby function. Based on the ?survey::svyby examples, we have:

# load the survey library
library(survey)

#  load data set
data( api )

# declare sampling design
dclus1 <- svydesign( id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc )

# estimate means
mns <-svyby(~api99, ~stype, dclus1, svymean,covmat=TRUE)

# collect variance-covariance matrix of estimates
( m <- vcov( mns ) )

##          E         H         M
## E 520.5973  573.0404  684.6562
## H 573.0404 1744.2317  747.1989
## M 684.6562  747.1989 1060.1954

# compute variance terms
var.naive <- sum( diag( m[c(1,3),c(1,3)] ) )
cov.term <- sum( diag( m[ c(1,3),c(3,1)] ) )

# "naive" SE of the difference
sqrt( var.naive )

## [1] 39.75918

# SE of the difference
sqrt( var.naive - cov.term )

## [1] 14.54236

#... or using svycontrast
svycontrast( mns , c(E = 1, M = -1) )

##          contrast     SE
## contrast -0.80833 14.542

Notice that, because the covariance terms are positive, the (actual) variance of the difference is smaller than the “naive” variance.

A similar idea can be implemented with other estimators, such as inequality and poverty measures. However, this has not yet been implemented for linearization/influence function methods. In the next section, we show an example with the Gini index.