## 2.3 Relative Median Income Ratio (svyrmir)

The relative median income ratio (rmir) is the ratio of the median income of people aged above a value (65) to the median of people aged below the same value. In mathematical terms,

$rmir = \frac{median\{y_i; age_i >65 \}}{median\{y_i; age_i \leq 65 \}}.$

The details of the linearization of the rmir and are discussed by Deville (1999)Deville, Jean-Claude. 1999. “Variance Estimation for Complex Statistics and Estimators: Linearization and Residual Techniques.” Survey Methodology 25 (2): 193–203. http://www.statcan.gc.ca/pub/12-001-x/1999002/article/4882-eng.pdf. and Deville (1999)Deville, Jean-Claude. 1999. “Variance Estimation for Complex Statistics and Estimators: Linearization and Residual Techniques.” Survey Methodology 25 (2): 193–203. http://www.statcan.gc.ca/pub/12-001-x/1999002/article/4882-eng.pdf..

A replication example

The R vardpoor package (Breidaks, Liberts, and Ivanova 2016Breidaks, Juris, Martins Liberts, and Santa Ivanova. 2016. “Vardpoor: Estimation of Indicators on Social Exclusion and Poverty and Its Linearization, Variance Estimation.” Riga, Latvia: CSB.), created by researchers at the Central Statistical Bureau of Latvia, includes a rmir coefficient calculation using the ultimate cluster method. The example below reproduces those statistics.

Load and prepare the same data set:

# load the convey package
library(convey)

library(survey)

library(vardpoor)

library(laeken)

# load the synthetic EU statistics on income & living conditions
data(eusilc)

# make all column names lowercase
names( eusilc ) <- tolower( names( eusilc ) )

# add a column with the row number
dati <- data.table::data.table(IDd = 1 : nrow(eusilc), eusilc)

# calculate the rmir coefficient
# using the R vardpoor library
varpoord_rmir_calculation <-
varpoord(

# analysis variable
Y = "eqincome",

# weights variable
w_final = "rb050",

# row number variable
ID_level1 = "IDd",

# row number variable
ID_level2 = "IDd",

# strata variable
H = "db040",

N_h = NULL ,

# clustering variable
PSU = "rb030",

# data.table
dataset = dati,

# age variable
age = "age",

# rmir coefficient function
type = "linrmir",

# poverty threshold range
order_quant = 50L ,

# get linearized variable
outp_lin = TRUE

)

# construct a survey.design
# using our recommended setup
des_eusilc <-
svydesign(
ids = ~ rb030 ,
strata = ~ db040 ,
weights = ~ rb050 ,
data = eusilc
)

# immediately run the convey_prep function on it
des_eusilc <- convey_prep( des_eusilc )

# coefficients do match
varpoord_rmir_calculation$all_result$value
## [1] 0.9330361
coef( svyrmir( ~ eqincome , des_eusilc, age = ~age ) ) 
##  eqincome
## 0.9330361
# linearized variables do match
# vardpoor
lin_rmir_varpoord<- varpoord_rmir_calculation$lin_out$lin_rmir
# convey
lin_rmir_convey <- attr(svyrmir( ~ eqincome , des_eusilc, age = ~age ),"lin")

# check equality
all.equal(lin_rmir_varpoord, lin_rmir_convey[,1] )
## [1] TRUE
# variances do not match exactly
attr( svyrmir( ~ eqincome , des_eusilc, age = ~age ) , 'var' ) 
##             eqincome
## eqincome 0.000127444
varpoord_rmir_calculation$all_result$var
## [1] 0.0001272137
# standard errors do not match exactly
varpoord_rmir_calculation$all_result$se
## [1] 0.0112789
SE( svyrmir( ~ eqincome , des_eusilc , age = ~age) ) 
##            eqincome
## eqincome 0.01128911

The variance estimate is computed by using the approximation defined in (1.1), where the linearized variable $$z$$ is defined by (1.2). The functions convey::svyrmir and vardpoor::linrmir produce the same linearized variable $$z$$.

However, the measures of uncertainty do not line up, because library(vardpoor) defaults to an ultimate cluster method that can be replicated with an alternative setup of the survey.design object.

# within each strata, sum up the weights
cluster_sums <- aggregate( eusilc$rb050 , list( eusilc$db040 ) , sum )

# name the within-strata sums of weights the cluster_sum
names( cluster_sums ) <- c( "db040" , "cluster_sum" )

# merge this column back onto the data.frame
eusilc <- merge( eusilc , cluster_sums )

# construct a survey.design
# with the fpc using the cluster sum
des_eusilc_ultimate_cluster <-
svydesign(
ids = ~ rb030 ,
strata = ~ db040 ,
weights = ~ rb050 ,
data = eusilc ,
fpc = ~ cluster_sum
)

# again, immediately run the convey_prep function on the survey.design
des_eusilc_ultimate_cluster <- convey_prep( des_eusilc_ultimate_cluster )

# matches
attr( svyrmir( ~ eqincome , des_eusilc_ultimate_cluster , age = ~age ) , 'var' ) 
##              eqincome
## eqincome 0.0001272137
varpoord_rmir_calculation$all_result$var
## [1] 0.0001272137
# matches
varpoord_rmir_calculation$all_result$se
## [1] 0.0112789
SE( svyrmir( ~ eqincome , des_eusilc_ultimate_cluster, age = ~age ) ) 
##           eqincome
## eqincome 0.0112789

For additional usage examples of svyrmir, type ?convey::svyrmir in the R console.