3.2 Quintile Share Ratio (svyqsr)

Unlike the previous measure, the quintile share ratio is an inequality measure in itself, depending only of the income distribution to evaluate the degree of inequality. By definition, it can be described as the ratio between the income share held by the richest 20% and the poorest 20% of the population.

In plain terms, it expresses how many times the wealthier part of the population are richer than the poorest part. For instance, a \(QSR = 4\) implies that the upper class owns 4 times as much of the total income as the poor.

The quintile share ratio can be modified to a more general function of fractile share ratios. For instance, Cobham, Schlogl, and Sumner (2015Cobham, Alex, Luke Schlogl, and Andy Sumner. 2015. “Inequality and the Tails: The Palma Proposition and Ratio Revisited.” Working Papers 143. United Nations, Department of Economics; Social Affairs. http://www.un.org/esa/desa/papers/2015/wp143_2015.pdf.) presents interesting arguments for using the Palma index, defined as the ratio between the share of the 10% richest over the share held by the poorest 40%.

The details of the linearization of the QSR are discussed by Deville (1999Deville, Jean-Claude. 1999. “Variance Estimation for Complex Statistics and Estimators: Linearization and Residual Techniques.” Survey Methodology 25 (2): 193–203. http://www.statcan.gc.ca/pub/12-001-x/1999002/article/4882-eng.pdf.) and Osier (2009Osier, Guillaume. 2009. “Variance Estimation for Complex Indicators of Poverty and Inequality.” Journal of the European Survey Research Association 3 (3): 167–95. http://ojs.ub.uni-konstanz.de/srm/article/view/369.).


A replication example

The R vardpoor package (Breidaks, Liberts, and Ivanova 2016Breidaks, Juris, Martins Liberts, and Santa Ivanova. 2016. “Vardpoor: Estimation of Indicators on Social Exclusion and Poverty and Its Linearization, Variance Estimation.” Riga, Latvia: CSB.), created by researchers at the Central Statistical Bureau of Latvia, includes a qsr coefficient calculation using the ultimate cluster method. The example below reproduces those statistics.

Load and prepare the same data set:

# load the convey package
library(convey)

# load the survey library
library(survey)

# load the vardpoor library
library(vardpoor)

# load the synthetic european union statistics on income & living conditions
data(eusilc)

# make all column names lowercase
names( eusilc ) <- tolower( names( eusilc ) )

# add a column with the row number
dati <- data.table(IDd = 1 : nrow(eusilc), eusilc)

# calculate the qsr coefficient
# using the R vardpoor library
varpoord_qsr_calculation <-
    varpoord(
    
        # analysis variable
        Y = "eqincome", 
        
        # weights variable
        w_final = "rb050",
        
        # row number variable
        ID_level1 = "IDd",
        
        # row number variable
        ID_level2 = "IDd",
        
        # strata variable
        H = "db040", 
        
        N_h = NULL ,
        
        # clustering variable
        PSU = "rb030", 
        
        # data.table
        dataset = dati, 
        
        # qsr coefficient function
        type = "linqsr",
      
      # poverty threshold range
      order_quant = 50L ,
      
      # get linearized variable
      outp_lin = TRUE
        
    )



# construct a survey.design
# using our recommended setup
des_eusilc <- 
    svydesign( 
        ids = ~ rb030 , 
        strata = ~ db040 ,  
        weights = ~ rb050 , 
        data = eusilc
    )

# immediately run the convey_prep function on it
des_eusilc <- convey_prep( des_eusilc )

# coefficients do match
varpoord_qsr_calculation$all_result$value
## [1] 3.970004
coef( svyqsr( ~ eqincome , des_eusilc ) )
## eqincome 
## 3.970004
# linearized variables do match
# vardpoor
lin_qsr_varpoord<- varpoord_qsr_calculation$lin_out$lin_qsr
# convey 
lin_qsr_convey <- attr(svyqsr( ~ eqincome , des_eusilc ),"lin")

# check equality
all.equal(lin_qsr_varpoord, lin_qsr_convey )
## [1] TRUE
# variances do not match exactly
attr( svyqsr( ~ eqincome , des_eusilc ) , 'var' )
##             eqincome
## eqincome 0.001810537
varpoord_qsr_calculation$all_result$var
## [1] 0.001807323
# standard errors do not match exactly
varpoord_qsr_calculation$all_result$se
## [1] 0.04251263
SE( svyqsr( ~ eqincome , des_eusilc ) )
##            eqincome
## eqincome 0.04255041

The variance estimate is computed by using the approximation defined in (1.1), where the linearized variable \(z\) is defined by (1.2). The functions convey::svygpg and vardpoor::lingpg produce the same linearized variable \(z\).

However, the measures of uncertainty do not line up, because library(vardpoor) defaults to an ultimate cluster method that can be replicated with an alternative setup of the survey.design object.

# within each strata, sum up the weights
cluster_sums <- aggregate( eusilc$rb050 , list( eusilc$db040 ) , sum )

# name the within-strata sums of weights the `cluster_sum`
names( cluster_sums ) <- c( "db040" , "cluster_sum" )

# merge this column back onto the data.frame
eusilc <- merge( eusilc , cluster_sums )

# construct a survey.design
# with the fpc using the cluster sum
des_eusilc_ultimate_cluster <- 
    svydesign( 
        ids = ~ rb030 , 
        strata = ~ db040 ,  
        weights = ~ rb050 , 
        data = eusilc , 
        fpc = ~ cluster_sum 
    )

# again, immediately run the convey_prep function on the `survey.design`
des_eusilc_ultimate_cluster <- convey_prep( des_eusilc_ultimate_cluster )

# matches
attr( svyqsr( ~ eqincome , des_eusilc_ultimate_cluster ) , 'var' )
##             eqincome
## eqincome 0.001807323
varpoord_qsr_calculation$all_result$var
## [1] 0.001807323
# matches
varpoord_qsr_calculation$all_result$se
## [1] 0.04251263
SE( svyqsr( ~ eqincome , des_eusilc_ultimate_cluster ) )
##            eqincome
## eqincome 0.04251263

For additional usage examples of svyqsr, type ?convey::svyqsr in the R console.