3.1 At Risk of Poverty Threshold (svyarpt)

✔️ commonly used by statistical agencies in the european union working group on statistics on income & living conditions (eurostat)
✔️ not tied to the inflation rate nor to a basket of goods or consumable products
✔️ generic calculation that can be broadly applied to different nations or regions
✔️ easy to understand: defaults to 60% of median income
❌ the 60% of median income used in ARPT might appear arbitrary for non-EU analyses
❌ does not account for the intensity/severity of poverty
❌ not really a poverty measure, but an estimated poverty threshold/poverty line

The at-risk-of-poverty threshold (ARPT) is a measure used to define the people whose incomes imply a low standard of living in comparison to the general living standards. Even though some people are not below the effective poverty line, those below the ARPT can be considered “almost deprived”.

This measure is defined as \(0.6\) times the median income for the entire population:

\[ arpt = 0.6 \times median(y), \] where \(y\) is the income variable and median is estimated for the whole population. The details of the linearization of the ARPT are discussed by Deville (1999Deville, Jean-Claude. 1999. “Variance Estimation for Complex Statistics and Estimators: Linearization and Residual Techniques.” Survey Methodology 25 (2): 193–203. http://www.statcan.gc.ca/pub/12-001-x/1999002/article/4882-eng.pdf.) and Osier (2009Osier, Guillaume. 2009. “Variance Estimation for Complex Indicators of Poverty and Inequality.” Journal of the European Survey Research Association 3 (3): 167–95. http://ojs.ub.uni-konstanz.de/srm/article/view/369.).


3.1.1 Replication Example

The R vardpoor package (Breidaks, Liberts, and Ivanova 2016Breidaks, Juris, Martins Liberts, and Santa Ivanova. 2016. “Vardpoor: Estimation of Indicators on Social Exclusion and Poverty and Its Linearization, Variance Estimation.” Riga, Latvia: CSB.), created by researchers at the Central Statistical Bureau of Latvia, includes an ARPT coefficient calculation using the ultimate cluster method. The example below reproduces those statistics.

Load and prepare the same data set:

# load the convey package
library(convey)

# load the survey library
library(survey)

# load the vardpoor library
library(vardpoor)

# load the laeken library
library(laeken)

# load the synthetic EU statistics on income & living conditions
data(eusilc)

# make all column names lowercase
names(eusilc) <- tolower(names(eusilc))

# add a column with the row number
dati <- data.table::data.table(IDd = 1:nrow(eusilc), eusilc)

# calculate the arpt coefficient
# using the R vardpoor library
varpoord_arpt_calculation <-
  varpoord(
    # analysis variable
    Y = "eqincome",
    
    # weights variable
    w_final = "rb050",
    
    # row number variable
    ID_level1 = "IDd",
    
    # row number variable
    ID_level2 = "IDd",
    
    # strata variable
    H = "db040",
    
    N_h = NULL ,
    
    # clustering variable
    PSU = "rb030",
    
    # data.table
    dataset = dati,
    
    # arpt coefficient function
    type = "linarpt",
    
    # get linearized variable
    outp_lin = TRUE
  )


# construct a survey.design
# using our recommended setup
des_eusilc <-
  svydesign(
    ids = ~ rb030 ,
    strata = ~ db040 ,
    weights = ~ rb050 ,
    data = eusilc
  )

# immediately run the convey_prep function on it
des_eusilc <- convey_prep(des_eusilc)

# coefficients do match
varpoord_arpt_calculation$all_result$value
## [1] 10859.24
coef(svyarpt( ~ eqincome , des_eusilc))
## eqincome 
## 10859.24
# linearized variables do match
# vardpoor
lin_arpt_varpoord <- varpoord_arpt_calculation$lin_out$lin_arpt
# convey
lin_arpt_convey <- attr(svyarpt( ~ eqincome , des_eusilc), "lin")

# check equality
all.equal(lin_arpt_varpoord, lin_arpt_convey)
## [1] TRUE
# variances do not match exactly
attr(svyarpt( ~ eqincome , des_eusilc) , 'var')
##          eqincome
## eqincome 2564.027
varpoord_arpt_calculation$all_result$var
## [1] 2559.442
# standard errors do not match exactly
varpoord_arpt_calculation$all_result$se
## [1] 50.59093
SE(svyarpt( ~ eqincome , des_eusilc))
##          eqincome
## eqincome 50.63622

The variance estimator and the linearized variable \(z\) are both defined in Linearization-Based Variance Estimation. The functions convey::svyarpt and vardpoor::linarpt produce the same linearized variable \(z\).

However, the measures of uncertainty do not line up, because library(vardpoor) defaults to an ultimate cluster method that can be replicated with an alternative setup of the survey.design object.

# within each strata, sum up the weights
cluster_sums <-
  aggregate(eusilc$rb050 , list(eusilc$db040) , sum)

# name the within-strata sums of weights the `cluster_sum`
names(cluster_sums) <- c("db040" , "cluster_sum")

# merge this column back onto the data.frame
eusilc <- merge(eusilc , cluster_sums)

# construct a survey.design
# with the fpc using the cluster sum
des_eusilc_ultimate_cluster <-
  svydesign(
    ids = ~ rb030 ,
    strata = ~ db040 ,
    weights = ~ rb050 ,
    data = eusilc ,
    fpc = ~ cluster_sum
  )

# again, immediately run the convey_prep function on the `survey.design`
des_eusilc_ultimate_cluster <-
  convey_prep(des_eusilc_ultimate_cluster)



# matches
stopifnot(all.equal(
  attr(svyarpt( ~ eqincome , des_eusilc_ultimate_cluster) , 'var')[1] ,
  varpoord_arpt_calculation$all_result$var
))

# matches
stopifnot(all.equal(varpoord_arpt_calculation$all_result$se ,
                    SE(
                      svyarpt( ~ eqincome , des_eusilc_ultimate_cluster)
                    )[1]))

For additional usage examples of svyarpt, type ?convey::svyarpt in the R console.

3.1.2 Real World Examples

This section displays example results using nationally-representative surveys from both the United States and Brazil. We present a variety of surveys, levels of analysis, and subpopulation breakouts to provide users with points of reference for the range of plausible values of the svyarpt function.

To understand the construction of each survey design object and respective variables of interest, please refer to section 1.4 for CPS-ASEC, section 1.5 for PNAD Contínua, and section 1.6 for SCF.

3.1.2.1 CPS-ASEC Household Income

svyarpt(~ htotval , cps_household_design)
##          arpt     SE
## htotval 44521 391.72
svyby(~ htotval , ~ sex , cps_household_design , svyarpt)
##           sex htotval se.htotval
## male     male 50623.2   455.4800
## female female 39000.0   185.2314

3.1.2.2 CPS-ASEC Family Income

svyarpt(~ ftotval , cps_family_design)
##          arpt     SE
## ftotval 55680 473.82
svyby(~ ftotval , ~ sex , cps_family_design , svyarpt)
##           sex ftotval se.ftotval
## male     male 61320.0   504.0308
## female female 48856.2   453.7270

3.1.2.3 CPS-ASEC Worker Earnings

svyarpt(~ pearnval , cps_ftfy_worker_design)
##           arpt     SE
## pearnval 36000 355.77
svyby(~ pearnval , ~ sex , cps_ftfy_worker_design , svyarpt)
##           sex pearnval se.pearnval
## male     male    37200    378.1736
## female female    31200    185.5967

3.1.2.4 PNAD Contínua Per Capita Income

svyarpt( ~ deflated_per_capita_income , pnadc_design , na.rm = TRUE)
##                              arpt     SE
## deflated_per_capita_income 598.16 1.8529
svyby(~ deflated_per_capita_income ,
      ~ sex ,
      pnadc_design ,
      svyarpt ,
      na.rm = TRUE)
##           sex deflated_per_capita_income se.deflated_per_capita_income
## male     male                   607.2863                      4.266030
## female female                   593.8436                      2.711941

3.1.2.5 PNAD Contínua Worker Earnings

svyarpt( ~ deflated_labor_income , pnadc_design , na.rm = TRUE)
##                         arpt     SE
## deflated_labor_income 955.28 1.9433
svyby( ~ deflated_labor_income , ~ sex , pnadc_design , svyarpt , na.rm = TRUE)
##           sex deflated_labor_income se.deflated_labor_income
## male     male             1074.9787                  1.88908
## female female              890.7655                  1.51126

3.1.2.6 SCF Family Net Worth

scf_MIcombine(with(scf_design , svyarpt( ~ networth)))
## Multiple imputation results:
##       m <- length(results)
##       scf_MIcombine(with(scf_design, svyarpt(~networth)))
##           results       se
## networth 115250.4 4678.086
scf_MIcombine(with(scf_design , svyby( ~ networth, ~ hhsex , svyarpt)))
## Multiple imputation results:
##       m <- length(results)
##       scf_MIcombine(with(scf_design, svyby(~networth, ~hhsex, svyarpt)))
##         results       se
## male   157268.4 6523.353
## female  44178.0 3877.152

3.1.2.7 SCF Family Income

scf_MIcombine(with(scf_design , svyarpt( ~ income)))
## Multiple imputation results:
##       m <- length(results)
##       scf_MIcombine(with(scf_design, svyarpt(~income)))
##         results       se
## income 42285.27 775.3165
scf_MIcombine(with(scf_design , svyby( ~ income, ~ hhsex , svyarpt)))
## Multiple imputation results:
##       m <- length(results)
##       scf_MIcombine(with(scf_design, svyby(~income, ~hhsex, svyarpt)))
##         results        se
## male   54348.25 1512.9449
## female 24255.66  852.8023