4.9 Atkinson index (svyatk)

✔️ is defined in terms of an interpretable utility function
✔️ has a direct (but non-linear) relationship to `svygei`
✔️ has an interpretable inequality aversion parameter
❌ does not handle zeroes or negative incomes
❌ the decomposition interpretation is not exactly the same as the GEI, but very similar (not implemented, though)
❌ not very common

Dalton (⊕1920Dalton, Hugh. 1920. “The Measurement of the Inequality of Incomes.” The Economic Journal 30 (September). https://doi.org/10.2307/2223525.) pointed out one of the most important facts about the economic theory of income inequality measurement: every income inequality measure has an implicit SWF.1212 This is the subject of an interesting discussion between the British and the Italian schools of inequality measurement. More formally, Atkinson (⊕1970Atkinson, Anthony B. 1970. “On the Measurement of Inequality.” Journal of Economic Theory 2 (3): 244–63. https://ideas.repec.org/a/eee/jetheo/v2y1970i3p244-263.html.) represent the SWF by an additively separable and symmetric function

\[ W ( y ) = \int_0^{+\infty} U ( y ) f( y ) dy \] where \(U( y )\) is a twice differentiable, increasing and concave utility function. The utility function “weights” the individual income \(y\) on the (overall) social welfare.1313 The utility is not what the welfare that the individual attributes to its own income, but the social welfare that the society attributes to an individual with that level of income. Saying that “\(U ( y )\) is increasing in \(y\)” means that more income means more individual utility, while “\(U ( y )\) is concave” means that (individual) welfare grows less for every income increase.

Atkinson (⊕1970, 250Atkinson, Anthony B. 1970. “On the Measurement of Inequality.” Journal of Economic Theory 2 (3): 244–63. https://ideas.repec.org/a/eee/jetheo/v2y1970i3p244-263.html.) that we should measure inequality by using the Equally Distributed Equivalent level of income \(y_{EDE}\). The \(y_{EDE}\) is the level of income that would lead to the observed level welfare if income was equally distributed. In mathematical terms, the \(y_{EDE}\) is defined as:

\[ \begin{aligned} W( y_{EDE} \cdot \mathbb{1} ) &= W( y ) \\ \int_0^{+\infty} U ( y_{EDE} ) f( y ) dy &= \int_0^{+\infty} U ( y ) f( y ) dy \\ U ( y_{EDE} ) \int_0^{+\infty} f( y ) dy &= \int_0^{+\infty} U ( y ) f( y ) dy \\ \therefore \quad y_{EDE} :\quad U ( y_{EDE} ) &= \int_0^{+\infty} U ( y ) f( y ) dy \end{aligned} \]

Using this concept, Atkinson (⊕1970Atkinson, Anthony B. 1970. “On the Measurement of Inequality.” Journal of Economic Theory 2 (3): 244–63. https://ideas.repec.org/a/eee/jetheo/v2y1970i3p244-263.html.) proposed measuring inequality as

\[ I = 1 - \frac{ y_{EDE}}{ \mu } \]

where \(\mu\) is the average income. Once a suitable form \(U\) is defined, this index has a direct interpretation in terms of social welfare; e.g., for a given level \(I\), it means that, if we could redistribute income equally from the current distribution, we would only need about \((1 - I)\%\) of the current average income to reach the same level of social welfare.

However, in order to completely rank distributions, we need to impose more restrictions on the form of \(U(y)\). Based on results from the theory of choice under uncertainty, Atkinson (⊕1970Atkinson, Anthony B. 1970. “On the Measurement of Inequality.” Journal of Economic Theory 2 (3): 244–63. https://ideas.repec.org/a/eee/jetheo/v2y1970i3p244-263.html.) suggests a class of utility functions given by

\[ U( y , \epsilon ) = \begin{cases} A + B \frac{y^{1 - \epsilon}}{1 - \epsilon} , &\epsilon \neq 1 \\ \log y , &\epsilon \neq 1 \end{cases} \]

where \(\epsilon\) is an inequality-aversion (scalar) parameter: as it approaches infinity, more weight is given to lower incomes.

Combining this with his proposed approach, Atkinson (⊕1970Atkinson, Anthony B. 1970. “On the Measurement of Inequality.” Journal of Economic Theory 2 (3): 244–63. https://ideas.repec.org/a/eee/jetheo/v2y1970i3p244-263.html.) derives what came to be known as the Atkinson index:

\[ I_\epsilon = \begin{cases} 1 - \bigg[ \int_0^{+ \infty} \big( {\frac{y}{\mu} } \big)^{1-\epsilon} f(y) dy \bigg]^{1/(1-\epsilon)} , &\epsilon \neq 1 \\ 1 - \exp \bigg[ \int_0^{+ \infty} \log \frac{y}{\mu} f(y) dy \bigg] , &\epsilon \to 1 \end{cases} \]

Although the above form is given terms of a continuous distribution, the equivalent plug-in estimator we use comes from Biewen and Jenkins (⊕2003Biewen, Martin, and Stephen Jenkins. 2003. “Estimation of Generalized Entropy and Atkinson Inequality Indices from Complex Survey Data.” Discussion Papers of DIW Berlin 345. DIW Berlin, German Institute for Economic Research. http://EconPapers.repec.org/RePEc:diw:diwwpp:dp345.):

\[ \widehat{A}_\epsilon = \begin{cases} 1 - \widehat{U}_0^{ - \epsilon/(1 - \epsilon) } \widehat{U}_1^{ -1 } \widehat{U}_{1 - \epsilon}^{ 1/(1 - \epsilon) } , &\text{if } \epsilon \in \mathbb{R}_+ \setminus\{ 1 \} \\ 1 - \widehat{U}_0 \widehat{U}_0^{-1} exp( \widehat{T}_0 \widehat{U}_0^{-1} ), &\text{if } \epsilon \rightarrow1 \end{cases} \]

4.9.1 Replication Example

In July 2006, Jenkins (⊕2008Jenkins, Stephen. 2008. “Estimation and Interpretation of Measures of Inequality, Poverty, and Social Welfare Using Stata.” North American Stata Users' Group Meetings 2006. Stata Users Group. http://EconPapers.repec.org/RePEc:boc:asug06:16.) presented at the North American Stata Users’ Group Meetings on the stata Atkinson Index command. The example below reproduces those statistics.

Load and prepare the same data set:

# load the convey package
library(convey)

# load the survey library
library(survey)

# load the foreign library
library(foreign)

# create a temporary file on the local disk
tf <- tempfile()

# store the location of the presentation file
presentation_zip <-
  "https://web.archive.org/web/20150928053959/http://repec.org/nasug2006/nasug2006_jenkins.zip"

# download jenkins' presentation to the temporary file
download.file(presentation_zip , tf , mode = 'wb')

# unzip the contents of the archive
presentation_files <- unzip(tf , exdir = tempdir())

# load the institute for fiscal studies' 1981, 1985, and 1991 data.frame objects
x81 <-
  read.dta(grep("ifs81" , presentation_files , value = TRUE))
x85 <-
  read.dta(grep("ifs85" , presentation_files , value = TRUE))
x91 <-
  read.dta(grep("ifs91" , presentation_files , value = TRUE))

# stack each of these three years of data into a single data.frame
x <- rbind(x81 , x85 , x91)

Replicate the author’s survey design statement from stata code..

. * account for clustering within HHs 
. version 8: svyset [pweight = wgt], psu(hrn)
pweight is wgt
psu is hrn
construct an

.. into R code:

# initiate a linearized survey design object
y <- svydesign( ~ hrn , data = x , weights = ~ wgt)

# immediately run the `convey_prep` function on the survey design
z <- convey_prep(y)

Replicate the author’s subset statement and each of his svyatk results with stata..

. svyatk x if year == 1981
 
Warning: x has 20 values = 0. Not used in calculations

Complex survey estimates of Atkinson inequality indices
 
pweight: wgt                                   Number of obs    = 9752
Strata: <one>                                  Number of strata = 1
PSU: hrn                                       Number of PSUs   = 7459
                                               Population size  = 54766261
---------------------------------------------------------------------------
Index    |  Estimate   Std. Err.      z      P>|z|     [95% Conf. Interval]
---------+-----------------------------------------------------------------
A(0.5)   |  .0543239   .00107583    50.49    0.000      .0522153   .0564324
A(1)     |  .1079964   .00245424    44.00    0.000      .1031862   .1128066
A(1.5)   |  .1701794   .0066943    25.42    0.000       .1570588      .1833
A(2)     |  .2755788   .02597608    10.61    0.000      .2246666    .326491
A(2.5)   |  .4992701   .06754311     7.39    0.000       .366888   .6316522
---------------------------------------------------------------------------

..using R code:

z81 <- subset(z , year == 1981)

svyatk( ~ eybhc0 , subset(z81 , eybhc0 > 0) , epsilon = 0.5)

##        atkinson     SE
## eybhc0 0.054324 0.0011

svyatk( ~ eybhc0 , subset(z81 , eybhc0 > 0))

##        atkinson     SE
## eybhc0    0.108 0.0025

svyatk( ~ eybhc0 , subset(z81 , eybhc0 > 0) , epsilon = 1.5)

##        atkinson     SE
## eybhc0  0.17018 0.0067

svyatk( ~ eybhc0 , subset(z81 , eybhc0 > 0) , epsilon = 2)

##        atkinson    SE
## eybhc0  0.27558 0.026

svyatk( ~ eybhc0 , subset(z81 , eybhc0 > 0) , epsilon = 2.5)

##        atkinson     SE
## eybhc0  0.49927 0.0675

Confirm this replication applies for subsetted objects as well, comparing stata code..

. svyatk x if year == 1981 & x >= 1

Complex survey estimates of Atkinson inequality indices
 
pweight: wgt                                   Number of obs    = 9748
Strata: <one>                                  Number of strata = 1
PSU: hrn                                       Number of PSUs   = 7457
                                               Population size  = 54744234
---------------------------------------------------------------------------
Index    |  Estimate   Std. Err.      z      P>|z|     [95% Conf. Interval]
---------+-----------------------------------------------------------------
A(0.5)   |  .0540059   .00105011    51.43    0.000      .0519477   .0560641
A(1)     |  .1066082   .00223318    47.74    0.000      .1022313   .1109852
A(1.5)   |  .1638299   .00483069    33.91    0.000       .154362   .1732979
A(2)     |  .2443206   .01425258    17.14    0.000      .2163861   .2722552
A(2.5)   |   .394787   .04155221     9.50    0.000      .3133461   .4762278
---------------------------------------------------------------------------

..to R code:

z81_two <- subset(z , year == 1981 & eybhc0 > 1)

svyatk( ~ eybhc0 , z81_two , epsilon = 0.5)

##        atkinson     SE
## eybhc0 0.054006 0.0011

svyatk( ~ eybhc0 , z81_two)

##        atkinson     SE
## eybhc0  0.10661 0.0022

svyatk( ~ eybhc0 , z81_two , epsilon = 1.5)

##        atkinson     SE
## eybhc0  0.16383 0.0048

svyatk( ~ eybhc0 , z81_two , epsilon = 2)

##        atkinson     SE
## eybhc0  0.24432 0.0143

svyatk( ~ eybhc0 , z81_two , epsilon = 2.5)

##        atkinson     SE
## eybhc0  0.39479 0.0416

For additional usage examples of svyatk, type ?convey::svyatk in the R console.

4.9.2 Real World Examples

This section displays example results using nationally-representative surveys from both the United States and Brazil. We present a variety of surveys, levels of analysis, and subpopulation breakouts to provide users with points of reference for the range of plausible values of the svyatk function.

To understand the construction of each survey design object and respective variables of interest, please refer to section 1.4 for CPS-ASEC, section 1.5 for PNAD Contínua, and section 1.6 for SCF.

4.9.2.1 CPS-ASEC Household Income

svyatk(
  ~ htotval ,
  subset(cps_household_design , htotval > 0)
)

##         atkinson     SE
## htotval  0.38424 0.0029

svyby(
  ~ htotval ,
  ~ sex ,
  subset(cps_household_design , htotval > 0) ,
  svyatk
)

##           sex   htotval  se.htotval
## male     male 0.3621620 0.004018133
## female female 0.4001271 0.003879890

4.9.2.2 CPS-ASEC Family Income

svyatk(
  ~ ftotval ,
  subset(cps_family_design , ftotval > 0)
)

##         atkinson     SE
## ftotval  0.33941 0.0033

svyby(
  ~ ftotval ,
  ~ sex ,
  subset(cps_family_design , ftotval > 0) ,
  svyatk
)

##           sex   ftotval  se.ftotval
## male     male 0.3134257 0.004928748
## female female 0.3615489 0.004371817

4.9.2.3 CPS-ASEC Worker Earnings

svyatk(
  ~ pearnval ,
  subset(cps_ftfy_worker_design , pearnval > 0)
)

##          atkinson    SE
## pearnval  0.25625 0.003

svyby(
  ~ pearnval ,
  ~ sex ,
  subset(cps_ftfy_worker_design , pearnval > 0) ,
  svyatk
)

##           sex  pearnval se.pearnval
## male     male 0.2619172 0.003659413
## female female 0.2373733 0.005095477

4.9.2.4 PNAD Contínua Per Capita Income

svyatk(
  ~ deflated_per_capita_income ,
  subset(pnadc_design , deflated_per_capita_income > 0),
  na.rm = TRUE
)

##                            atkinson     SE
## deflated_per_capita_income   0.3773 0.0038

svyby(
  ~ deflated_per_capita_income ,
  ~ sex ,
  subset(pnadc_design , deflated_per_capita_income > 0),
  svyatk ,
  na.rm = TRUE
)

##           sex deflated_per_capita_income se.deflated_per_capita_income
## male     male                  0.3793418                   0.004049461
## female female                  0.3750622                   0.003852270

4.9.2.5 PNAD Contínua Worker Earnings

svyatk(
  ~ deflated_labor_income ,
  subset(pnadc_design , deflated_labor_income > 0) ,
  na.rm = TRUE
)

##                       atkinson     SE
## deflated_labor_income  0.34911 0.0041

svyby(
  ~ deflated_labor_income ,
  ~ sex ,
  subset(pnadc_design , deflated_labor_income > 0) ,
  svyatk ,
  na.rm = TRUE
)

##           sex deflated_labor_income se.deflated_labor_income
## male     male             0.3498899              0.004738257
## female female             0.3375239              0.004399895

4.9.2.6 SCF Family Net Worth

scf_MIcombine(with(subset(scf_design , networth > 0) , svyatk(~ networth)))

## Warning in subset.svyimputationList(scf_design, networth > 0): subset differed
## between imputations

## Multiple imputation results:
##       m <- length(results)
##       scf_MIcombine(with(subset(scf_design, networth > 0), svyatk(~networth)))
##            results          se
## networth 0.8534903 0.004069562

scf_MIcombine(with(
  subset(scf_design , networth > 0) ,
  svyby(~ networth, ~ hhsex , svyatk)
))

## Warning in subset.svyimputationList(scf_design, networth > 0): subset differed
## between imputations

## Multiple imputation results:
##       m <- length(results)
##       scf_MIcombine(with(subset(scf_design, networth > 0), svyby(~networth, 
##     ~hhsex, svyatk)))
##          results          se
## male   0.8402561 0.005154381
## female 0.8257367 0.013566334

4.9.2.7 SCF Family Income

scf_MIcombine(with(subset(scf_design , income > 0) , svyatk(~ income)))

## Warning in subset.svyimputationList(scf_design, income > 0): subset differed
## between imputations

## Multiple imputation results:
##       m <- length(results)
##       scf_MIcombine(with(subset(scf_design, income > 0), svyatk(~income)))
##          results         se
## income 0.4943538 0.01340503

scf_MIcombine(with(
  subset(scf_design , income > 0) ,
  svyby(~ income, ~ hhsex , svyatk)
))

## Warning in subset.svyimputationList(scf_design, income > 0): subset differed
## between imputations

## Multiple imputation results:
##       m <- length(results)
##       scf_MIcombine(with(subset(scf_design, income > 0), svyby(~income, 
##     ~hhsex, svyatk)))
##          results         se
## male   0.4814661 0.01510394
## female 0.3182340 0.01475070