Retooling Poverty Targeting Using Out-of-Sample Validation and Machine Learning

Proxy means test (PMT) poverty targeting tools have become common tools for beneficiary targeting and poverty assessment where full means tests are costly. Currently popular estimation procedures for generating these tools prioritize minimization of in-sample prediction errors; however, the objective in generating such tools is out-of-sample prediction. This paper presents evidence that prioritizing minimal out-of-sample error, identified through cross-validation and stochastic ensemble methods, in PMT tool development can substantially improve the out-of-sample performance of these targeting tools. The USAID poverty assessment tool and base data are used for demonstration of these methods; however, the methods applied in this paper should be considered for PMT and other poverty-targeting tool development more broadly.

[1]  Furno Marilena,et al.  Quantile Regression , 2018, Wiley Series in Probability and Statistics.

[2]  Erin C. Lentz,et al.  Hunger and Food Insecurity , 2016 .

[3]  Hal R. Varian,et al.  Big Data: New Tricks for Econometrics , 2014 .

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  Poverty Assessment Tool Accuracy Submission USAID/IRIS Tool for Malawi , 2012 .

[6]  L. Pritchett,et al.  Estimating Wealth Effects Without Expenditure Data—Or Tears: An Application To Educational Enrollments In States Of India* , 2001, Demography.

[7]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[8]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[9]  Nicolai Meinshausen,et al.  Quantile Regression Forests , 2006, J. Mach. Learn. Res..

[10]  Yi Lin,et al.  Random Forests and Adaptive Nearest Neighbors , 2006 .

[11]  R. Koenker Quantile Regression: Name Index , 2005 .

[12]  David P. Coady,et al.  Targeting of Transfers in Developing Countries: Review of Lessons and Experience , 2004 .

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  L. Breiman CONSISTENCY FOR A SIMPLE MODEL OF RANDOM FORESTS , 2004 .

[15]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[16]  R. Stott,et al.  The World Bank , 2008, Annals of tropical medicine and parasitology.

[17]  J. Baker,et al.  Proxy Means Tests for Targeting Social Programs: Simulations and Speculation , 2009 .