New EM-type algorithms for the Heckman selection model

Abstract The Heckman selection model is widely used to analyse data for which the outcome is partially observable, and the missing part is not random. The 2-step method, maximum likelihood estimation (MLE), and EM algorithms have been developed to analyse this model; however, they have certain limitations. Three new algorithms (ECM, ECM(NR), and ECME) will be proposed with the advantages of the EM algorithm: easy implementation and numerical stability. Considering bias and mean squared error (MSE), simulations with different correlation values suggest that MLE performs similarly to the proposed algorithms; however, MLE as well as the proposed algorithms yield better estimations than the 2-step method. A simulation study in which standard error is also considered demonstrates that the new algorithms are more robust than MLE, and yield slightly better estimations than the 2-step and the robust 2-stage methods. Real data analyses are also provided to discuss the performance of MLE, 2-step, and the proposed algorithms. A real data analysis concerning the robustness issue further illustrates that, under certain conditions, the proposed algorithms are more efficient and stable.

[1]  Patrick A. Puhani,et al.  The Heckman Correction for Sample Selection and Its Critique - A Short Survey , 2000 .

[2]  Christopher Winship,et al.  Models for Sample Selection Bias , 1992 .

[3]  Arne Henningsen,et al.  Sample Selection Models in R: Package sampleSelection , 2008 .

[4]  Zhichao Jiang,et al.  Robust Modeling Using Non-Elliptically Contoured Multivariate t Distributions , 2016 .

[5]  M. Genton,et al.  A unified view on skewed distributions arising from selections , 2006 .

[6]  Siddhartha Chib,et al.  Estimation of Semiparametric Models in the Presence of Endogeneity and Sample Selection , 2009 .

[7]  J. Heckman Shadow prices, market wages, and labor supply , 1974 .

[8]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[9]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[10]  Peng Ding,et al.  Bayesian robust inference of sample selection using selection-t models , 2014, J. Multivar. Anal..

[11]  J. Heckman Sample selection bias as a specification error , 1979 .

[12]  Thomas Kneib,et al.  Bayesian geoadditive sample selection models , 2010 .

[13]  M. Genton,et al.  Robust inference in sample selection models , 2016 .

[14]  T. Choi,et al.  Bayesian analysis of semiparametric Bernstein polynomial regression models for data with sample selection , 2019, Statistics.

[15]  D. Rubin,et al.  The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .

[16]  Brian D. Johnson,et al.  Is the Magic Still There? The Use of the Heckman Two-Step Correction for Selection Bias in Criminology , 2007 .

[17]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[18]  D. Collier,et al.  Insights and Pitfalls: Selection Bias in Qualitative Research , 1996, World Politics.

[19]  Martijn van Hasselt Bayesian inference in a sample selection model , 2011 .

[20]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[21]  Jeffrey M. Wooldridge,et al.  Introductory Econometrics: A Modern Approach , 1999 .

[22]  A. Cameron,et al.  Microeconometrics: Methods and Applications , 2005 .