Addressing the Zeros Problem: Regression Models for Outcomes with a Large Proportion of Zeros, with an Application to Trial Outcomes

In law‐related and other social science contexts, researchers need to account for data with an excess number of zeros. In addition, dollar damages in legal cases also often are skewed. This article reviews various strategies for dealing with this data type. Tobit models are often applied to deal with the excess number of zeros, but these are more appropriate in cases of true censoring (e.g., when all negative values are recorded as zeros) and less appropriate when zeros are in fact often observed as the amount awarded. Heckman selection models are another methodology that is applied in this setting, yet they were developed for potential outcomes rather than actual ones. Two‐part models account for actual outcomes and avoid the collinearity problems that often attend selection models. A two‐part hierarchical model is developed here that accounts for both the skewed, zero‐inflated nature of damages data and the fact that punitive damage awards may be correlated within case type, jurisdiction, or time. Inference is conducted using a Markov chain Monte Carlo sampling scheme. Tobit models, selection models, and two‐part models are fit to two punitive damage awards data sets and the results are compared. We illustrate that the nonsignificance of coefficients in a selection model can be a consequence of collinearity, whereas that does not occur with two‐part models.

[1]  Robert L. Strawderman,et al.  Bayesian Inference for a Two-Part Hierarchical Model , 2006 .

[2]  Dongchu Sun,et al.  Fully Bayesian spline smoothing and intrinsic autoregressive priors , 2003 .

[3]  E. D. Ruijter,et al.  Co-Working Partners: The Influence of Legal Arrangements , 2008 .

[4]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[5]  Brian D. Johnson,et al.  Is the Magic Still There? The Use of the Heckman Two-Step Correction for Selection Bias in Criminology , 2007 .

[6]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[7]  C. Morris,et al.  A Comparison of Alternative Models for the Demand for Medical Care , 1983 .

[8]  W. Viscusi,et al.  Punitive Damages: How Judges and Juries Perform , 2004, The Journal of Legal Studies.

[9]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[10]  J. Tobin Estimation of Relationships for Limited Dependent Variables , 1958 .

[11]  R. Nelson,et al.  Individual Justice or Collective Legal Mobilization? Employment Discrimination Litigation in the Post Civil Rights United States , 2010 .

[12]  J. G. Cragg Some Statistical Models for Limited Dependent Variables with Application to the Demand for Durable Goods , 1971 .

[13]  Nicole L. Waters,et al.  The Decision to Award Punitive Damages: An Empirical Study , 2010 .

[14]  Willard G. Manning,et al.  Choosing Between the Sample-Selection Model and the Multi-Part Model , 1984 .

[15]  J. Berger,et al.  Estimation of a Covariance Matrix Using the Reference Prior , 1994 .

[16]  William Anderson,et al.  Numerical Analysis in Least Squares Regression with an Application to the Abortion-Crime Debate , 2008 .

[17]  Patrick A. Puhani,et al.  The Heckman Correction for Sample Selection and Its Critique - A Short Survey , 2000 .

[18]  E. Fehr,et al.  Cooperation and Punishment in Public Goods Experiments , 1999, SSRN Electronic Journal.

[19]  M. Pourahmadi,et al.  Bayesian analysis of covariance matrices and dynamic models for longitudinal data , 2002 .

[20]  Andrew M. Jones A double‐hurdle model of cigarette consumption , 1989 .

[21]  Shihti Yu,et al.  On the choice between sample selection and two-part models , 1996 .

[22]  Edward C. Norton,et al.  Choosing Between and Interpreting the Heckit and Two-Part Models for Corner Solutions , 2003, Health Services and Outcomes Research Methodology.

[23]  Jeffrey M. Wooldridge,et al.  Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data , 2003 .

[24]  A survey of the literature on selectivity bias as it pertains to health care markets. , 1985 .

[25]  H. Theil Introduction to econometrics , 1978 .

[26]  Joseph L Schafer,et al.  A Two-Part Random-Effects Model for Semicontinuous Longitudinal Data , 2001 .

[27]  Andrew Gelman,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2006 .

[28]  A. Mira Markov Chain Monte Carlo, Convergence and Mixing in , 2014 .

[29]  J. Wishart THE GENERALISED PRODUCT MOMENT DISTRIBUTION IN SAMPLES FROM A NORMAL MULTIVARIATE POPULATION , 1928 .

[30]  J. Heckman Sample selection bias as a specification error , 1979 .

[31]  J. Thompson,et al.  Bayesian Analysis in Stata using WinBUGS , 2006 .

[32]  M. Wells,et al.  The Significant Association Between Punitive and Compensatory Damages in Blockbuster Cases: A Methodological Primer , 2006 .

[33]  M. Wells,et al.  Juries, Judges, and Punitive Damages: Empirical Analyses Using the Civil Justice Survey of State Courts 1992, 1996, and 2001 Data , 2006 .

[34]  John M. Thompson,et al.  Bayesian Analysis in Stata with WinBUGS , 2006 .

[35]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .