IDCeMPy: Python Package for Inflated Discrete Choice Models

Scholars and data scientists often use discrete choice models to evaluate ordered dependent variables using the ordered probit model and unordered polytomous outcome measures via the multinomial logit (MNL) estimator (Greene, 2002; Richards & Bonnet, 2018; Sarrias, 2016). These models, however, cannot account for the possibility that in many ordered and unordered polytomous choice outcomes, a disproportionate share of observations — stemming from two distinct data generating processes (DGPs) — fall into a single category which is thus “inflated.” For instance, ordered outcome measures of self-reported smoking behavior that range from 0 for “no smoking” to 3 for “smoking 20 cigarettes or more daily” contain excessive observations in the zero (no smoking) category that includes individuals who never smoke cigarettes and those who smoked previously but temporarily stop smoking because of an increase in cigarette costs (Greene et al., 2015; Harris & Zhao, 2007). The “indifference” middle-category in ordered measures of immigration attitudes is inflated since it includes respondents who are genuinely indifferent about immigration and those who select “indifference” because of social desirability reasons (Bagozzi & Mukherjee, 2012; Brown et al., 2020). The baseline category of unordered polytomous variables of presidential vote choice is also often inflated as it includes non-voters who abstain from voting owing to temporary factors and routine non-voters who are disengaged from the political process (Bagozzi & Marchetti, 2017; Campbell & Monson, 2008). Inflated discrete choice models have been developed to address such category inflation in ordered and unordered polytomous outcome variables as failing to do so leads to model misspecification and incorrect inferences (Bagozzi & Mukherjee, 2012; Brown et al., 2020; Harris & Zhao, 2007).

[1]  Mauricio Sarrias Discrete Choice Models with Random Parameters in R: The Rchoice Package , 2016 .

[2]  Ricardo A. Daziano,et al.  Multinomial Logit Models with Continuous and Discrete Individual Heterogeneity in R: The gmnl Package , 2017 .

[3]  Benjamin E. Bagozzi,et al.  A Mixture Model for Middle Category Inflation in Ordered Survey Responses , 2012, Political Analysis.

[4]  Tianji Cai,et al.  gidm: A command for generalized inflated discrete models , 2019, The Stata Journal: Promoting communications on statistics and Stata.

[5]  Michel Bierlaire,et al.  PythonBiogeme: a short introduction , 2016 .

[6]  Erik R. Tillman,et al.  Exposure to European Union Policies and Support for Membership in the Candidate Countries , 2007 .

[7]  Benjamin E. Bagozzi,et al.  Supplemental Appendix For : Distinguishing Occasional Abstention from Routine Indifference in Models of Vote Choice , 2022 .

[8]  Mark N. Harris,et al.  A zero-inflated ordered probit model, with an application to modelling tobacco consumption , 2007 .

[9]  W. Greene,et al.  Inflated Responses in Measures of Self-Assessed Health , 2014, American Journal of Health Economics.

[10]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[11]  Alireza S. Mahani,et al.  Fast Estimation of Multinomial Logit Models: R Package mnlogit , 2014, 1404.3177.

[12]  Andrei Sirchenko,et al.  Estimation of nested and zero-inflated ordered probit models , 2018, The Stata Journal: Promoting communications on statistics and Stata.

[13]  William H. Greene,et al.  NLOGIT version 3.0 : reference guide , 2002 .

[14]  J. Quin Monson,et al.  The Religion Card Gay Marriage and the 2004 Presidential Election , 2008 .

[15]  Sarah Brown,et al.  Modelling Category Inflation with Multiple Inflation Processes: Estimation, Specification and Testing1 , 2020 .

[16]  Panagiotis Ch. Anastasopoulos,et al.  Analysis of accident injury-severities using a correlated random parameters ordered probit approach with time variant covariates , 2018, Analytic Methods in Accident Research.

[17]  Sylvia Richardson,et al.  PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes. , 2013, Journal of statistical software.

[18]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[19]  B. Ripley,et al.  Random and Mixed Effects , 2002 .