mixl: An open-source R package for estimating complex choice models on large datasets

Abstract This paper introduces mixl, a new R package for the estimation of advanced choice models. The estimation of such models typically relies on simulation methods with a large number of random draws to obtain stable results. mixl uses inherent properties of the log-likelihood problem structure to greatly reduce both the memory usage and runtime of the estimation procedure for specific types of mixed multinomial logit models. Functions for prediction and posterior analysis are included. Parallel computing is also supported, with near linear speedups observed on up to 24 cores. mixl is directly accessible from R, available on CRAN. We show that mixl is fast, easy to use, and scales to very large datasets. This paper presents the architecture and performance of the package, details its use, and presents some results using real world data and models.

[1]  Achim Zeileis Object-oriented Computation of Sandwich Estimators , 2006 .

[2]  K. Axhausen,et al.  Post-Car World: data collection methods and response behavior in a multi-stage travel survey , 2019, Transportation.

[3]  Ricardo A. Daziano,et al.  Multinomial Logit Models with Continuous and Discrete Individual Heterogeneity in R: The gmnl Package , 2017 .

[4]  Stephane Hess,et al.  On the use of a Modified Latin Hypercube Sampling (MLHS) method in the estimation of a Mixed Logit Model for vehicle choice , 2006 .

[5]  Barbara Chapman,et al.  Guest editorial: OpenMP , 2005 .

[6]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[7]  K. Axhausen,et al.  In-store or online shopping of search and experience goods: A hybrid choice approach , 2019, Journal of Choice Modelling.

[8]  Achim Zeileis,et al.  Various versatile variances : An object-oriented implementation of clustered covariances in R Working , 2017 .

[9]  Arne Henningsen,et al.  maxLik: A package for maximum likelihood estimation in R , 2011, Comput. Stat..

[10]  K. Axhausen,et al.  A pooled RP/SP mode, route and destination choice model to investigate mode and user-type effects in the value of travel time savings , 2019, Transportation Research Part A: Policy and Practice.

[11]  David Palma,et al.  Apollo: A flexible, powerful and customisable freeware package for choice model estimation and application , 2019, Journal of Choice Modelling.

[12]  Basil Schmid,et al.  Connecting Time-Use, Travel and Shopping Behavior: Results of a Multi-Stage Household Survey , 2019 .

[13]  Michiel C.J. Bliemer,et al.  Information theoretic-based sampling of observations , 2019, Journal of Choice Modelling.

[14]  Mikołaj Czajkowski,et al.  Simulation error in maximum likelihood estimation of discrete choice models , 2019, Journal of Choice Modelling.

[15]  Florian Heiss,et al.  Discrete Choice Methods with Simulation , 2016 .

[16]  Alireza S. Mahani,et al.  Fast Estimation of Multinomial Logit Models: R Package mnlogit , 2014, 1404.3177.

[17]  R. Fletcher Practical Methods of Optimization , 1988 .

[18]  Barbara M. Chapman,et al.  OpenMP , 2005, Parallel Comput..

[19]  Moshe Ben-Akiva,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1985 .

[20]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[21]  D. McFadden Econometric Models for Probabilistic Choice Among Products , 1980 .

[22]  M. Ben-Akiva,et al.  Foundations of Stated Preference Elicitation: Consumer Behavior and Choice-based Conjoint Analysis , 2019, Foundations and Trends® in Econometrics.

[23]  D. McFadden,et al.  MIXED MNL MODELS FOR DISCRETE RESPONSE , 2000 .

[24]  I. Sobol On the distribution of points in a cube and the approximate evaluation of integrals , 1967 .

[25]  Dirk Eddelbuettel,et al.  Rcpp: Seamless R and C++ Integration , 2011 .

[26]  Joan L. Walker,et al.  Hybrid Choice Models: Progress and Challenges , 2002 .