Competing methods for representing random taste heterogeneity in discrete choice models

The representation of random taste heterogeneity has been a prime research interest in the field of discrete choice modelling over recent years. Introducing random taste heterogeneity brings highly valued advantages in model flexibility and the ability of models to fit particular data. There are however also some drawbacks that must be addressed. Aside from the heightened cost of estimation, the main complication arising with the use of mixture models is the specification of a distribution for those taste parameters that vary randomly across respondents. It is of interest to seek to be able to specify models using mixture distributions that allow the range to be controlled while also yielding sufficient flexibility to fit the data. We further require that flexibility should be scalable such that it is possible to gradually increase the flexibility of the mixture as desired in any given application. This would allow practitioners to start with a standard model and then adapt it to the situation at hand. We finally ask that increased flexibility can be achieved with minimal additional computational cost such that there is hope that the methods will be applied in large scale applications. We summarise these conditions as range control, flexibility, scalable flexibility, and economy. Some effort has gone into advocating the use of discrete mixture models and non-parametric distributions. The distributions afforded by these methods are as flexible as the data allow and also give direct control over the range of the mixture distribution. This meets requirements a to c but not requirement d. With discrete mixtures, the number of mass points required may be excessively high and there may be substantial numerical problems involved. Nonparametric methods are generally very computationally intensive. For these reasons these methods are probably not considered for large scale applications. Some authors have investigated the use of more advanced continuous distributions such as Johnson SB or Johnson SU. This is a step in the right direction, but the flexibility of such distributions is still not scalable. They are also mostly unimodal, which might not hold for the true distribution to be estimated. In this paper, we stage a competition between two alternative approaches to the specification of a mixture distribution that both meet our requirements. The competition takes place over a number of matches, where each match is the estimation of a model on simulated datasets comprising a true distribution to be estimated. These distributions are specified by us in advance so as to be challenging estimation problems. We will mimic what a practitioner might do: we will fix the estimation methods without using our a priori knowledge of the true distribution, scale the flexibility as indicated by the data and in each match evaluate which approach performs best in terms of our criteria. The first of our contenders in the competition is a mixture distribution that is itself a discrete mixture of continuous distributions. In principle, the continuous distributions can be any continuous parametric distributions. However, we fix attention to using the Normal distribution as the base distribution and get a discrete mixture of Normals. This approach is scalable via the number of Normal distributions used and is a straightforward extension of the standard Normal mixture. It can easily accommodate a multimodal distribution. The second contender is essentially seminonparametric (SNP) in nature and uses a representation of densities from Bierens (2005) that can approximate virtually any continuous distribution. For the covering abstract see ITRD E135582.

[1]  David A. Hensher,et al.  The Mixed Logit Model: the State of Practice and Warnings for the Unwary , 2001 .

[2]  Michel Bierlaire,et al.  BIOGEME: a free package for the estimation of discrete choice models , 2003 .

[3]  K. Train,et al.  Mixed Logit with Repeated Choices: Households' Choices of Appliance Efficiency Level , 1998, Review of Economics and Statistics.

[4]  Michel Bierlaire,et al.  A practical test for the choice of mixing distribution in discrete choice models , 2005 .

[5]  R. Cranley,et al.  Randomization of Number Theoretic Methods for Multiple Integration , 1976 .

[6]  A. Gallant,et al.  Semi-nonparametric Maximum Likelihood Estimation , 1987 .

[7]  Turalay Kenc,et al.  Ox: An Object-Oriented Matrix Language , 1997 .

[8]  Kay W. Axhausen,et al.  Evidence on the distribution of values of travel time savings from a six-week diary , 2004 .

[9]  G Abay,et al.  Zeitkostenansaetze im Personenverkehr, Vorstudie , 2000 .

[10]  J. Geweke,et al.  Computationally Intensive Methods for Integration in Econometrics , 2001 .

[11]  Mark Coppejans,et al.  Estimation of the binary response model using a mixture of distributions estimator (MOD) , 2001 .

[12]  Kay W. Axhausen,et al.  State-of-the-Art Estimates of Swiss Value of Travel Time Savings , 2006 .

[13]  Mogens Fosgerau,et al.  Investigating the distribution of the value of travel time savings , 2006 .

[14]  Herman J. Bierens,et al.  SEMI-NONPARAMETRIC INTERVAL-CENSORED MIXED PROPORTIONAL HAZARD MODELS: IDENTIFICATION AND CONSISTENCY RESULTS , 2008, Econometric Theory.

[15]  Søren Feodor Nielsen,et al.  DECONVOLUTING PREFERENCES AND ERRORS: A SEMI-NONPARAMETRIC MODEL FOR BINOMIAL PANEL DATA , 2006 .

[16]  Xiaohong Chen Chapter 76 Large Sample Sieve Estimation of Semi-Nonparametric Models , 2007 .

[17]  Dan Rigby,et al.  Modeling Disinterest and Dislike: A Bounded Bayesian Mixed Logit Model of the UK Market for GM Food , 2006 .

[18]  K. Train Recreation Demand Models with Taste Differences Over People , 1998 .

[19]  David A. Hensher Reducing sign violation for VTTS distributions through recognition of an individual's attribute processing strategy , 2007 .

[20]  R. Spady,et al.  AN EFFICIENT SEMIPARAMETRIC ESTIMATOR FOR BINARY RESPONSE MODELS , 1993 .

[21]  D. McFadden,et al.  MIXED MNL MODELS FOR DISCRETE RESPONSE , 2000 .

[22]  Elisabetta Strazzera,et al.  Modeling Elicitation effects in contingent valuation studies: a Monte Carlo Analysis of the bivariate approach , 2005 .

[23]  Kenneth Train,et al.  Mixed Logit with Bounded Distributions of Correlated Partworths , 2005 .

[24]  Michel Bierlaire,et al.  European Transport \ Trasporti Europei , 2005 .

[25]  Bruno De Borger,et al.  The trade-off between money and travel time: A test of the theory of reference-dependent preferences , 2008 .

[26]  M. Thiene,et al.  Using Flexible Taste Distributions to Value Collective Reputation for Environmentally Friendly Production Methods , 2008 .

[27]  M. Bierlaire,et al.  ESTIMATION OF VALUE OF TRAVEL-TIME SAVINGS USING MIXED LOGIT MODELS , 2005 .

[28]  John W. Polak,et al.  A systematic comparison of continuous and discrete mixture models , 2007 .

[29]  John M. Rose,et al.  Applied Choice Analysis: The mixed logit model , 2005 .