Bayesian and maximum likelihood analysis of large-scale panel choice models with unobserved heterogeneity

Abstract This paper considers the estimation and inference procedures for the case of a logistic panel regression model with interactive fixed effects, where multiple individual effects are allowed and the model is capable of capturing high-dimensional cross-section dependence. The proposed model also allows for heterogeneous regression coefficients. New Bayesian and non-Bayesian approaches are introduced to estimate the model parameters. We investigate the asymptotic behaviors of the estimated parameters. We show the consistency and asymptotic normality of the estimated regression coefficients and the estimated interactive fixed effects when both the cross-section and time-series dimensions of the panel go to infinity. We prove that the dimensionality of the interactive effects can be consistently estimated by the proposed information criterion. Monte Carlo simulations demonstrate the satisfactory performance of the proposed method. Finally, the method is applied to study the performance of New York City medallion drivers in terms of efficiency.

[1]  Tomohiro Ando,et al.  Clustering Huge Number of Financial Time Series: A Panel Data Approach With High-Dimensional Predictors and Factor Structures , 2017 .

[2]  Hyungsik Roger Moon,et al.  Estimation of Random Coefficients Logit Demand Models with Interactive Fixed Effects , 2012, Journal of Econometrics.

[3]  Lena Boneva,et al.  A Discrete Choice Model for Large Heterogeneous Panels with Interactive Fixed Effects with an Application to the Determinants of Corporate Bond Issuance , 2016 .

[4]  J. Bai,et al.  Panel Data Models With Interactive Fixed Effects , 2009 .

[5]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[6]  Mark W. Watson,et al.  Consistent Estimation of the Number of Dynamic Factors in a Large N and T Panel , 2007 .

[7]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[8]  Martin Weidner,et al.  Individual and time effects in nonlinear panel models with large N , T , 2013, 1311.7065.

[9]  Peter E. Rossi,et al.  Bayesian Statistics and Marketing , 2005 .

[10]  J. Stock,et al.  Forecasting Using Principal Components From a Large Number of Predictors , 2002 .

[11]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[12]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[13]  Tomohiro Ando,et al.  Clustering Huge Number of Financial Time Series: A Panel Data Approach With High-Dimensional Predictors and Factor Structures , 2015 .

[14]  Kunpeng Li,et al.  Theory and methods of panel data models with interactive effects , 2014, 1402.6550.

[15]  J. Geweke,et al.  Alternative computational approaches to inference in the multinomial probit model , 1994 .

[16]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[17]  Liang Chen,et al.  Quantile Factor Models , 2017, Econometrica.

[18]  Karyne B. Charbonneau,et al.  Multiple Fixed Effects in Binary Response Panel Data Models , 2017 .

[19]  Kunpeng Li,et al.  STATISTICAL ANALYSIS OF FACTOR MODELS OF HIGH DIMENSION , 2012, 1205.6617.

[20]  O. Linton,et al.  A Discrete Choice Model for Large Heterogeneous Panels with Interactive Fixed Effects with an Application to the Determinants of Corporate Bond Issuance , 2016 .

[21]  Nicholas G. Polson,et al.  MCMC maximum likelihood for latent state models , 2007 .

[22]  Chenlei Leng,et al.  Bayesian adaptive Lasso , 2010, Annals of the Institute of Statistical Mathematics.

[23]  J. Geweke,et al.  Measuring the pricing error of the arbitrage pricing theory , 1996 .

[24]  M. Pesaran Estimation and Inference in Large Heterogeneous Panels with a Multifactor Error Structure , 2004, SSRN Electronic Journal.

[25]  Daniel B. Work,et al.  New York City Taxi Trip Data (2010-2013) , 2016 .

[26]  J. Bai,et al.  Determining the Number of Factors in Approximate Factor Models , 2000 .

[27]  M. Hallin,et al.  Determining the Number of Factors in the General Dynamic Factor Model , 2007 .

[28]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[29]  J. Bai,et al.  Inferential Theory for Factor Models of Large Dimensions , 2003 .

[30]  Katsumi Shimotsu,et al.  Gaussian semiparametric estimation of multivariate fractionally integrated processes , 2007 .

[31]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[32]  V. Chernozhukov,et al.  An MCMC approach to classical estimation , 2003 .

[33]  Martin Weidner,et al.  Nonlinear factor models for network and panel data , 2014, Journal of Econometrics.

[34]  C. Robert Simulation of truncated normal variables , 2009, 0907.4010.

[35]  J. Bai,et al.  Quantile Co-Movement in Financial Markets: A Panel Quantile Model With Unobserved Heterogeneity , 2018, Journal of the American Statistical Association.

[36]  Peter E. Rossi,et al.  A Bayesian analysis of the multinomial probit model with fully identified parameters , 2000 .

[37]  Peter D. Hoff,et al.  Simulation of the Matrix Bingham–von Mises–Fisher Distribution, With Applications to Multivariate and Relational Data , 2007, 0712.4166.

[38]  J. Bai,et al.  Principal components estimation and identification of static factors , 2013 .

[39]  Martin Weidner,et al.  Individual and Time Effects in Nonlinear Panel Data Models with Large N, T , 2011 .

[40]  M. Weidner,et al.  Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects , 2014 .