Group structure estimation for panel data -- a general approach

Consider a panel data setting where repeated observations on individuals are available. Often it is reasonable to assume that there exist groups of individuals that share similar effects of observed characteristics, but the grouping is typically unknown in advance. We propose a novel approach to estimate such unobserved groupings for general panel data models. Our method explicitly accounts for the uncertainty in individual parameter estimates and remains computationally feasible with a large number of individuals and/or repeated measurements on each individual. The developed ideas can be applied even when individual-level data are not available and only parameter estimates together with some quantification of uncertainty are given to the researcher.

[1]  J. Bai,et al.  Panel Data Models With Interactive Fixed Effects , 2009 .

[2]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[3]  H. Wang,et al.  ROBUST SUBGROUP IDENTIFICATION , 2019, Statistica Sinica.

[4]  R. Koenker Quantile regression for longitudinal data , 2004 .

[5]  Ulrike von Luxburg,et al.  Clustering Stability: An Overview , 2010, Found. Trends Mach. Learn..

[6]  Guang Cheng,et al.  Quantile Processes for Semi and Nonparametric Regression , 2016, 1604.02130.

[7]  J. List,et al.  The environmental Kuznets curve: does one size fit all? , 1999 .

[8]  Erich Schubert,et al.  Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms , 2018, SISAP.

[9]  H. Wang,et al.  Quantile-regression-based clustering for panel data , 2019, Journal of Econometrics.

[10]  Carlos Lamarche,et al.  Robust penalized quantile regression estimation for panel data , 2010 .

[11]  Mauro Maggioni,et al.  Path-Based Spectral Clustering: Guarantees, Robustness to Outliers, and Fast Algorithms , 2017, J. Mach. Learn. Res..

[12]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[13]  Tomohiro Ando,et al.  Bayesian and maximum likelihood analysis of large-scale panel choice models with unobserved heterogeneity , 2021, Journal of Econometrics.

[14]  Carlos Lamarche,et al.  Penalized Quantile Regression with Semiparametric Correlated Effects: Applications with Heterogeneous Preferences , 2017, SSRN Electronic Journal.

[15]  R. Lumsdaine,et al.  Estimation of Panel Group Structure Models with Structural Breaks in Group Memberships and Coefficients , 2020, SSRN Electronic Journal.

[16]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[17]  Wuyi Wang,et al.  Identifying latent group structures in nonlinear panels , 2020, Journal of Econometrics.

[18]  Soumendu Sundar Mukherjee,et al.  Weak convergence and empirical processes , 2019 .

[19]  Beatriz de la Iglesia,et al.  Clustering Rules: A Comparison of Partitioning and Hierarchical Clustering Algorithms , 2006, J. Math. Model. Algorithms.

[20]  Jianqing Fan,et al.  Homogeneity Pursuit , 2015, Journal of the American Statistical Association.

[21]  Kengo Kato,et al.  Asymptotics for panel quantile regression models with individual effects , 2012 .

[22]  Lawrence F. Katz,et al.  Creating Moves to Opportunity: Experimental Evidence on Barriers to Neighborhood Choice , 2019, American Economic Review.

[23]  Junhui Wang Consistent selection of the number of clusters via crossvalidation , 2010 .

[24]  John A. List,et al.  The Environmental Kuznets Curve: Real Progress or Misspecified Models? , 2003, Review of Economics and Statistics.

[25]  Liangjun Su,et al.  Panel threshold regressions with latent group structures , 2020 .

[26]  H. Dette,et al.  A similarity measure for second order properties of non-stationary functional time series with applications to clustering and testing , 2018, Bernoulli.

[27]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[28]  Francis R. Bach,et al.  Clusterpath: an Algorithm for Clustering using Convex Fusion Penalties , 2011, ICML.

[29]  R. Koenker,et al.  Hierarchical Spline Models for Conditional Quantiles and the Demand for Electricity , 1990 .

[30]  Raj Chetty,et al.  Where is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States* , 2014 .

[31]  Serena Ng,et al.  Estimation of Panel Data Models with Parameter Heterogeneity when Group Membership is Unknown , 2007 .

[32]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[33]  Raj Chetty,et al.  The Impacts of Neighborhoods on Intergenerational Mobility I: Childhood Exposure Effects , 2016 .

[34]  Fan Chung Graham,et al.  On the Spectra of General Random Graphs , 2011, Electron. J. Comb..

[35]  Elena Manresa,et al.  Grouped Patterns of Heterogeneity in Panel Data , 2015 .

[36]  Kengo Kato,et al.  Smoothed Quantile Regression for Panel Data , 2015 .

[37]  X. Leng,et al.  Multi-dimensional Latent Group Structures with Heterogeneous Distributions , 2020, SSRN Electronic Journal.

[38]  Homogeneity pursuit in panel data models: Theory and application , 2018, Journal of Applied Econometrics.

[39]  D. Chetverikov,et al.  Spectral and post-spectral estimators for grouped panel data models , 2022, 2212.13324.

[40]  David Watson,et al.  Spectrum: fast density-aware spectral clustering for single and multi-omic data , 2019, bioRxiv.

[41]  Tengyao Wang,et al.  A useful variant of the Davis--Kahan theorem for statisticians , 2014, 1405.0680.

[42]  Stanislav Volgushev,et al.  Panel data quantile regression with grouped fixed effects , 2018, Journal of Econometrics.

[43]  Wendun Wang,et al.  Heterogeneous Structural Breaks in Panel Data Models , 2018, Journal of Econometrics.

[44]  Raj Chetty,et al.  The Opportunity Atlas: Mapping the Childhood Roots of Social Mobility , 2018 .

[45]  Carlos A. Flores,et al.  Lessons From Quantile Panel Estimation of the Environmental Kuznets Curve , 2009 .

[46]  Stanislav Volgushev,et al.  On the unbiased asymptotic normality of quantile regression with fixed effects , 2018, Journal of Econometrics.

[47]  Shai Ben-David,et al.  A Sober Look at Clustering Stability , 2006, COLT.

[48]  P. Phillips,et al.  Identifying Latent Structures in Panel Data , 2014 .