Dependent Modeling of Temporal Sequences of Random Partitions

Abstract We consider modeling a dependent sequence of random partitions. It is wellknown in Bayesian nonparametrics that a random measure of discrete type induces a distribution over random partitions. The community has therefore assumed that the best approach to obtain a dependent sequence of random partitions is through modeling dependent random measures. We argue that this approach is problematic and show that the random partition model induced by dependent Bayesian nonparametric priors exhibits counter-intuitive dependence among partitions even though the dependence for the sequence of random probability measures is intuitive. Because of this, we suggest directly modeling the sequence of random partitions when clustering is of principal interest. To this end, we develop a class of dependent random partition models that explicitly models dependence in a sequence of partitions. We derive conditional and marginal properties of the joint partition model and devise computational strategies when employing the method in Bayesian modeling. In the case of temporal dependence, we demonstrate through simulation how the methodology produces partitions that evolve gently and naturally over time. We further illustrate the utility of the method by applying it to an environmental data set that exhibits spatio-temporal dependence.

[1]  D. Binder Bayesian cluster analysis , 1978 .

[2]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[3]  John Yen,et al.  Introduction , 2004, CACM.

[4]  Antonio Canale,et al.  BNPmix: An R Package for Bayesian Nonparametric Modeling via Pitman-Yor Mixtures , 2021, J. Stat. Softw..

[5]  Lorenzo Trippa,et al.  Dependent Species Sampling Models for Spatial Density Estimation , 2017 .

[6]  Arnaud Doucet,et al.  Generalized Pólya Urn for Time-Varying Pitman-Yor Processes , 2017, J. Mach. Learn. Res..

[7]  Maria De Iorio,et al.  Bayesian nonparametric temporal dynamic clustering via autoregressive Dirichlet priors , 2019, 1910.10443.

[8]  Stephen Walker,et al.  A Nonparametric Model for Stationary Time Series , 2014 .

[9]  Matteo Ruggiero,et al.  Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process? , 2015, IEEE transactions on pattern analysis and machine intelligence.

[10]  Wesley O Johnson,et al.  Bayesian Nonparametric Nonproportional Hazards Survival Modeling , 2009, Biometrics.

[11]  Steven N. MacEachern,et al.  The Dependent Dirichlet Process and Related Models , 2020, Statistical Science.

[12]  Moreno Bevilacqua,et al.  Analysis of Random Fields Using CompRandFld , 2015 .

[13]  M. Cugmas,et al.  On comparing partitions , 2015 .

[14]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[15]  Yuan Ji,et al.  A Time‐Series DDP for Functional Proteomics Profiles , 2012, Biometrics.

[16]  Garritt L. Page,et al.  Calibrating covariate informed product partition models , 2018, Stat. Comput..

[17]  Garritt L. Page,et al.  Spatial Product Partition Models , 2015, 1504.04489.

[18]  Vicente Núñez-Antón,et al.  Spatial Double Generalized Beta Regression Models , 2013 .

[19]  Aki Vehtari,et al.  Understanding predictive information criteria for Bayesian models , 2013, Statistics and Computing.

[20]  Jim E. Griffin,et al.  Bayesian Nonparametric Vector Autoregressive Models , 2017 .

[21]  Peter Müller,et al.  A Product Partition Model With Regression on Covariates , 2011, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[22]  M. Charlton,et al.  Spatial Variations in School Performance: A Local Analysis Using Geographically Weighted Regression , 2001 .

[23]  Yuan Ji,et al.  A Bayesian random partition model for sequential refinement and coagulation , 2019, Biometrics.

[24]  Sonia Petrone,et al.  A Predictive Study of Dirichlet Process Mixture Models for Curve Fitting , 2014, Scandinavian journal of statistics, theory and applications.

[25]  Edzer Pebesma,et al.  Spatio-Temporal Interpolation using gstat , 2016, R J..

[26]  Luis Gutiérrez,et al.  A time dependent Bayesian nonparametric model for air quality analysis , 2016, Comput. Stat. Data Anal..

[27]  Garritt L. Page,et al.  Bayesian Product Partition Models , 2018, Wiley StatsRef: Statistics Reference Online.

[28]  Arnaud Doucet,et al.  Generalized Polya Urn for Time-varying Dirichlet Process Mixtures , 2007, UAI.

[29]  Athanasios Kottas,et al.  Modeling for Dynamic Ordinal Regression Relationships: An Application to Estimating Maturity of Rockfish in California , 2015, 1507.01242.

[30]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .