Towards scalable Bayesian nonparametric methods for data analytics

Resorting big data to actionable information involves dealing with four dimensions of challenges in big data (called four V’s): volume, variety, velocity, veracity. In this study, we seek for novel Bayesian nonparametric models and scalable learning algorithms which can deal with these challenges of the big data era.

[1]  Pengtao Xie,et al.  Integrating Document Clustering and Topic Modeling , 2013, UAI.

[2]  Nhat Ho,et al.  Convergence rates of parameter estimation for some weakly identifiable finite mixtures , 2016 .

[3]  Le Song,et al.  Dirichlet-Hawkes Processes with Applications to Clustering Continuous-Time Document Streams , 2015, KDD.

[4]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[5]  Svetha Venkatesh,et al.  Learning Multi-faceted Activities from Heterogeneous Data with the Product Space Hierarchical Dirichlet Processes , 2016, PAKDD Workshops.

[6]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[7]  Zoubin Ghahramani Bayesian Methods for Artificial Intelligence and Machine Learning , 2008, ECAI.

[8]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[9]  Tommi S. Jaakkola,et al.  Tutorial on variational approximation methods , 2000 .

[10]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[11]  Lancelot F. James,et al.  Approximate Dirichlet Process Computing in Finite Normal Mixtures , 2002 .

[12]  Wray L. Buntine Chain graphs for learning , 1995, UAI.

[13]  Tzu-Tsung Wong,et al.  Generalized Dirichlet distribution in Bayesian analysis , 1998, Appl. Math. Comput..

[14]  Zoubin Ghahramani,et al.  Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference , 2015, ArXiv.

[15]  Chong Wang,et al.  Truncation-free Online Variational Inference for Bayesian Nonparametric Models , 2012, NIPS.

[16]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[17]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[18]  G. A. Marcoulides Multilevel Analysis Techniques and Applications , 2002 .

[19]  Yee Whye Teh,et al.  Collapsed Variational Inference for HDP , 2007, NIPS.

[20]  Maya R. Gupta,et al.  Introduction to the Dirichlet Distribution and Related Processes , 2010 .

[21]  A. Hawkes Spectra of some self-exciting and mutually exciting point processes , 1971 .

[22]  Dinh Q. Phung,et al.  Clustering for point pattern data , 2017, 2016 23rd International Conference on Pattern Recognition (ICPR).

[23]  Dinh Q. Phung,et al.  Model-based classification and novelty detection for point pattern data , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[24]  L. C. Hsu,et al.  A Unified Approach to Generalized Stirling Numbers , 1998 .

[25]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[26]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[27]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[28]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[29]  Hiroshi Nakagawa,et al.  Practical collapsed variational bayes inference for hierarchical dirichlet process , 2012, KDD.

[30]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[31]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[32]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[33]  Dahua Lin,et al.  Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation , 2013, NIPS.

[34]  S. MacEachern,et al.  Bayesian Nonparametric Spatial Modeling With Dirichlet Process Mixing , 2005 .

[35]  Christian P. Robert,et al.  The Bayesian choice : from decision-theoretic foundations to computational implementation , 2007 .

[36]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[37]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[38]  Chong Wang,et al.  Online Variational Inference for the Hierarchical Dirichlet Process , 2011, AISTATS.

[39]  David Barber,et al.  Bayesian reasoning and machine learning , 2012 .

[40]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[41]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[42]  X. Nguyen Convergence of latent mixing measures in finite and infinite mixture models , 2011, 1109.3250.

[43]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[44]  Simon Osindero,et al.  An Alternative Infinite Mixture Of Gaussian Process Experts , 2005, NIPS.

[45]  P. Elango,et al.  Clustering Images Using the Latent Dirichlet Allocation Model , 2005 .

[46]  Eric P. Xing,et al.  A Nonparametric Mixture Model for Topic Modeling over Time , 2012, SDM.

[47]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[48]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[49]  Michael I. Jordan,et al.  Hierarchical Bayesian Nonparametric Models with Applications , 2008 .

[50]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[51]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[52]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[53]  H. Ishwaran,et al.  Exact and approximate sum representations for the Dirichlet process , 2002 .

[54]  Reynold Xin,et al.  GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.

[55]  Stuart Barber,et al.  All of Statistics: a Concise Course in Statistical Inference , 2005 .

[56]  David Maxwell Chickering,et al.  Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables , 1997, Machine Learning.

[57]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[58]  Dinh Q. Phung,et al.  Learning Latent Activities from Social Signals with Hierarchical Dirichlet Processes , 2014 .

[59]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[60]  Max Welling,et al.  Bayesian k-Means as a Maximization-Expectation Algorithm , 2009, Neural Computation.

[61]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[62]  B. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[63]  Pascal Poupart,et al.  Hierarchical Double Dirichlet Process Mixture of Gaussian Processes , 2012, AAAI.

[64]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[65]  Dinh Q. Phung,et al.  Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts , 2014, ICML.

[66]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[67]  Emily B. Fox,et al.  Streaming Variational Inference for Bayesian Nonparametric Mixture Models , 2014, AISTATS.

[68]  Andre Wibisono,et al.  Streaming Variational Bayes , 2013, NIPS.

[69]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[70]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[71]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[72]  Klaus-Robert Müller,et al.  Wasserstein Training of Restricted Boltzmann Machines , 2016, NIPS.

[73]  Erik B. Sudderth,et al.  Memoized Online Variational Inference for Dirichlet Process Mixture Models , 2013, NIPS.

[74]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[75]  Svetha Venkatesh,et al.  Extraction of latent patterns and contexts from social honest signals using hierarchical Dirichlet processes , 2013, 2013 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[76]  Erik B. Sudderth,et al.  Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes , 2012, NIPS.

[77]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[78]  Jane Junn The Human Face of Big Data , 2016, Science.

[79]  Marcus Hutter,et al.  A Bayesian Review of the Poisson-Dirichlet Process , 2010, ArXiv.

[80]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[81]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[82]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[83]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[84]  Max Welling,et al.  Accelerated Variational Dirichlet Process Mixtures , 2006, NIPS.

[85]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[86]  T. Speed,et al.  Gaussian Markov Distributions over Finite Graphs , 1986 .

[87]  Svetha Venkatesh,et al.  MCNC: Multi-Channel Nonparametric Clustering from heterogeneous data , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[88]  Qiang Yang,et al.  Scalable Parallel EM Algorithms for Latent Dirichlet Allocation in Multi-Core Systems , 2015, WWW.

[89]  Gabriel Peyré,et al.  Stochastic Optimization for Large-scale Optimal Transport , 2016, NIPS.

[90]  Ba-Ngu Vo,et al.  A random finite set model for data clustering , 2014, 17th International Conference on Information Fusion (FUSION).

[91]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[92]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[93]  Fanglin Chen,et al.  StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones , 2014, UbiComp.

[94]  Aleks Jakulin,et al.  Discrete Component Analysis , 2005, SLSFS.

[95]  Gabriel Peyré,et al.  A Smoothed Dual Approach for Variational Wasserstein Problems , 2015, SIAM J. Imaging Sci..

[96]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[97]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[98]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[99]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[100]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[101]  S. MacEachern Decision Theoretic Aspects of Dependent Nonparametric Processes , 2000 .

[102]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[103]  Michael I. Jordan,et al.  Modeling Events with Cascades of Poisson Processes , 2010, UAI.

[104]  A. Gelfand,et al.  The Nested Dirichlet Process , 2008 .

[105]  Ana V. Diez Roux Multilevel analysis in public health research , 2000 .

[106]  Yue Lu,et al.  Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA , 2011, Information Retrieval.

[107]  Imad Aad,et al.  The Mobile Data Challenge: Big Data for Mobile Computing Research , 2012 .

[108]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[109]  C. Villani Optimal Transport: Old and New , 2008 .

[110]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[111]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[112]  D. Aldous Exchangeability and related topics , 1985 .

[113]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[114]  D. Mackay,et al.  Introduction to Gaussian processes , 1998 .

[115]  Brian Litt,et al.  A Hierarchical Dirichlet Process Model with Multiple Levels of Clustering for Human EEG Seizure Modeling , 2012, ICML.

[116]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[117]  Robert H. Lochner,et al.  A Generalized Dirichlet Distribution in Bayesian Life Testing , 1975 .

[118]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[119]  Yee Whye Teh,et al.  Collapsed Variational Dirichlet Process Mixture Models , 2007, IJCAI.

[120]  Michael I. Jordan Graphical Models , 2003 .

[121]  Sebastian Tschiatschek,et al.  Introduction to Probabilistic Graphical Models , 2014 .

[122]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[123]  Robert J. Connor,et al.  Concepts of Independence for Proportions with a Generalization of the Dirichlet Distribution , 1969 .

[124]  Swapnil Mishra,et al.  Experiments with non-parametric topic models , 2014, KDD.

[125]  E. Jaynes On the rationale of maximum-entropy methods , 1982, Proceedings of the IEEE.

[126]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .