Sparse Three-Parameter Restricted Indian Buffet Process for Understanding International Trade

This paper presents a Bayesian nonparametric latent feature model specially suitable for exploratory analysis of high-dimensional count data. We perform a non-negative doubly sparse matrix factorization that has two main advantages: not only we are able to better approximate the row input distributions, but the inferred topics are also easier to interpret. By combining the three-parameter and restricted Indian buffet processes into a single prior, we increase the model flexibility, allowing for a full spectrum of sparse solutions in the latent space. We demonstrate the usefulness of our approach in the analysis of countries' economic structure. Compared to other approaches, empirical results show our model's ability to give easy-to-interpret information and better capture the underlying sparsity structure of data.

[1]  Y. Teh,et al.  Indian Buffet Processes with Power-law Behavior , 2009, NIPS.

[2]  Finale Doshi-Velez,et al.  Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction , 2015, NIPS.

[3]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[4]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[5]  Bela Balassa,et al.  The Purchasing-Power Parity Doctrine: A Reappraisal , 1964, Journal of Political Economy.

[6]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[7]  César A. Hidalgo,et al.  The Product Space Conditions the Development of Nations , 2007, Science.

[8]  Luke Miratrix,et al.  Promoting Domain-Specific Terms in Topic Models with Informative Priors , 2017, ArXiv.

[9]  Paulo J. G. Lisboa,et al.  Making machine learning models interpretable , 2012, ESANN.

[10]  David M. Blei,et al.  Scalable Recommendation with Poisson Factorization , 2013, ArXiv.

[11]  Finale Doshi-Velez,et al.  Restricted Indian buffet processes , 2017, Stat. Comput..

[12]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[13]  Thomas L. Griffiths,et al.  The Indian Buffet Process: An Introduction and Review , 2011, J. Mach. Learn. Res..

[14]  Nibia Aires,et al.  Algorithms to Find Exact Inclusion Probabilities for Conditional Poisson Sampling and Pareto πps Sampling Designs , 1999 .

[15]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[16]  Zoubin Ghahramani,et al.  Nonparametric Bayesian Sparse Factor Models with application to Gene Expression modelling , 2010, The Annals of Applied Statistics.

[17]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[18]  Zoubin Ghahramani,et al.  Correlated Non-Parametric Latent Feature Models , 2009, UAI.

[19]  Zoran Utkovski,et al.  The Impact of Services on Economic Complexity: Service Sophistication as Route for Economic Growth , 2016, PloS one.

[20]  Fernando Pérez-Cruz,et al.  Bayesian nonparametric comorbidity analysis of psychiatric disorders , 2014, J. Mach. Learn. Res..

[21]  Fernando Perez-Cruz,et al.  Prior Design for Dependent Dirichlet Processes: An Application to Marathon Modeling , 2016, PloS one.

[22]  César A. Hidalgo,et al.  The network structure of economic output , 2011, 1101.1707.

[23]  Ole Winther,et al.  Bayesian Non-negative Matrix Factorization , 2009, ICA.

[24]  Guillaume Bouchard,et al.  Latent IBP Compound Dirichlet Allocation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[26]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[27]  Hongyu Zhao,et al.  Phylogenetic Indian Buffet Process: Theory and Applications in Integrative Analysis of Cancer Genomics , 2013 .

[28]  Josiah Hickson The Atlas of Economic Complexity: A Review , 2017 .

[29]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.