Thurstonian Boltzmann Machines: Learning from Multiple Inequalities

We introduce Thurstonian Boltzmann Machines (TBM), a unified architecture that can naturally incorporate a wide range of data inputs at the same time. Our motivation rests in the Thurstonian view that many discrete data types can be considered as being generated from a subset of underlying latent continuous variables, and in the observation that each realisation of a discrete type imposes certain inequalities on those variables. Thus learning and inference in TBM reduce to making sense of a set of inequalities. Our proposed TBM naturally supports the following types: Gaussian, intervals, censored, binary, categorical, muticategorical, ordinal, (in)-complete rank with and without ties. We demonstrate the versatility and capacity of the proposed model on three applications of very different natures; namely handwritten digit recognition, collaborative filtering and complex social survey analysis.

[1]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[2]  Geoffrey E. Hinton,et al.  Replicated Softmax: an Undirected Topic Model , 2009, NIPS.

[3]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[4]  D. Dunson,et al.  Bayesian latent variable models for mixed discrete outcomes. , 2005, Biostatistics.

[5]  R. Plackett The Analysis of Permutations , 1975 .

[6]  M. Fligner,et al.  Multistage Ranking Models , 1988 .

[7]  Geoffrey E. Hinton,et al.  Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Geoffrey E. Hinton Reducing the Dimensionality of Data with Neural , 2008 .

[9]  Peter V. Gehler,et al.  The rate adapting poisson model for information retrieval and object recognition , 2006, ICML.

[10]  Xiao Zhang,et al.  Bayesian analysis of multivariate nominal measures using multivariate multinomial probit models , 2008, Comput. Stat. Data Anal..

[11]  J. Ashford,et al.  Multi-variate probit analysis. , 1970, Biometrics.

[12]  R. Duncan Luce,et al.  Individual Choice Behavior , 1959 .

[13]  Muneki Yasuda,et al.  Boltzmann Machines with Bounded Continuous Random Variables , 2007 .

[14]  S. Chib,et al.  Analysis of multivariate probit models , 1998 .

[15]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[16]  L. Younes Parametric Inference for imperfectly observed Gibbsian fields , 1989 .

[17]  Svetha Venkatesh,et al.  Cumulative Restricted Boltzmann Machines for Ordinal Matrix Data Analysis , 2014, ACML.

[18]  Svetha Venkatesh,et al.  Ordinal Boltzmann Machines for Collaborative Filtering , 2009, UAI.

[19]  Ulf Böckenholt,et al.  Thurstonian-Based Analyses: Past, Present, and Future Utilities , 2006, Psychometrika.

[20]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[21]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[22]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[23]  Nicolas Le Roux,et al.  Learning a Generative Model of Images by Factoring Appearance and Shape , 2011, Neural Computation.

[24]  C. Robert Simulation of truncated normal variables , 2009, 0907.4010.

[25]  Daniel D. Lee,et al.  The Nonnegative Boltzmann Machine , 1999, NIPS.

[26]  Svetha Venkatesh,et al.  Probabilistic Models over Ordered Partitions with Applications in Document Ranking and Collaborative Filtering , 2011, SDM.

[27]  L. Thurstone A law of comparative judgment. , 1994 .

[28]  Alan Hanjalic,et al.  List-wise learning to rank with matrix factorization for collaborative filtering , 2010, RecSys '10.

[29]  David Haussler,et al.  Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[30]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[31]  Mohammad Emtiyaz Khan,et al.  Variational bounds for mixed-data factor analysis , 2010, NIPS.

[32]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[33]  Hal S. Stern,et al.  Models for Distributions on Permutations , 1990 .

[34]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[35]  Svetha Venkatesh,et al.  Mixed-Variate Restricted Boltzmann Machines , 2014, ACML.

[36]  M. Wedel,et al.  Factor analysis with (mixed) observed and latent variables in the exponential family , 2001 .

[37]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[38]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[39]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[40]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[41]  Rong Yan,et al.  Mining Associated Text and Images with Dual-Wing Harmoniums , 2005, UAI.

[42]  P. Müller,et al.  Nonparametric Bayesian Modeling for Multivariate Ordinal Data , 2005 .

[43]  John Geweke,et al.  Efficient Simulation from the Multivariate Normal and Student-t Distributions Subject to Linear Constraints and the Evaluation of Constraint Probabilities , 1991 .