Learning Kolmogorov Models for Binary Random Variables

We summarize our recent findings, where we proposed a framework for learning a Kolmogorov model, for a collection of binary random variables. More specifically, we derive conditions that link outcomes of specific random variables, and extract valuable relations from the data. We also propose an algorithm for computing the model and show its first-order optimality, despite the combinatorial nature of the learning problem. We apply the proposed algorithm to recommendation systems, although it is applicable to other scenarios. We believe that the work is a significant step toward interpretable machine learning.

[1]  Zhi-Quan Luo,et al.  Quasi-maximum-likelihood multiuser detection using semi-definite relaxation with application to synchronous CDMA , 2002, IEEE Trans. Signal Process..

[2]  Zhi-Quan Luo,et al.  Semidefinite Relaxation of Quadratic Optimization Problems , 2010, IEEE Signal Processing Magazine.

[3]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[4]  Justin K. Romberg,et al.  An Overview of Low-Rank Matrix Recovery From Incomplete Observations , 2016, IEEE Journal of Selected Topics in Signal Processing.

[5]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[6]  Cyril J. Stark Expressive Recommender Systems through Normalized Nonnegative Models , 2015, AAAI.

[7]  Cyril J. Stark Recommender systems inspired by the structure of quantum theory , 2016, ArXiv.

[8]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[9]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[10]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[11]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[12]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[13]  Lars Schmidt-Thieme,et al.  MyMediaLite: a free recommender system library , 2011, RecSys '11.

[14]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[15]  Matthias Hein,et al.  Matrix factorization with binary components , 2013, NIPS.

[16]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[17]  Robert M. Gray,et al.  Probability, Random Processes, And Ergodic Properties , 1987 .

[18]  Devin C. Koestler,et al.  DNA methylation arrays as surrogate measures of cell mixture distribution , 2012, BMC Bioinformatics.

[19]  Inderjit S. Dhillon,et al.  Non-exhaustive, Overlapping k-means , 2015, SDM.

[20]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.