Bayesian EDDI: Sequential Variable Selection with Bayesian Partial VAE

Obtaining more relevant information enables better decision making, but may be costly. Optimal sequential decision making allows us to trade off the desire to make good decisions by acquiring further information with the cost of performing that acquisition. To this end, we propose a principled framework, named EDDI (Efficient Dynamic Discovery of high-value Information), based on the theory of Bayesian experimental design. In EDDI we propose a novel partial variational autoencoder (Partial VAE), to efficiently handle missing data with different missing patterns. Additionally, we extend the VAE based framework to Bayesian treatment of the weights, which obtains better performance in small data regime. EDDI then combines it with an acquisition function that maximizes expected information gain on a set of target variables at each step.

[1]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[2]  Tianqi Chen,et al.  Stochastic Gradient Hamiltonian Monte Carlo , 2014, ICML.

[3]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[4]  Judi Scheffer,et al.  Dealing with Missing Data , 2020, The Big R‐Book.

[5]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[6]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[7]  J. Bernardo Expected Information as Expected Utility , 1979 .

[8]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[9]  Inderjit S. Dhillon,et al.  Guaranteed Rank Minimization via Singular Value Projection , 2009, NIPS.

[10]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[11]  Peter Fritz,et al.  Bmc Medical Informatics and Decision Making Underutilization of Information and Knowledge in Everyday Medical Practice: Evaluation of a Computer-based Solution , 2022 .

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Eunho Yang,et al.  Joint Active Feature Acquisition and Classification with Variable-Size Set Encoding , 2018, NeurIPS.

[14]  Pablo M. Olmos,et al.  Handling Incomplete Heterogeneous Data using VAEs , 2018, Pattern Recognit..

[15]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[16]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[17]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[18]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[19]  Jeffrey S. Rosenschein,et al.  Knowing What to Ask: A Bayesian Active Learning Approach to the Surveying Problem , 2017, AAAI.

[20]  Aram Galstyan,et al.  Multitask learning and benchmarking with clinical time series data , 2017, Scientific Data.

[21]  Zhiqiang Zheng,et al.  On active learning for data acquisition , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[22]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[23]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[24]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).