Design of Experiments with Imputable Feature Data: An Entropy-Based Approach

Tactical selection of experiments to estimate an underlying model is an innate task across various fields. Since each experiment has costs associated with it, selecting statistically significant experiments becomes necessary. Classic linear experimental design deals with experiment selection so as to minimize (functions of) variance in estimation of regression parameter. Typically, standard algorithms for solving this problem assume that data associated with each experiment is fully known. This isn’t often true since missing data is a common problem. For instance, remote sensors often miss data due to poor connection. Hence experiment selection under such scenarios is a widespread but challenging task. Though decoupling the tasks and using standard data imputation methods like matrix completion followed by experiment selection might seem a way forward, they perform sub-optimally since the tasks are naturally interdependent. Standard design of experiments is an NP hard problem, and the additional objective of imputing for missing data amplifies the computational complexity. In this paper, we propose a maximum-entropy-principle based framework that simultaneously addresses the problem of design of experiments as well as the imputation of missing data. Our algorithm exploits homotopy from a suitably chosen convex function to the non-convex cost function; hence avoiding poor local minima. Further, our proposed framework is flexible to incorporate additional application specific constraints. Simulations on various datasets show improvement in the cost value by over 60% in comparison to benchmark algorithms applied sequentially to the imputation and experiment selection problems.

[1]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[2]  Alan J. Miller,et al.  A Fedorov Exchange Algorithm for D-optimal Design , 1994 .

[3]  L. Lix,et al.  Impact of missing data on bias and precision when estimating change in patient-reported outcomes from a clinical registry , 2019, Health and Quality of Life Outcomes.

[4]  Benjamin M. Marlin,et al.  Missing Data Problems in Machine Learning , 2008 .

[5]  Jorge Nocedal,et al.  An Interior Point Algorithm for Large-Scale Nonlinear Programming , 1999, SIAM J. Optim..

[6]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[7]  Lúcia Santos,et al.  Design of experiments for microencapsulation applications: A review. , 2017, Materials science & engineering. C, Materials for biological applications.

[8]  Kenneth Rose,et al.  Design of robust HMM speech recognizers using deterministic annealing , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[9]  Mohit Singh,et al.  Proportional Volume Sampling and Approximation Algorithms for A-Optimal Design , 2018, SODA.

[10]  Luis Rademacher,et al.  Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[11]  Zhongheng Zhang,et al.  Missing data imputation: focusing on single imputation. , 2016, Annals of translational medicine.

[12]  Christos Boutsidis,et al.  Faster Subset Selection for Matrices and Applications , 2011, SIAM J. Matrix Anal. Appl..

[13]  Benjamin Duraković,et al.  Continuous quality improvement in textile processing by statistical process control tools: A case study of medium-sized company , 2013 .

[14]  Yuanzhi Li,et al.  Near-Optimal Design of Experiments via Regret Minimization , 2017, ICML.

[15]  Alan J. Miller,et al.  A review of some exchange algorithms for constructing discrete D-optimal designs , 1992 .

[16]  Manfred K. Warmuth,et al.  Leveraged volume sampling for linear regression , 2018, NeurIPS.

[17]  Suvrit Sra,et al.  Polynomial time algorithms for dual volume sampling , 2017, NIPS.

[18]  Rekha Jain,et al.  Wireless Sensor Network -A Survey , 2013 .

[19]  Weibiao Zhou,et al.  Design of experiments and regression modelling in food flavour and sensory analysis: A review , 2018 .

[20]  Malik Magdon-Ismail,et al.  On selecting a maximum volume sub-matrix of a matrix and related problems , 2009, Theor. Comput. Sci..

[21]  Andrea Montanari,et al.  Linear bandits in high dimension and recommendation systems , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Naresh K. Malhotra,et al.  Analyzing Marketing Research Data with Incomplete Information on the Dependent Variable , 1987 .

[23]  Srinivasa M. Salapaka,et al.  Simultaneous Facility Location and Path Optimization in Static and Dynamic Networks , 2020, IEEE Transactions on Control of Network Systems.

[24]  John Skilling,et al.  Maximum entropy method in image processing , 1984 .

[25]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[26]  Michael Jackson,et al.  Optimal Design of Experiments , 1994 .

[27]  Inderjit S. Dhillon,et al.  A generalized maximum entropy approach to bregman co-clustering and matrix approximation , 2004, J. Mach. Learn. Res..

[28]  Connie M. Borror,et al.  Genetic Algorithms for the Construction of D-Optimal Designs , 2003 .

[29]  Yunwen Xu,et al.  Aggregation of Graph Models and Markov Chains by Deterministic Annealing , 2014, IEEE Transactions on Automatic Control.

[30]  Shuming Jiao,et al.  Does deep learning always outperform simple linear regression in optical imaging? , 2020, Optics express.

[31]  Jorma Rissanen,et al.  Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.

[32]  Ramesh Raskar,et al.  Maximum-Entropy Fine-Grained Classification , 2018, NeurIPS.

[33]  Srinivasa Salapaka,et al.  Multiway k-Cut in Static and Dynamic Graphs: A Maximum Entropy Principle Approach , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[34]  V.K. Goyal,et al.  Estimation from lossy sensor data: jump linear modeling and Kalman filtering , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.

[35]  Constance V. Hines,et al.  Nonrandomly Missing Data in Multiple Regression: An Empirical Comparison of Common Missing-Data Treatments , 1991 .

[36]  K. Rose Deterministic annealing for clustering, compression, classification, regression, and related optimization problems , 1998, Proc. IEEE.

[37]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[38]  Vikas Singh,et al.  Experimental Design on a Budget for Sparse Linear Models and Applications , 2016, ICML.

[39]  Trivellore E Raghunathan,et al.  What do we do with missing data? Some options for analysis of incomplete data. , 2004, Annual review of public health.

[40]  Nirwan Ansari,et al.  The Progressive Smart Grid System from Both Power and Communications Aspects , 2012, IEEE Communications Surveys & Tutorials.

[41]  Aarti Singh,et al.  On Computationally Tractable Selection of Experiments in Measurement-Constrained Regression Models , 2016, J. Mach. Learn. Res..

[42]  Philipp Geyer,et al.  Linking BIM and Design of Experiments to balance architectural and technical design factors for energy performance , 2018 .

[43]  J. Brand,et al.  Missing data in a multi-item instrument were best handled by multiple imputation at the item score level. , 2014, Journal of clinical epidemiology.

[44]  Rashidul Haque,et al.  Application of penalized linear regression methods to the selection of environmental enteropathy biomarkers , 2017, Biomarker Research.

[45]  Rong Pan,et al.  Constructing Efficient Experimental Designs for Generalized Linear Models , 2016, Commun. Stat. Simul. Comput..