Interactive Elicitation of Knowledge on Feature Relevance Improves Predictions in Small Data Sets

Providing accurate predictions is challenging for machine learning algorithms when the number of features is larger than the number of samples in the data. Prior knowledge can improve machine learning models by indicating relevant variables and parameter values. Yet, this prior knowledge is often tacit and only available from domain experts. We present a novel approach that uses interactive visualization to elicit the tacit prior knowledge and uses it to improve the accuracy of prediction models. The main component of our approach is a user model that models the domain expert's knowledge of the relevance of different features for a prediction task. In particular, based on the expert's earlier input, the user model guides the selection of the features on which to elicit user's knowledge next. The results of a controlled user study show that the user model significantly improves prior knowledge elicitation and prediction accuracy, when predicting the relative citation counts of scientific documents in a specific domain.

[1]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[2]  Geoffrey Jones,et al.  Prior Elicitation: Interactive Spreadsheet Graphics With Sliders Can Be Fun, and Informative , 2014 .

[3]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[4]  Dorota Glowacka,et al.  Directing exploratory search: reinforcement learning from user interactions with keywords , 2013, IUI '13.

[5]  Samuel Kaski,et al.  Interactive intent modeling , 2014, Commun. ACM.

[6]  James Davey,et al.  Guiding feature subset selection with an interactive visualization , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[7]  Benjamin Haibe-Kains,et al.  Research and applications: Comparison and validation of genomic predictors for anticancer drug sensitivity , 2013, J. Am. Medical Informatics Assoc..

[8]  Hongtu Zhu,et al.  Tensor Regression with Applications in Neuroimaging Data Analysis , 2012, Journal of the American Statistical Association.

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  Marc G. Genton,et al.  Classes of Kernels for Machine Learning: A Statistics Perspective , 2002, J. Mach. Learn. Res..

[11]  David D. Lewis,et al.  Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.

[12]  Samuel Kaski,et al.  Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction , 2016, Machine Learning.

[13]  Andrés Cano,et al.  A Method for Integrating Expert Knowledge When Learning Bayesian Networks From Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Noah A. Smith,et al.  Predicting Risk from Financial Reports with Regression , 2009, NAACL.

[15]  Lu Tian,et al.  A Simple Method for Detecting Interactions between a Treatment and a Large Number of Covariates , 2012, 1212.2995.

[16]  Nick Cramer,et al.  Automatic Keyword Extraction from Individual Documents , 2010 .

[17]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[18]  Samuel Kaski,et al.  Regression with n→1 by Expert Knowledge Elicitation , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[19]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[20]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[21]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[22]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[23]  Desney S. Tan,et al.  Interactive optimization for steering machine classification , 2010, CHI.

[24]  Silvia Miksch,et al.  Visual Methods for Analyzing Probabilistic Classification Data , 2014, IEEE Transactions on Visualization and Computer Graphics.

[25]  Ashish Kapoor,et al.  FeatureInsight: Visual support for error-driven feature ideation in text classification , 2015, 2015 IEEE Conference on Visual Analytics Science and Technology (VAST).

[26]  Tong Zhang,et al.  Linear prediction models with graph regularization for web-page categorization , 2006, KDD '06.

[27]  Nci Dream Community A community effort to assess and improve drug sensitivity prediction algorithms , 2014 .

[28]  Kerrie Mengersen,et al.  Comparison of three expert elicitation methods for logistic regression on predicting the presence of the threatened brush‐tailed rock‐wallaby Petrogale penicillata , 2009 .

[29]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[30]  Samuel Kaski,et al.  Interactive Prior Elicitation of Feature Similarities for Small Sample Size Prediction , 2016, UMAP.

[31]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[32]  Fadlalla G. Elfadaly,et al.  Prior distribution elicitation for generalized linear and piecewise-linear models , 2013 .

[33]  Shannon L. Risacher,et al.  Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance , 2011, 2011 International Conference on Computer Vision.

[34]  Ahmed A. Rafea,et al.  KP-Miner: A keyphrase extraction system for English and Arabic documents , 2009, Inf. Syst..

[35]  Laura M. Heiser,et al.  A community effort to assess and improve drug sensitivity prediction algorithms , 2014, Nature Biotechnology.

[36]  John M. Chambers,et al.  Computers in Statistical Research: Simulation and Computer-Aided Mathematics , 1970 .

[37]  Zhengdong Lu Semi-supervised Clustering with Pairwise Constraints: A Discriminative Approach , 2007, AISTATS.

[38]  Wei Chu,et al.  Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.

[39]  Huimin Zhao,et al.  Incorporating domain knowledge into data mining classifiers: An application in indirect lending , 2008, Decis. Support Syst..

[40]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[41]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[42]  Wayne S. Smith,et al.  Interactive Elicitation of Opinion for a Normal Linear Model , 1980 .

[43]  Gerhard Weikum,et al.  The Bag-of-Opinions Method for Review Rating Prediction from Sparse Text Patterns , 2010, COLING.