A glass-box interactive machine learning approach for solving NP-hard problems with the human-in-the-loop

The goal of Machine Learning to automatically learn from data, extract knowledge and to make decisions without any human intervention. Such automatic (aML) approaches show impressive success. Recent results even demonstrate intriguingly that deep learning applied for automatic classification of skin lesions is on par with the performance of dermatologists, yet outperforms the average. As human perception is inherently limited, such approaches can discover patterns, e.g. that two objects are similar, in arbitrarily high-dimensional spaces what no human is able to do. Humans can deal only with limited amounts of data, whilst big data is beneficial for aML; however, in health informatics, we are often confronted with a small number of data sets, where aML suffer of insufficient training samples and many problems are computationally hard. Here, interactive machine learning (iML) may be of help, where a human-in-the-loop contributes to reduce the complexity of NP-hard problems. A further motivation for iML is that standard black-box approaches lack transparency, hence do not foster trust and acceptance of ML among end-users. Rising legal and privacy aspects, e.g. with the new European General Data Protection Regulations, make black-box approaches difficult to use, because they often are not able to explain why a decision has been made. In this paper, we present some experiments to demonstrate the effectiveness of the human-in-the-loop approach, particularly in opening the black-box to a glass-box and thus enabling a human directly to interact with an learning algorithm. We selected the Ant Colony Optimization framework, and applied it on the Traveling Salesman Problem, which is a good example, due to its relevance for health informatics, e.g. for the study of protein folding. From studies of how humans extract so much from so little data, fundamental ML-research also may benefit.

[1]  Richard M. Karp,et al.  Mapping the genome: some combinatorial problems arising in molecular biology , 1993, STOC.

[2]  Nigel R Franks,et al.  Speed versus accuracy in decision-making ants: expediting politics and policy implementation , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[3]  Charles Kemp,et al.  How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.

[4]  Andreas Holzinger,et al.  Interactive Machine Learning (iML): a challenge for Game-based approaches , 2016, NIPS 2016.

[5]  Ingo Steinwart,et al.  Mercer’s Theorem on General Domains: On the Interaction between Measures, Kernels, and RKHSs , 2012 .

[6]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[7]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Shen Lin Computer solutions of the traveling salesman problem , 1965 .

[9]  Alex Smola,et al.  Kernel methods in machine learning , 2007, math/0701907.

[10]  Petrica C. Pop,et al.  Optical character recognition in real environments using neural networks and k-nearest neighbor , 2013, Applied Intelligence.

[11]  Thomas Stützle,et al.  Ant Colony Optimization Theory , 2004 .

[12]  S. Voß,et al.  A classification of formulations for the (time-dependent) traveling salesman problem , 1995 .

[13]  Mihalis Yannakakis,et al.  On the Complexity of Protein Folding , 1998, J. Comput. Biol..

[14]  James A. Chisman,et al.  The clustered traveling salesman problem , 1975, Comput. Oper. Res..

[15]  T. Ormerod,et al.  Human performance on the traveling salesman problem , 1996, Perception & psychophysics.

[16]  Joshua B. Tenenbaum,et al.  Inferring causal networks from observations and interventions , 2003, Cogn. Sci..

[17]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[18]  Tim Hendtlass,et al.  Dynamic Ant Colony Optimisation , 2005, Applied Intelligence.

[19]  A. Simon Simulating Human Performance on the Traveling Salesman Problem , 2003 .

[20]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[21]  G. Croes A Method for Solving Traveling-Salesman Problems , 1958 .

[22]  I. Rooij,et al.  Convex hull and tour crossings in the Euclidean traveling salesperson problem: Implications for human performance studies , 2003, Memory & cognition.

[23]  J. Tenenbaum,et al.  Special issue on “Probabilistic models of cognition , 2022 .

[24]  Michael I. Jordan,et al.  An internal model for sensorimotor integration. , 1995, Science.

[25]  Juliane Jung,et al.  The Traveling Salesman Problem: A Computational Study , 2007 .

[26]  A. Bernstein,et al.  A chess playing program for the IBM 704 , 1899, IRE-ACM-AIEE '58 (Western).

[27]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[28]  Zuren Feng,et al.  Guidance-solution based ant colony optimization for satellite control resource scheduling problem , 2011, Applied Intelligence.

[29]  Andreas Holzinger,et al.  Reasoning Under Uncertainty: Towards Collaborative Interactive Machine Learning , 2016, Machine Learning for Health Informatics.

[30]  Carla E. Brodley,et al.  ASSERT: A Physician-in-the-Loop Content-Based Retrieval System for HRCT Image Databases , 1999, Comput. Vis. Image Underst..

[31]  Hayit Greenspan,et al.  Content-Based Image Retrieval in Radiology: Current Status and Future Directions , 2010, Journal of Digital Imaging.

[32]  J. Tenenbaum,et al.  Theory-based Bayesian models of inductive learning and reasoning , 2006, Trends in Cognitive Sciences.

[33]  Camelia-Mihaela Pintea,et al.  Emergency management using geographic information systems: application to the first Romanian traveling salesman problem instance , 2016, Knowledge and Information Systems.

[34]  Vasile Palade,et al.  Ant-Based System Analysis on the Traveling Salesman Problem Under Real-World Settings , 2016 .

[35]  Maya Cakmak,et al.  Power to the People: The Role of Humans in Interactive Machine Learning , 2014, AI Mag..

[36]  Camelia-Mihaela Pintea,et al.  Improving ant systems using a local updating rule , 2005, Seventh International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC'05).

[37]  Víctor Parada,et al.  People Efficiently Explore the Solution Space of the Computationally Intractable Traveling Salesman Problem to Find Near-Optimal Tours , 2010, PloS one.

[38]  Salil P. Vadhan,et al.  Computational Complexity , 2005, Encyclopedia of Cryptography and Security.

[39]  Samuel J. Gershman,et al.  Online learning of symbolic concepts , 2017 .

[40]  James N. MacGregor,et al.  Human Performance on the Traveling Salesman and Related Problems: A Review , 2011, J. Probl. Solving.

[41]  Rémi Monasson,et al.  Determining computational complexity from characteristic ‘phase transitions’ , 1999, Nature.

[42]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Andreas Holzinger,et al.  Interactive Machine Learning (iML) , 2016, Informatik-Spektrum.

[44]  Andreas Holzinger,et al.  DO NOT DISTURB? Classifier Behavior on Perturbed Datasets , 2017, CD-MAKE.

[45]  Christine Clavien,et al.  Gut Feelings : Short Cuts to Better Decision Making , 2008 .

[46]  J. Tenenbaum,et al.  Word learning as Bayesian inference. , 2007, Psychological review.

[47]  Andrew Gordon Wilson,et al.  Gaussian Process Kernels for Pattern Discovery and Extrapolation , 2013, ICML.

[48]  Andreas Holzinger,et al.  Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.

[49]  Antonino Staiano,et al.  A multi-step approach to time series analysis and gene expression clustering , 2006, Bioinform..

[50]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[51]  Andrew Gordon Wilson,et al.  The Human Kernel , 2015, NIPS.

[52]  Carl A. Nelson,et al.  Modeling Surgical Tool Selection Patterns as a "Traveling Salesman Problem" for Optimizing a Modular Surgical Tool System , 2008, MMVR.

[53]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[54]  Pedro M. Domingos The Role of Occam's Razor in Knowledge Discovery , 1999, Data Mining and Knowledge Discovery.

[55]  Jon Jouis Bentley,et al.  Fast Algorithms for Geometric Traveling Salesman Problems , 1992, INFORMS J. Comput..

[56]  Christopher G. Lucas,et al.  A rational model of function learning , 2015, Psychonomic Bulletin & Review.

[57]  Gerhard J. Woeginger,et al.  On the Complexity of Function Learning , 1993, COLT.

[58]  Frank Puppe,et al.  Introspective Subgroup Analysis for Interactive Knowledge Refinement , 2006, FLAIRS Conference.

[59]  Thomas L. Griffiths,et al.  Modeling human function learning with Gaussian processes , 2008, NIPS.

[60]  Edgar R. Weippl,et al.  The Right to Be Forgotten: Towards Machine Learning on Perturbed Knowledge Bases , 2016, CD-ARES.

[61]  Stefan Kirn,et al.  Ubiquitous Healthcare: The OnkoNet Mobile Agents Architecture , 2002, Mobile Computing in Medicine.

[62]  Antonino Staiano,et al.  Clustering and visualization approaches for human cell cycle gene expression data analysis , 2008, Int. J. Approx. Reason..

[63]  Ruxandra Stoean,et al.  Support Vector Machines and Evolutionary Algorithms for Classification - Single or Together? , 2014, Intelligent Systems Reference Library.

[64]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[65]  Deniz Erdogmus,et al.  Human-inthe-Loop Cyber-Physical Systems , .

[66]  Edgar R. Weippl,et al.  A tamper-proof audit and control system for the doctor in the loop , 2016, Brain Informatics.

[67]  B. Schölkopf,et al.  Kernel Methods in Machine Learning 1 , 2008 .

[68]  Gaston H. Gonnet,et al.  Using traveling salesman problem algorithms for evolutionary tree construction , 2000, Bioinform..

[69]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[70]  Jianxiong Xiao,et al.  DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[71]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  George B. Dantzig,et al.  The Truck Dispatching Problem , 1959 .

[73]  B. M. Ombuki,et al.  Ant Colony Optimization for Job Shop Scheduling Problem , 2004 .

[74]  Gerd Gigerenzer,et al.  Heuristic decision making. , 2011, Annual review of psychology.

[75]  Arthur L. Samuel,et al.  Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..