Making machine learning useable by revealing internal states update - a transparent approach

Machine learning ML techniques are often found difficult to apply effectively in practice because of their complexities. Therefore, making ML useable is emerging as one of active research fields recently. Furthermore, an ML algorithm is still a 'black-box'. This 'black-box' approach makes it difficult for users to understand complicated ML models. As a result, the user is uncertain about the usefulness of ML results and this affects the effectiveness of ML methods. This paper focuses on making a 'black-box' ML process transparent by presenting real-time internal status update of the ML process to users explicitly. A user study was performed to investigate the impact of revealing internal status update to users on the easiness of understanding data analysis process, meaningfulness of real-time status update, and convincingness of ML results. The study showed that revealing of the internal states of ML process can help improve easiness of understanding the data analysis process, make real-time status update more meaningful, and make ML results more convincing.

[1]  T. Perneger What's wrong with Bonferroni adjustments , 1998, BMJ.

[2]  Hans-Peter Kriegel,et al.  Visual classification: an interactive approach to decision tree construction , 1999, KDD '99.

[3]  Andy P. Field,et al.  Discovering Statistics Using SPSS , 2000 .

[4]  Ian Witten,et al.  Data Mining , 2000 .

[5]  Vasant Honavar,et al.  Gaining insights into support vector machine pattern classifiers using projection-based tour methods , 2001, KDD '01.

[6]  Jerry Alan Fails,et al.  Interactive machine learning , 2003, IUI '03.

[7]  Eibe Frank,et al.  Visualizing Class Probability Estimators , 2003, PKDD.

[8]  Joseph G. Ibrahim,et al.  Bayesian Survival Analysis , 2004 .

[9]  Shinichi Nakagawa A farewell to Bonferroni: the problems of low statistical power and publication bias , 2004, Behavioral Ecology.

[10]  Ivan Bratko,et al.  Nomograms for visualizing support vector machines , 2005, KDD '05.

[11]  Jianlin Cheng,et al.  HMMEditor: a visual editing tool for profile hidden Markov model , 2008, BMC Genomics.

[12]  James A. Landay,et al.  Examining Difficulties Software Developers Encounter in the Adoption of Statistical Machine Learning , 2008, AAAI.

[13]  Desney S. Tan,et al.  EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers , 2009, CHI.

[14]  K. Rothman Curbing type I and type II errors , 2010, European Journal of Epidemiology.

[15]  Vittorio Scarano,et al.  An Interactive Bio-inspired Approach to Clustering and Visualizing Datasets , 2011, 2011 15th International Conference on Information Visualisation.

[16]  Rosane Minghim,et al.  Improved Similarity Trees and their Application to Visual Data Classification , 2011, IEEE Transactions on Visualization and Computer Graphics.

[17]  Perry R. Cook,et al.  Real-time human interaction with supervised learning algorithms for music composition and performance , 2011 .

[18]  Jason C. Hung,et al.  Using emotional classification model for travel information system , 2011, Int. J. Comput. Sci. Eng..

[19]  Wernhuar Tarng,et al.  A virtual reality design for learning the basic concepts of synchrotron light source , 2011, Int. J. Comput. Sci. Eng..

[20]  Weng-Keen Wong,et al.  Why-oriented end-user debugging of naive Bayes text classification , 2011, ACM Trans. Interact. Intell. Syst..

[21]  Weng-Keen Wong,et al.  End-user interactions with intelligent and autonomous systems , 2012, CHI Extended Abstracts.

[22]  Kiri Wagstaff,et al.  Machine Learning that Matters , 2012, ICML.

[23]  P. Krishna Reddy,et al.  Temporality-based user interface design approaches for desktop and small screen environment , 2012, Int. J. Comput. Sci. Eng..

[24]  Yang Wang,et al.  Water pipe condition assessment: a hierarchical beta process approach for sparse incident data , 2014, Machine Learning.

[25]  Panagiotis Papapetrou,et al.  A peek into the black box: exploring classifiers by randomization , 2014, Data Mining and Knowledge Discovery.