You Are the Only Possible Oracle: Effective Test Selection for End Users of Interactive Machine Learning Systems
暂无分享,去创建一个
Alex Groce | Weng-Keen Wong | Todd Kulesza | Chaoqiang Zhang | Shalini Shamasunder | Margaret M. Burnett | Simone Stumpf | Shubhomoy Das | Amber Shinsel | Forrest Bice | Kevin McIntosh | S. Stumpf | Weng-Keen Wong | M. Burnett | S. Das | Alex Groce | T. Kulesza | Chaoqiang Zhang | K. McIntosh | Amber Shinsel | Shalini Shamasunder | Forrest Bice | Todd Kulesza | Kevin McIntosh
[1] Gregg Rothermel,et al. A methodology for testing spreadsheets , 2001, TSEM.
[2] Mary Shaw,et al. Semantic anomaly detection in online data sources , 2002, ICSE '02.
[3] Desney S. Tan,et al. Interactive optimization for steering machine classification , 2010, CHI.
[4] Weng-Keen Wong,et al. Fixing the program my computer learned: barriers for end users, challenges for the machine , 2009, IUI.
[5] Sarfraz Khurshid,et al. Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[6] Burr Settles,et al. Active Learning Literature Survey , 2009 .
[7] David Leon,et al. Tree-based methods for classifying software failures , 2004, 15th International Symposium on Software Reliability Engineering.
[8] Raymond R. Panko,et al. What we know about spreadsheet errors , 1998 .
[9] Desney S. Tan,et al. CueFlik: interactive concept learning in image search , 2008, CHI.
[10] Alex Groce,et al. Comparing non-adequate test suites using coverage criteria , 2013, ISSTA.
[11] Thomas G. Dietterich,et al. Active EM to reduce noise in activity recognition , 2007, IUI '07.
[12] Todd Kulesza,et al. Can feature design reduce the gender gap in end-user software development environments? , 2008, 2008 IEEE Symposium on Visual Languages and Human-Centric Computing.
[13] Ken Lang,et al. NewsWeeder: Learning to Filter Netnews , 1995, ICML.
[14] Mark Harman,et al. A Theoretical and Empirical Study of Search-Based Testing: Local, Global, and Hybrid Search , 2010, IEEE Transactions on Software Engineering.
[15] Bianca Zadrozny,et al. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.
[16] Alex Groce,et al. Lightweight Automated Testing with Adaptation-Based Programming , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering.
[17] Larry Wasserman,et al. All of Statistics , 2004 .
[18] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.
[19] Tsong Yueh Chen,et al. Fault-based testing without the need of oracles , 2003, Inf. Softw. Technol..
[20] Joe Tullio,et al. How it works: a field study of non-technical users interacting with an intelligent system , 2007, CHI.
[21] Weng-Keen Wong,et al. Explanatory Debugging: Supporting End-User Debugging of Machine-Learned Programs , 2010, VL/HCC.
[22] John A. Clark,et al. Dynamic adaptive Search Based Software Engineering , 2012, Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement.
[23] Mary Beth Rosson,et al. Design Planning in End-User Web Development , 2007 .
[24] Chris Murphy,et al. An Approach to Software Testing of Machine Learning Applications , 2007, SEKE.
[25] Tian Jiang,et al. Personalized defect prediction , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[26] Mark Harman,et al. The role of Artificial Intelligence in Software Engineering , 2012, 2012 First International Workshop on Realizing AI Synergies in Software Engineering (RAISE).
[27] Alex Groce,et al. Mini-crowdsourcing end-user assessment of intelligent assistants: A cost-benefit study , 2011, 2011 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).
[28] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[29] Gregg Rothermel,et al. An empirical study of the effects of minimization on the fault detection capabilities of test suites , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).
[30] Alex Groce,et al. Where Are My Intelligent Assistant's Mistakes? A Systematic Testing Approach , 2011, IS-EUD.
[31] David Leon,et al. Pursuing failure: the distribution of program failures in a profile space , 2001, ESEC/FSE-9.
[32] Desney S. Tan,et al. EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers , 2009, CHI.
[33] Hema Raghavan,et al. Active Learning with Feedback on Features and Instances , 2006, J. Mach. Learn. Res..
[34] BurnettMargaret,et al. You Are the Only Possible Oracle , 2014 .
[35] Rob Miller,et al. Outlier finding: focusing user attention on possible errors , 2001, UIST '01.
[36] Lionel C. Briand,et al. Formal analysis of the effectiveness and predictability of random testing , 2010, ISSTA '10.
[37] Anind K. Dey,et al. Toolkit to support intelligibility in context-aware applications , 2010, UbiComp.
[38] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[39] Gregg Rothermel,et al. Test Case Prioritization: A Family of Empirical Studies , 2002, IEEE Trans. Software Eng..
[40] Chih-Jen Lin,et al. Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..
[41] Deborah L. McGuinness,et al. Toward establishing trust in adaptive agents , 2008, IUI '08.
[42] S. Hart,et al. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .
[43] Gail E. Kaiser,et al. Automatic system testing of programs without test oracles , 2009, ISSTA.
[44] Jafar Adibi,et al. The Enron Email Dataset Database Schema and Brief Statistical Report , 2004 .
[45] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[46] M. E. Maron,et al. Automatic Indexing: An Experimental Inquiry , 1961, JACM.
[47] Weng-Keen Wong,et al. Why-oriented end-user debugging of naive Bayes text classification , 2011, ACM Trans. Interact. Intell. Syst..
[48] Anind K. Dey,et al. Why and why not explanations improve the intelligibility of context-aware intelligent systems , 2009, CHI.
[49] Dino Isa,et al. Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine , 2008, IEEE Transactions on Knowledge and Data Engineering.
[50] Judith Segal. Some Problems of Professional End User Developers , 2007 .
[51] Elaine J. Weyuker,et al. A Formal Analysis of the Fault-Detecting Ability of Testing Methods , 1993, IEEE Trans. Software Eng..
[52] Phyllis G. Frankl,et al. All-uses vs mutation testing: An experimental comparison of effectiveness , 1997, J. Syst. Softw..
[53] Phyllis G. Frankl,et al. An Experimental Comparison of the Effectiveness of Branch Testing and Data Flow Testing , 1993, IEEE Trans. Software Eng..
[54] Baowen Xu,et al. Application of Metamorphic Testing to Supervised Classifiers , 2009, 2009 Ninth International Conference on Quality Software.
[55] HarmanMark,et al. A Theoretical and Empirical Study of Search-Based Testing , 2010 .
[56] Alex Groce,et al. Taming compiler fuzzers , 2013, ACM-SIGPLAN Symposium on Programming Language Design and Implementation.
[57] James Fogarty,et al. Regroup: interactive machine learning for on-demand group creation in social networks , 2012, CHI.
[58] Alan F. Blackwell,et al. First steps in programming: a rationale for attention investment models , 2002, Proceedings IEEE 2002 Symposia on Human Centric Computing Languages and Environments.
[59] R. Jones,et al. Active Learning with Feedback on Both Features and Instances , 2006 .
[60] Weng-Keen Wong,et al. End-user feature labeling: a locally-weighted regression approach , 2011, IUI '11.
[61] Geoffrey I. Webb,et al. On the effect of data set size on bias and variance in classification learning , 1999 .
[62] Christopher Scaffidi. Unsupervised Inference of Data Formats in Human-Readable Notation , 2007, ICEIS.