暂无分享,去创建一个
Mark Harman | Lei Ma | Yang Liu | Jie M. Zhang | M. Harman | J Zhang | Yang Liu | Lei Ma
[1] Aws Albarghouthi,et al. Repairing Decision-Making Programs Under Uncertainty , 2017, CAV.
[2] Daniel Kroening,et al. Concolic Testing for Deep Neural Networks , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[3] Pratik Gajane,et al. On formalizing fairness in prediction with machine learning , 2017, ArXiv.
[4] Bram van Ginneken,et al. A survey on deep learning in medical image analysis , 2017, Medical Image Anal..
[5] Algirdas A. Avi. The Methodology of N-Version Programming , 1995 .
[6] Paul Barford,et al. Data Poisoning Attacks against Autoregressive Models , 2016, AAAI.
[7] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[8] Standard Glossary of Software Engineering Terminology , 1990 .
[9] Günther Ruhe,et al. Search Based Software Engineering , 2013, Lecture Notes in Computer Science.
[10] David J. Robson,et al. The state-based testing of object-oriented programs , 1993, 1993 Conference on Software Maintenance.
[11] Berkman Sahiner,et al. Test data reuse for evaluation of adaptive machine learning algorithms: over-fitting to a fixed 'test' dataset and a potential solution , 2018, Medical Imaging.
[12] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.
[13] David A. Landgrebe,et al. A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..
[14] Lionel C. Briand,et al. Testing advanced driver assistance systems using multi-objective search and neural networks , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).
[15] Lei Ma,et al. DeepHunter: Hunting Deep Neural Network Defects via Coverage-Guided Fuzzing , 2018, 1809.01266.
[16] Russ Tedrake,et al. Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.
[17] Nancy G. Leveson,et al. An empirical evaluation of the MC/DC coverage criterion on the HETE-2 satellite software , 2000, 19th DASC. 19th Digital Avionics Systems Conference. Proceedings (Cat. No.00CH37126).
[18] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[19] Xiaoxing Ma,et al. Structural Coverage Criteria for Neural Networks Could Be Misleading , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER).
[20] Nikolai Tillmann,et al. Test generation via Dynamic Symbolic Execution for mutation testing , 2010, 2010 IEEE International Conference on Software Maintenance.
[21] Yadong Wang,et al. Combinatorial Testing for Deep Learning Systems , 2018, ArXiv.
[22] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.
[23] Junfeng Yang,et al. DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.
[24] Xiaoxing Ma,et al. Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing , 2018, 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS).
[25] Yang Liu,et al. Metamorphic Relation Based Adversarial Attacks on Differentiable Neural Computer , 2018, ArXiv.
[26] Pushmeet Kohli,et al. Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures , 2018, ICLR.
[27] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.
[28] Julio Cesar Sampaio do Prado Leite,et al. On Non-Functional Requirements in Software Engineering , 2009, Conceptual Modeling: Foundations and Applications.
[29] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[30] Zhi-Hua Zhou,et al. Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.
[31] Xin Zhang,et al. TFX: A TensorFlow-Based Production-Scale Machine Learning Platform , 2017, KDD.
[32] Toshiaki Yasue,et al. A Survey of Software Quality for Machine Learning Applications , 2018, 2018 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW).
[33] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.
[34] P. Bickel,et al. Sex Bias in Graduate Admissions: Data from Berkeley , 1975, Science.
[35] Brandon M. Greenwell,et al. Interpretable Machine Learning , 2019, Hands-On Machine Learning with R.
[36] Michael P. Wellman,et al. Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.
[37] Lu Zhang,et al. Search-based inference of polynomial metamorphic relations , 2014, ASE.
[38] Ravishankar K. Iyer,et al. ML-Based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[39] Yi Li,et al. DeepCruiser: Automated Guided Testing for Stateful Deep Learning Systems , 2018, ArXiv.
[40] Douglas M. Hawkins,et al. The Problem of Overfitting , 2004, J. Chem. Inf. Model..
[41] Dave Towey,et al. A Monte Carlo Method for Metamorphic Testing of Machine Translation Services , 2018, 2018 IEEE/ACM 3rd International Workshop on Metamorphic Testing (MET).
[42] David Lo,et al. An Empirical Study of Bugs in Machine Learning Systems , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering.
[43] Chung-Hao Huang,et al. Towards Dependability Metrics for Neural Networks , 2018, 2018 16th ACM/IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE).
[44] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[45] Sanjay Krishnan,et al. ActiveClean: Interactive Data Cleaning For Statistical Modeling , 2016, Proc. VLDB Endow..
[46] Indre Zliobaite,et al. Fairness-aware machine learning: a perspective , 2017, ArXiv.
[47] Jingyi Wang,et al. Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).
[48] N. Japkowicz. Why Question Machine Learning Evaluation Methods ? ( An illustrative review of the shortcomings of current methods ) , 2006 .
[49] Arnaud Gotlieb,et al. Towards Testing of Deep Learning Systems with Training Set Reduction , 2019, ArXiv.
[50] Sanjay Krishnan,et al. AlphaClean: Automatic Generation of Data Cleaning Pipelines , 2019, ArXiv.
[51] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[52] Erhard Rahm,et al. Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..
[53] Tao Xie,et al. Multiple-Implementation Testing of Supervised Learning Software , 2016, AAAI Workshops.
[54] Lei Ma,et al. Secure Deep Learning Engineering: A Software Quality Assurance Perspective , 2018, ArXiv.
[55] John Langford,et al. A Reductions Approach to Fair Classification , 2018, ICML.
[56] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.
[57] Ravishankar K. Iyer,et al. Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors , 2019, ArXiv.
[58] Konrad Rieck,et al. DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.
[59] Yuanyuan Zhang,et al. A search based approach to fairness analysis in requirement assignments to aid negotiation, mediation and decision making , 2009, Requirements Engineering.
[60] D. Sculley,et al. The ML test score: A rubric for ML production readiness and technical debt reduction , 2017, 2017 IEEE International Conference on Big Data (Big Data).
[61] Xiaoxing Ma,et al. Boosting operational DNN testing efficiency through conditioning , 2019, ESEC/SIGSOFT FSE.
[62] Weijie Chen,et al. Classifier variability: Accounting for training and testing , 2012, Pattern Recognit..
[63] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[64] Mark Harman,et al. A Theoretical and Empirical Study of Search-Based Testing: Local, Global, and Hybrid Search , 2010, IEEE Transactions on Software Engineering.
[65] Tao Xie,et al. Detecting Failures of Neural Machine Translation in the Absence of Reference Translations , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks – Industry Track.
[66] Cody Fleming,et al. Towards Improved Testing For Deep Learning , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER).
[67] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[68] Gail E. Kaiser,et al. Properties of Machine Learning Applications for Use in Metamorphic Testing , 2008, SEKE.
[69] Lu Zhang,et al. Predictive Mutation Testing , 2016, IEEE Transactions on Software Engineering.
[70] R. P. Jagadeesh Chandra Bose,et al. Identifying implementation bugs in machine learning based image classifiers using metamorphic testing , 2018, ISSTA.
[71] Gail E. Kaiser,et al. Using JML Runtime Assertion Checking to Automate Metamorphic Testing in Applications without Test Oracles , 2009, 2009 International Conference on Software Testing Verification and Validation.
[72] Samuel Madden,et al. MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis , 2018, SIGMOD Conference.
[73] Sarfraz Khurshid,et al. DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[74] Shin Nakajima,et al. Dataset Coverage for Testing Machine Learning Computer Programs , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).
[75] Jun Sun,et al. Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing , 2018, ArXiv.
[76] Yuriy Brun,et al. Fairness testing: testing software for discrimination , 2017, ESEC/SIGSOFT FSE.
[77] Bernease Herman,et al. The Promise and Peril of Human Evaluation for Model Interpretability , 2017, ArXiv.
[78] Reid A. Johnson,et al. Calibrating Probability with Undersampling for Unbalanced Classification , 2015, 2015 IEEE Symposium Series on Computational Intelligence.
[79] Satoshi Masuda,et al. A Test Architecture for Machine Learning Product , 2018, 2018 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW).
[80] Felix Bießmann,et al. Automating Large-Scale Data Quality Verification , 2018, Proc. VLDB Endow..
[81] Seth Flaxman,et al. European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..
[82] J. Voas,et al. Software Testability: The New Verification , 1995, IEEE Softw..
[83] Patrick D. McDaniel,et al. Cleverhans V0.1: an Adversarial Machine Learning Library , 2016, ArXiv.
[84] Annibale Panichella,et al. Testing Autonomous Cars for Feature Interaction Failures using Many-Objective Search , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[85] David Clark,et al. Squeeziness: An information theoretic measure for avoiding fault masking , 2012, Inf. Process. Lett..
[86] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.
[87] Lionel C. Briand,et al. Testing Vision-Based Control Systems Using Learnable Evolutionary Algorithms , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).
[88] Matt J. Kusner,et al. Counterfactual Fairness , 2017, NIPS.
[89] Antonio Criminisi,et al. Measuring Neural Net Robustness with Constraints , 2016, NIPS.
[90] Claes Wohlin,et al. Guidelines for snowballing in systematic literature studies and a replication in software engineering , 2014, EASE '14.
[91] Yifan Chen,et al. An empirical study on TensorFlow program bugs , 2018, ISSTA.
[92] R. Avery,et al. Credit Scoring and Its Effects on the Availability and Affordability of Credit , 2009 .
[93] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[94] Peter W. O'Hearn,et al. From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis , 2018, 2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM).
[95] Neoklis Polyzotis,et al. Data Validation for Machine Learning , 2019, SysML.
[96] Ting Chen,et al. State of the art: Dynamic symbolic execution for automated test generation , 2013, Future Gener. Comput. Syst..
[97] Yuriy Brun,et al. Offline Contextual Bandits with High Probability Fairness Guarantees , 2019, NeurIPS.
[98] Jinqiu Yang,et al. A Study of Oracle Approximations in Testing Deep Learning Libraries , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[99] Danfeng Zhang,et al. Detecting Violations of Differential Privacy , 2018, CCS.
[100] Sudipta Chattopadhyay,et al. Grammar Based Directed Testing of Machine Learning Systems , 2019, ArXiv.
[101] Yuriy Brun,et al. Themis: automatically testing software for discrimination , 2018, ESEC/SIGSOFT FSE.
[102] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.
[103] Eric Horvitz,et al. On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems , 2016, AAAI.
[104] Andrew McCallum,et al. A comparison of event models for naive bayes text classification , 1998, AAAI 1998.
[105] Shin Nakajima,et al. [Invited] Quality Assurance of Machine Learning Software , 2018, 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE).
[106] Paul Voigt,et al. The Eu General Data Protection Regulation (Gdpr): A Practical Guide , 2017 .
[107] Mark Harman,et al. An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.
[108] Yuanyuan Zhang,et al. “Fairness Analysis” in Requirements Assignments , 2008, 2008 16th IEEE International Requirements Engineering Conference.
[109] Wasif Afzal,et al. A systematic review of search-based testing for non-functional system properties , 2009, Inf. Softw. Technol..
[110] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.
[111] James C. King,et al. Symbolic execution and program testing , 1976, CACM.
[112] Ravishankar K. Iyer,et al. Towards a Bayesian Approach for Assessing Fault Tolerance of Deep Neural Networks , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks – Supplemental Volume (DSN-S).
[113] Heike Wehrheim,et al. Testing Machine Learning Algorithms for Balanced Data Usage , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).
[114] Eric P. Xing,et al. What If We Simply Swap the Two Text Fragments? A Straightforward yet Effective Way to Test the Robustness of Methods to Confounding Signals in Nature Language Inference Tasks , 2018, AAAI.
[115] Liqun Sun,et al. Metamorphic testing of driverless cars , 2019, Commun. ACM.
[116] Days,et al. “Feedback Loop”: The Civil Rights Act of 1964 and its Progeny , 2005 .
[117] Shin Yoo,et al. Guiding Deep Learning System Testing Using Surprise Adequacy , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).
[118] Corina S. Pasareanu,et al. DeepSafe: A Data-Driven Approach for Assessing Robustness of Neural Networks , 2018, ATVA.
[119] Phil McMinn,et al. Search‐based software test data generation: a survey , 2004, Softw. Test. Verification Reliab..
[120] Lin Tan,et al. CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).
[121] Tom Schaul,et al. Unit Tests for Stochastic Optimization , 2013, ICLR.
[122] Atif M. Memon. GUI Testing: Pitfalls and Process , 2002, Computer.
[123] Bin Li,et al. An Empirical Study on Real Bugs for Machine Learning Programs , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).
[124] Ali Shahrokni,et al. A systematic review of software robustness , 2013, Inf. Softw. Technol..
[125] M. Kenward,et al. An Introduction to the Bootstrap , 2007 .
[126] David C. Parkes,et al. How Do Fairness Definitions Fare?: Examining Public Attitudes Towards Algorithmic Definitions of Fairness , 2018, AIES.
[127] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[128] Chris Murphy,et al. An Approach to Software Testing of Machine Learning Applications , 2007, SEKE.
[129] Meng Wang,et al. Do Pseudo Test Suites Lead to Inflated Correlation in Measuring Test Effectiveness? , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).
[130] Mihai Oltean,et al. Fruit recognition from images using deep learning , 2017, Acta Universitatis Sapientiae, Informatica.
[131] Eugene Wu,et al. DeepBase: Deep Inspection of Neural Networks , 2018, SIGMOD Conference.
[132] Keinosuke Fukunaga,et al. Effects of Sample Size in Classifier Design , 1989, IEEE Trans. Pattern Anal. Mach. Intell..
[133] D. Sculley,et al. The Data Linter: Lightweight Automated Sanity Checking for ML Data Sets , 2017 .
[134] Sharad Goel,et al. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.
[135] Francisco José García-Peñalvo,et al. Enabling Adaptability in Web Forms Based on User Characteristics Detection Through A/B Testing and Machine Learning , 2018, IEEE Access.
[136] Zhenyu Zhang,et al. A Noise-Sensitivity-Analysis-Based Test Prioritization Technique for Deep Neural Networks , 2019, ArXiv.
[137] Zhenchang Xing,et al. Neural-Machine-Translation-Based Commit Message Generation: How Far Are We? , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[138] Lei Ma,et al. DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[139] Sanjai Rayadurgam,et al. Input Prioritization for Testing Neural Networks , 2019, 2019 IEEE International Conference On Artificial Intelligence Testing (AITest).
[140] Shin Nakajima,et al. Dataset Diversity for Metamorphic Testing of Machine Learning Software , 2018, SOFL+MSVL.
[141] Mohit Bansal,et al. Analyzing Compositionality-Sensitivity of NLI Models , 2018, AAAI.
[142] D. Sculley,et al. TensorFlow Debugger: Debugging Dataflow Graphs for Machine Learning , 2016 .
[143] Tao Xie,et al. Telemade: A Testing Framework for Learning-Based Malware Detection Systems , 2018, AAAI Workshops.
[144] Aws Albarghouthi,et al. Fairness-Aware Programming , 2019, FAT.
[145] Ricardo Baeza-Yates,et al. Quality-efficiency trade-offs in machine learning for text processing , 2017, 2017 IEEE International Conference on Big Data (Big Data).
[146] Sudipta Chattopadhyay,et al. Automated Directed Fairness Testing , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[147] Saikat Dutta,et al. Storm: program reduction for testing and debugging probabilistic programming systems , 2019, ESEC/SIGSOFT FSE.
[148] Foutse Khomh,et al. On Testing Machine Learning Programs , 2018, J. Syst. Softw..
[149] Koushik Sen,et al. CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.
[150] R. F. Wagner,et al. Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers. , 1999, Medical physics.
[151] Yue Zhao,et al. DLFuzz: differential fuzzing testing of deep learning systems , 2018, ESEC/SIGSOFT FSE.
[152] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.
[153] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.
[154] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.
[155] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .
[156] Daniel Kang,et al. Model Assertions for Debugging Machine Learning , 2018 .
[157] Luciano Baresi,et al. An Introduction to Software Testing , 2006, FoVMT.
[158] Mark Harman,et al. Constructing Subtle Faults Using Higher Order Mutation Testing , 2008, 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation.
[159] Baowen Xu,et al. Testing and validating machine learning classifiers by metamorphic testing , 2011, J. Syst. Softw..
[160] John Mylopoulos,et al. Non-Functional Requirements in Software Engineering , 2000, International Series in Software Engineering.
[161] Tao Xie,et al. Testing Untestable Neural Machine Translation: An Industrial Case , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).
[162] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[163] Weiming Xiang,et al. Verification for Machine Learning, Autonomy, and Neural Networks Survey , 2018, ArXiv.
[164] Wei Li,et al. DeepBillboard: Systematic Physical-World Testing of Autonomous Driving Systems , 2018, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).
[165] Daniel Kroening,et al. Global Robustness Evaluation of Deep Neural Networks with Provable Guarantees for L0 Norm , 2018, ArXiv.
[166] Yann LeCun,et al. Measuring the VC-Dimension of a Learning Machine , 1994, Neural Computation.
[167] Mark Harman,et al. Automatic Testing and Improvement of Machine Translation , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).
[168] Russ Tedrake,et al. Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation , 2018, NeurIPS.
[169] Tim Menzies,et al. Easy over hard: a case study on deep learning , 2017, ESEC/SIGSOFT FSE.
[170] Alberto L. Sangiovanni-Vincentelli,et al. Systematic Testing of Convolutional Neural Networks for Autonomous Driving , 2017, ArXiv.
[171] Suman Jana,et al. DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).
[172] Jameleddine Hassine,et al. Validation of Machine Learning Classifiers Using Metamorphic Testing and Feature Selection Techniques , 2017, MIWAI.
[173] Ian J. Goodfellow,et al. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .
[174] Sanjay Krishnan,et al. ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning , 2016, SIGMOD Conference.
[175] Gail E. Kaiser,et al. Automatic system testing of programs without test oracles , 2009, ISSTA.
[176] Mark Harman,et al. An analysis of the relationship between conditional entropy and failed error propagation in software testing , 2014, ICSE.
[177] Matthew Wicker,et al. Feature-Guided Black-Box Safety Testing of Deep Neural Networks , 2017, TACAS.
[178] Roxana Geambasu,et al. FairTest: Discovering Unwarranted Associations in Data-Driven Applications , 2015, 2017 IEEE European Symposium on Security and Privacy (EuroS&P).
[179] Sumit Kumar Jha,et al. Integrating symbolic and statistical methods for testing intelligent systems: Applications to machine learning and computer vision , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[180] Jan Kautz,et al. Unsupervised Image-to-Image Translation Networks , 2017, NIPS.
[181] Zhendong Su,et al. Compiler validation via equivalence modulo inputs , 2014, PLDI.
[182] Yu. L. Karpov,et al. Adaptation of General Concepts of Software Testing to Neural Networks , 2018, Programming and Computer Software.
[183] Timon Gehr,et al. DP-Finder: Finding Differential Privacy Violations by Sampling and Optimization , 2018, CCS.
[184] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[185] Berkman Sahiner,et al. Calibration of medical diagnostic classifier scores to the probability of disease , 2016, Statistical methods in medical research.
[186] Tim Miller,et al. Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..
[187] Mark Harman,et al. Perturbed Model Validation: A New Framework to Validate Model Relevance , 2019, ArXiv.
[188] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[189] Wen-Chuan Lee,et al. MODE: automated neural network model debugging via state differential analysis and input selection , 2018, ESEC/SIGSOFT FSE.
[190] Carlos Eduardo Scheidegger,et al. Assessing the Local Interpretability of Machine Learning Models , 2019, ArXiv.
[191] Andrew D. Selbst,et al. Big Data's Disparate Impact , 2016 .
[192] Jianxiong Xiao,et al. DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[193] Fei-Fei Li,et al. Visualizing and Understanding Recurrent Networks , 2015, ArXiv.
[194] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[195] Neoklis Polyzotis,et al. Data Management Challenges in Production Machine Learning , 2017, SIGMOD Conference.
[196] Yi Qin,et al. SynEva: Evaluating ML Programs by Mirror Program Synthesis , 2018, 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS).
[197] Uri Alon,et al. code2vec: learning distributed representations of code , 2018, Proc. ACM Program. Lang..
[198] P. Massart,et al. Concentration inequalities and model selection , 2007 .
[199] Gaétan Hains,et al. Towards formal methods and software engineering for deep learning: Security, safety and productivity for dl systems development , 2018, 2018 Annual IEEE International Systems Conference (SysCon).
[200] Mark Harman,et al. The Oracle Problem in Software Testing: A Survey , 2015, IEEE Transactions on Software Engineering.
[201] Li Li,et al. An Orchestrated Empirical Study on Deep Learning Frameworks and Platforms , 2018, ArXiv.
[202] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[203] Tsong Yueh Chen,et al. METTLE: A METamorphic Testing Approach to Assessing and Validating Unsupervised Machine Learning Systems , 2018, IEEE Transactions on Reliability.
[204] Fuyuki Ishikawa. Concepts in Quality Assessment for Machine Learning - From Test Data to Arguments , 2018, ER.
[205] Julia Rubin,et al. Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).
[206] Lei Ma,et al. DeepMutation: Mutation Testing of Deep Learning Systems , 2018, 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE).
[207] Matthias Woehrle,et al. Open Questions in Testing of Learned Computer Vision Functions for Automated Driving , 2019, SAFECOMP Workshops.
[208] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[209] Joachim Wegener,et al. Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System , 2004, GECCO.
[210] Jenna Burrell,et al. How the machine ‘thinks’: Understanding opacity in machine learning algorithms , 2016 .
[211] Kang Li,et al. Security Risks in Deep Learning Implementations , 2017, 2018 IEEE Security and Privacy Workshops (SPW).
[212] Jun Wan,et al. MuNN: Mutation Analysis of Neural Networks , 2018, 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C).
[213] TorkarRichard,et al. A systematic review of search-based testing for non-functional system properties , 2009 .
[214] Julian Dolby,et al. Ariadne: analysis for machine learning programs , 2018, MAPL@PLDI.
[215] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).
[216] W. M. McKeeman,et al. Differential Testing for Software , 1998, Digit. Tech. J..
[217] Xin-Hua Hu,et al. Validating a deep learning framework by metamorphic testing , 2017 .
[218] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[219] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..
[220] Michael I. Jordan,et al. Machine learning: Trends, perspectives, and prospects , 2015, Science.
[221] Or Biran,et al. Explanation and Justification in Machine Learning : A Survey Or , 2017 .
[222] Yuriy Brun,et al. Preventing undesirable behavior of intelligent machines , 2019, Science.
[223] Ravishankar K. Iyer,et al. Hands Off the Wheel in Autonomous Vehicles?: A Systems Perspective on over a Million Miles of Field Data , 2018, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[224] Dongmei Zhang,et al. A Framework for Ensuring the Quality of a Big Data Service , 2016, 2016 IEEE International Conference on Services Computing (SCC).
[225] Mark Harman,et al. A multi-objective approach to search-based test data generation , 2007, GECCO '07.
[226] Dave Towey,et al. Metamorphic Relations for Enhancing System Understanding and Use , 2020, IEEE Transactions on Software Engineering.
[227] Jan Hendrik Metzen,et al. On Detecting Adversarial Perturbations , 2017, ICLR.
[228] Diptikalyan Saha,et al. Automated Test Generation to Detect Individual Discrimination in AI Models , 2018, ArXiv.
[229] Zhi Quan Zhou,et al. Metamorphic Testing for Machine Translations: MT4MT , 2018, 2018 25th Australasian Software Engineering Conference (ASWEC).
[230] Peter L. Bartlett,et al. The Rademacher Complexity of Co-Regularized Kernel Classes , 2007, AISTATS.
[231] Harald C. Gall,et al. Software Engineering for Machine Learning: A Case Study , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).
[232] Tsong Yueh Chen,et al. Metamorphic Testing: A New Approach for Generating Next Test Cases , 2020, ArXiv.
[233] Yves Le Traon,et al. Test Selection for Deep Learning Systems , 2019, ACM Trans. Softw. Eng. Methodol..
[234] Lei Ma,et al. DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems , 2019, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).
[235] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).
[236] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[237] Georgios Fainekos,et al. Simulation-based Adversarial Test Generation for Autonomous Vehicles with Machine Learning Components , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).
[238] Matthew Johnson-Roberson,et al. Failing to Learn: Autonomously Identifying Perception Failures for Self-Driving Cars , 2017, IEEE Robotics and Automation Letters.
[239] Daniel Kroening,et al. Testing Deep Neural Networks , 2018, ArXiv.
[240] Krishna P. Gummadi,et al. The Case for Process Fairness in Learning: Feature Selection for Fair Decision Making , 2016 .
[241] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[242] Ravishankar K. Iyer,et al. AVFI: Fault Injection for Autonomous Vehicles , 2018, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W).
[243] Sarfraz Khurshid,et al. Symbolic Execution for Deep Neural Networks , 2018, ArXiv.
[244] András György,et al. Detecting Overfitting via Adversarial Examples , 2019, NeurIPS.
[245] Sanjay Krishnan,et al. PALM: Machine Learning Explanations For Iterative Debugging , 2017, HILDA@SIGMOD.
[246] Berkman Sahiner,et al. On the assessment of the added value of new predictive biomarkers , 2013, BMC Medical Research Methodology.
[247] Daniel Kroening,et al. Safety and Trustworthiness of Deep Neural Networks: A Survey , 2018, ArXiv.
[248] Sanjay Krishnan,et al. BoostClean: Automated Error Detection and Repair for Machine Learning , 2017, ArXiv.
[249] Yuriy Brun,et al. Causal Testing: Understanding Defects' Root Causes , 2018, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).
[250] A. Hartman. Software and Hardware Testing Using Combinatorial Covering Suites , 2005 .
[251] A. Jefferson Offutt,et al. MuJava: an automated class mutation system , 2005, Softw. Test. Verification Reliab..
[252] Baowen Xu,et al. Application of Metamorphic Testing to Supervised Classifiers , 2009, 2009 Ninth International Conference on Quality Software.
[253] Lubomir M. Hadjiiski,et al. Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size. , 2000, Medical physics.
[254] Krishna P. Gummadi,et al. Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.
[255] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.
[256] Shin Nakajima. Generalized Oracle for Testing Machine Learning Computer Programs , 2017, SEFM Workshops.
[257] V. Barnett,et al. Applied Linear Statistical Models , 1975 .
[258] Cewu Lu,et al. Virtual to Real Reinforcement Learning for Autonomous Driving , 2017, BMVC.
[259] Qiang Yang,et al. Lifelong Machine Learning Test , 2015, AAAI 2015.
[260] Ian Goodfellow,et al. TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing , 2018, ICML.
[261] Paulo Cortez,et al. A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..
[262] Christian Murphy,et al. Parameterizing random test data according to equivalence classes , 2007, RT '07.