Balancing Effectiveness and Flakiness of Non-Deterministic Machine Learning Tests
暂无分享,去创建一个
[1] Lingming Zhang,et al. Fuzzing Automatic Differentiation in Deep-Learning Libraries , 2023, 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE).
[2] Aurojit Panda,et al. NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers , 2022, ASPLOS.
[3] Lingming Zhang,et al. Fuzzing deep-learning libraries via automated relational API inference , 2022, ESEC/SIGSOFT FSE.
[4] Sasa Misailovic,et al. To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects , 2022, 2022 IEEE Conference on Software Testing, Verification and Validation (ICST).
[5] Lingming Zhang,et al. Coverage-guided tensor compiler fuzzing with joint IR-pass mutation , 2022, Proc. ACM Program. Lang..
[6] Lingming Zhang,et al. Free Lunch for Testing: Fuzzing Deep-Learning Libraries from Open Source , 2022, 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE).
[7] Sasa Misailovic,et al. FLEX: fixing flaky tests in machine learning projects by updating assertion bounds , 2021, ESEC/SIGSOFT FSE.
[8] Sasa Misailovic,et al. TERA: optimizing stochastic regression tests in machine learning projects , 2021, ISSTA.
[9] Yepang Liu,et al. To what extent do DNN-based image classification models make unreliable inferences? , 2021, Empirical Software Engineering.
[10] Darko Marinov,et al. Domain-Specific Fixes for Flaky Tests with Wrong Assumptions on Underdetermined Specifications , 2021, 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE).
[11] Wei Yang,et al. An Empirical Analysis of UI-Based Flaky Tests , 2021, 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE).
[12] Liqian Chen,et al. Detecting numerical bugs in neural network architectures , 2020, ESEC/SIGSOFT FSE.
[13] Chao Shen,et al. Audee: Automated Testing for Deep Learning Frameworks , 2020, 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[14] Sasa Misailovic,et al. Detecting flaky tests in probabilistic and machine learning applications , 2020, International Symposium on Software Testing and Analysis.
[15] T. Chen,et al. Metamorphic Testing: A New Approach for Generating Next Test Cases , 2020, ArXiv.
[16] Jinqiu Yang,et al. A Study of Oracle Approximations in Testing Deep Learning Libraries , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[17] Lei Ma,et al. DeepMutation++: A Mutation Testing Framework for Deep Learning Systems , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[18] Gabriele Bavota,et al. Taxonomy of Real Faults in Deep Learning Systems , 2019, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).
[19] Jie M. Zhang,et al. Automatic Testing and Improvement of Machine Translation , 2019, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).
[20] Saikat Dutta,et al. Storm: program reduction for testing and debugging probabilistic programming systems , 2019, ESEC/SIGSOFT FSE.
[21] Tao Xie,et al. iFixFlakies: a framework for automatically fixing order-dependent flaky tests , 2019, ESEC/SIGSOFT FSE.
[22] Pinjia He,et al. Structure-Invariant Testing for Machine Translation , 2019, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).
[23] Darko Marinov,et al. Mitigating the effects of flaky tests on mutation testing , 2019, ISSTA.
[24] Suman Nath,et al. Root causing flaky tests in a large-scale industrial setting , 2019, ISSTA.
[25] Mark Harman,et al. Machine Learning Testing: Survey, Landscapes and Horizons , 2019, IEEE Transactions on Software Engineering.
[26] T. Davenport,et al. The potential for artificial intelligence in healthcare , 2019, Future Healthcare Journal.
[27] Sasa Misailovic,et al. Statistical Algorithmic Profiling for Randomized Approximate Programs , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).
[28] Lin Tan,et al. CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).
[29] Wing Lam,et al. iDFlakies: A Framework for Detecting and Partially Classifying Flaky Tests , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).
[30] Liqun Sun,et al. Metamorphic testing of driverless cars , 2019, Commun. ACM.
[31] Peter Henderson,et al. An Introduction to Deep Reinforcement Learning , 2018, Found. Trends Mach. Learn..
[32] Saikat Dutta,et al. Testing probabilistic programming systems , 2018, ESEC/SIGSOFT FSE.
[33] David S. Rosenblum,et al. Verifying the long-run behavior of probabilistic system models in the presence of uncertainty , 2018, ESEC/SIGSOFT FSE.
[34] Peter W. O'Hearn,et al. From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis , 2018, 2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM).
[35] Sarfraz Khurshid,et al. DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[36] Benoit Baudry,et al. Descartes: A PITest Engine to Detect Pseudo-Tested Methods: Tool Demonstration , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).
[37] Yifan Chen,et al. An empirical study on TensorFlow program bugs , 2018, ISSTA.
[38] R. P. Jagadeesh Chandra Bose,et al. Identifying implementation bugs in machine learning based image classifiers using metamorphic testing , 2018, ISSTA.
[39] Darko Marinov,et al. DeFlaker: Automatically Detecting Flaky Tests , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).
[40] Sarfraz Khurshid,et al. Approximate Transformations as Mutation Operators , 2018, 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST).
[41] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.
[42] Suman Jana,et al. DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).
[43] Jiqiang Guo,et al. Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.
[44] Timon Gehr,et al. PSI: Exact Symbolic Inference for Probabilistic Programs , 2016, CAV.
[45] Jeffrey M. Voas,et al. Metamorphic Testing for Cybersecurity , 2016, Computer.
[46] Darko Marinov,et al. Detecting Assumptions on Deterministic Implementations of Non-deterministic Specifications , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).
[47] Tsong Yueh Chen,et al. Metamorphic Testing for Software Quality Assessment: A Study of Search Engines , 2016, IEEE Transactions on Software Engineering.
[48] 김종영. 구글 TensorFlow 소개 , 2015 .
[49] Yves Le Traon,et al. Trivial Compiler Equivalence: A Large Scale Empirical Study of a Simple, Fast and Effective Equivalent Mutant Detection Technique , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.
[50] Fernando A. Mujica,et al. An Empirical Evaluation of Deep Learning on Highway Driving , 2015, ArXiv.
[51] Darko Marinov,et al. An empirical analysis of flaky tests , 2014, SIGSOFT FSE.
[52] Vance W. Berger,et al. Kolmogorov–Smirnov Test: Overview , 2014 .
[53] Sarfraz Khurshid,et al. Operator-based and random mutant selection: Better together , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[54] Gilles Pokam,et al. Selective mutation testing for concurrent code , 2013, ISSTA.
[55] Mark Harman,et al. An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.
[56] R. Marler,et al. The weighted sum method for multi-objective optimization: new insights , 2010 .
[57] Andreas Zeller,et al. The Impact of Equivalent Mutants , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.
[58] Joshua B. Tenenbaum,et al. Church: a language for generative models , 2008, UAI.
[59] Anirban DasGupta,et al. Best constants in Chebyshev inequalities with various applications , 2000 .
[60] J. Doye,et al. Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms , 1997, cond-mat/9803344.
[61] F. J. Anscombe,et al. Distribution of the Kurtosis Statistic b2 for Normal Samples. , 1983 .
[62] V. V. Buldygin,et al. Sub-Gaussian random variables , 1980 .
[63] S. Shapiro,et al. An Analysis of Variance Test for Normality (Complete Samples) , 1965 .
[64] F. Massey. The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .
[65] Lingming Zhang,et al. Fuzzing Deep-Learning Libraries via Large Language Models , 2022, ArXiv.
[66] Sasa Misailovic,et al. AQUA: Automated Quantized Inference for Probabilistic Programs , 2021, ATVA.
[67] S. Sagar Imambi,et al. PyTorch , 2021, Programming with TensorFlow.
[68] I. Comparison. Faster Mutation Testing Inspired by Test Prioritization and Reduction , 2013 .