Towards a Common Testing Terminology for Software Engineering and Artificial Intelligence Experts

Analytical quality assurance, especially testing, is an integral part of software-intensive system development. With the increased usage of Artificial Intelligence (AI) and Machine Learning (ML) as part of such systems, this becomes more difficult as well-understood software testing approaches cannot be applied directly to the AI-enabled parts of the system. The required adaptation of classical testing approaches and development of new concepts for AI would benefit from a deeper understanding and exchange between AI and software engineering experts. A major obstacle on this way, we see in the different terminologies used in the two communities. As we consider a mutual understanding of the testing terminology as a key, this paper contributes a mapping between the most important concepts from classical software testing and AI testing. In the mapping, we highlight differences in relevance and naming of the mapped concepts.

[1]  Michael Kläs,et al.  Increasing Trust in Data-Driven Model Validation - A Framework for Probabilistic Augmentation of Images and Meta-data Generation Using Application Scope Characteristics , 2019, SAFECOMP.

[2]  Luciano Baresi,et al.  An Introduction to Software Testing , 2006, FoVMT.

[3]  Ilene Burnstein,et al.  Practical Software Testing: A Process-Oriented Approach , 2003 .

[4]  Michael Kläs,et al.  Uncertainty Wrappers for Data-Driven Models - Increase the Transparency of AI/ML-Based Models Through Enrichment with Dependable Situation-Aware Uncertainty Estimates , 2019, SAFECOMP Workshops.

[5]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[6]  Michael Felderer,et al.  Quality Assurance for AI-Based Systems: Overview and Challenges (Introduction to Interactive Session) , 2021, SWQD.

[7]  Valentina Lenarduzzi,et al.  Software Quality for AI: Where We Are Now? , 2021, SWQD.

[8]  Adam Trendowicz,et al.  Construction of a quality model for machine learning systems , 2021, Software Quality Journal.

[9]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[10]  Paolo Tonella,et al.  Testing machine learning based systems: a systematic mapping , 2020, Empirical Software Engineering.

[11]  Mark Harman,et al.  Machine Learning Testing: Survey, Landscapes and Horizons , 2019, IEEE Transactions on Software Engineering.

[12]  Janek Groß,et al.  Using Complementary Risk Acceptance Criteria to Structure Assurance Cases for Safety-Critical AI Components , 2021, AISafety@IJCAI.

[13]  Bruno Legeard,et al.  A taxonomy of model‐based testing approaches , 2012, Softw. Test. Verification Reliab..