Construction of a quality model for machine learning systems

Nowadays, systems containing components based on machine learning (ML) methods are becoming more widespread. In order to ensure the intended behavior of a software system, there are standards that define necessary qualities of the system and its components (such as ISO/IEC 25010). Due to the different nature of ML, we have to re-interpret existing qualities for ML systems or add new ones (such as trustworthiness). We have to be very precise about which quality property is relevant for which entity of interest (such as completeness of training data or correctness of trained model), and how to objectively evaluate adherence to quality requirements. In this article, we present how to systematically construct quality models for ML systems based on an industrial use case. This quality model enables practitioners to specify and assess qualities for ML systems objectively. In addition to the overall construction process described, the main outcomes include a meta-model for specifying quality models for ML systems, reference elements regarding relevant views, entities, quality properties, and measures for ML systems based on existing research, an example instantiation of a quality model for a concrete industrial use case, and lessons learned from applying the construction process. We found that it is crucial to follow a systematic process in order to come up with measurable quality properties that can be evaluated in practice. In the future, we want to learn how the term quality differs between different types of ML systems and come up with reference quality models for evaluating qualities of ML systems.

[1]  Lukasz A. Kurgan,et al.  A survey of Knowledge Discovery and Data Mining process models , 2006, The Knowledge Engineering Review.

[2]  Michael Kläs,et al.  Uncertainty in Machine Learning Applications: A Practice-Driven Classification of Uncertainty , 2018, SAFECOMP Workshops.

[3]  Mikio Aoyama,et al.  Towards Guidelines for Assessing Qualities of Machine Learning Systems , 2020, QUATIC.

[4]  Shin Nakajima,et al.  [Invited] Quality Assurance of Machine Learning Software , 2018, 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE).

[5]  Ivica Crnkovic,et al.  It takes three to tango: Requirement, outcome/data, and AI driven development , 2018, SiBW.

[6]  Y. Raghu Reddy,et al.  Software Quality Models: A Systematic Mapping Study , 2019, 2019 IEEE/ACM International Conference on Software and System Processes (ICSSP).

[7]  Alexander Poth,et al.  Quality Assurance for Machine Learning – an approach to function and system safeguarding , 2020, 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS).

[8]  Lars Kotthoff,et al.  Automated Machine Learning: Methods, Systems, Challenges , 2019, The Springer Series on Challenges in Machine Learning.

[9]  Jennifer Horkoff,et al.  Non-Functional Requirements for Machine Learning: Challenges and New Directions , 2019, 2019 IEEE 27th International Requirements Engineering Conference (RE).

[10]  Bruce Edmonds,et al.  Computational modelling for decision-making: where, why, what, who and how , 2018, Royal Society Open Science.

[11]  Bruce Edmonds,et al.  The Use of Models - Making MABS Actually Work , 2000 .

[12]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[13]  Fuyuki Ishikawa Concepts in Quality Assessment for Machine Learning - From Test Data to Arguments , 2018, ER.

[14]  Leakage in data mining: Formulation, detection, and avoidance , 2012, TKDD.

[15]  Solon Barocas,et al.  Engaging the ethics of data science in practice , 2017, Commun. ACM.

[16]  Harald C. Gall,et al.  Software Engineering for Machine Learning: A Case Study , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[17]  Eunsuk Kang,et al.  Teaching Software Engineering for Al-Enabled Systems , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET).

[18]  Richard Y. Wang,et al.  Data Quality , 2000, Advances in Database Systems.

[19]  Mikio Aoyama,et al.  Requirements-Driven Method to Determine Quality Characteristics and Measurements for Machine Learning Software and Its Evaluation , 2020, 2020 IEEE 28th International Requirements Engineering Conference (RE).

[20]  Hrvoje Belani,et al.  Requirements Engineering Challenges in Building AI-Based Complex Systems , 2019, 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW).

[21]  Gonzalo Mariscal,et al.  A survey of data mining and knowledge discovery process models and methodologies , 2010, The Knowledge Engineering Review.

[22]  Wolfgang Kastner,et al.  Manufacturing process data analysis pipelines: a requirements analysis and survey , 2019, Journal of Big Data.

[23]  Jan Bosch,et al.  Software Engineering Challenges of Deep Learning , 2018, 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA).

[24]  Nicolas Lachiche,et al.  CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories , 2021, IEEE Transactions on Knowledge and Data Engineering.

[25]  Bruce Edmonds,et al.  Different Modelling Purposes , 2019, J. Artif. Soc. Soc. Simul..

[26]  Ivica Crnkovic,et al.  Large-scale machine learning systems in real-world industrial settings: A review of challenges and solutions , 2020, Inf. Softw. Technol..

[27]  Reinhold Plösch,et al.  Operationalised product quality models and assessment: The Quamoco approach , 2014, Inf. Softw. Technol..

[28]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[29]  Giuliano Lorenzoni,et al.  Machine Learning Model Development from a Software Engineering Perspective: A Systematic Literature Review , 2021, ArXiv.

[30]  Donald K. Crandall,et al.  Guidelines for Quality Assurance , 1988 .

[31]  M. N. Sulaiman,et al.  A Review On Evaluation Metrics For Data Classification Evaluations , 2015 .

[32]  Gail C. Murphy,et al.  How does Machine Learning Change Software Development Practices? , 2021, IEEE Transactions on Software Engineering.

[33]  Andreas Vogelsang,et al.  Requirements Engineering for Machine Learning: Perspectives from Data Scientists , 2019, 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW).

[34]  Fumihiro Kumeno,et al.  Sofware engneering challenges for machine learning applications: A literature review , 2020, Intell. Decis. Technol..

[35]  Stephen G. Kobourov,et al.  Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale , 2016, PloS one.

[36]  Bruce Edmonds,et al.  The Use of Models - Making MABS More Informative , 2000, MABS.

[37]  Yang Feng,et al.  Software Engineering Practice in the Development of Deep Learning Applications , 2019, ArXiv.

[38]  Igor Steinmacher,et al.  Understanding Development Process of Machine Learning Systems: Challenges and Solutions , 2019, 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[39]  Jeffrey J. P. Tsai,et al.  Machine Learning and Software Engineering , 2002, 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings..

[40]  Mark Harman,et al.  Machine Learning Testing: Survey, Landscapes and Horizons , 2019, IEEE Transactions on Software Engineering.