Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology

Machine learning is an established and frequently used technique in industry and academia, but a standard process model to improve success and efficiency of machine learning applications is still missing. Project organizations and machine learning practitioners face manifold challenges and risks when developing machine learning applications and have a need for guidance to meet business expectations. This paper therefore proposes a process model for the development of machine learning applications, covering six phases from defining the scope to maintaining the deployed machine learning application. Business and data understanding are executed simultaneously in the first phase, as both have considerable impact on the feasibility of the project. The next phases are comprised of data preparation, modeling, evaluation, and deployment. Special focus is applied to the last phase, as a model running in changing real-time environments requires close monitoring and maintenance to reduce the risk of performance degradation over time. With each task of the process, this work proposes quality assurance methodology that is suitable to address challenges in machine learning development that are identified in the form of risks. The methodology is drawn from practical experience and scientific literature, and has proven to be general and stable. The process model expands on CRISP-DM, a data mining process model that enjoys strong industry support, but fails to address machine learning specific tasks. The presented work proposes an industry- and application-neutral process model tailored for machine learning applications with a focus on technical tasks for quality assurance.

[1]  Malte Brettel,et al.  How Virtualization, Decentralization and Network Building Change the Manufacturing Landscape: An Industry 4.0 Perspective , 2014 .

[2]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[3]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[4]  Benoît Frénay,et al.  Legal requirements on explainability in machine learning , 2020, Artificial Intelligence and Law.

[5]  Jay Lee,et al.  A Cyber-Physical Systems architecture for Industry 4.0-based manufacturing systems , 2015 .

[6]  Miryung Kim,et al.  Data Scientists in Software Teams: State of the Art and Challenges , 2018, IEEE Transactions on Software Engineering.

[7]  Marina Meila,et al.  Megaman: Scalable Manifold Learning in Python , 2016, J. Mach. Learn. Res..

[8]  Klaus-Robert Müller,et al.  "What is relevant in a text document?": An interpretable machine learning approach , 2016, PloS one.

[9]  Jasbir S. Arora,et al.  Survey of multi-objective optimization methods for engineering , 2004 .

[10]  Daniel Großmann,et al.  Survey into predictive key performance indicator analysis from data mining perspective , 2020, 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA).

[11]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[12]  Olegas Niaksu CRISP Data Mining Methodology Extension for Medical Domain , 2015 .

[13]  Frank Leymann,et al.  Cloud Computing Patterns , 2014, Springer Vienna.

[14]  Hajo Wiemer,et al.  Data Mining Methodology for Engineering Applications (DMME)—A Holistic Extension to the CRISP-DM Model , 2019, Applied Sciences.

[15]  Guy Shani,et al.  A Survey of Accuracy Evaluation Metrics of Recommendation Tasks , 2009, J. Mach. Learn. Res..

[16]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[17]  Patrick Royston,et al.  Multiple imputation using chained equations: Issues and guidance for practice , 2011, Statistics in medicine.

[18]  D. Sculley,et al.  Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[19]  Alfred Ultsch,et al.  Explainable AI Framework for Multivariate Hydrochemical Time Series , 2021, Mach. Learn. Knowl. Extr..

[20]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[21]  Rich Caruana,et al.  InterpretML: A Unified Framework for Machine Learning Interpretability , 2019, ArXiv.

[22]  Lukasz A. Kurgan,et al.  A survey of Knowledge Discovery and Data Mining process models , 2006, The Knowledge Engineering Review.

[23]  Steffen Ihlenfeldt,et al.  DMME: Data mining methodology for engineering applications – a holistic extension to the CRISP-DM model , 2019, Procedia CIRP.

[24]  Sotiris Moschoyiannis,et al.  Serving Machine Learning Workloads in Resource Constrained Environments: a Serverless Deployment Example , 2019, 2019 IEEE 12th Conference on Service-Oriented Computing and Applications (SOCA).

[25]  Joachim M. Buhmann,et al.  On Relevant Dimensions in Kernel Feature Spaces , 2008, J. Mach. Learn. Res..

[26]  Manasi Vartak,et al.  ModelDB: a system for machine learning model management , 2016, HILDA '16.

[27]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[28]  Felix Bießmann,et al.  Quantifying Interpretability and Trust in Machine Learning Systems , 2019, ArXiv.

[29]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[30]  Eamonn J. Keogh,et al.  Curse of Dimensionality , 2017, Encyclopedia of Machine Learning and Data Mining.

[31]  Michael Carbin,et al.  The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.

[32]  Klaus-Robert Müller,et al.  iNNvestigate neural networks! , 2018, J. Mach. Learn. Res..

[33]  Cynthia Rudin,et al.  The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis , 2019, Operations Research & Management Science in the Age of Analytics.

[34]  Dimitris Bertsimas,et al.  From Predictive Methods to Missing Data Imputation: An Optimization Approach , 2017, J. Mach. Learn. Res..

[35]  Steven Peters,et al.  Deep feature learning of in-cylinder flow fields to analyze cycle-to-cycle variations in an SI engine , 2020, International Journal of Engine Research.

[36]  Sameerchand Pudaruth,et al.  Predicting the Price of Used Cars using Machine Learning Techniques , 2006 .

[37]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[38]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[39]  Francisco Herrera,et al.  Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI , 2020, Inf. Fusion.

[40]  Tao Zhang,et al.  A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.

[41]  Vatche Ishakian,et al.  Towards Enterprise-Ready AI Deployments Minimizing the Risk of Consuming AI Models in Business Applications , 2018, 2018 First International Conference on Artificial Intelligence for Industries (AI4I).

[42]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[43]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[44]  Gonzalo Mariscal,et al.  A survey of data mining and knowledge discovery process models and methodologies , 2010, The Knowledge Engineering Review.

[45]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[46]  Giuseppe Lami,et al.  Deep Learning in Automotive Software , 2017, IEEE Software.

[47]  A Min Tjoa,et al.  Current Advances, Trends and Challenges of Machine Learning and Knowledge Extraction: From Machine Learning to Explainable AI , 2018, CD-MAKE.

[48]  Hironori Washizaki,et al.  Preliminary Systematic Literature Review of Machine Learning System Development Process , 2019, ArXiv.

[49]  Jared S. Murray,et al.  Multiple Imputation: A Review of Practical and Theoretical Findings , 2018, 1801.04058.

[50]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[51]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[52]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[53]  Sriram K. Rajamani,et al.  Debugging Machine Learning Tasks , 2016, ArXiv.

[54]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[55]  Hans-Peter Kriegel,et al.  Future trends in data mining , 2007, Data Mining and Knowledge Discovery.

[56]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[57]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[58]  Constantine Frangakis,et al.  Multiple imputation by chained equations: what is it and how does it work? , 2011, International journal of methods in psychiatric research.

[59]  Hesham A. Hefny,et al.  A survey on exploring key performance indicators , 2016 .

[60]  Duncan Fyfe Gillies,et al.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data , 2015, Adv. Bioinformatics.

[61]  Dimitrios I. Fotiadis,et al.  Machine learning applications in cancer prognosis and prediction , 2014, Computational and structural biotechnology journal.

[62]  Vinod G. Surange Implementation of Six Sigma to Reduce Cost of Quality: A Case Study of Automobile Sector , 2015, Journal of Failure Analysis and Prevention.

[63]  Lars Kotthoff,et al.  Automated Machine Learning: Methods, Systems, Challenges , 2019, The Springer Series on Challenges in Machine Learning.

[64]  Joana Hois,et al.  ExplAIn Yourself! Transparency for Positive UX in Autonomous Driving , 2021, CHI.

[65]  Marco F. Huber,et al.  A Survey on the Explainability of Supervised Machine Learning , 2020, J. Artif. Intell. Res..

[66]  Adrien Chan-Hon-Tong,et al.  An Algorithm for Generating Invisible Data Poisoning Using Adversarial Noise That Breaks Image Classification Deep Learning , 2018, Mach. Learn. Knowl. Extr..

[67]  Chunsheng Yang,et al.  APU FMEA Validation and Its Application to Fault Identification , 2010 .

[68]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[69]  Wael M. Mohammed,et al.  Implementing and Visualizing ISO 22400 Key Performance Indicators for Monitoring Discrete Manufacturing Systems , 2018 .

[70]  Ah Chung Tsoi,et al.  Neural Network Classification and Prior Class Probabilities , 1996, Neural Networks: Tricks of the Trade.

[71]  Benjamin Guedj,et al.  Pycobra: A Python Toolbox for Ensemble Learning and Visualisation , 2017, J. Mach. Learn. Res..

[72]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[73]  Ernestina Menasalvas Ruiz,et al.  Toward data mining engineering: A software engineering approach , 2009, Inf. Syst..

[74]  Igor Steinmacher,et al.  Understanding Development Process of Machine Learning Systems: Challenges and Solutions , 2019, 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[75]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[76]  Davide Taibi,et al.  MVP Explained: A Systematic Mapping Study on the Definitions of Minimal Viable Product , 2016, 2016 42th Euromicro Conference on Software Engineering and Advanced Applications (SEAA).

[77]  Mark Harman,et al.  Machine Learning Testing: Survey, Landscapes and Horizons , 2019, IEEE Transactions on Software Engineering.

[78]  Andrew Y. Ng,et al.  Learning Feature Representations with K-Means , 2012, Neural Networks: Tricks of the Trade.

[79]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[80]  Heiko Schwarz,et al.  DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks , 2019, IEEE Journal of Selected Topics in Signal Processing.

[81]  Volker Stich,et al.  Integration of Novel Sensors and Machine Learning for Predictive Maintenance in Medium Voltage Switchgear to Enable the Energy and Mobility Revolutions , 2020, Sensors.

[82]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[83]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[84]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[85]  Joana Hois,et al.  How to Achieve Explainability and Transparency in Human AI Interaction , 2019, HCI.

[86]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[87]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[88]  Alexander Binder,et al.  Unmasking Clever Hans predictors and assessing what machines really learn , 2019, Nature Communications.

[89]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[90]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[91]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[92]  Liqun Sun,et al.  Metamorphic testing of driverless cars , 2019, Commun. ACM.

[93]  Nasser Kehtarnavaz,et al.  Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time Apps , 2019, Mach. Learn. Knowl. Extr..

[94]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[95]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[96]  Verena Geist,et al.  AI System Engineering - Key Challenges and Lessons Learned , 2020, Mach. Learn. Knowl. Extr..

[97]  Shulin Wang,et al.  Feature selection in machine learning: A new perspective , 2018, Neurocomputing.