Challenges in the Deployment and Operation of Machine Learning in Practice

Machine learning has recently emerged as a powerful technique to increase operational efficiency or to develop new value propositions. However, the translation of a prediction algorithm into an operationally usable machine learning model is a time-consuming and in various ways challenging task. In this work, we target to systematically elicit the challenges in deployment and operation to enable broader practical dissemination of machine learning applications. To this end, we first identify relevant challenges with a structured literature analysis. Subsequently, we conduct an interview study with machine learning practitioners across various industries, perform a qualitative content analysis, and identify challenges organized along three distinct categories as well as six overarching clusters. Eventually, results from both literature and interviews are evaluated with a comparative analysis. Key issues identified include auto- mated strategies for data drift detection and handling, standardization of machine learning infrastructure, and appropriate communication and expectation management.

[1]  Gerhard Satzger,et al.  Patterns of Data-Infused Business Model Innovation , 2016, 2016 IEEE 18th Conference on Business Informatics (CBI).

[2]  Jeff Dyck,et al.  Machine learning for engineering , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[3]  T. Davenport Competing on analytics. , 2006, Harvard business review.

[4]  Thomas G. Dietterich,et al.  Structured machine learning: the next ten years , 2008, Machine Learning.

[5]  Liming Zhu,et al.  Continuous Validation for Data Analytics Systems , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[6]  Panos M. Pardalos,et al.  Massive datasets and machine learning for computational biomedicine: trends and challenges , 2018, Annals of Operations Research.

[7]  Yuxiang Zhang,et al.  A local expansion propagation algorithm for social link identification , 2019, Knowledge and Information Systems.

[8]  Cynthia Rudin,et al.  Machine learning for science and society , 2013, Machine Learning.

[9]  Shaik Saidulu,et al.  Machine Learning and Statistical Approaches for Big Data : Issues , Challenges and Research Directions , 2017 .

[10]  Charles Parker,et al.  Unexpected challenges in large scale machine learning , 2012, BigMine '12.

[11]  Daniel L. Silver,et al.  Machine Lifelong Learning: Challenges and Benefits for Artificial General Intelligence , 2011, AGI.

[12]  Mohak Shah,et al.  An architecture for the deployment of statistical models for the big data era , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[13]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[14]  Michael Naehrig,et al.  ML Confidential: Machine Learning on Encrypted Data , 2012, ICISC.

[15]  Dpto,et al.  Machine Learning Techniques for Solving Classification Problems with Missing Input Data , 2008 .

[16]  Noel Lopes,et al.  Novel Trends in Scaling Up Machine Learning Algorithms , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[17]  Rüdiger Wirth,et al.  CRISP-DM: Towards a Standard Process Model for Data Mining , 2000 .

[18]  Shulin Wang,et al.  Feature selection in machine learning: A new perspective , 2018, Neurocomputing.

[19]  Carla E. Brodley,et al.  Challenges and Opportunities in Applied Machine Learning , 2012, AI Mag..

[20]  Gang Chen,et al.  Database Meets Deep Learning: Challenges and Opportunities , 2016, SGMD.

[21]  Fredric C. Gey,et al.  Data Mining: A Brief Introduction to the Field and Research Community , 2000 .

[22]  Tinoosh Mohsenin,et al.  SCALENet: A SCalable Low power AccELerator for Real-time Embedded Deep Neural Networks , 2018, ACM Great Lakes Symposium on VLSI.

[23]  Andrew L. Ferguson,et al.  Machine learning and data science in soft materials engineering , 2018, Journal of physics. Condensed matter : an Institute of Physics journal.

[24]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[25]  M. Adya,et al.  Data Mining in Healthcare: Issues and a Research Agenda , 2000 .

[26]  Shan Suthaharan,et al.  Big data classification: problems and challenges in network intrusion prediction with machine learning , 2014, PERV.

[27]  Vincent S. Tseng,et al.  Transfer Learning on High Variety Domains for Activity Recognition , 2015, ASE BD&SI.

[28]  Dietmar Jannach,et al.  A systematic review and taxonomy of explanations in decision support and recommender systems , 2017, User Modeling and User-Adapted Interaction.

[29]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[30]  Zhi-Hua Zhou,et al.  Machine learning challenges and impact: an interview with Thomas Dietterich , 2017 .

[31]  D. Sculley,et al.  Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[32]  V. Martinez,et al.  THE FUTURE OF SERVITIZATION : Technologies that will make a difference , 2015 .

[33]  C. Helfferich,et al.  Die Qualität qualitativer Daten , 2004 .

[34]  Vijay Khatri,et al.  Business analytics: Why now and what next? , 2014 .

[35]  Muhammad Shafique,et al.  Adaptive and Energy-Efficient Architectures for Machine Learning: Challenges, Opportunities, and Research Roadmap , 2017, 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[36]  Neoklis Polyzotis,et al.  Data Management Challenges in Production Machine Learning , 2017, SIGMOD Conference.

[37]  Richard T. Watson,et al.  Analyzing the Past to Prepare for the Future: Writing a Literature Review , 2002, MIS Q..

[38]  Brendan J. Frey,et al.  Machine Learning in Genomic Medicine: A Review of Computational Problems and Data Sets , 2016, Proceedings of the IEEE.

[39]  Kiri Wagstaff,et al.  Machine Learning that Matters , 2012, ICML.

[40]  Michael Anderson,et al.  Toward ensuring ethical behavior from autonomous systems: a case-supported principle-based paradigm , 2015, Ind. Robot.

[41]  Anand D. Sarwate,et al.  Signal Processing and Machine Learning with Differential Privacy: Algorithms and Challenges for Continuous Data , 2013, IEEE Signal Processing Magazine.

[42]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[43]  Andreas Holzinger,et al.  DO NOT DISTURB? Classifier Behavior on Perturbed Datasets , 2017, CD-MAKE.

[44]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.

[45]  Bart Baesens,et al.  Call for Papers MISQ Special Issue on Transformational Issues of Big Data and Analytics in Networked Business , 2014 .

[46]  Fang Chen,et al.  Making machine learning useable by revealing internal states update - a transparent approach , 2016, Int. J. Comput. Sci. Eng..

[47]  Pengtao Xie,et al.  Crypto-Nets: Neural Networks over Encrypted Data , 2014, ArXiv.

[48]  Gerhard Satzger,et al.  How to Cope with Change? - Preserving Validity of Predictive Services over Time , 2019, HICSS.

[49]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[50]  Philip Koopman,et al.  Autonomous Vehicle Safety: An Interdisciplinary Challenge , 2017, IEEE Intelligent Transportation Systems Magazine.

[51]  David M. Brooks,et al.  Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[52]  Raouf Boutaba,et al.  A comprehensive survey on machine learning for networking: evolution, applications and research opportunities , 2018, Journal of Internet Services and Applications.

[53]  Wolfgang Kellerer,et al.  o'zapft is: Tap Your Network Algorithm's Big Data! , 2017, Big-DAMA@SIGCOMM.

[54]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .