Machine Learning Software Engineering in Practice: An Industrial Case Study

SAP is the market leader in enterprise software offering an end-to-end suite of applications and services to enable their customers worldwide to operate their business. Especially, retail customers of SAP deal with millions of sales transactions for their day-to-day business. Transactions are created during retail sales at the point of sale (POS) terminals and then sent to some central servers for validations and other business operations. A considerable proportion of the retail transactions may have inconsistencies due to many technical and human errors. SAP provides an automated process for error detection but still requires a manual process by dedicated employees using workbench software for correction. However, manual corrections of these errors are time-consuming, labor-intensive, and may lead to further errors due to incorrect modifications. This is not only a performance overhead on the customers' business workflow but it also incurs high operational costs. Thus, automated detection and correction of transaction errors are very important regarding their potential business values and the improvement in the business workflow. In this paper, we present an industrial case study where we apply machine learning (ML) to automatically detect transaction errors and propose corrections. We identify and discuss the challenges that we faced during this collaborative research and development project, from three distinct perspectives: Software Engineering, Machine Learning, and industry-academia collaboration. We report on our experience and insights from the project with guidelines for the identified challenges. We believe that our findings and recommendations can help researchers and practitioners embarking into similar endeavors.

[1]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[2]  Ivica Crnkovic,et al.  Meeting Industry-Academia Research Collaboration Challenges with Agile Methodologies , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[3]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[4]  Harald C. Gall,et al.  Software Engineering for Machine Learning: A Case Study , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[5]  Nasser Modiri,et al.  Focusing on the importance and the role of requirement engineering , 2011, The 4th International Conference on Interaction Sciences.

[6]  Roger B. Grosse,et al.  Testing MCMC code , 2014, ArXiv.

[7]  Michael D. Santoro,et al.  Facilitators of Knowledge Transfer in University-Industry Collaborations: A Knowledge-Based Perspective , 2006, IEEE Transactions on Engineering Management.

[8]  Sanjiv Kumar,et al.  A Survey of Modern Questions and Challenges in Feature Extraction , 2015, FE@NIPS.

[9]  Milo Honegger,et al.  Shedding Light on Black Box Machine Learning Algorithms: Development of an Axiomatic Framework to Assess the Quality of Methods that Explain Individual Predictions , 2018, ArXiv.

[10]  Benoît Frénay,et al.  Interpretability of machine learning models and representations: an introduction , 2016, ESANN.

[11]  Steven Euijong Whang,et al.  A Survey on Data Collection for Machine Learning: A Big Data - AI Integration Perspective , 2018, IEEE Transactions on Knowledge and Data Engineering.

[12]  D. Sculley,et al.  Hidden Technical Debt in Machine Learning Systems , 2015, NIPS.

[13]  Carolin Plewa,et al.  What Drives and Inhibits University‐Business Cooperation in Europe? A Comprehensive Assessment , 2016 .

[14]  S. Ankrah,et al.  Universities-Industry Collaboration: A Systematic Review , 2015 .

[15]  W. B. Roberts,et al.  Machine Learning: The High Interest Credit Card of Technical Debt , 2014 .

[16]  A. Salter,et al.  Investigating the factors that diminish the barriers to university–industry collaboration , 2009 .

[17]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[18]  Junfeng Yang,et al.  DeepXplore: Automated Whitebox Testing of Deep Learning Systems , 2017, SOSP.

[19]  Foster Provost,et al.  Selective Data Acquisition for Machine Learning Saar-Tsechansky , 2011 .

[20]  Stephen J. Childe,et al.  Innovation: a knowledge transfer perspective , 2013 .

[21]  W. Dolfsma,et al.  Knowledge transfer in university–industry research partnerships: a review , 2018, The Journal of Technology Transfer.

[22]  HerreraFrancisco,et al.  A survey on data preprocessing for data stream mining , 2017 .

[23]  Chris Murphy,et al.  An Approach to Software Testing of Machine Learning Applications , 2007, SEKE.