Dynamic Data Management for Machine Learning in Embedded Systems: A Case Study

Dynamic data and continuously evolving sets of records are essential for a wide variety of today’s data management applications. Such applications range from large, social, content-driven Internet applications, to highly focused data processing verticals like data intensive science, telecommunications and intelligence applications. However, the dynamic and multimodal nature of data makes it challenging to transform it into machine-readable and machine-interpretable forms. In this paper, we report on an action research study that we conducted in collaboration with a multinational company in the embedded systems domain. In our study, and in the context of a real-world industrial application of dynamic data management, we provide insights to data science community and research to guide discussions and future research into dynamic data management in embedded systems. Our study identifies the key challenges in the phases of data collection, data storage and data cleaning that can significantly impact the overall performance of the system.

[1]  Elliott Delaye,et al.  Deep learning challenges and solutions with Xilinx FPGAs , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[2]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[3]  Rachel K. E. Bellamy,et al.  Trials and tribulations of developers of intelligent systems: A field study , 2016, 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[4]  Ivica Crnkovic,et al.  A Taxonomy of Software Engineering Challenges for Machine Learning Systems: An Empirical Investigation , 2019, XP.

[5]  Neoklis Polyzotis,et al.  Data Lifecycle Challenges in Production Machine Learning , 2018, SIGMOD Rec..

[6]  Darius Hedgebeth Data‐driven decision making for the enterprise: an overview of business intelligence applications , 2007 .

[7]  Frederica Darema,et al.  Dynamic Data Driven Applications Systems: A New Paradigm for Application Simulations and Measurements , 2004, International Conference on Computational Science.

[8]  Frank Klawonn,et al.  Guide to Intelligent Data Analysis - How to Intelligently Make Sense of Real Data , 2010, Texts in Computer Science.

[9]  Jun Yang,et al.  Data Management in Machine Learning: Challenges, Techniques, and Systems , 2017, SIGMOD Conference.

[10]  Sachin Ahuja,et al.  Machine learning and its applications: A review , 2017, 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC).

[11]  Manfred Broy,et al.  Challenges in automotive software engineering , 2006, ICSE.

[12]  Christoph Koch,et al.  DBToaster: Agile Views for a Dynamic Data Management System , 2011, CIDR.

[13]  Paul Davidsson,et al.  Collaborative Sensing with Interactive Learning using Dynamic Intelligent Virtual Sensors , 2019, Sensors.

[14]  A Salameh,et al.  Spotify Tailoring for B2B Product Development , 2019, 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA).

[15]  Athanasios V. Vasilakos,et al.  Machine learning on big data: Opportunities and challenges , 2017, Neurocomputing.

[16]  Ihab F. Ilyas,et al.  Data Cleaning: Overview and Emerging Challenges , 2016, SIGMOD Conference.

[17]  Neoklis Polyzotis,et al.  Data Management Challenges in Production Machine Learning , 2017, SIGMOD Conference.