Big data analysis for sensor time-series in automation

The trend of large scale data production is observed not only within web companies, but is entering also other domains including automation domain. Smart sensors and smart devices contribute to growing amounts of data that need to be processed. An example of processing is prediction for better control, clustering for more effective maintenance, or improving the overall production in general. The so called Big Data paradigm shows new ways of handling bigger amounts of various data, including providing technologies that are able to handle them in an effective way. This paper examines the utilization of Big Data technologies for industry automation domain. The approach is illustrated on time series data measured from a passive house with the goal of detecting specific events. We show how the Big Data technologies allow data analysis that would be hard with traditional approaches.

[1]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[2]  Robert H. Kushler,et al.  Exploratory Data Analysis With MATLAB® , 2006, Technometrics.

[3]  Dong Dong,et al.  Nonlinear principal component analysis-based on principal curves and neural networks , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[4]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[5]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[6]  William H. Dutton,et al.  Clouds, big data, and smart assets: Ten tech-enabled business trends to watch , 2010 .

[7]  S.Joe Qin,et al.  Neural Networks for Intelligent Sensors and Control — Practical Issues and Some Solutions , 1997 .

[8]  Petr Novák,et al.  Design and verification of simulation models of passive houses , 2012, Proceedings of 2012 IEEE 17th International Conference on Emerging Technologies & Factory Automation (ETFA 2012).

[9]  Melnned M. Kantardzic Big Data Analytics , 2013, Lecture Notes in Computer Science.

[10]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[11]  Ronald K. Pearson,et al.  Outliers in process modeling and identification , 2002, IEEE Trans. Control. Syst. Technol..

[12]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[13]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[14]  Marek Obitko,et al.  Big Data Challenges in Industrial Automation , 2013, HoloMAS.

[15]  M Daszykowski,et al.  Dealing with missing values and outliers in principal component analysis. , 2007, Talanta.

[16]  Anthony Rowe,et al.  Specialized Storage for Big Numeric Time Series , 2013, HotStorage.

[17]  Kristin L. Sainani,et al.  Dealing with missing data , 2002 .