Challenges for Data Mining on Sensor Data of Interlinked Processes

In industries like steel production, interlinked production processes leave no time for assessing the physical quality of intermediate products. Failures during the process can lead to high internal costs when already defective products are passed through the entire value chain. However, process data like machine parameters and sensor data which are directly linked to quality can be recorded. Based on a rolling mill case study, the paper discusses how decentralized data mining and intelligent machine-to-machine communication could be used to predict the physical quality of intermediate products online and in real-time for detecting quality issues as early as possible. The recording of huge data masses and the distributed but sequential nature of the problem lead to challenging research questions for the next generation of

[1]  Hans-Peter Kriegel,et al.  Multi-represented Classification Based on Confidence Estimation , 2007, PAKDD.

[2]  F. B. Crosby Iron and Steel Production , 1926, Transactions of the American Institute of Electrical Engineers.

[3]  Katharina Morik,et al.  Learning from Label Proportions by Optimizing Cluster Model Selection , 2011, ECML/PKDD.

[4]  David R. Musicant,et al.  Supervised Learning by Training on Aggregate Outputs , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[5]  P. N. Figueiredo Technological Learning and Competitive Performance , 2001 .

[6]  A. Kugi,et al.  Active compensation of roll eccentricity in rolling mills , 1998, Conference Record of 1998 IEEE Industry Applications Conference. Thirty-Third IAS Annual Meeting (Cat. No.98CH36242).

[7]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[8]  Katharina Morik,et al.  Method trees: building blocks for self-organizable representations of value series: how to evolve representations for classifying audio data , 2005, GECCO '05.

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[11]  Jilles Vreeken,et al.  Item Sets that Compress , 2006, SDM.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[14]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[15]  Kanishka Bhaduri,et al.  Distributed anomaly detection using 1‐class SVM for vertically partitioned data , 2011, Stat. Anal. Data Min..

[16]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.