A field test with self-organized modeling for knowledge discovery in a fleet of city buses

Fleets of commercial vehicles represent an excellent real life setting for ubiquitous knowledge discovery. There are many electronic control units onboard a modern bus or truck, with hundreds of signals being transmitted between them on the controller area network. The growing complexity of the vehicles has lead to a significant desire to have systems for fault detection, remote diagnostics and maintenance prediction. This paper aims to show that it is possible to discover useful diagnostic knowledge by a self-organized algorithm in the scenario of a fleet of city buses. The approach is demonstrated as a process consisting of two parts; Unsupervised modeling (where interesting features are discovered) and Guided search (where the previously found features are coupled to additional information sources). The modeling part searches for simple linear models in a group of vehicles, where interesting features are selected based on both non-randomness in relations and variability in the group. It is shown in an eight months long data collection study that this approach was able to discover features related to broken wheelspeed sensors. Strikingly, deviations in these features (for the vehicles with broken sensors) can be observed up to several months before a breakdown occur. This potentially allows for sufficient time to schedule the vehicle for maintenance and prepare the workshop with relevant components.

[1]  João Gama,et al.  Resource Aware Distributed Knowledge Discovery , 2010, Ubiquitous Knowledge Discovery.

[2]  Geoff Hulten,et al.  Catching up with the Data: Research Issues in Mining Data Streams , 2001, DMKD.

[3]  Gancho Vachkov Intelligent Data Analysis for Performance Evaluation and Fault Diagnosis in Complex Systems , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[4]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[5]  Lorenza Saitta,et al.  Introduction: The Challenge of Ubiquitous Knowledge Discovery , 2010, Ubiquitous Knowledge Discovery.

[6]  João Gama,et al.  Ubiquitous Knowledge Discovery , 2011, IDA 2011.

[7]  Donald C. Wunsch,et al.  Problems of Further Development of GMDH Algorithms: Part 2 , 2002 .

[8]  Rong Chen,et al.  Algorithms for Distributed Data Stream Mining , 2007, Data Streams - Models and Algorithms.

[9]  Koen Vanhoof,et al.  Application Challenges for Ubiquitous Knowledge Discovery , 2010, Ubiquitous Knowledge Discovery.

[10]  Mohamed Medhat Gaber,et al.  On-board Mining of Data Streams in Sensor Networks , 2005 .

[11]  Jiawei Han,et al.  Re-examination of interestingness measures in pattern mining: a unified framework , 2010, Data Mining and Knowledge Discovery.

[12]  W. Krzanowski Between-Groups Comparison of Principal Components , 1979 .

[13]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[14]  Srinivasan Parthasarathy,et al.  A Survey of Distributed Mining of Data Streams , 2007, Data Streams - Models and Algorithms.

[15]  Ahmed K. Elmagarmid,et al.  The Kluwer international series on advances in database systems , 1996 .

[16]  Stefan Byttner,et al.  A self-organized approach for unsupervised fault detection in multiple systems , 2008, 2008 19th International Conference on Pattern Recognition.

[17]  Kenneth McGarry,et al.  A survey of interestingness measures for knowledge discovery , 2005, The Knowledge Engineering Review.

[18]  Michael G. Pecht,et al.  No-fault-found and intermittent failures in electronic products , 2008, Microelectron. Reliab..

[19]  Beverly Sackler,et al.  What Is Interesting: Studies on Interestingness in Knowledge Discovery , 2003 .

[20]  Stefan Byttner,et al.  Consensus self-organized models for fault detection (COSMO) , 2011, Eng. Appl. Artif. Intell..

[21]  Yilu Zhang,et al.  Connected Vehicle Diagnostics and Prognostics, Concept, and Initial Practice , 2009, IEEE Transactions on Reliability.

[22]  A. G. Ivakhnenko,et al.  Problems of Further Development of the Group Method of Data Handling Algorithms. Part I , 2000 .

[23]  Nicholas Wickström,et al.  A new measure of movement symmetry in early Parkinson's disease patients using symbolic processing of inertial sensor data , 2011, IEEE Transactions on Biomedical Engineering.

[24]  Le Gruenwald,et al.  Research issues in mining multiple data streams , 2010, StreamKDD '10.