Knowledge Extraction from Logged Truck Data using Unsupervised Learning Methods

The goal was to extract knowledge from data that is logged by the electronic system ofevery Volvo truck. This allowed the evaluation of large populations of trucks without requiring additional measuring devices and facilities.An evaluation cycle, similar to the knowledge discovery from databases model, wasdeveloped and applied to extract knowledge from data. The focus was on extractinginformation in the logged data that is related to the class labels of different populations,but also supported knowledge extraction inherent from the given classes. The methodsused come from the field of unsupervised learning, a sub-field of machine learning andinclude the methods self-organizing maps, multi-dimensional scaling and fuzzy c-meansclustering.The developed evaluation cycle was exemplied by the evaluation of three data-sets.Two data-sets were arranged from populations of trucks differing by their operatingenvironment regarding road condition or gross combination weight. The results showedthat there is relevant information in the logged data that describes these differencesin the operating environment. A third data-set consisted of populations with differentengine configurations, causing the two groups of trucks being unequally powerful.Using the knowledge extracted in this task, engines that were sold in one of the twoconfigurations and were modified later, could be detected.Information in the logged data that describes the vehicle's operating environment,allows to detect trucks that are operated differently of their intended use. Initial experimentsto find such vehicles were conducted and recommendations for an automatedapplication were given.

[1]  S. Edlund,et al.  Complete Vehicle Durability Assessments Using Discrete Sets of Random Roads and Transient Obstacles Based on Q-distributions , 2002 .

[2]  Marc Teboulle,et al.  Grouping Multidimensional Data - Recent Advances in Clustering , 2006 .

[3]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[4]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[5]  Magnus Karlsson DRIVER INDEPENDENT ROAD CURVE CHARACTERISATION , 2004 .

[6]  Olli Simula,et al.  Process Monitoring and Modeling Using the Self-Organizing Map , 1999, Integr. Comput. Aided Eng..

[7]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[8]  Sampsa Laine,et al.  Using visualization, variable selection and feature extraction to learn from industrial data , 2003 .

[9]  Michel Verleysen,et al.  On the Effects of Dimensionality on Data Analysis with Neural Networks , 2009, IWANN.

[10]  Olli Simula,et al.  The Self-Organizing Map in Industry Analysis , 1999 .

[11]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[12]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .

[13]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[14]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[15]  Magnus Karlsson Statistical parameterization of lateral vehicle loads , 2006 .

[16]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[17]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[18]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[19]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[20]  F Oijer,et al.  IDENTIFICATION OF TRANSIENT ROAD OBSTACLE DISTRIBUTIONS AND THEIR IMPACT ON VEHICLE DURABILITY AND DRIVER COMFORT , 2004 .

[21]  Stefan Edlund,et al.  The Right Truck for the Job with Global Truck Application Descriptions , 2004 .

[22]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[24]  Erkki Oja,et al.  Engineering applications of the self-organizing map , 1996, Proc. IEEE.

[25]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[26]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[27]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.