Integrating human knowledge within a hybrid clustering-classification scheme for detecting patterns within large movement data sets

The visual analysis of large movement data sets can be a challenging task. This study proposes an approach for identifying interesting movement patterns that combines human knowledge and decision making with a hybrid clustering-classification method. Rather than performing an unsupervised clustering of the entire data set, a stratified random sample of the full data set is used to identify initial clusters that are verified and labelled by the analyst, and then used as input patterns for classifying the remainder of the data set using an iterative genetic program. Classifications suggested after each iteration are presented to the analyst for refinement based on their knowledge and experience. A geovisual analytics environment is provided to both show the outcomes of the clustering and classification, and to obtain the analyst’s input during the hybrid clustering-classification process. Our approach allows data to be classified without a priori specification of classification patterns. Instead, the process takes advantage of human decision making within the automatic analysis of the data. The approach was tested with fishing vessel movement data in Eastern Canada.

[1]  Valéria Cesário Times,et al.  DB-SMoT: A direction-based spatio-temporal clustering method , 2010, 2010 5th IEEE International Conference Intelligent Systems.

[2]  Dino Pedreschi,et al.  Visually driven analysis of movement data by progressive clustering , 2008, Inf. Vis..

[3]  R. Yokoyama,et al.  Supervised landform classification of Northeast Honshu from DEM-derived thematic maps , 2006 .

[4]  Dino Pedreschi,et al.  Interactive visual clustering of large collections of trajectories , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[5]  Hongjun Lu,et al.  CBC: clustering based text classification requiring minimal labeled data , 2003, Third IEEE International Conference on Data Mining.

[6]  Orland Hoeber,et al.  Geovisualization of fishing vessel movement patterns using hybrid fractal / velocity signatures , 2010 .

[7]  Wolfgang Banzhaf,et al.  Genetic Programming: An Introduction , 1997 .

[8]  Derya Birant,et al.  An incremental genetic algorithm for classification and sensitivity analysis of its parameters , 2011, Expert Syst. Appl..

[9]  Wolfgang Banzhaf,et al.  Accelerating Genetic Programming through Graphics Processing Units. , 2009 .

[10]  Tieniu Tan,et al.  A system for learning statistical motion patterns , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yixin Zhong,et al.  Using clustering to enhance text classification , 2008 .

[12]  Göran Falkman,et al.  Anomaly detection in sea traffic - A comparison of the Gaussian Mixture Model and the Kernel Density Estimator , 2009, 2009 12th International Conference on Information Fusion.