A swarm-trained k-nearest prototypes adaptive classifier with automatic feature selection for interval data

Some complex data types are capable of modeling data variability and imprecision. These data types are studied in the symbolic data analysis field. One such data type is interval data, which represents ranges of values and is more versatile than classic point data for many domains. This paper proposes a new prototype-based classifier for interval data, trained by a swarm optimization method. Our work has two main contributions: a swarm method which is capable of performing both automatic selection of features and pruning of unused prototypes and a generalized weighted squared Euclidean distance for interval data. By discarding unnecessary features and prototypes, the proposed algorithm deals with typical limitations of prototype-based methods, such as the problem of prototype initialization. The proposed distance is useful for learning classes in interval datasets with different shapes, sizes and structures. When compared to other prototype-based methods, the proposed method achieves lower error rates in both synthetic and real interval datasets.

[1]  Xin-She Yang,et al.  BBA: A Binary Bat Algorithm for Feature Selection , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images.

[2]  F. Alexa,et al.  Feature space reduction and parameter optimization with application to semi-supervised water pollutant classification , 2014, 2014 11th International Symposium on Electronics and Telecommunications (ISETC).

[3]  Francisco de A. T. de Carvalho,et al.  Classification of SAR Images Through a Convex Hull Region Oriented Approach , 2004, ICONIP.

[4]  Nizar Bouguila,et al.  A model-based discriminative framework for sets of positive vectors classification: Application to object categorization , 2014, 2014 1st International Conference on Advanced Technologies for Signal and Image Processing (ATSIP).

[5]  P. Messina,et al.  Köppen climate classification system , 2014 .

[6]  Fabrice Rossi,et al.  Multi-layer Perceptron on Interval Data ? , 2002 .

[7]  Heng Huang,et al.  Large Margin Local Estimate With Applications to Medical Image Classification , 2015, IEEE Transactions on Medical Imaging.

[8]  Inés María Galván,et al.  AMPSO: A New Particle Swarm Method for Nearest Neighborhood Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Parham Moradi,et al.  Velocity based artificial bee colony algorithm for high dimensional continuous optimization problems , 2014, Eng. Appl. Artif. Intell..

[10]  Telmo de Menezes e Silva Filho,et al.  A Weighted Learning Vector Quantization Approach for Interval Data , 2012, ICONIP.

[11]  Hani Hamdan,et al.  A neural networks approach to interval-valued data clustering. Applicationto Lebanese meteorological stations data , 2011, 2011 IEEE Workshop on Signal Processing Systems (SiPS).

[12]  Xiong Xiong,et al.  An Improved Self-Adaptive PSO Algorithm with Detection Function for Multimodal Function Optimization Problems , 2013 .

[13]  Monique Noirhomme-Fraiture,et al.  Symbolic Data Analysis and the SODAS Software , 2008 .

[14]  Jason C. Hung,et al.  Feature Selection of Support Vector Machine Based on Harmonious Cat Swarm Optimization , 2014, 2014 7th International Conference on Ubi-Media Computing and Workshops.

[15]  Renata M. C. R. de Souza,et al.  Clustering interval data through kernel-induced feature space , 2012, Journal of Intelligent Information Systems.

[16]  Dervis Karaboga,et al.  A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm , 2007, J. Glob. Optim..

[17]  Jon Atli Benediktsson,et al.  A Novel Feature Selection Approach Based on FODPSO and SVM , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Wei Xu Symbolic data analysis , 2010 .

[19]  David Abend Analysis Of Symbolic Data Exploratory Methods For Extracting Statistical Information From Complex Data , 2016 .

[20]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[21]  Javier Arroyo,et al.  iMLP: Applying Multi-Layer Perceptrons to Interval-Valued Data , 2007, Neural Processing Letters.

[22]  Jon Atli Benediktsson,et al.  Feature Selection Based on Hybridization of Genetic Algorithm and Particle Swarm Optimization , 2015, IEEE Geoscience and Remote Sensing Letters.

[23]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[24]  Renata M. C. R. de Souza,et al.  IFKCN: Applying fuzzy Kohonen clustering network to interval data , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[25]  Sushmita Mitra,et al.  Symbolic classification, clustering and fuzzy radial basis function network , 2005, Fuzzy Sets Syst..

[26]  Renata M. C. R. de Souza,et al.  Possibilistic approach to clustering of interval data , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[27]  Ivanoe De Falco,et al.  Facing classification problems with Particle Swarm Optimization , 2007, Appl. Soft Comput..

[28]  Renata M. C. R. de Souza,et al.  Logistic regression-based pattern classifiers for symbolic interval data , 2011, Pattern Analysis and Applications.

[29]  Telmo de Menezes e Silva Filho,et al.  Fuzzy learning vector quantization approaches for interval data , 2013, 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[30]  Ajith Abraham,et al.  Fuzzy C-means and fuzzy swarm for fuzzy clustering problem , 2011, Expert Syst. Appl..

[31]  Selim Dilmac,et al.  A new ECG arrhythmia clustering method based on Modified Artificial Bee Colony algorithm, comparison with GA and PSO classifiers , 2013, 2013 IEEE INISTA.

[32]  Leandro Nunes de Castro,et al.  The proposal of a Constructive Particle Swarm Classifier , 2010, 2010 Second World Congress on Nature and Biologically Inspired Computing (NaBIC).

[33]  Renata M. C. R. de Souza,et al.  Kernel-based fuzzy clustering of interval data , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[34]  Hans-Hermann Bock,et al.  Dynamic clustering for interval data based on L2 distance , 2006, Comput. Stat..

[35]  Paula Brito,et al.  Linear discriminant analysis for interval data , 2006, Comput. Stat..

[36]  Hani Hamdan,et al.  Self-organizing map based on hausdorff distance for interval-valued data , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[37]  Donato Malerba,et al.  Classification of symbolic objects: A lazy learning approach , 2006, Intell. Data Anal..