Learning similarity measures from data with fuzzy sets and particle swarms

Gauging the similarity among objects is a fairly common and important task that underpins many popular machine learning endeavours such as classification or clustering. Uncertainty representation mechanisms, such as rough set theory, or information processing paradigms like granular computing also lean upon well-defined similarity measures to better model the objects in the universe of discourse. In this informationladen world, the responsibility of designing these crucial granular constructs is shifting from domain experts to intelligent systems that automatically learn from data. An approach that hybridizes particle swarm optimization with elements from rough set theory has been recently proposed [1] to build these similarity measures from scratch. However, this scheme still remains fairly sensitive to the values of the similarity thresholds both in the input attribute space and the decision space. In this paper, we tackle this limitation by employing fuzzy sets to categorize the domain of both similarity thresholds. The efficacy of the proposed methodology is illustrated with the K-nearest neighbor classifier. Empirical results over several well-known repositories confirm that this approach preserves the classification accuracy while reducing the number of system parameters and enhancing its interpretability.

[1]  Lotfi A. Zadeh,et al.  Similarity relations and fuzzy orderings , 1971, Inf. Sci..

[2]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[3]  Gwanggil Jeon,et al.  Learning Collaboration Links in a Collaborative Fuzzy Clustering Environment , 2007, MICAI.

[4]  Gin-Shuh Liang,et al.  Computing, Artificial Intelligence and Information Technology Cluster analysis based on fuzzy equivalence relation , 2005 .

[5]  David G. Stork,et al.  Pattern Classification , 1973 .

[6]  Witold Pedrycz,et al.  Granular Computing - The Emerging Paradigm , 2007 .

[7]  Piero P. Bonissone,et al.  On heuristics as a fundamental constituent of soft computing , 2008, Fuzzy Sets Syst..

[8]  Samer Al Hawari,et al.  A Comprehensive Comparative Study Using Vector Space Model with K-Nearest Neighbor on Text Categorization Data , 2008 .

[9]  Michael M. Richter,et al.  Case-Based Reasoning: A Textbook , 2013 .

[10]  Rafael Bello,et al.  A Method for Building Prototypes in the Nearest Prototype Approach Based on Similarity Relations for Problems of Function Approximation , 2012, MICAI.

[11]  Włodzisław Duch,et al.  Weighting and Selection of Features , 1999 .

[12]  Andrzej Skowron,et al.  Rough sets: Some extensions , 2007, Inf. Sci..

[13]  Rafael Bello,et al.  A method to build similarity relations into extended Rough Set Theory , 2010, 2010 10th International Conference on Intelligent Systems Design and Applications.

[14]  Rafael Bello,et al.  An analysis about the measure quality ofsimilarity and its applications in machine learning , 2013 .

[15]  Wen-June Wang,et al.  New similarity measures on fuzzy sets and on elements , 1997, Fuzzy Sets Syst..

[16]  Rafael Bello,et al.  Using PSO and RST to Predict the Resistant Capacity of Connections in Composite Structures , 2010, NICSO.

[17]  Seok-Beom Roh,et al.  A design of granular fuzzy classifier , 2014, Expert Syst. Appl..

[18]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[19]  Yailé Caballero Mota,et al.  Improving the MLP Learning by Using a Method to Calculate the Initial Weights of the Network Based on the Quality of Similarity Measure , 2011, MICAI.

[20]  Andrzej Skowron,et al.  Toward Perception Based Computing: A Rough-Granular Perspective , 2006, WImBI.

[21]  Witold Pedrycz,et al.  Building granular fuzzy decision support systems , 2014, Knowl. Based Syst..

[22]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[23]  Rafael Falcon,et al.  Learning Membership Functions for an Associative Fuzzy Neural Network , 2008 .

[24]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.