New similarity index based on the aggregation of membership functions through OWA operator

In the field of data analysis, the use of metrics is a classical way to assess pairwise similarity. Unfortunately the popular distances are often inoperative because of the noise, the multidimensionality and the heterogeneous nature of data. These drawbacks lead us to propose a similarity index based on fuzzy set theory. Each object of the dataset is described with the vector of its fuzzy attributes. Thanks to aggregation operators, the object is fuzzified by using the fuzzy attributes. Thus each object becomes a fuzzy subset within the dataset. The similarity of a reference object compared to another one is assessed through the membership function of the fuzzified reference object and an aggregation method using OWA operator.

[1]  P.-C.-F. Daunou,et al.  Mémoire sur les élections au scrutin , 1803 .

[2]  Elio Cables Pérez,et al.  OWA weights determination by means of linear functions. , 2009, SOCO 2009.

[3]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[4]  Padraig Cunningham,et al.  A Taxonomy of Similarity Mechanisms for Case-Based Reasoning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..

[6]  Anthony F. Norcio,et al.  Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems , 2009, Fuzzy Sets Syst..

[7]  Marcin Detyniecki,et al.  Mathematical Aggregation Operators and their Application to Video Querying , 2000 .

[8]  Ana L. N. Fred,et al.  Learning Pairwise Similarity for Data Clustering , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[9]  Michel Herbin,et al.  Exploratory Data Analysis of Insulin Therapy in the Elderly Type 2 Diabetic Patients , 2013, Stud. Inform. Univ..

[10]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[11]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decision-making , 1988 .

[12]  David Novak,et al.  Large-scale similarity data management with distributed Metric Index , 2012, Inf. Process. Manag..

[13]  Beata Walczak,et al.  Concept of (dis)similarity in data analysis , 2012 .

[14]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[15]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[16]  Manuel Barrena García,et al.  A flexible framework to ease nearest neighbor search in multidimensional data spaces , 2010, Data Knowl. Eng..

[17]  Ronald R. Yager,et al.  Fuzzy logic methods in recommender systems , 2003, Fuzzy Sets Syst..

[18]  Barry Smyth,et al.  Retrieval, reuse, revision and retention in case-based reasoning , 2005, The Knowledge Engineering Review.

[19]  Rami Zwick,et al.  Measures of similarity among fuzzy concepts: A comparative analysis , 1987, Int. J. Approx. Reason..

[20]  Didier Dubois,et al.  On the use of aggregation operations in information fusion processes , 2004, Fuzzy Sets Syst..

[21]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..