Pattern recognition with mixed and incomplete data

In this paper, an increasingly active line of study in pattern recognition (PR) called “logical combinatorial pattern recognition” (LCPR) is reviewed. Briefly, this refers to pattern recognition problems with mixed and incomplete object descriptions using similarity functions less restricted than a distance, i.e., objects described simultaneously in terms of numerical and non-numerical features with missing values. The similarity function is not necessarily the opposite or the inverse of a certain distance function; it could be asymmetric. The necessity of this branch of PR and the ways in which these problems have been faced, as well as its basic concepts, tools, and the principal theoretical and practical results over the last 30 years and the most important future works, are concisely exposed.

[1]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[2]  Mika Sato-Ilic Dynamic fuzzy clustering using fuzzy cluster loading , 2006, Int. J. Gen. Syst..

[3]  Ventzeslav Valev,et al.  Integer-valued problems of transforming the training tables in k-valued code in pattern recognition problems , 1991, Pattern Recognit..

[4]  Irene Olaya Ayaquica Martínez,et al.  C-mean algorithm with similarity functions , 2002 .

[5]  Jesús Ariel Carrasco-Ochoa,et al.  Feature Selection Using Typical e: Testors, Working on Dynamical Data , 2004, CIARP.

[6]  Jesús Ariel Carrasco-Ochoa,et al.  Sensitivity analysis of fuzzy Goldman typical testors , 2004, Fuzzy Sets Syst..

[7]  José Francisco Martínez Trinidad,et al.  Extension to C-means Algorithm for the Use of Similarity Functions , 1999, PKDD.

[8]  Brian Everitt,et al.  Cluster analysis , 1974 .

[9]  H. Ralambondrainy,et al.  A conceptual version of the K-means algorithm , 1995, Pattern Recognit. Lett..

[11]  Rafael Berlanga Llavori,et al.  On-line event and topic detection by using the compact sets clustering algorithm , 2002, J. Intell. Fuzzy Syst..

[12]  José Francisco Martínez Trinidad,et al.  Discovering Differences in Patients with Uveitis Through Typical Testors by Class , 2000, PKDD.

[13]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[14]  José Ruiz-Shulcloper,et al.  A clustering method for very large mixed data sets , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[15]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  José Francisco Martínez Trinidad,et al.  The logical combinatorial approach to pattern recognition, an overview through selected works , 2001, Pattern Recognit..

[19]  Rafael Berlanga Llavori,et al.  JERARTOP: A New Topic Detection System , 2004, CIARP.

[20]  George S Sebestyen,et al.  Decision-making processes in pattern recognition (ACM monograph series) , 1962 .

[21]  José Francisco Martínez Trinidad,et al.  Conceptual K-Means Algorithm Based on Complex Features , 2006, CIARP.

[22]  Rafael Berlanga Llavori,et al.  Topic discovery based on text mining techniques , 2007, Inf. Process. Manag..

[23]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[24]  José Ruiz-Shulcloper,et al.  A tool to discover the main themes in a Spanish or English document , 2000 .

[25]  José Francisco Martínez Trinidad,et al.  Structuralization of universes , 2000, Fuzzy Sets Syst..

[26]  J. Martínez-Trinidad,et al.  REFUNION-GENERALIZATION-CONCEPTUAL CLUSTERING ALGORITHM , 2001 .

[27]  Carrasco-Ochoa Jesús,et al.  Combining evolution techniques to estimate features ’ weights and the support sets system for ALVOT , 2002 .

[28]  Zhou Wen,et al.  Efficient mining of emerging patterns , 2002 .

[29]  J. Martínez-Trinidad,et al.  A new approach to differential diagnosis of diseases. , 1996, International journal of bio-medical computing.

[30]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[31]  J. Gower A comparison of some methods of cluster analysis. , 1967, Biometrics.

[32]  José Francisco Martínez Trinidad,et al.  Global k-Means with Similarity Functions , 2005, CIARP.

[33]  K. J. Lynch,et al.  Automatic construction of networks of concepts characterizing document databases , 1992, IEEE Trans. Syst. Man Cybern..

[34]  Max J. Egenhofer,et al.  Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure , 2004, Int. J. Geogr. Inf. Sci..

[35]  Edwin Diday,et al.  A Recent Advance in Data Analysis: Clustering Objects into Classes Characterized by Conjunctive Concepts , 1981 .

[36]  José Francisco Martínez Trinidad,et al.  Fuzzy clustering of semantic spaces , 2001, Pattern Recognit..

[37]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[38]  José Ruiz-Shulcloper,et al.  Determining the feature relevance for non-classically described objects and a new algorithm to compute typical fuzzy testors , 1995, Pattern Recognit. Lett..

[39]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[40]  José Ruiz-Shulcloper,et al.  Mathematical algorithms for the supervised classification based on fuzzy partial precedence , 1999 .

[41]  J. Ruiz-Shulcloper,et al.  DGLC: a density-based global logical combinatorial clustering algorithm for large mixed incomplete data , 2000, IGARSS 2000. IEEE 2000 International Geoscience and Remote Sensing Symposium. Taking the Pulse of the Planet: The Role of Remote Sensing in Managing the Environment. Proceedings (Cat. No.00CH37120).

[42]  Ryszard S. Michalski,et al.  On the Quasi-Minimal Solution of the General Covering Problem , 1969 .

[43]  L. Vega-Alvarado,et al.  A new approach to classify cleft lip and palate. , 2001, The Cleft palate-craniofacial journal : official publication of the American Cleft Palate-Craniofacial Association.

[44]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[45]  L. Vega-Alvarado,et al.  A New Approach to Classify Cleft Lip and Palate , 2001 .

[46]  José Ruiz-Shulcloper,et al.  An overview of the evolution of the concept of testor , 2001, Pattern Recognit..

[47]  José Ruiz-Shulcloper,et al.  RGC: A new conceptual clustering algorithm for mixed incomplete data sets , 2002 .

[48]  Jinyan Li,et al.  CAEP: Classification by Aggregating Emerging Patterns , 1999, Discovery Science.

[49]  N. JARDINE,et al.  A New Approach to Pattern Recognition , 1971, Nature.

[50]  José Ruiz-Shulcloper,et al.  Clustering Mixed Incomplete Data , 2002 .

[51]  José Ruiz-Shulcloper,et al.  Selecting Objects for ALVOT , 2006, CIARP.

[52]  José Ruiz-Shulcloper,et al.  Logical Combinatorial Pattern Recognition: A Review , 2002 .

[53]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[55]  Yoshiharu Sato,et al.  EXTENDED FUZZY CLUSTERING MODELS FOR ASYMMETRIC SIMILARITY , 1995 .

[56]  José Ruiz-Shulcloper,et al.  MID MINING: A LOGICAL COMBINATORIAL PATTERN RECOGNITION APPROACH TO CLUSTERING IN LARGE DATA SETS , 2000 .

[57]  Hsinchun Chen,et al.  A concept space approach to addressing the vocabulary problem in scientific information retrieval: an experiment on the worm community system , 1997 .

[58]  L. Vega-Alvarado,et al.  A similarity function to evaluate the orthodontic condition in patients with cleft lip and palate. , 2004, Medical hypotheses.