Taxonomy Formation by Approximate Equivalence Relations, Revisited

Unsupervised classification of objects involves formation of classes and construction of one or more taxonomies that include those classes. Meaningful classes can be formed in feedback with acquisition of knowledge about each class. We demonstrate how contingency tables can be used to construct one-level taxonomy elements by relying only on approximate equivalence relations between attribute pairs, and how a multi-level taxonomy formation can be guided by a partition utility functions. Databases with different types of attributes and large number of records can be dealt with.

[1]  Jan M. Zytkow,et al.  Concept Hierarchies: A Restricted Form of Knowledge Derived From Regularities , 1994, ISMIS.

[2]  Alicja Ciok,et al.  Discretization as a tool in cluster analysis , 1998 .

[3]  Jan M. Zytkow,et al.  From Contingency Tables to Various Forms of Knowledge in Databases , 1996, Advances in Knowledge Discovery and Data Mining.

[4]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[5]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[6]  Douglas H. Fisher,et al.  Iterative Optimization and Simplification of Hierarchical Clusterings , 1996, J. Artif. Intell. Res..

[7]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[8]  Brian Everitt,et al.  Principles of Multivariate Analysis , 2001 .

[9]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[10]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[11]  Thierry Van de Merckt Decision Trees in Numerical Attribute Spaces , 1993, IJCAI.

[12]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[13]  U. Fayyad,et al.  Technical Note On the Handling of Continuous-Va lued Attributes in Decision Tree Generation , 1992 .

[14]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[15]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[16]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .