Mutagenicity Risk Analysis by Using Class Association Rules

Mutagenicity analysis of chemical compounds is crucial for the cause investigation of our modern diseases including cancers. For the analysis, accurate and comprehensive classification of the mutagenicity is strongly needed. Especially, use of appropriate features of the chemical compounds plays a key role for the interpretability of the classification results. In this paper, a classification approach named “Levelwise Subspace Clustering based Classification by Aggregating Emerging Patterns (LSC-CAEP)” which is known to be accurate and provides interpretable rules is applied to a mutagenicity data set. Promising results of the analysis are shown through a demonstration.

[1]  T. M. Murali,et al.  A Monte Carlo algorithm for fast projective clustering , 2002, SIGMOD '02.

[2]  Takashi Washio,et al.  Deriving Class Association Rules Based on Levelwise Subspace Clustering , 2005, PKDD.

[3]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[4]  Hans-Peter Kriegel,et al.  Density-Connected Subspace Clustering for High-Dimensional Data , 2004, SDM.

[5]  Ke Wang,et al.  Interestingness-Based Interval Merger for Numeric Association Rules , 1998, KDD.

[6]  Ian Witten,et al.  Data Mining , 2000 .

[7]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[8]  G. Klopman Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules , 1985 .

[9]  Hans-Peter Kriegel,et al.  Efficient density-based clustering of complex objects , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[10]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[11]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[12]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[13]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[14]  Luís Torgo,et al.  Knowledge Discovery in Databases: PKDD 2005, 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005, Proceedings , 2005, PKDD.

[15]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[16]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[17]  Jinyan Li,et al.  CAEP: Classification by Aggregating Emerging Patterns , 1999, Discovery Science.