Two Way Focused Classification

In this paper we propose TwoWayFocused classification that performs feature selection and tuple selection over the data before performing classification. Although feature selection and tuple selection have been studied earlier in various research areas such as machine learning, data mining, and so on, they have rarely been studied together. The contribution of this paper is that we propose a novel distance measure to select the most representative features and tuples. Our experiments are conducted over some microarray gene expression datasets, UCI machine learning and KDD datasets. Results show that the proposed method outperforms the existing methods quite significantly.

[1]  Walter L. Ruzzo,et al.  Improved Gene Selection for Classification of Microarrays , 2002, Pacific Symposium on Biocomputing.

[2]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[3]  F. Clarke,et al.  Nonlinear oscillations and boundary value problems for Hamiltonian systems , 1982 .

[4]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[5]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[6]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[7]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[8]  Spiridon D. Likothanassis,et al.  Integrating feature and instance selection for text classification , 2002, KDD.

[9]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Paul H. Rabinowitz,et al.  On subharmonic solutions of hamiltonian systems , 1980 .

[12]  Huan Liu,et al.  Instance Selection and Construction for Data Mining , 2001 .

[13]  Huan Liu,et al.  Redundancy based feature selection for microarray data , 2004, KDD.

[14]  Gabriella Tarantello,et al.  Subharmonic solutions with prescribed minimal period for nonautonomous Hamiltonian systems , 1988 .

[15]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[16]  J. Ross Quinlan,et al.  Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[17]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[18]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[19]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Ian Witten,et al.  Data Mining , 2000 .

[21]  Gabriella Tarantello Subharmonic solutions for hamiltonian systems via a $\mathbb {Z}_p$ pseudoindex theory , 1988 .

[22]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[23]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[24]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..