Pattern Synthesis Using Fuzzy Partitions of the Feature Set for Nearest Neighbor Classifier Design

Nearest neighbor classifiers require a larger training set in order to achieve a better classification accuracy. For a higher dimensional data, if the training set size is small, it suffers from the curse of dimensionality effect and performance gets degraded. Partition based pattern synthesis is an existing technique of generating a larger set of artificial training patterns based on a chosen partition of the feature set. If the blocks of the partition are statistically independent then the quality of synthetic patterns generated is high. But, such a partition, often does not exist for real world problems. So, approximate ways of generating a partition based on correlation coefficient values between pairs of features were used earlier in some studies. That is, an approximate hard partition, where each feature belongs to exactly one cluster (block) of the partition was used for doing the synthesis. The current paper proposes an improvement over this. Instead of having a hard approximate partition, a soft approximate partition based on fuzzy set theory could be beneficial. The present paper proposes such a fuzzy partitioning method of the feature set called fuzzy partition around medoids (fuzzy-PAM). Experimentally, using some standard data-sets, it is demonstrated that the fuzzy partition based synthetic patters are better as for as the classification accuracy is concerned.

[1]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[2]  M. Narasimha Murty,et al.  Overlap pattern synthesis with an efficient nearest neighbor classifier , 2005, Pattern Recognit..

[3]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[4]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[5]  M. Narasimha Murty,et al.  Partition based pattern synthesis technique with efficient algorithms for nearest neighbor classification , 2006, Pattern Recognit. Lett..

[6]  Yoshihiko Hamamoto,et al.  A Bootstrap Technique for Nearest Neighbor Classifier Design , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  M. Narasimha Murty,et al.  An incremental data mining algorithm for compact realization of prototypes , 2001, Pattern Recognit..

[8]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[9]  M. Narasimha Murty,et al.  Fusion of multiple approximate nearest neighbor classifiers for fast and efficient classification , 2004, Inf. Fusion.

[10]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[11]  A pattern synthesis technique with an efficient nearest neighbor classifier for binary pattern recognition , 2004, ICPR 2004.

[12]  T. Ravindra Babu,et al.  Comparison of genetic algorithm based prototype selection schemes , 2001, Pattern Recognit..

[13]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[14]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[15]  M. Narasimha Murty,et al.  An Efficient Parzen-Window Based Network Intrusion Detector Using a Pattern Synthesis Technique , 2005, PReMI.