Dimensionality Reduction in Statistical Pattern Recognition and Low Loss Dimensionality Reduction

First,authors review the prevailing feature selection methods such as Exhaustive Search,Genetic Algorithm,Sequential Forward Floating Selection,and Best Individual Features,and feature extraction approaches such as Principal Component Analysis,Fisher Discriminant Analysis,and Projection Pursuit for feature space dimensionality reduction in statistical pattern recognition.Second,authors discuss the characteristics and the applicable domains of all these techniques.Third,authors propose a novel feature selection method based on so-called optimal classifier,Bayesian classifier.The new feature selection method,i.e.the low loss dimensionality reduction(LLDR),is applied in automatic text categorization and compared with the prevailing feature selection methods such as Mutual Information(MI),Chi-square Statistic(CHI),and Document Frequency(DF) in automatic text categorization.Experimental results performed on the well known dataset Reuters-21578 show that the ability for dimensionality reduction of LLDR compared with those of MI and CHI,and higher than that of DF.Considering that LLDR is more computational efficient than MI and CHI,LLDR is a promising feature selection method for automatic text categorization.