Unsupervised Extraction and Supervised Selection of Features Based on Information Gain

For robust recognition we first extract features from sensory data without considering the class labels, and then select important features for the classification. The unsupervised feature extraction may incorporate Principle Component Analysis, Independent Component Analysis, and Non-negative Matrix factorization. For the supervised selection of features we adopt Fisher Score and Information Gain (IG). To avoid the calculation of multivariate joint probability density functions, instead of the IG, we use Mutual Information (MI) between a feature and the class variable. However, in this case the MI among selected features reduces the effectiveness of the feature selection, and the statistically-independent ICA-based features result in the best performance.

[1]  Gang Wang,et al.  Feature selection with conditional mutual information maximin in text categorization , 2004, CIKM '04.

[2]  Soo-Young Lee,et al.  Extraction of frame-difference features based on PCA and ICA for lip-reading , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[3]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[4]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[5]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.