An efficient feature selection using multi-criteria in text categorization for naïve Bayes classifier

Feature selection is one of the most interesting problems in machine learning in general and text categorization in particular. Previous researches in feature selection often focus on choosing appropriate measument to evaluate features. This seems to be good for structured data but rather difficult to text, a nonstructured data. Our main contribution in this paper is to propose a new approach of feature selection based on multi-criteria ranking of features. A new model for feature selection is propose; based on a threshold value for each criterion, a new procedure for feature selection is proposed and applied to a text categorization. Experiments show that the proposed model outperforms performances in compare to conventional feature selection methods.