Feature Reduction Based on Relative Document Frequency Balance Information Gain

To overcome the shortage of information gain in text categorization,this paper proposes a method of feature reduction based on the relative document frequency balance information gain (RDFBIG).Experimental results show that RDFBIG can effectively eliminate the impact of corpus scale in different classes,and achieve better results in text categorization.