论文信息 - Applying the conjugate gradient method for text document categorization

Applying the conjugate gradient method for text document categorization

We investigate the effectiveness of two different methods to solve the linear least squares fit (LLSF) problem for document categorization. The first method is the singular value decomposition (SVD) method that has been previously used to solve the document categorization problem. The second method is the conjugate gradient (CG) method that is one of the most effective algorithms for solving a linear equation problem. However, up to our knowledge, the CG method has never been applied to handle the document classification problem. Therefore, we compare the effectiveness of these two LLSF methods to categorize text documents. In addition, we examine the effect of using different term weighting schemes on their performance for document classification. Lastly, we compare the performance of the LLSF classifiers against the neighborhood-based Dt-kNN classifier, our best variant of the kNN classifier integrated with a dynamic threshold scheme, on the Reuters 21578 dataset. Besides being the first proposal to use the CG method for document classification, our work opens up many exciting directions for future investigation.

Vincent Tam | Rudy Setiono | Ardi Santoso