论文信息 - Research on Text Classification Based on TextRank

Research on Text Classification Based on TextRank

Extracting keywords from the result of word segmentation with the improved TextRank algorithm. Use the relative position of the words in the article to calculate the influence of position; the position of the coverage of the words and expressions is extended to the statement of the words and the key words as the feature of the text. Hadoop programming using naive Bayesian algorithm for text classification. The experiments show that the improved Textrank has a great improvement in classification performance, and the classification accuracy of naive Bayesian algorithm is 93% when the number of keywords is 40. Compared with the traditional, the accuracy rate increased by about 10%.

Guangming Lu | Yule Xia | Jiamei Wang | Zhenling Yang

[1] Zhiyuan Liu,et al. Automatic Keyphrase Extraction via Topic Decomposition , 2010, EMNLP.

[2] Xiaojun Wan,et al. Single Document Keyphrase Extraction Using Neighborhood Knowledge , 2008, AAAI.

[3] Xiaojun Wan,et al. CollabRank: Towards a Collaborative Approach to Single-Document Keyphrase Extraction , 2008, COLING.

[4] Li Jing. Application of native Bayes classifier to text classification , 2003 .

[5] Rada Mihalcea,et al. TextRank: Bringing Order into Text , 2004, EMNLP.

[6] Xiaojun Wan,et al. Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction , 2007, ACL.

[7] Yang Song,et al. Topical Keyphrase Extraction from Twitter , 2011, ACL.

[8] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[9] Xia Tian. Study on Keyword Extraction Using Word Position Weighted TextRank , 2013 .

[10] Mark Last,et al. Graph-Based Keyword Extraction for Single-Document Summarization , 2008, COLING 2008.