论文信息 - Extracting Comparative Sentences from Korean Text Documents Using Comparative Lexical Patterns and Machine Learning Techniques

Extracting Comparative Sentences from Korean Text Documents Using Comparative Lexical Patterns and Machine Learning Techniques

This paper proposes how to automatically identify Korean comparative sentences from text documents. This paper first investigates many comparative sentences referring to previous studies and then defines a set of comparative keywords from them. A sentence which contains one or more elements of the keyword set is called a comparative-sentence candidate. Finally, we use machine learning techniques to eliminate non-comparative sentences from the candidates. As a result, we achieved significant performance, an F1-score of 88.54%, in our experiments using various web documents.

Youngjoong Ko | Seon Yang | Youngjoong Ko | Seon Yang

[1] Ellen Riloff,et al. Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[2] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[3] Andrea Esuli,et al. Determining Term Subjectivity and Term Orientation for Opinion Mining , 2006, EACL.

[4] Bing Liu,et al. Identifying comparative sentences in text documents , 2006, SIGIR.

[5] Soo-Min Kim,et al. Automatic Detection of Opinion Bearing Words and Sentences , 2005, IJCNLP.

[6] Andrew McCallum,et al. A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[7] Zhang Le,et al. Maximum Entropy Modeling Toolkit for Python and C , 2004 .

[8] Janyce Wiebe,et al. Annotating Opinions in the World Press , 2003, SIGDIAL Workshop.

[9] Bing Liu,et al. Mining Comparative Sentences and Relations , 2006, AAAI.

[10] Sang-goo Lee,et al. Opinion mining of customer feedback data on the web , 2008, ICUIMC '08.