论文信息 - Unsupervised Citation Sentence Identification Based on Similarity Measurement

Unsupervised Citation Sentence Identification Based on Similarity Measurement

Citation Context Analysis has obtained the interest of many researchers in the field of bibliometrics. To do this, the first step is to extract the context of each citation from a citing paper. In this paper, we proposed a novel unsupervised approach for the identification of implicit citation sentences without attaching a citation tag. Our approach selects the neighboring sentences around an explicit citation sentence as candidate sentences, calculates the similarity between a candidate sentence and a cited or citing paper, and deems those that are more similar to the cited paper to be implicit citation sentences. To calculate text similarity, we proposed four methods based on the Doc2vec model, the Vector Space Model (VSM) and the LDA model respectively. The experiment results showed that the hybrid method combing the probabilistic TF-IDF weighted VSM with the TF-IDF weighted Doc2vec obtained the best performance. Compared against other supervised methods, our approach does not need any annotated training corpus, and thus can be easy to apply to other domains in theory.

Shiyan Ou | Hyonil Kim | Shiyan Ou | Hyonil Kim

[1] Dragomir R. Radev,et al. Identifying Non-Explicit Citing Sentences for Citation-Based Summarization. , 2010, ACL.

[2] Manabu Okumura,et al. Towards Multi-paper Summarization Using Reference Information , 1999, IJCAI.

[3] Dragomir R. Radev,et al. Reference Scope Identification in Citing Sentences , 2012, NAACL.

[4] Guo Zhang,et al. Content‐based citation analysis: The next generation of citation analysis , 2014, J. Assoc. Inf. Sci. Technol..

[5] Stephen Cranefield,et al. Context identification of sentences in related work sections using a conditional random field: towards intelligent digital libraries , 2010, JCDL '10.

[6] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7] Dain Kaplan,et al. Automatic Extraction of Citation Contexts for Research Paper Summarization: A Coreference-chain based Approach , 2009 .

[8] John O'Connor,et al. Citing statements: Computer recognition and use to improve retrieval , 1982, Inf. Process. Manag..

[9] Awais Athar,et al. Sentiment Analysis of Citations using Sentence Structure-Based Features , 2011, ACL.

[10] ChengXiang Zhai,et al. A Constrained Hidden Markov Model Approach for Non-Explicit Citation Context Extraction , 2014, SDM.

[11] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.