论文信息 - LIPN at SemEval-2017 Task 10: Filtering Candidate Keyphrases from Scientific Publications with Part-of-Speech Tag Sequences to Train a Sequence Labeling Model

LIPN at SemEval-2017 Task 10: Filtering Candidate Keyphrases from Scientific Publications with Part-of-Speech Tag Sequences to Train a Sequence Labeling Model

This paper describes the system used by the team LIPN in SemEval 2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications. The team participated in Scenario 1, that includes three subtasks, Identification of keyphrases (Subtask A), Classification of identified keyphrases (Subtask B) and Extraction of relationships between two identified keyphrases (Subtask C). The presented system was mainly focused on the use of part-of-speech tag sequences to filter candidate keyphrases for Subtask A. Subtasks A and B were addressed as a sequence labeling problem using Conditional Random Fields (CRFs) and even though Subtask C was out of the scope of this approach, one rule was included to identify synonyms.

Davide Buscaldi | Thierry Charnois | Simon Hernandez

[1] Aïcha Mokhtari,et al. Accurate Keyphrase Extraction from Scientific Papers by Mining Linguistic Information , 2015, CLBib@ISSI.

[2] Isabelle Augenstein,et al. SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications , 2017, *SEMEVAL.

[3] Min-Yen Kan,et al. Re-examining Automatic Keyphrase Extraction Approaches in Scientific Articles , 2009, MWE@IJCNLP.

[4] Vincent Ng,et al. Automatic Keyphrase Extraction: A Survey of the State of the Art , 2014, ACL.

[5] Sivaji Bandyopadhyay,et al. Keyphrase Extraction in Scientific Articles: A Supervised Approach , 2012, COLING.

[6] Chengzhi Zhang,et al. Automatic Keyword Extraction from Documents Using Conditional Random Fields , 2008 .

[7] Carl Gutwin,et al. Domain-Specific Keyphrase Extraction , 1999, IJCAI.

[8] Maria P. Grineva,et al. Extracting key terms from noisy and multitheme documents , 2009, WWW '09.