论文信息 - Learning Graph Walk Based Similarity Measures for Parsed Text

Learning Graph Walk Based Similarity Measures for Parsed Text

We consider a parsed text corpus as an instance of a labelled directed graph, where nodes represent words and weighted directed edges represent the syntactic relations between them. We show that graph walks, combined with existing techniques of supervised learning, can be used to derive a task-specific word similarity measure in this graph. We also propose a new path-constrained graph walk method, in which the graph walk process is guided by high-level knowledge about meaningful edge sequences (paths). Empirical evaluation on the task of named entity coordinate term extraction shows that this framework is preferable to vector-based models for small-sized corpora. It is also shown that the path-constrained graph walk algorithm yields both performance and scalability gains.

William W. Cohen | Einat Minkov

[1] Mirella Lapata,et al. Dependency-Based Construction of Semantic Space Models , 2007, CL.

[2] James P. Callan,et al. Structured retrieval for question answering , 2007, SIGIR.

[3] Kevyn Collins-Thompson,et al. Query expansion using random walk models , 2005, CIKM '05.

[4] Gregory Grefenstette,et al. Explorations in automatic thesaurus discovery , 1994 .

[5] CollinsMichael,et al. Discriminative Reranking for Natural Language Parsing , 2005 .

[6] Christopher D. Manning,et al. Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[7] Razvan C. Bunescu,et al. A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[8] Dekang Lin,et al. Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[9] William W. Cohen,et al. Contextual search and name disambiguation in email using graphs , 2006, SIGIR.

[10] Daniel Jurafsky,et al. Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[11] Thad Hughes,et al. Lexical Semantic Relatedness with Random Graph Walks , 2007, EMNLP.