Asknet: Creating and Evaluating Large Scale Integrated Semantic Networks

Extracting semantic information from multiple natural language sources and combining that information into a single unified resource is an important and fundamental goal for natural language processing. Large scale resources of this kind can be useful for a wide variety of tasks including question answering, word sense disambiguation and knowledge discovery. A single resource representing the information in multiple documents can provide significantly more semantic information than is available from the documents considered independently. The ASKNet system utilises existing NLP tools and resources, together with spreading activation based techniques, to automatically extract semantic information from a large number of English texts, and combines that information into a large scale semantic network. The initial emphasis of the ASKNet system is on wide-coverage, robustness and speed of construction. In this paper we show how a network consisting of over 1.5 million nodes and 3.5 million edges, more than twi...

[1]  R. Schvaneveldt,et al.  Facilitation in recognizing pairs of words: evidence of a dependence between retrieval operations. , 1971, Journal of experimental psychology.

[2]  Lucy Vanderwende,et al.  MindNet: Acquiring and Structuring Semantic Information from Text , 1998, COLING-ACL.

[3]  Uwe Reyle,et al.  From Discourse to Logic - Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory , 1993, Studies in linguistics and philosophy.

[4]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[5]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[6]  Hugo Liu,et al.  ConceptNet — A Practical Commonsense Reasoning Tool-Kit , 2004 .

[7]  Henry Lieberman,et al.  A commonsense approach to predictive text entry , 2004, CHI EA '04.

[8]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[9]  Brian Harrington,et al.  ASKNet: Automated Semantic Knowledge Network , 2007, AAAI.

[10]  James R. Curran,et al.  Language Independent NER using a Maximum Entropy Tagger , 2003, CoNLL.

[11]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[12]  Lenhart K. Schubert,et al.  Extracting and evaluating general world knowledge from the Brown Corpus , 2003, HLT-NAACL 2003.

[13]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[14]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000 .

[15]  Michael J. Witbrock,et al.  An Introduction to the Syntax and Content of Cyc , 2006, AAAI Spring Symposium: Formalizing and Compiling Background Knowledge and Its Applications to Knowledge Representation and Question Answering.

[16]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[17]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[18]  David Baxter,et al.  On the Effective Use of Cyc in a Question Answering System , 2005 .

[19]  Mark Steedman,et al.  Wide-Coverage Semantic Representations from a CCG Parser , 2004, COLING.

[20]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[21]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.