Predication-based Semantic Indexing: Permutations as a Means to Encode Predications in Semantic Space

Corpus-derived distributional models of semantic distance between terms have proved useful in a number of applications. For both theoretical and practical reasons, it is desirable to extend these models to encode discrete concepts and the ways in which they are related to one another. In this paper, we present a novel vector space model that encodes semantic predications derived from MEDLINE by the SemRep system into a compact spatial representation. The associations captured by this method are of a different and complementary nature to those derived by traditional vector space models, and the encoding of predication types presents new possibilities for knowledge discovery and information retrieval.

[1]  Marcelo Fiszman,et al.  Extracting Semantic Predications from Medline Citations for Pharmacogenomics , 2006, Pacific Symposium on Biocomputing.

[2]  Walter Kintsch,et al.  Comprehension: A Paradigm for Cognition , 1998 .

[3]  Trevor Cohen,et al.  Abductive Reasoning and Similarity: Some Computational Tools , 2010 .

[4]  Anders Holst,et al.  Random indexing of text samples for latent semantic analysis , 2000 .

[5]  Roger W. Schvaneveldt,et al.  Pathfinder associative networks: studies in knowledge organization , 1990 .

[6]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[7]  Trevor Cohen,et al.  Exploring MEDLINE Space with Random Indexing and Pathfinder Networks , 2008, AMIA.

[8]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[9]  Marcelo Fiszman,et al.  Semantic Interpretation for the Biomedical Research Literature , 2005 .

[10]  Jeffrey Heer,et al.  prefuse: a toolkit for interactive information visualization , 2005, CHI.

[11]  Trevor Cohen,et al.  Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections , 2010, J. Biomed. Informatics.

[12]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[13]  Allen C. Browne,et al.  UMLS language and vocabulary tools. , 2003, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[14]  P. Kanerva,et al.  Permutations as a means to encode order in word space , 2008 .

[15]  H. Chertkow,et al.  Semantic memory , 2002, Current neurology and neuroscience reports.

[16]  Marvin Minsky,et al.  Semantic Information Processing , 1968 .

[17]  Trevor Cohen,et al.  Empirical distributional semantics: Methods and biomedical applications , 2009, J. Biomed. Informatics.

[18]  Dirk Ifenthaler,et al.  Computer-Based Diagnostics and Systematic Analysis of Knowledge , 2010 .

[19]  Ana Sousa,et al.  Mepolizumab and exacerbations of refractory eosinophilic asthma. , 2009, The New England journal of medicine.

[20]  Marcelo Fiszman,et al.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.

[21]  Dominic Widdows,et al.  Semantic Vectors: a Scalable Open Source Package and Online Technology Management Application , 2008, LREC.

[22]  Hsinchun Chen,et al.  Medical Informatics: Knowledge Management and Data Mining in Biomedicine (Operations Research/Computer Science Interfaces) , 2005 .

[23]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[24]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.