Text Based Knowledge Discovery with Information Flow Analysis

Information explosion has led to diminishing awareness: disciplines are becoming increasingly specialized; individuals and groups are becoming ever more insular. This paper considers how awareness can be enhanced via text-based knowledge discovery. Knowledge representation is motivated from a socio-cognitive perspective. Concepts are represented as vectors in a high dimensional semantic space automatically derived from a text corpus. Information flow computation between vectors is proposed as a means of discovering implicit associations between concepts. The potential of information flow analysis in text based knowledge discovery has been demonstrated by two case studies: literature-based scientific discovery by attempting to simulate Swanson’s Raynaud-fish oil discovery in medical texts; and automatic category derivation from document titles. There is some justification to believe that the techniques create awareness of new knowledge.

[1]  Peter Bruza,et al.  Inferring query models by information flow analysis , 2002 .

[2]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[3]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[4]  Susan T. Dumais,et al.  Using latent semantic indexing for literature based discovery , 1998 .

[5]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[6]  Susan T. Dumais,et al.  Using Latent Semantic Indexing for Literature Based Discovery , 1998, J. Am. Soc. Inf. Sci..

[7]  Peter Bruza,et al.  Discovering information flow suing high dimensional conceptual space , 2001, SIGIR '01.

[8]  Neil R. Smalheiser,et al.  Artificial Intelligence An interactive system for finding complementary literatures : a stimulus to scientific discovery , 1995 .

[9]  Peter Gärdenfors,et al.  Conceptual spaces - the geometry of thought , 2000 .

[10]  Peter Bruza,et al.  Towards context sensitive information inference , 2003, J. Assoc. Inf. Sci. Technol..

[11]  Curt Burgess,et al.  Explorations in context space: Words, sentences, discourse , 1998 .

[12]  D. Swanson Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge , 2015, Perspectives in biology and medicine.

[13]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[14]  Marc Weeber,et al.  Using concepts in literature-based discovery: Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001, J. Assoc. Inf. Sci. Technol..

[15]  Marc Weeber,et al.  Using concepts in literature-based discovery: simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001 .

[16]  Peter W. Foltz Quantitative approaches to semantic knowledge representations , 1998 .

[17]  Jon Barwise,et al.  Information Flow: The Logic of Distributed Systems , 1997 .

[18]  Peter Bruza,et al.  Inferring query models by computing information flow , 2002, CIKM '02.

[19]  Jan-Willem Romeijn European Summer School on Logic, Language, and Information , 2008 .