Determining Semantic Valences of Ontology Concepts by Means of Nonnegative Factorization of Tensors of Large Text Corpora

This paper describes a method for automatic detection of semantic relations between concept nodes of a networked ontological knowledge base by analyzing matrices of semantic-syntactic valences of words. These matrices are obtained by means of nonnegative factorization of tensors of syntactic compatibility of words. Such tensors are generated in the course of frequency analysis of syntactic structures of sentences taken from large text corpora of English Wikipedia and Simple English Wikipedia entries.

[1]  Steffen Staab,et al.  Clustering Concept Hierarchies from Text , 2004, LREC.

[2]  Tim Van de Cruys,et al.  A Non-negative Tensor Factorization Model for Selectional Preference Induction , 2009, Natural Language Engineering.

[3]  Shay B. Cohen,et al.  Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs , 2012, NIPS.

[4]  Raphael Volz,et al.  Semi-automatic Ontology Acquisition from a Corporate Intranet , 2000 .

[5]  Charles J. Fillmore,et al.  THE CASE FOR CASE. , 1967 .

[6]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.

[7]  Martha Palmer,et al.  Extracting Tree Adjoining Grammars from Bracketed Corpora , 2009 .

[8]  Suresh Manandhar,et al.  Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures , 2002, EKAW.

[9]  Maria Ruiz-Casado,et al.  Automatising the learning of lexical patterns: An application to the enrichment of WordNet by extracting semantic relationships from Wikipedia , 2007, Data Knowl. Eng..

[10]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[11]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[12]  Markku Hauta-Kasari,et al.  Nonnegative Tensor Factorization Accelerated Using GPGPU , 2011, IEEE Transactions on Parallel and Distributed Systems.

[13]  Steffen Staab,et al.  Word classification based on combined measures of distributional and semantic similarity , 2003, EACL.

[14]  Iryna Gurevych,et al.  The People’s Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet , 2011, IWCS.

[15]  Scott Kirkpatrick,et al.  Optimization by Simmulated Annealing , 1983, Sci..

[16]  Wei Peng,et al.  On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis , 2010, Applied Intelligence.

[17]  Steffen Staab,et al.  Discovering Conceptual Relations from Text , 2000, ECAI.

[18]  Lucy Vanderwende,et al.  MindNet: Acquiring and Structuring Semantic Information from Text , 1998, COLING-ACL.

[19]  Marti A. Hearst Automated Discovery of WordNet Relations , 2004 .

[20]  Ted Briscoe,et al.  Automatic Extraction of Subcategorization from Corpora , 1997, ANLP.

[21]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[22]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[23]  Lillian Lee,et al.  Similarity-Based Approaches to Natural Language Processing , 1997, ArXiv.

[24]  Alon Itai,et al.  Two Languages Are More Informative Than One , 1991, ACL.

[25]  Udo Hahn,et al.  Towards Text Knowledge Engineering , 1998, AAAI/IAAI.

[26]  Marc Moens,et al.  Description of the LTG System Used for MUC-7 , 1998, MUC.

[27]  Peter M. Hastings Automatic acquisition of word meaning from context , 1994 .

[28]  Yorick Wilks,et al.  Providing machine tractable dictionary tools , 1990, Machine Translation.

[29]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[30]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[31]  Jason Baldridge,et al.  Providing Robustness for a CCG System , 2007 .

[32]  Kenneth Ward Church,et al.  Using Statistics in Lexical Analysis , 2003, Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon.

[33]  Thierry Poibeau,et al.  Multi-way Tensor Factorization for Unsupervised Lexical Acquisition , 2012, COLING.

[34]  Maria Ruiz-Casado,et al.  Automatic Assignment of Wikipedia Encyclopedic Entries to WordNet Synsets , 2005, AWIC.