Complex-Valued Embedding Models for Knowledge Graphs

The explosion of widely available relational datain the form of knowledge graphsenabled many applications, including automated personalagents, recommender systems and enhanced web search results.The very large size and notorious incompleteness of these data basescalls for automatic knowledge graph completion methods to make these applicationsviable. Knowledge graph completion, also known as link-prediction,deals with automatically understandingthe structure of large knowledge graphs---labeled directed graphs---topredict missing entries---labeled edges. An increasinglypopular approach consists in representing knowledge graphs as third-order tensors,and using tensor factorization methods to predict their missing entries.State-of-the-art factorization models propose different trade-offs between modelingexpressiveness, and time and space complexity. We introduce a newmodel, ComplEx---for Complex Embeddings---to reconcile both expressivenessand complexity through the use of complex-valued factorization, and exploreits link with unitary diagonalization.We corroborate our approach theoretically and show that all possibleknowledge graphs can be exactly decomposed by the proposed model.Our approach based on complex embeddings is arguably simple,as it only involves a complex-valued trilinear product,whereas other methods resort to more and more complicated compositionfunctions to increase their expressiveness. The proposed ComplEx model isscalable to large data sets as it remains linear in both space and time, whileconsistently outperforming alternative approaches on standardlink-prediction benchmarks. We also demonstrateits ability to learn useful vectorial representations for other tasks,by enhancing word embeddings that improve performanceson the natural language problem of entailment recognitionbetween pair of sentences.In the last part of this thesis, we explore factorization models abilityto learn relational patterns from observed data.By their vectorial nature, it is not only hard to interpretwhy this class of models works so well,but also to understand where they fail andhow they might be improved. We conduct an experimentalsurvey of state-of-the-art models, not towardsa purely comparative end, but as a means to get insightabout their inductive abilities.To assess the strengths and weaknesses of each model, we create simple tasksthat exhibit first, atomic properties of knowledge graph relations,and then, common inter-relational inference through synthetic genealogies.Based on these experimental results, we propose new researchdirections to improve on existing models, including ComplEx.

[1]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[2]  Sameer Singh,et al.  Towards Combined Matrix and Tensor Factorization for Universal Schema Relation Extraction , 2015, VS@HLT-NAACL.

[3]  F. L. Hitchcock The Expression of a Tensor or a Polyadic as a Sum of Products , 1927 .

[4]  Nathan Srebro,et al.  Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.

[5]  P. Comon,et al.  Tensor decompositions, alternating least squares and other tales , 2009 .

[6]  Volker Tresp,et al.  Logistic Tensor Factorization for Multi-Relational Data , 2013, ArXiv.

[7]  Lars Schmidt-Thieme,et al.  Predicting RDF triples in incomplete knowledge bases with tensor factorization , 2012, SAC '12.

[8]  Noga Alon,et al.  Sign rank versus VC dimension , 2015, COLT.

[9]  Paul Mineiro,et al.  Loss-Proportional Subsampling for Subsequent ERM , 2013, ICML.

[10]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[11]  Lawrence K. Saul,et al.  Modeling distances in large-scale networks by matrix factorization , 2004, IMC '04.

[12]  Volker Tresp,et al.  Type-Constrained Representation Learning in Knowledge Graphs , 2015, SEMWEB.

[13]  Nathan Linial,et al.  Complexity measures of sign matrices , 2007, Comb..

[14]  Pedro M. Domingos,et al.  Statistical predicate invention , 2007, ICML '07.

[15]  Satya S. Sahoo,et al.  A Survey of Current Approaches for Mapping of Relational Databases to RDF , 2009 .

[16]  Nathan Srebro,et al.  Beating SGD: Learning SVMs in Sublinear Time , 2011, NIPS.

[17]  Augustin-Louis Cauchy,et al.  Sur l'équation à l'aide de laquelle on détermine les inégalités séculaires des mouvements des planètes , 2009 .

[18]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[19]  Jason Weston,et al.  Open Question Answering with Weakly Supervised Embedding Models , 2014, ECML/PKDD.

[20]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[21]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[22]  Yoshua Bengio,et al.  Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model , 2008, IEEE Transactions on Neural Networks.

[23]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[24]  Alexa T. McCray,et al.  An Upper-Level Ontology for the Biomedical Domain , 2003, Comparative and functional genomics.

[25]  Huanbo Luan,et al.  Modeling Relation Paths for Representation Learning of Knowledge Bases , 2015, EMNLP.

[26]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[27]  Kai-Wei Chang,et al.  Typed Tensor Decomposition of Knowledge Bases for Relation Extraction , 2014, EMNLP.

[28]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[29]  Lars Schmidt-Thieme,et al.  Pairwise interaction tensor factorization for personalized tag recommendation , 2010, WSDM '10.

[30]  Mark W. Schmidt,et al.  Hybrid Deterministic-Stochastic Methods for Data Fitting , 2011, SIAM J. Sci. Comput..

[31]  Theodoros Rekatsinas,et al.  Multi-relational Learning Using Weighted Tensor Decomposition with Modular Loss , 2013, ArXiv.

[32]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[33]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[34]  Joshua B. Tenenbaum,et al.  Modelling Relational Data using Bayesian Clustered Tensor Factorization , 2009, NIPS.

[35]  Maximilian Nickel,et al.  Tensor factorization for relational learning , 2013 .

[36]  J. Kruskal Rank, decomposition, and uniqueness for 3-way and n -way arrays , 1989 .

[37]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[38]  Stan Matwin,et al.  Text Classification Using WordNet Hypernyms , 1998, WordNet@ACL/COLING.

[39]  Erhard Rahm,et al.  Frameworks for entity matching: A comparison , 2010, Data Knowl. Eng..

[40]  Rajarshi Das,et al.  Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks , 2016, EACL.

[41]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[42]  Chengfei Liu,et al.  Query Evaluation on Probabilistic RDF Databases , 2009, WISE.

[43]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[44]  Chris H. Q. Ding,et al.  Binary Matrix Factorization with Applications , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[45]  Lise Getoor,et al.  Knowledge Graph Identification , 2013, SEMWEB.

[46]  Heiner Stuckenschmidt,et al.  RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models , 2013, AAAI.

[47]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[48]  Luc De Raedt,et al.  Statistical Relational Artificial Intelligence: Logic, Probability, and Computation , 2016, Statistical Relational Artificial Intelligence.

[49]  Sebastian Riedel Improving the Accuracy and Efficiency of MAP Inference for Markov Logic , 2008, UAI.

[50]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[51]  Foster Provost,et al.  Suspicion scoring based on guilt-by-association, colle ctive inference, and focused data access 1 , 2005 .

[52]  Andrew McCallum,et al.  Compositional Vector Space Models for Knowledge Base Completion , 2015, ACL.

[53]  Y. Escoufier,et al.  Analyse factorielle des matrices carrees non symetriques , 1980 .

[54]  Yves Grandvalet,et al.  Combining Two And Three-Way Embeddings Models for Link Prediction in Knowledge Bases , 2016, J. Artif. Intell. Res..

[55]  H. Hornich Logik der Forschung , 1936 .

[56]  H. Robbins A Stochastic Approximation Method , 1951 .

[57]  Sameer Singh,et al.  Low-Dimensional Embeddings of Logic , 2014, ACL 2014.

[58]  J. Neumann Zur Algebra der Funktionaloperationen und Theorie der normalen Operatoren , 1930 .

[59]  Ben Shneiderman,et al.  D-Dupe: An Interactive Tool for Entity Resolution in Social Networks , 2006, 2006 IEEE Symposium On Visual Analytics Science And Technology.

[60]  Robert J. Harrison,et al.  Global arrays: A nonuniform memory access programming model for high-performance computers , 1996, The Journal of Supercomputing.

[61]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[62]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[63]  Luc De Raedt,et al.  Towards Combining Inductive Logic Programming with Bayesian Networks , 2001, ILP.

[64]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[65]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[66]  Yu Hu,et al.  Learning Semantic Word Embeddings based on Ordinal Knowledge Constraints , 2015, ACL.

[67]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[68]  Roman V. Yampolskiy,et al.  AI-Complete, AI-Hard, or AI-Easy - Classification of Problems in AI , 2012, MAICS.

[69]  Sameer Singh,et al.  Injecting Logical Background Knowledge into Embeddings for Relation Extraction , 2015, NAACL.

[70]  Inderjit S. Dhillon,et al.  NOMAD: Nonlocking, stOchastic Multi-machine algorithm for Asynchronous and Decentralized matrix completion , 2013, Proc. VLDB Endow..

[71]  C. Lee Giles,et al.  Autonomous citation matching , 1999, AGENTS '99.

[72]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[73]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[74]  Maximilian Nickel,et al.  Complex and Holographic Embeddings of Knowledge Graphs: A Comparison , 2017, ArXiv.

[75]  Peter J. Haas,et al.  Large-scale matrix factorization with distributed stochastic gradient descent , 2011, KDD.

[76]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[77]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[78]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[79]  Guillaume Bouchard,et al.  On Approximate Reasoning Capabilities of Low-Rank Vector Spaces , 2015, AAAI Spring Symposia.

[80]  M. Marelli,et al.  SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[81]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[82]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[83]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[84]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[85]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[86]  Ben Taskar,et al.  Probabilistic Classification and Clustering in Relational Data , 2001, IJCAI.

[87]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[88]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[89]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[90]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[91]  Jianfeng Gao,et al.  Basic Reasoning with Tensor Product Representations , 2016, ArXiv.

[92]  Edward Grefenstette,et al.  Towards a Formal Distributional Semantics: Simulating Logical Calculi with Tensors , 2013, *SEMEVAL.

[93]  Steffen Rendle Scaling Factorization Machines to Relational Data , 2013, Proc. VLDB Endow..

[94]  Guillaume Bouchard,et al.  A Factorization Machine Framework for Testing Bigram Embeddings in Knowledgebase Completion , 2016, AKBC@NAACL-HLT.

[95]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[96]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[97]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[98]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[99]  William Yang Wang,et al.  Learning First-Order Logic Embeddings via Matrix Factorization , 2016, IJCAI.

[100]  Tamara G. Kolda,et al.  Scalable Tensor Factorizations with Missing Data , 2010, SDM.

[101]  Antoine Bordes,et al.  Effective Blending of Two and Three-way Interactions for Modeling Multi-relational Data , 2014, ECML/PKDD.

[102]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[103]  Léon Bottou,et al.  From machine learning to machine reasoning , 2011, Machine Learning.

[104]  Guillaume Bouchard,et al.  Iterative Splits of Quadratic Bounds for Scalable Binary Tensor Factorization , 2014, UAI.

[105]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[106]  Jennifer Neville,et al.  Collective Classification with Relational Dependency Networks , 2003 .

[107]  Yuji Matsumoto,et al.  Knowledge Transfer for Out-of-Knowledge-Base Entities: A Graph Neural Network Approach , 2017, ArXiv.

[108]  Mark Steedman,et al.  Combined Distributional and Logical Semantics , 2013, TACL.

[109]  Fabian M. Suchanek,et al.  Fast rule mining in ontological knowledge bases with AMIE+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{docu , 2015, The VLDB Journal.

[110]  Jason Weston,et al.  Irreflexive and Hierarchical Relations as Translations , 2013, ArXiv.

[111]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[112]  Peter Christen,et al.  Data Matching , 2012, Data-Centric Systems and Applications.

[113]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[114]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[115]  Eric Moulines,et al.  Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.

[116]  Lizhen Qu,et al.  STransE: a novel embedding model of entities and relationships in knowledge bases , 2016, NAACL.

[117]  Scott Aaronson,et al.  Why Philosophers Should Care About Computational Complexity , 2011, Electron. Colloquium Comput. Complex..

[118]  Xueyan Jiang,et al.  Reducing the Rank in Relational Factorization Models by Including Observable Patterns , 2014, NIPS.

[119]  Ashish Sabharwal,et al.  Knowledge Completion for Generics using Guided Tensor Factorization , 2018, Transactions of the Association for Computational Linguistics.

[120]  Thomas Demeester,et al.  Lifted Rule Injection for Relation Embeddings , 2016, EMNLP.

[121]  Gerhard Weikum,et al.  YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract , 2013, IJCAI.

[122]  John Miller,et al.  Traversing Knowledge Graphs in Vector Space , 2015, EMNLP.

[123]  Thomas Gottron,et al.  Online dating recommender systems: the split-complex number approach , 2012, RSWeb@RecSys.

[124]  Inderjit S. Dhillon,et al.  Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems , 2012, 2012 IEEE 12th International Conference on Data Mining.

[125]  Jason J. Jung,et al.  Exploiting matrix factorization to asymmetric user similarities in recommendation systems , 2015, Knowl. Based Syst..

[126]  Andrew McCallum,et al.  Generalizing to Unseen Entities and Entity Pairs with Row-less Universal Schema , 2016, EACL.

[127]  Massimiliano Pontil,et al.  A New Convex Relaxation for Tensor Completion , 2013, NIPS.

[128]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[129]  Luc De Raedt,et al.  Logical and relational learning , 2008, Cognitive Technologies.

[130]  Volker Tresp,et al.  Querying Factorized Probabilistic Triple Databases , 2014, SEMWEB.

[131]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[132]  Jun Zhao,et al.  Learning to Represent Knowledge Graphs with Gaussian Embedding , 2015, CIKM.

[133]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[134]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[135]  René Vidal,et al.  Global Optimality in Tensor Factorization, Deep Learning, and Beyond , 2015, ArXiv.

[136]  Omer Levy,et al.  Do Supervised Distributional Methods Really Learn Lexical Inference Relations? , 2015, NAACL.

[137]  Tengyu Ma,et al.  Matrix Completion has No Spurious Local Minimum , 2016, NIPS.

[138]  Jun Li,et al.  A Link Prediction Approach for Item Recommendation with Complex Number , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[139]  Lise Getoor,et al.  Probabilistic Similarity Logic , 2010, UAI.

[140]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[141]  Peter Haddawy,et al.  Answering Queries from Context-Sensitive Probabilistic Knowledge Bases , 1997, Theor. Comput. Sci..

[142]  Guillaume Bouchard,et al.  Convex Collective Matrix Factorization , 2013, AISTATS.

[143]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[144]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[145]  P. Cameron Naïve set theory , 1998 .

[146]  Mark Dredze,et al.  Entity Disambiguation for Knowledge Base Population , 2010, COLING.

[147]  Seong-Bae Park,et al.  A Translation-Based Knowledge Graph Embedding Preserving Logical Property of Relations , 2016, HLT-NAACL.

[148]  Guillaume Bouchard,et al.  Knowledge Graph Completion via Complex Tensor Factorization , 2017, J. Mach. Learn. Res..

[149]  Francis R. Bach,et al.  A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization , 2008, J. Mach. Learn. Res..

[150]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[151]  Luc De Raedt,et al.  kLog: A Language for Logical and Relational Learning with Kernels (Extended Abstract) , 2012, IJCAI.

[152]  Guillaume Bouchard,et al.  On Inductive Abilities of Latent Factor Models for Relational Learning , 2017, J. Artif. Intell. Res..

[153]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[154]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[155]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[156]  Alex Graves,et al.  Associative Long Short-Term Memory , 2016, ICML.

[157]  Samy Bengio,et al.  LLORMA: Local Low-Rank Matrix Approximation , 2016, J. Mach. Learn. Res..

[158]  Gang Wang,et al.  RC-NET: A General Framework for Incorporating Knowledge into Word Representations , 2014, CIKM.

[159]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[160]  Edwin R. Hancock,et al.  Eigenspaces for Graphs , 2002, Int. J. Image Graph..

[161]  Tim Rocktäschel,et al.  Learning Knowledge Base Inference with Neural Theorem Provers , 2016, AKBC@NAACL-HLT.

[162]  Euripides G. M. Petrakis,et al.  Semantic similarity methods in wordNet and their application to information retrieval on the web , 2005, WIDM '05.

[163]  Masashi Shimbo,et al.  On the Equivalence of Holographic and Complex Embeddings for Link Prediction , 2017, ACL.

[164]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[165]  N. Chino,et al.  Complex Space Models for the Analysis of Asymmetry , 2002 .

[166]  Mathias Niepert Discriminative Gaifman Models , 2016, NIPS.

[167]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[168]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[169]  Joos Vandewalle,et al.  Independent component analysis and (simultaneous) third-order tensor diagonalization , 2001, IEEE Trans. Signal Process..

[170]  Junichi Yamagishi,et al.  Initial investigation of speech synthesis based on complex-valued neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[171]  David Heckerman,et al.  Probabilistic Entity-Relationship Models, PRMs, and Plate Models , 2004 .

[172]  Yang Liu,et al.  Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention , 2016, ArXiv.

[173]  Jason Weston,et al.  A semantic matching energy function for learning with multi-relational data , 2013, Machine Learning.

[174]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[175]  Andrew McCallum,et al.  Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.

[176]  Andrew McCallum,et al.  Structured Relation Discovery using Generative Models , 2011, EMNLP.

[177]  Ruhi Sarikaya,et al.  Knowledge Graph Inference for spoken dialog systems , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[178]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[179]  Thomas Demeester,et al.  Adversarial Sets for Regularising Neural Link Predictors , 2017, UAI.

[180]  Guillaume Bouchard,et al.  Decomposing Real Square Matrices via Unitary Diagonalization , 2016 .

[181]  Michael Gamon,et al.  Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[182]  Mason A. Porter,et al.  Community Structure in Online Collegiate Social Networks , 2008 .

[183]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[184]  Robert P. Goldman,et al.  From knowledge bases to decision models , 1992, The Knowledge Engineering Review.

[185]  Guillaume Bouchard,et al.  Online Learning to Sample , 2015, 1506.09016.

[186]  Lise Getoor,et al.  Collective entity resolution in relational data , 2007, TKDD.

[187]  Jing Xiao,et al.  Non-negative matrix factorization as a feature selection tool for maximum margin classifiers , 2011, CVPR 2011.

[188]  Y. Saad,et al.  Numerical Methods for Large Eigenvalue Problems , 2011 .

[189]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[190]  Steffen Staab,et al.  TripleRank: Ranking Semantic Web Data by Tensor Decomposition , 2009, SEMWEB.

[191]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[192]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[193]  Christopher Potts,et al.  Recursive Neural Networks Can Learn Logical Semantics , 2014, CVSC.

[194]  Eric Moulines,et al.  A blind source separation technique using second-order statistics , 1997, IEEE Trans. Signal Process..

[195]  Ryota Tomioka,et al.  Estimation of low-rank tensors via convex optimization , 2010, 1010.0789.

[196]  Bernardo A. Huberman,et al.  E-Mail as Spectroscopy: Automated Discovery of Community Structure within Organizations , 2005, Inf. Soc..

[197]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..