Ontology learning from text: A look back and into the future

Ontologies are often viewed as the answer to the need for interoperable semantics in modern information systems. The explosion of textual information on the Read/Write Web coupled with the increasing demand for ontologies to power the Semantic Web have made (semi-)automatic ontology learning from text a very promising research area. This together with the advanced state in related areas, such as natural language processing, have fueled research into ontology learning over the past decade. This survey looks at how far we have come since the turn of the millennium and discusses the remaining challenges that will define the research directions in this area in the near future.

[1]  Steffen Staab,et al.  Discovering Conceptual Relations from Text , 2000, ECAI.

[2]  David Sánchez,et al.  Learning non-taxonomic relationships from web documents for domain ontology construction , 2008, Data Knowl. Eng..

[3]  Mitsuru Ishizuka,et al.  Acquisition of Hypernyms and Hyponyms from the WWW , 2003 .

[4]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[5]  Werner Kuhn,et al.  Ontology-based discovery of geographic information services - An application in disaster management , 2006, Comput. Environ. Urban Syst..

[6]  Mehrnoush Shamsfard,et al.  Learning ontologies from natural language texts , 2004, Int. J. Hum. Comput. Stud..

[7]  Steffen Staab,et al.  Strategies for the Evaluation of Ontology Learning , 2008, Ontology Learning and Population.

[8]  Vincent D. Blondel,et al.  Automatic discovery of similar words , 2004 .

[9]  Daniel Jurafsky,et al.  Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[10]  Fausto Giunchiglia,et al.  Lightweight Ontologies , 2009, Encyclopedia of Database Systems.

[11]  Ke Wang,et al.  Mining Generalized Associations of Semantic Relations from Textual Web Content , 2007, IEEE Transactions on Knowledge and Data Engineering.

[12]  Arno Scharl,et al.  Augmenting Lightweight Domain Ontologies with Social Evidence Sources , 2010, 2010 Workshops on Database and Expert Systems Applications.

[13]  J. Cullen,et al.  The Knowledge Acquisition Bottleneck: Time for Reassessment? , 1988 .

[14]  Mehrnoush Shamsfard,et al.  The state of the art in ontology learning: a framework for comparison , 2003, The Knowledge Engineering Review.

[15]  Soonhung Han,et al.  Meta-ontology for automated information integration of parts libraries , 2006, Comput. Aided Des..

[16]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[17]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[18]  Wilson Wong,et al.  A Cognitive-Based Approach to Identify Topics in Text Using the Web as a Knowledge Source , 2011 .

[19]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[20]  Yorick Wilks,et al.  Data Driven Ontology Evaluation , 2004, LREC.

[21]  Dekang Lin,et al.  PRINCIPAR - An Efficient, Broad-coverage, Principle-based Parser , 1994, COLING.

[22]  James F. Allen Natural language understanding (2nd ed.) , 1995 .

[23]  Frank van Harmelen,et al.  Supporting User Tasks through Visualisation of Light-weight Ontologies , 2004, Handbook on Ontologies.

[24]  Lina Zhou,et al.  Ontology learning: state of the art and open issues , 2007, Inf. Technol. Manag..

[25]  Dunja Mladenic,et al.  Semi-automatic Construction of Topic Ontologies , 2005, EWMF/KDO.

[26]  ResnikPhilip Semantic similarity in a taxonomy , 1999 .

[27]  Patrick Gallinari,et al.  Learning "Generalization/Specialization" Relations between Concepts - Application for Automatically Building Thematic Document Hierarchies , 2004, RIAO.

[28]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[29]  Peter Mika,et al.  Ontologies are us: A unified model of social networks and semantics , 2005, J. Web Semant..

[30]  Mohammed Bennamoun,et al.  Acquiring Semantic Relations Using the Web for Constructing Lightweight Ontologies , 2009, PAKDD.

[31]  Ralph Grishman,et al.  Automatic Acquisition of Domain Knowledge for Information Extraction , 2000, COLING.

[32]  P. Buitelaar,et al.  Web-based Ontology Learning with ISOLDE , 2022 .

[33]  Pablo Gamallo,et al.  Mapping Syntactic Dependencies onto Semantic Relations , 2002 .

[34]  Steffen Staab,et al.  Learning Concept Hierarchies from Text with a Guided Agglomerative Clustering Algorithm , 2005, ICML 2005.

[35]  Silvia Bernardini,et al.  BootCaT: Bootstrapping Corpora and Terms from the Web , 2004, LREC.

[36]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[37]  Roberto Basili,et al.  A Contrastive Approach to Term Extraction , 2001 .

[38]  Mohammed Bennamoun,et al.  Constructing Web Corpora through Topical Web Partitioning for Term Recognition , 2008, Australasian Conference on Artificial Intelligence.

[39]  Francky Trichet,et al.  Heavyweight Ontology Engineering , 2006, OTM Workshops.

[40]  F. Ciravegna,et al.  Named Entity Recognition for Ontology Population using Background Knowledge from Wikipedia , 2011 .

[41]  Francesc Ribas,et al.  On Learning more Appropriate Selectional Restrictions , 1995, EACL.

[42]  David Faure,et al.  A corpus-based conceptual clustering method for verb frames and ontology , 1998 .

[43]  Schubert Foo,et al.  Ontology research and development. Part 1 - a review of ontology generation , 2002, J. Inf. Sci..

[44]  Raymond J. Mooney,et al.  Learning Semantic Grammars with Constructive Inductive Logic Programming , 1993, AAAI.

[45]  Chung Hee Hwang,et al.  Incompletely and Imprecisely Speaking: Using Dynamic Ontologies for Representing and Retrieving Information , 1999, KRDB.

[46]  Alexander Budanitsky,et al.  Lexical Semantic Relatedness and Its Application in Natural Language Processing , 1999 .

[47]  Raphael Volz,et al.  The Ontology Extraction & Maintenance Framework Text-To-Onto , 2001 .

[48]  Jian-Hua Yeh,et al.  Ontology Construction Based on Latent Topic Extraction in a Digital Library , 2008, ICADL.

[49]  Andreas Papasalouros,et al.  Automated Learning of Social Ontologies , 2011 .

[50]  ChurchKenneth Ward,et al.  Word association norms, mutual information, and lexicography , 1990 .

[51]  Martin Romacker,et al.  Content management in the SYNDIKATE system - How technical documents are automatically transformed to text knowledge bases , 2000, Data Knowl. Eng..

[52]  Marko Grobelnik,et al.  A SURVEY OF ONTOLOGY EVALUATION TECHNIQUES , 2005 .

[53]  Pablo Gamallo,et al.  Selection Restrictions Acquisition for Parsing Improvement , 2001, INAP.

[54]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[55]  Eero Hyvönen,et al.  Ontology-Based Image Retrieval , 2003, WWW.

[56]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[57]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[58]  Qiong Luo,et al.  Towards Ontology Learning from Folksonomies , 2009, IJCAI.

[59]  Xin Zhang,et al.  Ontology-Based User Modeling for E-Commerce System , 2008, 2008 Third International Conference on Pervasive Computing and Applications.

[60]  David Faure,et al.  ASIUM: Learning subcategorization frames and restrictions of se-18 lection , 1998 .

[61]  Jussi Piitulainen,et al.  Discovering Synonyms and Other Related Words , 2004 .

[62]  Francisco Câmara Pereira Divago : Searching for new ideas in a multi-domain environment , 1996 .

[63]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[64]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[65]  York Sure-Vetter,et al.  Supporting the Construction of Spanish Legal Ontologies with Text2Onto , 2008, Computable Models of the Law, Languages, Dialogues, Games, Ontologies.

[66]  Mohammad Nizar,et al.  Ontology Concepts for Requirements Engineering Process in E-Government Applications , 2010, 2010 Fifth International Conference on Internet and Web Applications and Services.

[67]  Dieter Fensel,et al.  Towards the Semantic Web: Ontology-driven Knowledge Management , 2002 .

[68]  Steffen Staab,et al.  Ontology Learning Part One - On Discoverying Taxonomic Relations from the Web , 2002 .

[69]  Carina Silberer,et al.  Proceedings of the International Conference on Language Resources and Evaluation (LREC) , 2008 .

[70]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[71]  Michael Uschold,et al.  Ontologies and semantics for seamless connectivity , 2004, SGMD.

[72]  Takahiro Hara,et al.  Constructing a Global Ontology by Concept Mapping Using Wikipedia Thesaurus , 2008, 22nd International Conference on Advanced Information Networking and Applications - Workshops (aina workshops 2008).

[73]  Steffen Staab,et al.  The TEXT-TO-ONTO Ontology Learning Environment , 2000 .

[74]  Ah-Hwee Tan,et al.  CRCTOL: A semantic-based domain ontology learning system , 2010, J. Assoc. Inf. Sci. Technol..

[75]  Fred Popowich,et al.  Adapting a synonym database to specific domains , 2000 .

[76]  Martin Romacker,et al.  The SynDiKATe Text Knowledge Base Generator , 2001, HLT.

[77]  Kanagasabai Rajaraman,et al.  Towards ontology-driven navigation of the lipid bibliosphere , 2008, BMC Bioinformatics.

[78]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[79]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[80]  Khurshid Ahmad,et al.  The head-modifier principle and multilingual term extraction , 2005, Natural Language Engineering.

[81]  David Faure,et al.  Knowledge Acquisition of Predicate Argument Structures from Technical Texts Using Machine Learning: The System ASIUM , 1999, EKAW.

[82]  Asunción Gómez-Pérez,et al.  Selection of Ontologies for the Semantic Web , 2003, ICWE.

[83]  Paola Velardi,et al.  Integrated approach to Web ontology learning and engineering , 2002, Computer.

[84]  Euripides G. M. Petrakis,et al.  Unsupervised Ontology Acquisition from Plain Texts: The OntoGain System , 2010, NLDB.

[85]  Sergei Nirenburg,et al.  Lexical Acquisition with WordNet and the Mikrokosmos Ontology , 1998, WordNet@ACL/COLING.

[86]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[87]  Mohammed Bennamoun,et al.  Constructing specialised corpora through analysing domain representativeness of websites , 2011, Lang. Resour. Evaluation.

[88]  Myra Spiliopoulou,et al.  Coupling Information Extraction and Data Mining for Ontology Learning in PARMENIDES , 2004, RIAO.

[89]  Pablo Gamallo,et al.  Selection Restrictions Acquisition for Parsing and Information Retrieval Improvement , 2001, INAP.

[90]  Steffen Staab,et al.  Learning Taxonomic Relations from Heterogeneous Evidence , 2004 .

[91]  Ian H. Witten,et al.  Thesaurus-based index term extraction for agricultural documents , 2005 .

[92]  David Faure,et al.  First experiences of using semantic knowledge learned by ASIUM for information extraction task using INTEX , 2000, ECAI Workshop on Ontology Learning.

[93]  Steffen Staab,et al.  On How to Perform a Gold Standard Based Evaluation of Ontology Learning , 2006, SEMWEB.

[94]  Kalina Bontcheva,et al.  GATE: an Architecture for Development of Robust HLT applications , 2002, ACL.

[95]  Elizabeth Chang,et al.  Semi-Automatic Ontology Extension Using Spreading Activation , 2005 .

[96]  Dominique Lenne,et al.  KoMIS: An Ontology-Based Knowledge Management System for Industrial Safety , 2007, 18th International Workshop on Database and Expert Systems Applications (DEXA 2007).

[97]  Aldo Gangemi,et al.  Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology , 2005, IJCAI.

[98]  R. Porzel,et al.  A Task-based Approach for Ontology Evaluation , 2022 .

[99]  Yorick Wilks,et al.  User-Centred Ontology Learning for Knowledge Management , 2002, NLDB.

[100]  Paola Velardi,et al.  Semantic Interpretation of Terminological Strings , 2002 .

[101]  Dekang Lin,et al.  Dependency-Based Evaluation of Minipar , 2003 .

[102]  Paola Velardi,et al.  Using text processing techniques to automatically enrich a domain ontology , 2001, FOIS.

[103]  Mohammed Bennamoun,et al.  Tree-Traversing Ant Algorithm for term clustering based on featureless similarities , 2007, Data Mining and Knowledge Discovery.

[104]  Norbert Fuhr,et al.  Probabilistic Models in Information Retrieval , 1992, Comput. J..

[105]  Wilson Wong Learning lightweight ontologies from text across different domains using the web as background knowledge , 2009 .

[106]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[107]  Dominique Lenne,et al.  KoMIS: An Ontology-Based Knowledge Management System for Industrial Safety , 2007 .

[108]  Marek Hatala,et al.  Utility of Ontology Extraction Tools in the Hands of Educators , 2009, 2009 IEEE International Conference on Semantic Computing.

[109]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[110]  Günter Neumann,et al.  An Information Extraction Core System for Real World German Text Processing , 1997, ANLP.

[111]  Amílcar Cardoso,et al.  Automatic Reading and Learning from Text , 2001 .

[112]  Jos de Bruijn,et al.  Ontology Mediation, Merging, and Aligning , 2006 .

[113]  Jacques Calmet,et al.  OntoBayes: An Ontology-Driven Uncertainty Model , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).

[114]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[115]  Heum Park,et al.  Ontology-based Approach to Intelligent Ubiquitous Tourist Information System , 2009, Proceedings of the 4th International Conference on Ubiquitous Information Technologies & Applications.

[116]  Paul Buitelaar,et al.  Ontology Learning from Text: An Overview , 2005 .

[117]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[118]  Jason Baldridge,et al.  Multidisciplinary Instruction with the Natural Language Toolkit , 2008 .

[119]  Martin Volk,et al.  Cross-language Ontology Learning , 2011 .

[120]  Rada Mihalcea,et al.  SenseLearner: Word Sense Disambiguation for All Words in Unrestricted Text , 2005, ACL.

[121]  D. Mladení,et al.  Semi-automatic construction of topic ontology , 2005 .

[122]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[123]  Pablo Castells,et al.  An Adaptation of the Vector-Space Model for Ontology-Based Information Retrieval , 2007, IEEE Transactions on Knowledge and Data Engineering.

[124]  Gilad Mishne,et al.  Learning domain ontologies for Web service descriptions: an experiment in bioinformatics , 2005, WWW '05.

[125]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[126]  Hans Hjelm,et al.  Cross-language Ontology Learning: Incorporating and Exploiting Cross-language Data in the Ontology Learning Process , 2009 .

[127]  Steffen Staab,et al.  Measuring Similarity between Ontologies , 2002, EKAW.

[128]  Oi Yee Kwong,et al.  Mining parallel knowledge from comparable patents , 2011 .

[129]  Robert G. Raskin,et al.  Knowledge representation in the semantic web for Earth and environmental terminology (SWEET) , 2005, Comput. Geosci..

[130]  Paola Velardi,et al.  TermExtractor: a Web Application to Learn the Shared Terminology of Emergent Web Communities , 2007, IESA.

[131]  Wei Liu,et al.  Determination of Unithood and Termhood for Term Recognition , 2009 .

[132]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[133]  Pablo Gamallo,et al.  Learning Subcategorisation Information to Model a Grammar with "Co-restrictions , 2003 .

[134]  Mohammed Bennamoun,et al.  A probabilistic framework for automatic term recognition , 2009, Intell. Data Anal..

[135]  Udo Hahn,et al.  Finding new terminology in very large corpora , 2005, K-CAP '05.

[136]  Enrico Motta,et al.  Template Driven Information Extraction for Populating Ontologies , 2001, Workshop on Ontology Learning.

[137]  T. Katerina,et al.  Automatic Term Recognition using Contextual Cues , 1997 .

[138]  Anna Maria Di Sciullo,et al.  Natural Language Understanding , 2009, SoMeT.

[139]  Enrico Motta,et al.  A Hybrid Approach for Relation Extraction Aimed at the Semantic Web , 2006, FQAS.

[140]  Joydeep Ghosh,et al.  Relationship-based clustering and cluster ensembles for high-dimensional data mining , 2002 .

[141]  Ming Li,et al.  Normalized Information Distance , 2008, ArXiv.

[142]  Marti A. Hearst Automated Discovery of WordNet Relations , 2004 .

[143]  Nancy Ide,et al.  Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art , 1998, Comput. Linguistics.

[144]  Haofen Wang,et al.  Catriple: Extracting Triples from Wikipedia Categories , 2008, ASWC.