Faceted Hierarchy: A New Graph Type to Organize Scientific Concepts and a Construction Method

On a scientific concept hierarchy, a parent concept may have a few attributes, each of which has multiple values being a group of child concepts. We call these attributes facets: classification has a few facets such as application (e.g., face recognition), model (e.g., svm, knn), and metric (e.g., precision). In this work, we aim at building faceted concept hierarchies from scientific literature. Hierarchy construction methods heavily rely on hypernym detection, however, the faceted relations are parent-to-child links but the hypernym relation is a multi-hop, i.e., ancestor-to-descendent link with a specific facet “type-of”. We use information extraction techniques to find synonyms, sibling concepts, and ancestor-descendent relations from a data science corpus. And we propose a hierarchy growth algorithm to infer the parent-child links from the three types of relationships. It resolves conflicts by maintaining the acyclic structure of a hierarchy.

[1]  Xiao Huang,et al.  On Interpretation of Network Embedding via Taxonomy Induction , 2018, KDD.

[2]  Jiawei Han,et al.  TruePIE: Discovering Reliable Patterns in Pattern-Based Information Extraction , 2018, KDD.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Sylvie Ratté,et al.  Concept extraction from business documents for software engineering projects , 2015, Automated Software Engineering.

[5]  Karl Aberer,et al.  Taxonomy Induction Using Hypernym Subsequences , 2017, CIKM.

[6]  Heng Ji,et al.  Constructing Topical Hierarchies in Heterogeneous Information Networks , 2013, ICDM.

[7]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8]  Srayan Datta,et al.  Building a Scientific Concept Hierarchy Database (SCHBase) , 2015, ACL.

[9]  Carlo Zaniolo,et al.  On2Vec: Embedding-based Relation Prediction for Ontology Population , 2018, SDM.

[10]  Wanxiang Che,et al.  Learning Semantic Hierarchies via Word Embeddings , 2014, ACL.

[11]  Yi Zhang,et al.  On the Transitivity of Hypernym-Hyponym Relations in Data-Driven Lexical Taxonomies , 2017, AAAI.

[12]  Ming Liu,et al.  Constructing Semantic Hierarchies via Fusion Learning Architecture , 2017, CCIR.

[13]  Yiyu Shi,et al.  A Novel Unsupervised Approach for Precise Temporal Slot Filling from Incomplete and Noisy Temporal Contexts , 2019, WWW.

[14]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[15]  Steffen Staab,et al.  Learning Concept Hierarchies from Text with a Guided Agglomerative Clustering Algorithm , 2005, ICML 2005.

[16]  Stefano Faralli,et al.  TAXI at SemEval-2016 Task 13: a Taxonomy Induction Method based on Lexico-Syntactic Patterns, Substrings and Focused Crawling , 2016, *SEMEVAL.

[17]  Philip Resnik,et al.  Learning a Concept Hierarchy from Multi-labeled Documents , 2014, NIPS.

[18]  ChengXiang Zhai,et al.  Noun-Phrase Analysis in Unrestricted Text for Information Retrieval , 1996, ACL.

[19]  Aditya G. Parameswaran,et al.  Towards the web of concepts , 2010, Proc. VLDB Endow..

[20]  Siddhartha Jonnalagadda,et al.  Enhancing clinical concept extraction with distributional semantics , 2012, J. Biomed. Informatics.

[21]  A. McCallum,et al.  Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning , 2017 .

[22]  Behrang Q. Zadeh,et al.  SemEval-2018 Task 7: Semantic Relation Extraction and Classification in Scientific Papers , 2018, *SEMEVAL.

[23]  See-Kiong Ng,et al.  Taxonomy Construction Using Syntactic Contextual Evidence , 2014, EMNLP.

[24]  Nitesh V. Chawla,et al.  The Role of "Condition": A Novel Scientific Knowledge Graph Representation and Construction Model , 2019, KDD.

[25]  Brian M. Sadler,et al.  TaxoGen: Constructing Topical Concept Taxonomy by Adaptive Term Embedding and Clustering , 2018, KDD 2018.

[26]  Zornitsa Kozareva,et al.  A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web , 2010, EMNLP.

[27]  Lei Zou,et al.  Taxonomy Induction and Taxonomy-based Recommendations for Online Courses , 2015, JCDL.

[28]  David J. Weir,et al.  Learning to Distinguish Hypernyms and Co-Hyponyms , 2014, COLING.

[29]  Jiawei Han,et al.  Automated Phrase Mining from Massive Text Corpora , 2017, IEEE Transactions on Knowledge and Data Engineering.

[30]  Qingkai Zeng,et al.  Tablepedia: Automating PDF Table Reading in an Experimental Evidence Exploration and Analytic System , 2019, WWW.

[31]  Dan Klein,et al.  Structured Learning for Taxonomy Induction with Belief Propagation , 2014, ACL.

[32]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[33]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[34]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[35]  Gerhard Weikum,et al.  PATTY: A Taxonomy of Relational Patterns with Semantic Types , 2012, EMNLP.

[36]  Maya Ramanath,et al.  Construction and Applications of TeKnowbase: A Knowledge Base of Computer Science Concepts , 2018, WWW.