From Vision to Content: Construction of Domain-Specific Multi-Modal Knowledge Graph

Knowledge graphs are usually constructed to describe the various concepts that exist in real world as well as the relationships between them. There are many knowledge graphs in specific fields, but they usually pay more attention on text or structured data, ignoring the image vision information, and cannot play an adequate role in the emerging visualization applications. Aiming at this issue, we design a method that integrates image vision information and text information derived from Wikimedia Commons to construct a domain-specific multi-modal knowledge graph, taking the metallic materials domain as an example to illustrate the method. The text description of each image is regarded as its context semantic to acquire the image’s context semantic labels based on the DBpedia resource. Furthermore, we adopt deep neural network model instead of simple visual descriptors to acquire the image’s visual semantic labels using the concepts from WordNet. In order to fuse the visual semantic labels and context semantic labels, a path-based concept extension and fusion strategy is proposed based on the conceptual hierarchies of WordNet and DBpedia to obtain the effective extension concepts as well as the links between them, increasing the scale of the knowledge graph and enhancing the correlation between images. The experimental results show that the maximum extension level has a significant impact on the quality of the generated domain knowledge graph, and the best extension level number is respectively determined for both DBpedia and WordNet. In addition, the results of this paper are compared with IMGpedia to further show the effectiveness of the proposed method.

[1]  Andrew McCallum,et al.  Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets , 2018, EMNLP.

[2]  Angli Liu,et al.  Knowledge-Augmented Language Model and Its Application to Unsupervised Named-Entity Recognition , 2019, NAACL.

[3]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[4]  Benjamin Bustos,et al.  IMGpedia: A Linked Dataset with Content-Based Analysis of Wikimedia Images , 2017, SEMWEB.

[5]  Ji-Rong Wen,et al.  An Inference Approach to Basic Level of Categorization , 2015, CIKM.

[6]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  L. F. Rau,et al.  Extracting company names from text , 1991, [1991] Proceedings. The Seventh IEEE Conference on Artificial Intelligence Application.

[8]  A. Swartz MusicBrainz: A Semantic Web Service , 2002, IEEE Intell. Syst..

[9]  Xiaoming Zhang,et al.  Metallic materials ontology population from LOD based on conditional random field , 2018, Comput. Ind..

[10]  Jing Li,et al.  HDSKG: Harvesting domain specific knowledge graph from content of webpages , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[11]  Hongchao Ma,et al.  Construction of MeSH-Like Obstetric Knowledge Graph , 2018, 2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC).

[12]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[13]  Jianguo Chen,et al.  Information extraction and knowledge graph construction from geoscience literature , 2018, Comput. Geosci..

[14]  Penghe Chen,et al.  KnowEdu: A System to Construct Knowledge Graph for Education , 2018, IEEE Access.

[15]  Sameer Singh,et al.  Embedding Multimodal Relational Data for Knowledge Base Completion , 2018, EMNLP.

[16]  Lei Zhang,et al.  Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Iryna Gurevych,et al.  A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning , 2018, *SEMEVAL.

[18]  Peer Bork,et al.  The SIDER database of drugs and side effects , 2015, Nucleic Acids Res..

[19]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20]  Achim Rettinger,et al.  Towards Holistic Concept Representations: Embedding Relational Knowledge, Visual Attributes, and Distributional Word Semantics , 2017, International Semantic Web Conference.

[21]  Huanbo Luan,et al.  Image-embodied Knowledge Representation Learning , 2016, IJCAI.

[22]  C. Lee Giles,et al.  Extracting Semantic Relations for Scholarly Knowledge Base Construction , 2018, 2018 IEEE 12th International Conference on Semantic Computing (ICSC).

[23]  André Freitas,et al.  Building a Knowledge Graph from Natural Language Definitions for Interpretable Text Entailment Recognition , 2018, LREC.

[24]  Kirsten Schmieder,et al.  Automated differentiation between meningioma and healthy brain tissue based on optical coherence tomography ex vivo images using texture features , 2018, Journal of biomedical optics.

[25]  Elizabeth Blakesley Lindsay,et al.  The Internet Movie Database (IMDb) , 2013 .

[26]  John Boyle,et al.  Chemlistem: chemical named entity recognition using recurrent neural networks , 2018, Journal of Cheminformatics.

[27]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[28]  Xiaoming Zhang,et al.  MMOY: Towards deriving a metallic materials ontology from Yago , 2016, Adv. Eng. Informatics.

[29]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[30]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[31]  Soo Siang Teoh,et al.  Pedestrian Detection in Visual Images Using Combination of HOG and HOM Features , 2019, 10th International Conference on Robotics, Vision, Signal Processing and Power Applications.

[32]  Quan Qian,et al.  Ontology based heterogeneous materials database integration and semantic query , 2017 .

[33]  Ryan Gabbard,et al.  Combining rule-based and statistical mechanisms for low-resource named entity recognition , 2018, Machine Translation.

[34]  Xiaoming Zhang,et al.  MMKG: An approach to generate metallic materials knowledge graph based on DBpedia and Wikipedia , 2017, Comput. Phys. Commun..

[35]  Feiran Huang,et al.  Network embedding by fusing multimodal contents and links , 2019, Knowl. Based Syst..

[36]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Toshihiro Ashino,et al.  Materials Ontology: An Infrastructure for Exchanging Materials Information and Knowledge , 2010, Data Sci. J..

[38]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[39]  Kamalakar Karlapalem,et al.  Scalable Knowledge Graph Construction over Text using Deep Learning based Predicate Mapping , 2019, WWW.

[40]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[41]  Trevor Darrell,et al.  Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.

[42]  Eric Granger,et al.  Video-based face recognition using ensemble of haar-like deep convolutional neural networks , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[43]  Ling Zhu,et al.  Knowledge graph for TCM health preservation: Design, construction, and applications , 2017, Artif. Intell. Medicine.

[44]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[45]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[46]  Svetlana Lazebnik,et al.  Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Michael S. Bernstein,et al.  Visual Relationship Detection with Language Priors , 2016, ECCV.

[48]  Achim Rettinger,et al.  Knowledge Fusion via Embeddings from Text, Knowledge Graphs, and Images , 2017, ArXiv.

[49]  Matthieu Cord,et al.  MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Larry S. Davis,et al.  Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[51]  Xiaoming Zhang,et al.  STSM: An Infrastructure for Unifying Steel Knowledge and Discovering New Knowledge , 2014 .

[52]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[53]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[54]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[55]  Beatrice Alex,et al.  Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches , 2019, ArXiv.