Semantic HMC: Ontology-Described Hierarchy Maintenance in Big Data Context

One of the biggest challenges in Big Data is the exploitation of Value from large volumes of data that are constantly changing. To exploit value, one must focus on extracting knowledge from these Big Data sources. To extract knowledge and value from unstructured text we propose using a Hierarchical Multi-Label Classification process called Semantic HMC that uses ontologies to describe the predictive model including the label hierarchy and the classification rules. To not overload the user, this process automatically learns the ontology-described label hierarchy from a very large set of text documents. This paper aims to present a maintenance process of the ontology-described label hierarchy relations with regards to a stream of unstructured text documents in the context of Big Data that incrementally updates the label hierarchy.

[1]  Weiwei Cui,et al.  How Hierarchical Topics Evolve in Large Text Corpora , 2014, IEEE Transactions on Visualization and Computer Graphics.

[2]  Flavius Frasincar,et al.  A semantic approach for extracting domain taxonomies from text , 2014, Decis. Support Syst..

[3]  Marti A. Hearst Automatic Acquisition of Hyponyms , 1992 .

[4]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[5]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[6]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[7]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[8]  Flavius Frasincar,et al.  Domain taxonomy learning from text: The subsumption method versus hierarchical clustering , 2013, Data Knowl. Eng..

[9]  Christophe Cruz,et al.  Semantic HMC: A Predictive Model Using Multi-label Classification for Big Data , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[10]  Steffen Staab,et al.  Automatic Acquisition of Taxonomies from Text: FCA meets NLP , 2003 .

[11]  Christophe Cruz,et al.  Semantic HMC for big data analysis , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[12]  Haixun Wang,et al.  Automatic taxonomy construction from keywords , 2012, KDD.

[13]  Ian H. Witten,et al.  Constructing a Focused Taxonomy from a Document Collection , 2013, ESWC.

[14]  Raphael Volz,et al.  The Ontology Extraction & Maintenance Framework Text-To-Onto , 2001 .

[15]  Yunhao Liu,et al.  Big Data: A Survey , 2014, Mob. Networks Appl..

[16]  Sharon A. Caraballo Automatic construction of a hypernym-labeled noun hierarchy from text , 1999, ACL.

[17]  Krzysztof Janowicz,et al.  Linked Data, Big Data, and the 4th Paradigm , 2013, Semantic Web.