Semantic Data Ingestion for Intelligent, Value-Driven Big Data Analytics

In this position paper we describe a conceptual model for intelligent Big Data analytics based on both semantic and machine learning AI techniques (called AI ensembles). These processes are linked to business outcomes by explicitly modelling data value and using semantic technologies as the underlying mode for communication between the diverse processes and organisations creating AI ensembles. Furthermore, we show how data governance can direct and enhance these ensembles by providing recommendations and insights that to ensure the output generated produces the highest possible value for the organisation.

[1]  R. Dahyot,et al.  Browsing sports video: trends in sports-related indexing and retrieval work , 2006, IEEE Signal Processing Magazine.

[2]  Rozenn Dahyot,et al.  Automatic Discovery and Geotagging of Objects from Street View Imagery , 2017, Remote. Sens..

[3]  Mario Piattini,et al.  MAMD 2.0: Environment for data quality processes implantation based on ISO 8000-6X and ISO/IEC 33000 , 2017, Comput. Stand. Interfaces.

[4]  Jens Lehmann,et al.  MEX Interfaces: Automating Machine Learning Metadata Generation , 2016, SEMANTiCS.

[5]  Diego Calvanese,et al.  Linking Data to Ontologies , 2008, J. Data Semant..

[6]  Diego Calvanese,et al.  Ontop: Answering SPARQL queries over relational databases , 2016, Semantic Web.

[7]  Heiko Paulheim,et al.  RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[8]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[9]  Christopher De Sa,et al.  Data Programming: Creating Large Training Sets, Quickly , 2016, NIPS.

[10]  Christoph Lange,et al.  Representing dataset quality metadata using multi-dimensional views , 2014, SEM '14.

[11]  Marijn Janssen,et al.  Coordinating Decision-Making in Data Management Activities: A Systematic Review of Data Governance Principles , 2016, EGOV.

[12]  Christoph Lange,et al.  Luzzu—A Methodology and Framework for Linked Data Quality Assessment , 2016, JDIQ.

[13]  T. Davenport Competing on analytics. , 2006, Harvard business review.

[14]  Markus Freudenberg,et al.  The Metadata Ecosystem of DataID , 2016, MTSR.

[15]  Elhadj Benkhelifa,et al.  Key Dimensions for Cloud Data Governance , 2016, 2016 IEEE 4th International Conference on Future Internet of Things and Cloud (FiCloud).

[16]  Markus Helfert,et al.  Management of Data Value Chains, a Value Monitoring Capability Maturity Model , 2018, ICEIS.

[17]  Deborah L. McGuinness,et al.  PROV-O: The PROV Ontology , 2013 .

[18]  Rob Brennan,et al.  A Semantic Data Value Vocabulary Supporting Data Value Assessment and Measurement Integration , 2018, ICEIS.

[19]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[20]  Markus Freudenberg,et al.  Enabling Combined Software and Data Engineering: the ALIGNED Suite of Ontologies , 2016, International Semantic Web Conference.

[21]  Carlo Batini,et al.  Digital Information Asset Evaluation: Characteristics and Dimensions , 2014 .

[22]  Abdullah Bulbul,et al.  Social media based 3D visual popularity , 2017, Comput. Graph..