Semantics of Data Mining Services in Cloud Computing

In recent years with the rise of Cloud Computing, many companies providing services in the cloud, are empowering a new series of services to their catalogue, such as data mining and data processing, taking advantage of the vast computing resources available to them. Different service definition proposals have been put forward to address the problem of describing services in Cloud Computing in a comprehensive way. Bearing in mind that each provider has its own definition of the logic of its services, and specifically of data mining services, it should be pointed out that the possibility of describing services in a flexible way between providers is fundamental in order to maintain the usability and portability of this type of Cloud Computing services. The use of semantic technologies based on the proposal offered by Linked Data for the definition of services, allows the design and modelling of data mining services, achieving a high degree of interoperability. In this article a schema for the definition of data mining services on cloud computing is presented considering all key aspects of service, such as prices, interfaces, Software Level Agreement, instances or data mining workflow, among others. The new schema is based on Linked Data, and it reuses other schemata obtaining a better and more complete definition of the services. In order to validate the completeness of the scheme, a series of data mining services have been created where a set of algorithms such as Random Forest or K-Means are modeled as services. In addition, a dataset has been generated including the definition of the services of several actual Cloud Computing data mining providers, confirming the effectiveness of the schema.

[1]  María Poveda-Villalón,et al.  Linked Open Vocabularies (LOV): A gateway to reusable semantic vocabularies on the Web , 2016, Semantic Web.

[2]  Jens Lehmann,et al.  MEX vocabulary: a lightweight interchange format for machine learning experiments , 2015, SEMANTICS.

[3]  Martin Mozina,et al.  Orange: data mining toolbox in python , 2013, J. Mach. Learn. Res..

[4]  Weisong Shi,et al.  Edge Computing: Vision and Challenges , 2016, IEEE Internet of Things Journal.

[5]  Junping Du,et al.  Adaptive and attribute-based trust model for service level agreement guarantee in cloud computing , 2013, IET Inf. Secur..

[6]  Graham J. Williams,et al.  PMML: An Open Standard for Sharing Models , 2009, R J..

[7]  Marc J. Hadley,et al.  Web application description language (WADL) , 2006 .

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Jorge S. Cardoso,et al.  Linked USDL: A Vocabulary for Web-Scale Service Trading , 2014, ESWC.

[10]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[11]  Domenico Talia,et al.  A Workflow Management System for Scalable Data Mining on Clouds , 2018, IEEE Transactions on Services Computing.

[12]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[13]  Craig A. Knoblock,et al.  Rapidly Integrating Services into the Linked Data Cloud , 2012, SEMWEB.

[14]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[15]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[16]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[17]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[18]  John Domingue,et al.  Using Semantics for Automating the Authentication of Web APIs , 2010, SEMWEB.

[19]  Jens Lehmann,et al.  Interoperable Machine Learning Metadata using MEX , 2015, International Semantic Web Conference.

[20]  Zibin Zheng,et al.  Service-Generated Big Data and Big Data-as-a-Service: An Overview , 2013, 2013 IEEE International Congress on Big Data.

[21]  Matthias Klusch Service Discovery , 2014, Encyclopedia of Social Network Analysis and Mining.

[22]  Saso Dzeroski,et al.  OntoDM: An Ontology of Data Mining , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[23]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[24]  Martin Hepp,et al.  GoodRelations: An Ontology for Describing Products and Services Offers on the Web , 2008, EKAW.

[25]  Bhagyashree Ambulkar,et al.  Data Mining in Cloud Computing , 2012 .

[26]  Sanjiva Weerawarana,et al.  Web Services Definition Language , 2001, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[27]  Joaquin Vanschoren,et al.  Exposé: An ontology for data mining experiments , 2010 .

[28]  Jerry R. Hobbs,et al.  DAML-S: Semantic Markup for Web Services , 2001, SWWS.

[29]  John A. Kunze,et al.  Dublin Core Metadata for Resource Discovery , 1998, RFC.

[30]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Ling Liu,et al.  Services Computing: From Cloud Services, Mobile Services to Internet of Services , 2016, IEEE Trans. Serv. Comput..

[32]  Yahya Slimani,et al.  A survey on cloud service description , 2017, J. Netw. Comput. Appl..

[33]  John Domingue,et al.  Web Service Modeling Ontology (WSMO): an ontology for Semantic Web Services , 2005 .

[34]  Gopal Gupta,et al.  USDL: A Service-Semantics Description Language for Automatic Service Discovery and Composition , 2009, Int. J. Web Serv. Res..

[35]  Tomas Vitvar,et al.  SAWSDL: Semantic Annotations for WSDL and XML Schema , 2007, IEEE Internet Computing.