Structure-aware Mashup service Clustering for cloud-based Internet of Things using genetic algorithm based clustering algorithm

Abstract An increasing number of physical objects connected to the Internet makes it possible for smart things to access all kinds of cloud services. Mashup technology has been an effective way to the rapid IoT (Internet of Things) application development. However, the number of Mashup services (IoT applications) being so large makes how to discover the desired IoT applications accurately and efficiently become a problem. Service clustering technology can facilitate service discovery effectively, and many different approaches have been proposed. However, many of them only use semantic similarities to guide clustering operations and need the configuration of the number of clusters. Structural similarities are orthogonal to semantic similarities. But they have never been used in service clustering approaches. In this paper, we propose a novel Mashup service clustering approach based on a structural similarity and a genetic algorithm based clustering algorithm. First, it applies a two-mode graph to describe Mashups, Web APIs, and their relations formally. Second, it applies the SimRank algorithm to quantify the structural similarity between every pair of Mashup services. Finally, it introduces a genetic algorithm based clustering algorithm to organize Mashup services into clusters effectively and determines the number of clusters automatically. Empirical results on a real-world Mashup services data set collected from ProgrammableWeb demonstrate that our approach can cluster Mashup services efficiently without any constraints on the number of clusters, and its performance is better than other Mashup service clustering approaches based on semantic metrics.

[1]  Liang Chen,et al.  Co-Clustering WSDL Documents to Bootstrap Service Discovery , 2014, 2014 IEEE 7th International Conference on Service-Oriented Computing and Applications.

[2]  S.G. Oguducu,et al.  A new graph-based evolutionary approach to sequence clustering , 2005, Fourth International Conference on Machine Learning and Applications (ICMLA'05).

[3]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[4]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[5]  Zibin Zheng,et al.  Clustering Web services to facilitate service discovery , 2013, Knowledge and Information Systems.

[6]  Qi Yu,et al.  On Service Community Learning: A Co-clustering Approach , 2010, 2010 IEEE International Conference on Web Services.

[7]  Mohamed Quafafou,et al.  Probabilistic Topic Models for Web Services Clustering and Discovery , 2013, ESOCC.

[8]  Tudor David,et al.  Semantic Web Service Clustering for Efficient Discovery Using an Ant-Based Method , 2010, IDC.

[9]  Schahram Dustdar,et al.  Web service clustering using multidimensional angles as proximity measures , 2009, TOIT.

[10]  Klaus Moessner,et al.  Probabilistic Methods for Service Clustering , 2010, SMRR@ISWC.

[11]  Incheon Paik,et al.  Web-Service Clustering with a Hybrid of Ontology Learning and Information-Retrieval-Based Term Similarity , 2013, 2013 IEEE 20th International Conference on Web Services.

[12]  Yingqiu Li,et al.  Research on Web service discovery with semantics and clustering , 2011, 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference.

[13]  Giorgio Gambosi,et al.  Complexity and approximation: combinatorial optimization problems and their approximability properties , 1999 .

[14]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 2000, Springer Berlin Heidelberg.

[15]  C. K. Michael Tse,et al.  Characterizing the Structural Quality of General Complex Software Networks , 2008, Int. J. Bifurc. Chaos.

[16]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[17]  Samantha Jenkins,et al.  Software architecture graphs as complex networks: A novel partitioning scheme to measure stability and evolution , 2007, Inf. Sci..

[18]  Keqing He,et al.  An On-Demand Services Discovery Approach Based on Topic Clustering , 2014 .

[19]  Zibin Zheng,et al.  Titan: a system for effective web service discovery , 2012, WWW.

[20]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[21]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[22]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[23]  Patrick Martin,et al.  Clustering WSDL Documents to Bootstrap the Discovery of Web Services , 2010, 2010 IEEE International Conference on Web Services.

[24]  Patrick C. K. Hung,et al.  Constructing a Global Social Service Network for Better Quality of Web Service Discovery , 2015, IEEE Transactions on Services Computing.

[25]  Fernando Brito e Abreu,et al.  Candidate metrics for object-oriented software within a taxonomy framework , 1994, J. Syst. Softw..

[26]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Valeria De Antonellis,et al.  Ontology-based methodology for e-service discovery , 2006, Inf. Syst..

[28]  Wilson Wong,et al.  Discovering Homogenous Service Communities through Web Service Clustering , 2008, SOCASE.

[29]  Mingdong Tang,et al.  Mashup Service Clustering Based on an Integration of Service Content and Network via Exploiting a Two-Level Topic Model , 2016, 2016 IEEE International Conference on Web Services (ICWS).

[30]  Wilson Wong,et al.  Web service clustering using text mining techniques , 2009, Int. J. Agent Oriented Softw. Eng..

[31]  Sallie M. Henry,et al.  Object-oriented metrics that predict maintainability , 1993, J. Syst. Softw..

[32]  Zibin Zheng,et al.  WT-LDA: User Tagging Augmented LDA for Web Service Clustering , 2013, ICSOC.

[33]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[34]  Cheng Wu,et al.  Category-Aware API Clustering and Distributed Recommendation for Automatic Mashup Creation , 2015, IEEE Transactions on Services Computing.