WS-HFS: A Heterogeneous Feature Selection Framework for Web Services Mining

With the development of Service Computing and Big Data research, more and more heterogeneous data generated in the process of Service Computing attracts our attention. Combining correlated data sources may help improve the performance of a given task. For example, in service recommendation, one can combine (1) user profile data (e.g. Genders, age, etc.), (2) user log data (e.g., Click through data, service invocation records, etc.), (3) QoS data (e.g. Response time, cost, etc.), (4) service functional description (e.g., Service name, WSDL document, etc.) and (5) service tagging data (i.e., Tags annotated by users) to build a recommendation model. All these data sources provide informative but heterogeneous features. For instance, user profile and QoS data usually have nominal features reflecting users' background and services' qualities, log data provides term-based features about users' historical behaviors, and service functional description and tagging data have term-based features reflecting services' functionalities and users' collective opinions. Given multiple heterogeneous data sources, one important challenge is to find a unified feature subspace to capture the knowledge from all data sources. To handle this problem, in this paper, we propose a Heterogeneous Feature Selection framework, named as WS-HFS, in which the consensus and the weight of different sources are both considered. Moreover, we apply the proposed framework to Web service clustering as a case study, and compare it with the state of the art approaches. The comprehensive experiments based on real data demonstrate the effectiveness of WS-HFS.

[1]  I. Jolliffe Principal Component Analysis , 2002 .

[2]  Maria Luisa Villani,et al.  An approach for QoS-aware service composition based on genetic algorithms , 2005, GECCO '05.

[3]  Hao Yang,et al.  Dynamically Traveling Web Service Clustering Based on Spatial and Temporal Aspects , 2007, ER Workshops.

[4]  Wilson Wong,et al.  Discovering Homogenous Service Communities through Web Service Clustering , 2008, SOCASE.

[5]  Schahram Dustdar,et al.  Web service clustering using multidimensional angles as proximity measures , 2009, TOIT.

[6]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[7]  J. Padget,et al.  Service-Oriented Computing: Agents, Semantics, and Engineering , 2009 .

[8]  Athman Bouguettaya,et al.  Web Service Mining , 2010 .

[9]  Qi Yu,et al.  On Service Community Learning: A Co-clustering Approach , 2010, 2010 IEEE International Conference on Web Services.

[10]  Patrick Martin,et al.  Clustering WSDL Documents to Bootstrap the Discovery of Web Services , 2010, 2010 IEEE International Conference on Web Services.

[11]  Zibin Zheng,et al.  WTCluster: Utilizing Tags for Web Services Clustering , 2011, ICSOC.

[12]  Qi Yu,et al.  Place Semantics into Context: Service Community Discovery from the WSDL Corpus , 2011, ICSOC.

[13]  Zibin Zheng,et al.  QoS-Aware Web Service Recommendation by Collaborative Filtering , 2011, IEEE Transactions on Services Computing.

[14]  Jinjun Chen,et al.  A History Record-Based Service Optimization Method for QoS-Aware Service Composition , 2011, 2011 IEEE International Conference on Web Services.

[15]  Mingdong Tang,et al.  AWSR: Active Web Service Recommendation Based on Usage History , 2012, 2012 IEEE 19th International Conference on Web Services.

[16]  Liang Chen,et al.  Collaborative QoS Prediction via Matrix Factorization and Topic Model , 2013, 2013 IEEE 6th International Conference on Service-Oriented Computing and Applications.

[17]  Zibin Zheng,et al.  Predicting Quality of Service for Selection by Neighborhood-Based Collaborative Filtering , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[18]  Zibin Zheng,et al.  WT-LDA: User Tagging Augmented LDA for Web Service Clustering , 2013, ICSOC.

[19]  Liang Chen,et al.  Instant Recommendation for Web Services Composition , 2014, IEEE Transactions on Services Computing.