Learning Sparse Functional Factors for Large-Scale Service Clustering

The past decade has witnessed a fast growth of web-based services, making discovery of user desired services from a large and diverse service space a fundamental challenge. Service clustering has been demonstrated as a promising solution by automatically detecting functionally similar services so that they can be searched and discovered together. In this way, both the efficiency and accuracy of service discovery can be improved. However, the autonomous nature of service providers leads to highly diverse usage of terms in their respective service descriptions. Furthermore, a typical service description is comprised of very limited terms due to the small number of (and focused) functionalities offered by the service. These unique characteristics make service descriptions different from regular text documents, which poses additional challenges when clustering large-scale services. Recent works show that service clustering can benefit from discovery and use of functionality-related latent factors to represent services as opposed to a large and diverse set of terms. Nonetheless, how to determine the total number of latent functional factors and sparsely assign them to each service description arises as a central challenge, especially for a large service space where there is no easy way to enumerate the types of different functionalities. In this paper, we propose a machine learning method that automatically learns the number of latent functional factors in a service space. It also enforces the sparsity constraint, which allows each service to be represented by a small number of latent functional factors. The sparsity constraint is in line with the fact that most real-world services only provide limited functionalities. We conduct extensive experiments on two sets of real-world service data to demonstrate the effectiveness of the proposed service clustering approach.

[1]  Schahram Dustdar,et al.  Web service clustering using multidimensional angles as proximity measures , 2009, TOIT.

[2]  Qi Yu Sparse Functional Representation for Large-Scale Service Clustering , 2012, ICSOC.

[3]  Richi Nayak,et al.  Improving Web Service Discovery by Using Semantic Models , 2008, WISE.

[4]  Qi Yu,et al.  Place Semantics into Context: Service Community Discovery from the WSDL Corpus , 2011, ICSOC.

[5]  Patrick Martin,et al.  Clustering WSDL Documents to Bootstrap the Discovery of Web Services , 2010, 2010 IEEE International Conference on Web Services.

[6]  Jiawei Han,et al.  Document clustering using locality preserving indexing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[7]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[8]  Zibin Zheng,et al.  WSExpress: A QoS-aware Search Engine for Web Services , 2010, 2010 IEEE International Conference on Web Services.

[9]  Clement T. Yu,et al.  WebIQ: Learning from the Web to Match Deep-Web Query Interfaces , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[10]  L. Lovász Matching Theory (North-Holland mathematics studies) , 1986 .

[11]  Fangfang Liu,et al.  Measuring Similarity of Web Services Based on WSDL , 2010, 2010 IEEE International Conference on Web Services.

[12]  Qi Yu,et al.  On Service Community Learning: A Co-clustering Approach , 2010, 2010 IEEE International Conference on Web Services.

[13]  John R. Gilbert,et al.  Sparse Matrices in MATLAB: Design and Implementation , 1992, SIAM J. Matrix Anal. Appl..

[14]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[15]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[16]  Zibin Zheng,et al.  Titan: a system for effective web service discovery , 2012, WWW.

[17]  Lawrence Carin,et al.  Nonparametric factor analysis with beta process priors , 2009, ICML '09.

[18]  Qi Yu,et al.  Integrating User Invocation Data and Extended Semantics for Service Community Discovery , 2012, Int. J. Next Gener. Comput..

[19]  Denilson Barbosa,et al.  Prequery Discovery of Domain-Specific Query Forms: A Survey , 2013, IEEE Transactions on Knowledge and Data Engineering.

[20]  Matthias Klusch,et al.  Automated semantic web service discovery with OWLS-MX , 2006, AAMAS '06.

[21]  Xuanzhe Liu,et al.  Discovering Homogeneous Web Service Community in the User-Centric Web Environment , 2009, IEEE Transactions on Services Computing.

[22]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[23]  Yanchun Zhang,et al.  Efficiently finding web services using a clustering semantic approach , 2008, CSSSIA '08.

[24]  Jun Zhang,et al.  Simlarity Search for Web Services , 2004, VLDB.

[25]  Zibin Zheng,et al.  WTCluster: Utilizing Tags for Web Services Clustering , 2011, ICSOC.

[26]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..