WS-Finder: A Framework for Similarity Search of Web Services

Most existing Web service search engines employ keyword search over databases, which computes the distance between the query and the Web services over a fixed set of features. Such an approach often results in incompleteness of search results. The Earth Mover's Distance (EMD) has been successfully used in multimedia databases due to its ability to capture the differences between two distributions. However, calculating EMD is computationally intensive. In this paper, we present a novel framework called WS-Finder, which improves the existing keyword-based search techniques for Web services. In particular, we employ EMD for many-to-many partial matching between the contents of the query and the service attributes. We also develop a generalized minimization lower bound as a new EMD filter for partial matching. This new EMD filter is then combined to a k-NN algorithm for producing complete top-k search results. Furthermore, we theoretically and empirically show that WS-Finder is able to produce query answers effectively and efficiently.

[1]  Tobias Meisen,et al.  Efficient similarity search using the Earth Mover's Distance for large multimedia databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[2]  Takahiro Kawamura,et al.  Semantic Matching of Web Services Capabilities , 2002, SEMWEB.

[3]  Schahram Dustdar,et al.  A vector space search engine for Web services , 2005, Third European Conference on Web Services (ECOWS'05).

[4]  Eyhab Al-Masri,et al.  Investigating web services on the world wide web , 2008, WWW.

[5]  James A. Hendler,et al.  The Semantic Web — ISWC 2002 , 2002, Lecture Notes in Computer Science.

[6]  Yanchun Zhang,et al.  Web Services Discovery Based on Latent Semantic Approach , 2008, 2008 IEEE International Conference on Web Services.

[7]  Haibin Ling,et al.  An Efficient Earth Mover's Distance Algorithm for Robust Histogram Comparison , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Ambuj K. Singh,et al.  Indexing Spatially Sensitive Distance Measures Using Multi-resolution Lower Bounds , 2006, EDBT.

[9]  Anthony K. H. Tung,et al.  Efficient and effective similarity search over probabilistic data based on Earth Mover’s Distance , 2010, The VLDB Journal.

[10]  Jun Zhang,et al.  Simlarity Search for Web Services , 2004, VLDB.

[11]  Gonzalo Navarro,et al.  Flexible Pattern Matching in Strings: Practical On-Line Search Algorithms for Texts and Biological Sequences , 2002 .

[12]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[13]  Zibin Zheng,et al.  WSExpress: A QoS-aware Search Engine for Web Services , 2010, 2010 IEEE International Conference on Web Services.

[14]  Yanchun Zhang,et al.  Efficiently finding web services using a clustering semantic approach , 2008, CSSSIA '08.

[15]  Ira Assent,et al.  Approximation Techniques for Indexing the Earth Mover’s Distance in Multimedia Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[16]  Jennifer Widom,et al.  Query optimization over web services , 2006, VLDB.

[17]  G. Lieberman,et al.  Introduction to Mathematical Programming , 1990 .

[18]  Xiaojun Wan,et al.  A novel document similarity measure based on earth mover's distance , 2007, Inf. Sci..

[19]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[20]  Athanasios K. Tsakalidis,et al.  Web Service Discovery Mechanisms: Looking for a Needle in a Haystack? , 2004 .

[21]  Xiaotie Deng,et al.  Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover's Distance (EMD) , 2006, IEEE Transactions on Dependable and Secure Computing.