P-FCM: a proximity-based fuzzy clustering for user-centered web applications

Abstract In last years, the Internet and the web have been evolved in an astonishing way. Standard web search services play an important role as useful tools for the Internet community even though they suffer from a certain difficulty. The web continues its growth, making the reliability of Internet-based information and retrieval systems more complex. Nevertheless there has been a substantial analysis of the gap between the expected information and the returned information, the work of web search engine is still very hard. There are different problems concerning web searching activity, one among these falls in the query phase. Each engine provide an interface which the user is forced to learn. Often, the searching process returns a huge list of answers that are irrelevant, unavailable, or outdated. The tediosity of querying, due to the fact the queries are too weak to cope with the user’s expressiveness, has stimulated the designers to enrich the human-system interaction with new searching metaphors. One of these is the searching of “similar” pages, as offered by Google, Yahoo and others. The idea is very good, since the similarity gives an easy and intuitive mechanism to express a complex relation. We believe that this approach could become more effective if the user can rely on major flexibility in expressing the similarity dependencies with respect the current and available possibilities. In this paper we introduce a novel method for considering and processing the user-driven similarity during web navigation. We define an extension of fuzzy C-means algorithm, namely proximity fuzzy C-means (P-FCM) incorporating a measure of similarity or dissimilarity as user’s feedback on the clusters. We present the theoretical framework of this extension and then we observe, through a suite of web-based experiments, how significant is the impact of user’s feedback during P-FCM functioning. These observations suggest that the P-FCM approach can offer a relatively simple way of improving the web page classification according with the user interaction with the search engine.

[1]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[2]  Thomas A. Runkler,et al.  Alternating cluster estimation: a new tool for clustering and function approximation , 1999, IEEE Trans. Fuzzy Syst..

[3]  Kate Smith-Miles,et al.  Web page clustering using a self-organizing map of user navigation patterns , 2003, Decis. Support Syst..

[4]  James C. Bezdek,et al.  Nerf c-means: Non-Euclidean relational fuzzy clustering , 1994, Pattern Recognit..

[5]  Vipin Kumar,et al.  Partitioning-based clustering for Web document categorization , 1999, Decis. Support Syst..

[6]  V. J. Rayward-Smith,et al.  Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition , 1999 .

[7]  Anupam Joshi,et al.  Low-complexity fuzzy relational clustering algorithms for Web mining , 2001, IEEE Trans. Fuzzy Syst..

[8]  Frank Hoeppner,et al.  Fuzzy shell clustering algorithms in image processing: fuzzy C-rectangular and 2-rectangular shells , 1997, IEEE Trans. Fuzzy Syst..

[9]  James C. Bezdek,et al.  Generalized fuzzy c-means clustering strategies using Lp norm distances , 2000, IEEE Trans. Fuzzy Syst..

[10]  Witold Pedrycz,et al.  Granular computing: an introduction , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[11]  J. Leon Zhao,et al.  Automatic discovery of similarity relationships through Web mining , 2003, Decis. Support Syst..

[12]  Howard C. Card,et al.  Categorizing Web pages on the subject of neural networks , 1998, J. Netw. Comput. Appl..

[13]  Cyril Cleverdon,et al.  Optimizing convenient online access to bibliographic databases , 1984 .

[14]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.