Personalized mining of web documents using link structures and fuzzy concept networks

Personalized search engines are important tools for finding web documents for specific users, because they are able to provide the location of information on the WWW as accurately as possible, using efficient methods of data mining and knowledge discovery. The types and features of traditional search engines are various, including support for different functionality and ranking methods. New search engines that use link structures have produced improved search results which can overcome the limitations of conventional text-based search engines. Going a step further, this paper presents a system that provides users with personalized results derived from a search engine that uses link structures. The fuzzy document retrieval system (constructed from a fuzzy concept network based on the user's profile) personalizes the results yielded from link-based search engines with the preferences of the specific user. A preliminary experiment with six subjects indicates that the developed system is capable of searching not only relevant but also personalized web pages, depending on the preferences of the user.

[1]  B. Pinkerton,et al.  Finding What People Want : Experiences with the WebCrawler , 1994, WWW Spring 1994.

[2]  Lipika Dey,et al.  A new customized document categorization scheme using rough membership , 2005, Appl. Soft Comput..

[3]  Shyi-Ming Chen,et al.  Document retrieval using knowledge-based fuzzy information retrieval techniques , 1995, IEEE Trans. Syst. Man Cybern..

[4]  Andrew Tomkins,et al.  The Web and Social Networks , 2002, Computer.

[5]  Xiaohua Hu,et al.  From Computational Intelligence to Web Intelligence , 2002, Computer.

[6]  L. Zadeh Fuzzy sets as a basis for a theory of possibility , 1999 .

[7]  Jiawei Han,et al.  Data Mining for Web Intelligence , 2002, Computer.

[8]  Sung-Bae Cho,et al.  Conceptual Information Extraction with Link-Based Search , 2001, Web Intelligence.

[9]  Ning Zhong,et al.  In Search of the Wisdom Web , 2002, Computer.

[10]  Wolfgang Nejdl,et al.  PROS: A Personalized Ranking Platform for Web Search , 2004, AH.

[11]  Helen Nissenbaum,et al.  Defining the Web: The Politics of Search Engines , 2000, Computer.

[12]  Hector Garcia-Molina,et al.  Efficient Crawling Through URL Ordering , 1998, Comput. Networks.

[13]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[14]  Oren Etzioni,et al.  Grouper: A Dynamic Clustering Interface to Web Search Results , 1999, Comput. Networks.

[15]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[16]  Shyi-Ming Chen,et al.  Fuzzy query processing for document retrieval based on extended fuzzy concept networks , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[17]  Dario Lucarella,et al.  FIRST: Fuzzy Information Retrieval SysTem , 1991, J. Inf. Sci..

[18]  Monika Henzinger,et al.  Hyperlink Analysis for the Web , 2001, IEEE Internet Comput..

[19]  Clement T. Yu,et al.  Personalized Web search for improving retrieval effectiveness , 2004, IEEE Transactions on Knowledge and Data Engineering.

[20]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[21]  Sriram Raghavan,et al.  Searching the Web , 2001, ACM Trans. Internet Techn..

[22]  A. Tomkins,et al.  Spectral filtering for resource discovery , 1998 .

[23]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[24]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[25]  A.L.P. Chen,et al.  Supporting conceptual and neighborhood queries on the World Wide Web , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[26]  Sung-Bae Cho,et al.  A personalized Web search engine using fuzzy concept network with link structure , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).