Mining linguistic browsing patterns in the world wide web

Abstract World-wide-web applications have grown very rapidly and have made a significant impact on computer systems. Among them, web browsing for useful information may be most commonly seen. Due to its tremendous amounts of use, efficient and effective web retrieval has thus become a very important research topic in this field. Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for a certain purpose. In this paper, we use the data mining techniques to discover relevant browsing behavior from log data in web servers, thus being able to help make rules for retrieval of web pages. The browsing time of a customer on each web page is used to analyze the retrieval behavior. Since the data collected are numeric, fuzzy concepts are used to process them and to form linguistic terms. A sophisticated web-mining algorithm is thus proposed to find relevant browsing behavior from the linguistic data. Each page uses only the linguistic term with the maximum cardinality in later mining processes, thus making the number of fuzzy regions to be processed the same as the number of the pages. Computational time can thus be greatly reduced. The patterns mined out thus exhibit the browsing behavior and can be used to provide some appropriate suggestions to web-server managers.

[1]  Ebrahim Mamdani,et al.  Applications of fuzzy algorithms for control of a simple dynamic plant , 1974 .

[2]  Theodosios Pavlidis,et al.  Fuzzy Decision Tree Algorithms , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Hans-Jürgen Zimmermann,et al.  Fuzzy Set Theory - and Its Applications , 1985 .

[4]  A. F. Blishun Fuzzy learning models in expert systems , 1987 .

[5]  Ian Graham,et al.  Expert Systems: Knowledge, Uncertainty and Decision , 1988 .

[6]  Stephen Watson,et al.  Set Theory and its Applications , 1989 .

[7]  J. Rives FID3: fuzzy induction decision tree , 1990, [1990] Proceedings. First International Symposium on Uncertainty Modeling and Analysis.

[8]  Abraham Kandel,et al.  Fuzzy Expert Systems , 1991 .

[9]  Richard Weber,et al.  Fuzzy-ID3: A class of methods for automatic knowledge acquisition , 1992 .

[10]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11]  M. Delgado,et al.  An inductive learning procedure to identify fuzzy systems , 1993 .

[12]  H. Zimmermann,et al.  Fuzzy Set Theory and Its Applications , 1993 .

[13]  S. Moral,et al.  Learning rules for a fuzzy inference model , 1993 .

[14]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[15]  M. Shaw,et al.  Induction of fuzzy decision trees , 1995 .

[16]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[17]  Antonio González,et al.  A learning methodology in uncertain and imprecise environments , 1995 .

[18]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[19]  Tzung-Pei Hong,et al.  Induction of fuzzy rules and membership functions from training examples , 1996, Fuzzy Sets Syst..

[20]  T. Hong,et al.  Inductive learning from fuzzy examples , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[21]  Jaideep Srivastava,et al.  Grouping Web page references into transactions for mining World Wide Web browsing patterns , 1997, Proceedings 1997 IEEE Knowledge and Data Engineering Exchange Workshop.

[22]  Tzung-Pei Hong,et al.  A Generalized Version Space Learning Algorithm for Noisy and Uncertain Data , 1997, IEEE Trans. Knowl. Data Eng..

[23]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[24]  Keith C. C. Chan,et al.  Mining fuzzy association rules , 1997, CIKM '97.

[25]  Samir Elloumi,et al.  An Incrementa Learning System for Imprecise and Uncertain Knowledge Discovery , 1998, Inf. Sci..

[26]  Niki Pissinou,et al.  Attribute weighting: a method of applying domain knowledge in the decision tree process , 1998, International Conference on Information and Knowledge Management.

[27]  Katia P. Sycara,et al.  WebMate: a personal agent for browsing and searching , 1998, AGENTS '98.

[28]  J. C. Peters,et al.  Fuzzy Cluster Analysis : A New Method to Predict Future Cardiac Events in Patients With Positive Stress Tests , 1998 .

[29]  Xuemei Shi,et al.  Knowledge representation and discovery based on linguistic atoms , 1998, Knowl. Based Syst..

[30]  Tzung-Pei Hong,et al.  Finding relevant attributes and membership functions , 1999, Fuzzy Sets Syst..

[31]  Tzung-Pei Hong,et al.  A fuzzy inductive learning strategy for modular rules , 1999, Fuzzy Sets Syst..

[32]  Edith Cohen,et al.  Efficient algorithms for predicting requests to Web servers , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[33]  Tzung-Pei Hong,et al.  Processing individual fuzzy attributes for fuzzy rule induction , 2000, Fuzzy Sets Syst..

[34]  Tzung-Pei Hong,et al.  A New Probabilistic Induction Method , 2004, Journal of Automated Reasoning.

[35]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.