A Website Mining Model Centered on User Queries

We present a model for mining user queries found within the access logs of a website and for relating this information to the website's overall usage, structure and content. The aim of this model is to discover, in a simple way, valuable information to improve the quality of the website, allowing the website to become more intuitive and adequate for the needs of its users. This model presents a methodology of analysis and classification of the different types of queries registered in the usage logs of a website, such as queries submitted by users to the site's internal search engine and queries on global search engines that lead to documents in the website. These queries provide useful information about topics that interest users visiting the website and the navigation patterns associated to these queries indicate whether or not the documents in the site satisfied the user's needs at that moment.

[1]  H. Chertkow,et al.  Semantic memory , 2002, Current neurology and neuroscience reports.

[2]  P. Tan,et al.  WebSIFT : The Web Site Information Filter , 1999 .

[3]  Georg Lausen,et al.  Spreading activation models for trust propagation , 2004, IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004.

[4]  Maguelonne Teisseire,et al.  Using data mining techniques on Web access logs to dynamically improve hypertext structure , 1999, LINK.

[5]  Scott Everett Preece A spreading activation network model for information retrieval , 1981 .

[6]  Peter Pirolli,et al.  Computational models of information scent-following in a very large browsable text collection , 1997, CHI.

[7]  John F. Sowa,et al.  Knowledge representation: logical, philosophical, and computational foundations , 2000 .

[8]  Ricardo A. Baeza-Yates,et al.  Query Clustering for Boosting Web Page Ranking , 2004, AWIC.

[9]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[10]  Bamshad Mobasher,et al.  Web Usage Mining and Personalization , 2004, The Practical Handbook of Internet Computing.

[11]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[12]  George Cybenko,et al.  How dynamic is the Web? , 2000, Comput. Networks.

[13]  Yungwook Kim,et al.  Measuring the Economic Value of Public Relations , 2001 .

[14]  Ricardo Baeza-Yates,et al.  Web structure, age and page quality , 2002, WWW 2002.

[15]  P. Batista,et al.  Mining on-line newspaper web access logs , 2001 .

[16]  Bernardo Magnini,et al.  Is It the Right Answer? Exploiting Web Redundancy for Answer Validation , 2002, ACL.

[17]  Olfa Nasraoui,et al.  Combining Web Usage Mining and Fuzzy Inference for Website Personalization , 2003 .

[18]  Ricardo Baeza-Yates,et al.  Web Usage Mining in Search Engines , 2005 .

[19]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[20]  Fabio Crestani,et al.  Application of Spreading Activation Techniques in Information Retrieval , 1997, Artificial Intelligence Review.

[21]  Michael K. Ng,et al.  A Cube Model and Cluster Analysis for Web Access Sessions , 2001, WEBKDD.

[22]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[23]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[24]  Lee Wilkins Deciding What's News: A Study of CBS Evening News, NBC Nightly News, Newsweek, and Time , 2005 .

[25]  Jian Pei,et al.  Mining Access Patterns Efficiently from Web Logs , 2000, PAKDD.

[26]  Josef Falkinger,et al.  Attention Economies , 2003, J. Econ. Theory.

[27]  Maciej Ceglowski,et al.  Semantic Search of Unstructured Data using Contextual Network Graphs , 2003 .

[28]  Ricardo Baeza-Yates,et al.  Excavando la web , 2004 .

[29]  Oren Etzioni,et al.  Adaptive Web Sites: an AI Challenge , 1997, IJCAI.

[30]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[31]  Linda Hon,et al.  Demonstrating Effectiveness in Public Relations: Goals, Objectives, and Evaluation , 1998 .

[32]  Doug Downey,et al.  Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison , 2004, AAAI.

[33]  Myra Spiliopoulou,et al.  Web usage mining for Web site evaluation , 2000, CACM.

[34]  Robin Burke,et al.  USING CONCEPT HIERARCHIES TO ENHANCE USER QUERIES IN WEB-BASED INFORMATION RETRIEVAL , 2003 .

[35]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[36]  Wei-Ying Ma,et al.  Log mining to improve the performance of site search , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering (Workshops), 2002..

[37]  Jaideep Srivastava,et al.  Discovery of Interesting Usage Patterns from Web Data , 1999, WEBKDD.

[38]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[39]  Oren Etzioni,et al.  Adaptive Web Sites: Automatically Synthesizing Web Pages , 1998, AAAI/IAAI.

[40]  Wk Ching,et al.  A Cube Model for Web Access Sessions and Cluster Analysis , 2001 .

[41]  Oren Etzioni,et al.  The use of web-based statistics to validate, information extraction , 2004, AAAI 2004.

[42]  Myra Spiliopoulou,et al.  Analysis of navigation behaviour in web sites integrating multiple information systems , 2000, The VLDB Journal.

[43]  Brian D. Davison,et al.  Finding Relevant Website Queries , 2003, WWW.

[44]  Olfa Nasraoui,et al.  An Evolutionary Approach to Mining Robust Multi-Resolution Web Profiles and Context Sensitive URL Associations , 2002, Int. J. Comput. Intell. Appl..

[45]  Ricardo A. Baeza-Yates,et al.  Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[46]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .