Combining evidence for automatic Web session identification

Contextual information provides an important basis for identifying and understanding users' information needs. Our previous work in traditional information retrieval systems has shown how using contextual information could improve retrieval performance. With the vast quantity and variety of information available on the Web, and the short query lengths within Web searches, it becomes even more crucial that appropriate contextual information is extracted to facilitate personalized services. However, finding users' contextual information is not straightforward, especially in the Web search environment where less is known about the individual users. In this paper, we will present an approach that has significant potential far studying Web users' search contexts. The approach automatically groups a user's consecutive search activities on the same search topic into one session. It uses Dempster-Shafer theory to combine evidence extracted from two sources, each of which is based on the statistical data from Web search logs. The evaluation we have performed demonstrates that our approach has achieved a significant improvement over previous methods of session identification.

[1]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[2]  Sally Jo Cunningham,et al.  An Analysis of Usage of a Digital Library , 1998, ECDL.

[3]  Clement T. Yu,et al.  Multiple evidence combination in image retrieval: Diogenes searches for people on the Web , 2000, SIGIR '00.

[4]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[5]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[6]  Amanda Spink,et al.  Modeling Users' Successive Searches in Digital Environments: A National Science Foundation/British Library Funded Study , 1998, D Lib Mag..

[7]  Donna K. Harman,et al.  Relevance feedback revisited , 1992, SIGIR '92.

[8]  Daqing He,et al.  Detecting session boundaries from Web user logs , 2000 .

[9]  A. S. Goker,et al.  Web user search pattern analysis for modelling query topic changes , 2001 .

[10]  Ayse Goker Context learning in Okapi , 1997 .

[11]  W. Bruce Croft The Role of Context and Adaptation in User Interfaces , 1984, Int. J. Man Mach. Stud..

[12]  Sanna Talja,et al.  The production of context in information seeking research: a metatheoretical view , 1999, Inf. Process. Manag..

[13]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[14]  Daqing He,et al.  Analysing Web Search Logs to Determine Session Boundaries for User-Oriented Learning , 2000, AH.

[15]  Carlo Strapparava,et al.  Proceedings of the Second International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems , 2000 .

[16]  T. L. McCluskey,et al.  Towards an Adaptive Information Retrieval System , 1991, ISMIS.

[17]  Saul Greenberg,et al.  How people revisit web pages: empirical findings and implications for the design of history systems , 1997, Int. J. Hum. Comput. Stud..

[18]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[19]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[20]  Donna Harman,et al.  Multi-task multi-modality SVM for early COVID-19 Diagnosis using chest CT data , 2021, Information Processing & Management.

[21]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[22]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.