论文信息 - Why we search: visualizing and predicting user behavior

Why we search: visualizing and predicting user behavior

The aggregation and comparison of behavioral patterns on the WWW represent a tremendous opportunity for understanding past behaviors and predicting future behaviors. In this paper, we take a first step at achieving this goal. We present a large scale study correlating the behaviors of Internet users on multiple systems ranging in size from 27 million queries to 14 million blog posts to 20,000 news articles. We formalize a model for events in these time-varying datasets and study their correlation. We have created an interface for analyzing the datasets, which includes a novel visual artifact, the DTWRadar, for summarizing differences between time series. Using our tool we identify a number of behavioral properties that allow us to understand the predictive power of patterns of use.

[1] Jaime Teevan,et al. History repeats itself: repeat queries in Yahoo's logs , 2006, SIGIR '06.

[2] Ji-Rong Wen,et al. Query clustering using user logs , 2002, TOIS.

[3] Eamonn J. Keogh,et al. Derivative Dynamic Time Warping , 2001, SDM.

[4] Susan T. Dumais,et al. Newsjunkie: providing personalized newsfeeds via analysis of information novelty , 2004, WWW '04.

[5] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[6] Abdur Chowdhury,et al. A picture of search , 2006, InfoScale '06.

[7] Dimitrios Gunopulos,et al. Identifying similarities, periodicities and bursts for online search queries , 2004, SIGMOD '04.

[8] Konstantina Martzoukou,et al. A review of Web information seeking research: considerations of method and foci of interest , 2005, Inf. Res..

[9] Jon Kleinberg,et al. Traffic-based feedback on the web , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10] Eamonn J. Keogh,et al. HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[11] Jon M. Kleinberg,et al. Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[12] Lucy T. Nowell,et al. ThemeRiver: Visualizing Thematic Changes in Large Document Collections , 2002, IEEE Trans. Vis. Comput. Graph..

[13] Jimmy Lin,et al. Identification of user sessions with hierarchical agglomerative clustering , 2006, ASIST.

[14] Ramanathan V. Guha,et al. The predictive power of online chatter , 2005, KDD '05.

[15] Marc Alexa,et al. Visualizing time-series on spirals , 2001, IEEE Symposium on Information Visualization, 2001. INFOVIS 2001..

[16] Eamonn J. Keogh,et al. Visualizing and Discovering Non-Trivial Patterns in Large Time Series Databases , 2005, Inf. Vis..

[17] L. R. Rabiner,et al. A comparative study of several dynamic time-warping algorithms for connected-word recognition , 1981, The Bell System Technical Journal.

[18] Yiming Yang,et al. Topic Detection and Tracking Pilot Study Final Report , 1998 .

[19] Andrew P. Witkin,et al. Scale-Space Filtering , 1983, IJCAI.

[20] E. Tufte. Beautiful Evidence , 2006 .

[21] Steve Chien,et al. Semantic similarity between search engine queries using temporal correlation , 2005, WWW '05.

[22] Jarke J. van Wijk,et al. Cluster and Calendar Based Visualization of Time Series Data , 1999, INFOVIS.

[23] David D. Jensen,et al. Mining of Concurrent Text and Time Series , 2008 .