Discovering geographical-specific interests from web click data

As the Internet continues to play an important role in many business applications, it becomes vital to increase the competitive edge by offering geographically tailored contents that reflect the common interests of the geographical region of the web visitors. In this paper, we define the problem of mining geographical-specific interests patterns. We utilize the quadtree to model the influence distributions of different features, and design an algorithm called Flex-iPROBER to mine geographical-specific interests patterns that are significant in a local region. We further examine how these patterns can change over time and develop an algorithm called MineGIC to efficiently discover pattern changes. Experiment results demonstrate that the proposed algorithms are scalable and efficient. Patterns discovered from real world web click datasets reveal interesting patterns and show the evolution of the interests of people in those regions.

[1]  Ramakrishnan Srikant,et al.  Mining web logs to improve website organization , 2001, WWW '01.

[2]  Xing Xie,et al.  Detecting Geographical Serving Area of Web Resources , 2006, GIR.

[3]  Suh-Yin Lee,et al.  On mining webclick streams for path traversal patterns , 2004, WWW Alt. '04.

[4]  Chao Liu,et al.  A probabilistic approach to spatiotemporal theme pattern mining on weblogs , 2006, WWW '06.

[5]  Shashi Shekhar,et al.  A Joinless Approach for Mining Spatial Colocation Patterns , 2006, IEEE Transactions on Knowledge and Data Engineering.

[6]  Xin Zhang,et al.  Fast mining of spatial collocations , 2004, KDD.

[7]  Mong-Li Lee,et al.  A framework for mining topological patterns in spatio-temporal databases , 2005, CIKM '05.

[8]  Xing Xie,et al.  Detecting geographic locations from web resources , 2005, GIR '05.

[9]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[10]  Hui Xiong,et al.  Discovering colocation patterns from spatial data sets: a general approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[11]  Philip S. Yu,et al.  Efficient Data Mining for Path Traversal Patterns , 1998, IEEE Trans. Knowl. Data Eng..

[12]  Xin Jin,et al.  Web usage mining based on probabilistic latent semantic analysis , 2004, KDD.

[13]  Vir V. Phoha,et al.  Web user clustering from access log using belief function , 2001, K-CAP '01.

[14]  Tao Luo,et al.  Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization , 2004, Data Mining and Knowledge Discovery.

[15]  Shashi Shekhar,et al.  A partial join approach for mining co-location patterns , 2004, GIS '04.

[16]  Anthony K. H. Tung,et al.  Discovering Spatial Interaction Patterns , 2008, DASFAA.

[17]  Jian Pei,et al.  Mining Access Patterns Efficiently from Web Logs , 2000, PAKDD.