WEB Community Mining and WEB Log Mining: Commodity Cluster Based Execution

The emergence of WWW has drawn new frontiers for database research. Web mining has become a hot topic since WWW rapid expansion rate and chaotic nature have exposed some technical challenges as well as interesting discoveries. In general web mining can be classified into web structure mining and web usage mining. Here we introduce two applications of web mining, first from mining the web structure we identify web communities, and the second we mine web usage of mobile internet users on location aware search engine. Those applications require heavy computational power as well as good scalability. Cluster of commodity PCs is suitable as the platform to handle such applications. Here we also report some approaches for optimal parallel execution of mining algorithms on PC cluster.

[1]  Masaru Kitsuregawa,et al.  Parallel mining algorithms for generalized association rules with classification hierarchy , 1997, SIGMOD '98.

[2]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[3]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[4]  Monika Henzinger,et al.  Finding Related Pages in the World Wide Web , 1999, Comput. Networks.

[5]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[6]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[7]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[8]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[9]  Masaru Kitsuregawa,et al.  Creating a Web community chart for navigating related communities , 2001, Hypertext.

[10]  Masaru Kitsuregawa,et al.  Hash based parallel algorithms for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[11]  Vipin Kumar,et al.  Scalable parallel data mining for association rules , 1997, SIGMOD '97.

[12]  Philip S. Yu,et al.  Efficient parallel data mining for association rules , 1995, CIKM '95.

[13]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[14]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[15]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[16]  Masato Oguchi,et al.  Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining , 1997, ACM/IEEE SC 1997 Conference (SC'97).