Toward Distributed Knowledge Discovery on Grid Systems

While massive amounts of data are being collected and stored from not only science fields but also industry and commerce fields, the efficient mining and management of useful information of this data is becoming a challenge and a massive economic need. This led to the development of distributed data mining techniques to deal with huge multi-dimensional datasets distributed among several sites.

[1]  Steven Tuecke,et al.  The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration , 2002 .

[2]  Anthony Rowe,et al.  Discovery net: towards a grid of knowledge discovery , 2002, KDD.

[3]  Hans-Peter Kriegel,et al.  DBDC: Density Based Distributed Clustering , 2004, EDBT.

[4]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[5]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[6]  M. Tahar Kechadi,et al.  Performance study of distributed Apriori-like frequent itemsets mining , 2010, Knowledge and Information Systems.

[7]  Dennis P. Groth,et al.  Average-Case Performance of the Apriori Algorithm , 2004, SIAM J. Comput..

[8]  Boris Novikov,et al.  An Indexing Algorithm for Text Retrieval , 1996, ADBIS.

[9]  Bin Zhang,et al.  Distributed data clustering can be efficient and exact , 2000, SKDD.

[10]  Jin-Fu Chang,et al.  Knowledge Representation Using Fuzzy Petri Nets , 1990, IEEE Trans. Knowl. Data Eng..

[11]  Ran Wolff,et al.  A high-performance distributed algorithm for mining association rules , 2004, Knowledge and Information Systems.

[12]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[13]  Mario Cannataro,et al.  A data mining toolset for distributed high- performance platforms , 2002 .

[14]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[15]  T. Buzan,et al.  The Mind Map Book , 1993 .

[16]  M. Tahar Kechadi,et al.  Lightweight Clustering Technique for Distributed Data Mining Applications , 2007, Industrial Conference on Data Mining.

[17]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[18]  Sanjay Ranka,et al.  A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data , 1997, VLDB.

[19]  Michael J. Pazzani,et al.  A Principal Components Approach to Combining Regression Estimates , 1999, Machine Learning.

[20]  M. Tahar Kechadi,et al.  An efficient support management tool for distributed data mining environments , 2007, 2007 2nd International Conference on Digital Information Management.

[21]  Keying Ye,et al.  Determining the Number of Clusters Using the Weighted Gap Statistic , 2007, Biometrics.

[22]  Nhien-An Le-Khac,et al.  An Efficient Knowledge Management Tool for Distributed Data Mining Environments , 2009 .

[23]  Matthias Jarke,et al.  20th VLDB Conference, September 12-15, 1994, Santiago-Chile : proceedings of the 20th International Conference on Very Large Data Bases , 1994 .

[24]  Martin J. Eppler Making knowledge visible through intranet knowledge maps: concepts, elements, cases , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[25]  Mark N. Wexler,et al.  The who, what and why of knowledge mapping , 2001, J. Knowl. Manag..

[26]  Yi Deng,et al.  A G-Net Model for Knowledge Representation and Reasoning , 1990, IEEE Trans. Knowl. Data Eng..

[27]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[28]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[29]  Bruce G. Buchanan,et al.  The MYCIN Experiments of the Stanford Heuristic Programming Project , 1985 .

[30]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[31]  JUSTIN ZOBEL,et al.  Inverted files for text search engines , 2006, CSUR.

[32]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[33]  Benoit Hudzia,et al.  Entity Based Peer-to-Peer in a Data Grid Environment , 2006, ArXiv.

[34]  Joseph D. Novak,et al.  Learning How to Learn , 1984 .

[35]  Peter Brezany,et al.  GridMiner: An Infrastructure for Data Mining on Computational Grids , 2003 .

[36]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[37]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[38]  Mario Cannataro,et al.  Distributed data mining on the grid , 2002, Future Gener. Comput. Syst..

[39]  Alex Alves Freitas,et al.  Mining Very Large Databases with Parallel Processing , 1997, The Kluwer International Series on Advances in Database Systems.

[40]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[41]  Yike Guo,et al.  An Architecture for Distributed Enterprise Data Mining , 1999, HPCN Europe.