Webservices oriented data mining in knowledge architecture

Massive parallelism is required for an efficient solution to data mining tasks, considering the proliferation of data and the need for high computational effort. The DisDaMin (Distributed Data Mining) project addressed distributed discovery and knowledge discovery through parallelization of data mining tasks. DisDaMin algorithms are based on the DG-ADAJ (Desktop-Grid Adaptive Application in Java), a middleware platform for Desktop Grid used as a deployment mechanism for DisDaMin algorithms. On the top of DG-ADAJ, SOA specific components could be employed to provide additional features and to improve current operation. An ESB (Enterprise Service Bus) built on the top of DG-ADAJ is going to provide improved availability, solve interoperability issues by exposing services through well established interfaces, and offer a loosely coupled infrastructure. An additional enactment layer is going to improve the performance of intelligent fragmentation of data, offering at the same time the necessary support for the execution of data workflows.

[1]  Wagner Meira,et al.  Anteater: A Service-Oriented Architecture for High-Performance Data Mining , 2006, IEEE Internet Computing.

[2]  Ivan Janciak,et al.  GridMiner: a fundamental infrastructure for building intelligent grid systems , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[3]  Ning Zhong,et al.  Intelligent Technologies for Information Analysis , 2004, Springer Berlin Heidelberg.

[4]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[5]  Bernard Toursel,et al.  A clustering method to distribute a database on a grid , 2007, Future Gener. Comput. Syst..

[6]  María S. Pérez-Hernández,et al.  Design and implementation of a data mining grid-aware architecture , 2007, Future Gener. Comput. Syst..

[7]  Scott R. Kohn,et al.  Toward a Common Component Architecture for High-Performance Scientific Computing , 1999, HPDC.

[8]  James Pasley,et al.  How BPEL and SOA Are Changing Web Services Development , 2005, IEEE Internet Comput..

[9]  Mario Cannataro,et al.  Grid-Based Data Mining and Knowledge Discovery , 2004 .

[10]  Masaru Kitsuregawa,et al.  Hash based parallel algorithms for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[11]  María S. Pérez-Hernández,et al.  Adapting the Weka Data Mining Toolkit to a Grid Based Environment , 2005, AWIC.

[12]  James Arthur Kohl,et al.  The CCA core specification in a distributed memory SPMD framework , 2002, Concurr. Comput. Pract. Exp..

[13]  Walter F. Tichy Programming-in-the-large: past, present, and future , 1992, International Conference on Software Engineering.

[14]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[15]  Domenico Talia The Open Grid Services Architecture: Where the Grid Meets the Web , 2002, IEEE Internet Comput..

[16]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[17]  Guillem Lefait,et al.  Optimal Grid Exploitation Algorithms for Data Mining , 2006, 2006 Fifth International Symposium on Parallel and Distributed Computing.

[18]  Richard Olejnik,et al.  A Java Object Observation Policy for Load Balancing , 2002, PDPTA.

[19]  Philip S. Yu,et al.  Efficient parallel data mining for association rules , 1995, CIKM '95.

[20]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[21]  Richard Olejnik,et al.  An observation mechanism of distributed objects in Java , 2002, Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing.

[22]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules and sequential patterns , 1996 .

[23]  Marek Tudruj,et al.  A Framework for Desktop GRID Applications: CCADAJ , 2006, 2006 Fifth International Symposium on Parallel and Distributed Computing.

[24]  Domenico Talia Grid-based Distributed Data Mining Systems , Algorithms and Services , 2006 .

[25]  Valérie Fiolet Algorithmes distribués d'extraction de connaissances , 2006 .

[26]  David Chappell,et al.  SOA-Ready for Primetime : The Next-Generation , Grid-Enabled Service-Oriented Architecture , 2007 .

[27]  Maria Ganzha,et al.  Combining Software Agents and Grid Middleware , 2007, GPC.

[28]  Marek Tudruj,et al.  Byte-code scheduling of Java programs with branches for desktop grid , 2007, Future Gener. Comput. Syst..

[29]  Marek Tudruj,et al.  Optimizing Distributed Data Mining Applications Based on Object Clustering Methods , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[30]  Andrzej Skowron,et al.  Proceedings of the 2005 IEEE / WIC / ACM International Conference on Web Intelligence , 2005 .

[31]  Steven Tuecke,et al.  Grid Service Specification , 2002 .

[32]  Domenico Talia,et al.  Distributed data mining services leveraging WSRF , 2007, Future Gener. Comput. Syst..

[33]  Radu Prodan,et al.  From Web services to OGSA: experiences in implementing an OGSA-based grid application , 2003, Proceedings. First Latin American Web Congress.

[34]  Mohammed J. Zaki Parallel and distributed association mining: a survey , 1999, IEEE Concurr..

[35]  Bernard Toursel,et al.  Distributed Data Mining , 2001, Scalable Comput. Pract. Exp..