Developing an open knowledge discovery support system for a network environment

Knowledge discovery in databases (KDD) is a highly complex process where a lot of data manipulation tools with different characteristics can, and in fact have to, be used together in an interactive and iterative fashion, to reach the goal of previously unknown, potentially useful information extraction. In this paper we analyze the major sources of complexity in the framework of network organizations, pointing out the necessity to give support to the user in many different ways and at very different levels of granularity, from the use of a single tool, to the management of whole, distributed, KDD projects. Unfortunately, currently available systems lack to support the users in at least some of these features. We then propose a solution based on the service oriented computing paradigm, arguing that the advantages of this paradigm, namely openness, modularity, reusability and transparency, as well as ubiquity, can help in the design of an effective support system for knowledge discovery in databases in network environments

[1]  Robert L. Grossman,et al.  Data mining standards initiatives , 2002, CACM.

[2]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[3]  Maurizio Panti,et al.  Semantic Annotation of Classification Data for KDD Support Services , 2004, ICSNW.

[4]  Maozhen Li,et al.  PaDDMAS: parallel and distributed data mining application suite , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[5]  Hillol Kargupta,et al.  Distributed Data Mining: Algorithms, Systems, and Applications , 2003 .

[6]  Pedram Sadeghian,et al.  An extensible service oriented distributed data mining framework , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..

[7]  Michael Schrefl,et al.  Integration of Web services into workflows through a multi-level schema architecture , 2002, Proceedings Fourth IEEE International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems (WECWIS 2002).

[8]  Sanjiva Weerawarana,et al.  Unraveling the Web services web: an introduction to SOAP, WSDL, and UDDI , 2002, IEEE Internet Computing.

[9]  Domenico Talia The Open Grid Services Architecture: Where the Grid Meets the Web , 2002, IEEE Internet Comput..

[10]  Arnaldo Spalvieri,et al.  Quantizing for minimum average misclassification risk , 1998, IEEE Trans. Neural Networks.

[11]  Mario Cannataro,et al.  The knowledge grid , 2003, CACM.

[12]  Sunita Sarawagi,et al.  Data mining models as services on the internet , 2000, SKDD.

[13]  Ramasamy Uthurusamy,et al.  EVOLVING DATA MINING INTO SOLUTIONS FOR INSIGHTS , 2002 .

[14]  Emil C. Lupu,et al.  Workflow-based composition of Web-services: a business model or a programming paradigm? , 2002, Proceedings. Sixth International Enterprise Distributed Object Computing.

[15]  Robert L. Grossman,et al.  The management and mining of multiple predictive models using the predictive modeling markup language , 1999, Inf. Softw. Technol..

[16]  Ilker Hamzaoglu,et al.  Scalable, Distributed Data Mining - An Agent Architecture , 1997, KDD.

[17]  J. Roy,et al.  Understanding Web services , 2001 .