Knowledge discovery standards

As knowledge discovery (KD) matures and enters the mainstream, there is an onus on the technology developers to provide the technology in a deployable, embeddable form. This transition from a stand-alone technology, in the control of the knowledgeable few, to a widely accessible and usable technology will require the development of standards. These standards need to be designed to address various aspects of KD ranging from the actual process of applying the technology in a business environment, so as to make the process more transparent and repeatable, through to the representation of knowledge generated and the support for application developers. The large variety of data and model formats that researchers and practitioners have to deal with and the lack of procedural support in KD have prompted a number of standardization efforts in recent years, led by industry and supported by the KD community at large. This paper provides an overview of the most prominent of these standards and highlights how they relate to each other using some example applications of these standards.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[3]  Alípio Mário Jorge,et al.  Data Mining and Decision Support Integration through the Predictive Model Markup Language Standard and Visualization , 2003 .

[4]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[5]  Jim Melton SQL Multimedia and Application Packages , 2003 .

[6]  Ronald J. Brachman,et al.  The Process of Knowledge Discovery in Databases: A First Sketch , 1994, KDD Workshop.

[7]  Willi Klösgen,et al.  Explora: A Multipattern and Multistrategy Discovery Assistant , 1996, Advances in Knowledge Discovery and Data Mining.

[8]  Zhaohui Tang,et al.  Building Data Mining Solutions with SQL Server 2000 , 2000 .

[9]  Luc De Raedt,et al.  A perspective on inductive databases , 2002, SKDD.

[10]  Robert L. Grossman,et al.  Data Mining for Scientific and Engineering Applications , 2001, Massive Computing.

[11]  D. Christodoulakis,et al.  Data Mining for Decision Support in e-banking area , 2003 .

[12]  S. C. Hui,et al.  Data mining for decision support , 2002 .

[13]  Susie Stephens,et al.  Oracle Data Mining , 2005 .

[14]  Ramasamy Uthurusamy,et al.  Knowledge discovery in databases : papers from the 1994 AAAI Workshop, August 2, Seattle, Washington , 1994 .

[15]  Regina Dunlea,et al.  Simple Object Access Protocol (SOAP) , 2005 .

[16]  Robert L. Grossman,et al.  Data mining standards initiatives , 2002, CACM.

[17]  Alípio Mário Jorge,et al.  Post-processing Operators for Browsing Large Sets of Association Rules , 2002, Discovery Science.

[18]  Jan M. Zytkow,et al.  Knowledge discovery in databases: the purpose, necessity, and challenges , 2002 .

[19]  Thomas Reinartz,et al.  CRISP-DM 1.0: Step-by-step data mining guide , 2000 .

[20]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[21]  Inderpal S. Bhandari,et al.  Advanced Scout: Data Mining and Knowledge Discovery in NBA Data , 2004, Data Mining and Knowledge Discovery.

[22]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[23]  Jim Melton,et al.  SQL multimedia and application packages (SQL/MM) , 2001, SGMD.