SEWEBAR-CMS: semantic analytical report authoring for data mining results

SEWEBAR-CMS is a set of extensions for the Joomla! Content Management System (CMS) that extends it with functionality required to serve as a communication platform between the data analyst, domain expert and the report user. SEWEBAR-CMS integrates with existing data mining software through PMML. Background knowledge is entered via a web-based elicitation interface and is preserved in documents conforming to the proposed Background Knowledge Exchange Format (BKEF) specification. SEWEBAR-CMS offers web service integration with semantic knowledge bases, into which PMML and BKEF data are stored. Combining domain knowledge and mining model visualizations with results of queries against the knowledge base, the data analyst conveys the results of the mining through a semi-automatically generated textual analytical report to the end user. The paper demonstrates the use of SEWEBAR-CMS on a real-world task from the cardiological domain and presents a user study showing that the proposed report authoring support leads to a statistically significant decrease in the time needed to author the analytical report.

[1]  Vojtech Svátek,et al.  Semantic Analytical Reports: A Framework for Post-processing Data Mining Results , 2009, ISMIS.

[2]  Ioannis Vlahavas,et al.  Methods and Applications of Artificial Intelligence , 2002, Lecture Notes in Computer Science.

[3]  Jan Rauch,et al.  Considerations on Logical Calculi for Dealing with Knowledge in Data Mining , 2009 .

[4]  Milan Šimůnek,et al.  XML Schema and Topic Map Ontology for Background Knowledge in Data Mining , 2010 .

[5]  Frank Puppe,et al.  A Knowledge-Intensive Approach for Semi-automatic Causal Subgroup Discovery , 2009, Knowledge Discovery Enhanced with Semantic and Social Information.

[6]  Stan Matwin,et al.  Using Qualitative Models to Guide Inductive Learning , 1993, ICML.

[7]  Tim Beißbarth,et al.  Extending pathways based on gene lists using InterPro domain signatures , 2008, BMC Bioinformatics.

[8]  Abraham Bernstein,et al.  Toward intelligent assistance for a data mining process: an ontology-based approach for cost-sensitive classification , 2005, IEEE Transactions on Knowledge and Data Engineering.

[9]  Emma Smith Language: An Overview , 2004 .

[10]  Dunja Mladenic,et al.  Knowledge Discovery Enhanced with Semantic and Social Information , 2009, Studies in Computational Intelligence.

[11]  Jan Rauch,et al.  An XML Format for Association Rule Models Based on the GUHA Method , 2010, RuleML.

[12]  Andrei Olaru,et al.  Local mining of Association Rules with Rule Schemas , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[13]  Maarten van Someren,et al.  Using Models of Problem Solving as Bias in Automated Knowledge Acquisition , 1994, ECAI.

[14]  Foster J. Provost,et al.  Exploiting Background Knowledge in Automated Discovery , 1996, KDD.

[15]  Petr Hájek,et al.  Mechanizing Hypothesis Formation , 1978 .

[16]  Hussein Almuallim,et al.  On Handling Tree-Structured Attributed in Decision Tree Learning , 1995, ICML.

[17]  Claudio Gennaro,et al.  Functionalities of a Content Management System specialized for Digital Library Applications , 2004 .

[18]  Mario Cannataro,et al.  A Data Mining Ontology for Grid Programming , 2003 .

[19]  Jean-Gabriel Ganascia,et al.  A Machine Learning Tool Designed for a Model-Based Knowledge Acquisition Approach , 1993, EKAW.

[20]  Dunja Mladenic,et al.  Semantics, Web and Mining , 2008 .

[21]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[22]  Lars Marius Garshol TMRAP - Topic Maps Remote Access Protocol , 2005, TMRA.

[23]  Eric Bloedorn,et al.  Exploiting Available Domain Knowledge to Improve Mining Aviation Safety and Network Security Data , 2004 .

[24]  Wen-Yang Lin,et al.  Mining Association Rules with Ontological Information , 2007, Second International Conference on Innovative Computing, Informatio and Control (ICICIC 2007).

[25]  Jan Rauch,et al.  Logic of Association Rules , 2004, Applied Intelligence.

[26]  Rudi Studer,et al.  Providing User Support for Developing Knowledge Discovery Applications: A Midterm Report , 1998, Künstliche Intell..

[27]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[28]  Nikolaos M. Avouris,et al.  The Role of Domain Knowledge in a Large Scale Data Mining Project , 2002, SETN.

[29]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[30]  Frank Puppe,et al.  Wiki-Enabled Semantic Data Mining – Task Design , Evaluation and Refinement , 2009 .

[31]  Jan Rauch,et al.  Ontology-Enhanced Association Mining , 2005, EWMF/KDO.

[32]  Jan Rauch,et al.  LAREDAM - Considerations on System of Local Analytical Reports from Data Mining , 2008, ISMIS.

[33]  Takahira Yamaguchi Specifying and Learning Inductive Learning Systems Using Ontologies , 1998 .

[34]  Bernd Markscheffel,et al.  GTM alpha - Towards a Graphical Notation for Topic Maps , 2008 .

[35]  Deborah L. McGuinness,et al.  OWL Web ontology language overview , 2004 .

[36]  Martin Ralbovský,et al.  Using Disjunctions in Association Mining , 2007, Industrial Conference on Data Mining.

[37]  T. Havránek,et al.  Mechanizing Hypothesis Formation: Mathematical Foundations for a General Theory , 1978 .

[38]  Lars Marius Garshol tolog - A Topic Maps Query Language , 2005, TMRA.

[39]  Cláudia Antunes Mining Patterns in the Presence of Domain Knowledge , 2009, ICEIS.

[40]  Wen-Ching Lin,et al.  PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics , 2010 .

[41]  Jan Rauch,et al.  An Alternative Approach to Mining Association Rules , 2005, Foundations of Data Mining and knowledge Discovery.

[42]  Jan Rauch,et al.  Dealing with Background Knowledge in the SEWEBAR Project , 2009, Knowledge Discovery Enhanced with Semantic and Social Information.

[43]  Liz Sonenberg,et al.  Domain ontology driven data mining: a medical case study , 2007, DDDM '07.

[44]  Amedeo Napoli,et al.  Ontology-guided data preparation for discovering genotype-phenotype relationships , 2008, BMC Bioinformatics.

[45]  Marcos Aurélio Domingues,et al.  Using Taxonomies to Facilitate the Analysis of the Association Rules , 2011, ArXiv.

[46]  Lars Marius Garshol,et al.  Towards a Methodology for Developing Topic Maps Ontologies , 2006, TMRA.

[47]  Bruce G. Buchanan,et al.  Ontology-guided knowledge discovery in databases , 2001, K-CAP '01.