A Rule Management System for Knowledge Based Data Cleaning

In this paper, we propose a rule management system for data cleaning that is based on knowledge. This system combines features of both rule based systems and rule based data cleaning frameworks. The important advantages of our system are threefold. First, it aims at proposing a strong and unified rule form based on first order structure that permits the representation and management of all the types of rules and their quality via some characteristics. Second, it leads to increase the quality of rules which conditions the quality of data cleaning. Third, it uses an appropriate knowledge acquisition process, which is the weakest task in the current rule and knowledge based systems. As several research works have shown that data cleaning is rather driven by domain knowledge than by data, we have identified and analyzed the properties that distinguish knowledge and rules from data for better determining the most components of the proposed system. In order to illustrate our system, we also present a first experiment with a case study at health sector where we demonstrate how the system is useful for the improvement of data quality. The autonomy, extensibility and platform-independency of the proposed rule management system facilitate its incorporation in any system that is interested in data quality management.

[1]  Robbie T. Nakatsu,et al.  Rule‐Based Expert Systems , 2009 .

[2]  Jun Liu,et al.  On the consistency of rule bases based on lattice‐valued first‐order logic LF(X) , 2006, Int. J. Intell. Syst..

[3]  Radhika Santhanam,et al.  Could the use of a knowledge-based system lead to implicit learning? , 2007, Decis. Support Syst..

[4]  Su Myat Marlar Soe,et al.  Design and Implementation of Rule-based Expert System for Fault Management , 2008 .

[5]  Grzegorz J. Nalepa,et al.  A study of methodological issues in design and development of rule‐based systems: proposal of a new approach , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[6]  Jun Liu,et al.  Preference criterion and consistency in the rule-based system based on a lattice-valued logic , 2007 .

[7]  Chien-Chung Chan,et al.  From Data to Knowledge: an Integrated Rule-Based Data Mining System , 2005, SEKE.

[8]  Tok Wang Ling,et al.  IntelliClean: a knowledge-based intelligent data cleaner , 2000, KDD '00.

[9]  Guido Governatori,et al.  Rules and Norms: Requirements for Rule Interchange Languages in the Legal Domain , 2009, RuleML.

[10]  Mirta Baranovic,et al.  Generating data quality rules and integration into ETL process , 2009, DOLAP.

[11]  Dinesh Batra,et al.  Comparing a rule-based approach with a pattern-based approach at different levels of complexity of conceptual data modelling tasks , 2004, Int. J. Hum. Comput. Stud..

[12]  Frank Tip,et al.  A survey of program slicing techniques , 1994, J. Program. Lang..

[13]  Giovanna Dondossola Formal methods in the development of safety critical Knowledge-Based components , 1998, EUROVAV.

[14]  Markus Helfert,et al.  Discovering dynamic integrity rules with a rules-based tool for data quality analyzing , 2010, CompSysTech '10.

[15]  Mohd Syazwan Abdullah,et al.  Knowledge-based systems: a re-evaluation , 2006, J. Knowl. Manag..

[16]  Shawn Hedman A First Course in Logic: An Introduction to Model Theory, Proof Theory, Computability, and Complexity (Oxford Texts in Logic) , 2004 .

[17]  Mariana Hentea,et al.  Intelligent System for Information Security Management: Architecture and Design Issues , 2007 .

[18]  Ernest Friedman-Hill,et al.  Jess in action : rule-based systems in Java , 2003 .

[19]  Ajith Abraham,et al.  Rule-Based Expert Systems , 2005 .

[20]  Wei-Kang Wang,et al.  Designing a knowledge-based system for benchmarking: A DEA approach , 2011, Knowl. Based Syst..

[21]  Manel Poch,et al.  Integrating empirical and heuristic knowledge in a KBS to approach stream eutrophication. , 2009 .

[22]  Grzegorz J. Nalepa,et al.  Knowledge Representation with Granular Attributive Logic for XTT-Based Expert Systems , 2007, FLAIRS.

[23]  Marek J. Druzdzel,et al.  Comparison of Rule-Based and Bayesian Network Approaches in Medical Diagnostic Systems , 2001, AIME.

[24]  Grzegorz J. Nalepa,et al.  Prolog-Based Analysis of Tabular Rule-Based Systems with XTT Approach , 2006, FLAIRS.

[25]  C. Angeli,et al.  Diagnostic Expert Systems : From Expert ’ s Knowledge to Real-Time Systems , 2010 .

[26]  Grzegorz J. Nalepa,et al.  The HeKatE methodology. Hybrid engineering of intelligent systems , 2010, Int. J. Appl. Math. Comput. Sci..

[27]  Pascal Hitzler,et al.  A Metamodel and UML Profile for Rule-Extended OWL DL Ontologies , 2006, ESWC.

[28]  S. Hedman A First Course in Logic: An Introduction to Model Theory, Proof Theory, Computability, and Complexity , 2004 .

[29]  Mahmoud Boufaïda,et al.  Knowledge Based Data Cleaning for Data Warehouse Quality , 2011, ICDIPC.

[30]  Adrian Paschke,et al.  A Rule-based Middleware for Business Process Execution , 2008, Multikonferenz Wirtschaftsinformatik.