Information-theoretic fuzzy approach to data reliability and data mining

A novel, information-theoretic fuzzy approach to discovering unreliable data in a relational database is presented. A multilevel information-theoretic connectionist network is constructed to evaluate activation functions of partially reliable database values. The degree of value reliability is defined as a fuzzy measure of difference between the maximum attribute activation and the actual value activation. Unreliable values can be removed from the database or corrected to the values predicted by the network. The method is applied to a real-world relational database which is extended to a fuzzy relational database by adding fuzzy attributes representing reliability degrees of crisp attributes. The highest connection weights in the network are translated into meaningful if, then rules. This work aims at improving reliability of data in a relational database by developing a framework for discovering, accessing and correcting lowly reliable data.

[1]  Patrick Bosc,et al.  SQLf: a relational database language for fuzzy querying , 1995, IEEE Trans. Fuzzy Syst..

[2]  Richard Y. Wang,et al.  Toward quality data: An attribute-based approach , 2014, Decis. Support Syst..

[3]  Patrick Bosc,et al.  Fuzzy databases : principles and applications , 1996 .

[4]  George J. Klir,et al.  Fuzzy sets and fuzzy logic - theory and applications , 1995 .

[5]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[6]  Werasak Kurutach,et al.  Managing different aspects of imperfect data in databases: a unified approach , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[7]  Olga Pons,et al.  Knowledge Management in Fuzzy Databases , 2000 .

[8]  N. Mati,et al.  Discovering Informative Patterns and Data Cleaning , 1996 .

[9]  Oded Maimon,et al.  Fuzzy Approach to Data Reliability , 2000 .

[10]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[11]  Ronald R. Yager Database discovery using fuzzy sets , 1996, Int. J. Intell. Syst..

[12]  Claudia Testemale,et al.  Fuzzy relational databases—a key to expert systems , 1986 .

[13]  Roy George,et al.  Fuzzy database systems - challenges and opportunities of a new era , 1996, Int. J. Intell. Syst..

[14]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[15]  Niv Ahituv,et al.  Principles of information systems for management , 1982 .

[16]  A. Kandel Fuzzy Mathematical Techniques With Applications , 1986 .

[17]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[18]  Hiroshi Nakajima,et al.  A spreadsheet-based fuzzy retrieval system , 1996, Int. J. Intell. Syst..

[19]  Hongjun Lu,et al.  Effective Data Mining Using Neural Networks , 1996, IEEE Trans. Knowl. Data Eng..

[20]  Abraham Kandel,et al.  Fuzzy relational data bases : a key to expert systems , 1984 .

[21]  Abraham Silberschatz,et al.  Database System Concepts , 1980 .

[22]  Padhraic Smyth,et al.  An Information Theoretic Approach to Rule-Based Connectionist Expert Systems , 1988, NIPS.

[23]  Veda C. Storey,et al.  A Framework for Analysis of Data Quality Research , 1995, IEEE Trans. Knowl. Data Eng..

[24]  Yoshikane Takahashi Fuzzy Database Query Languages and Their Relational Completeness Theorem , 1993, IEEE Trans. Knowl. Data Eng..

[25]  Stephen E. Levinson,et al.  Adaptive acquisition of language , 1991 .