T owards Data Quality and Data Mining Using Constraints in XML

Quality data is necessary for different data mining techniques and reversely, data mining techniques can be utilized to measure quality of data. Data mining and data quality issues got much attention for relational data in past. But, as a massive amount of data is being stored and represented over the web in XML, the issue of data quality for mining purposes and also using data mining techniques for quality measures get research interest. We propose two important interrelated issues: how quality XML data is useful for data mining in XML and how data mining in XML is used to measure the quality data for XML. When we address both issues, we consider XML constraints because constraints in XML can be used for quality measurement in XML data and also for finding some important patterns and association rules in XML data mining. We mainly address the theoretical framework for data quality and data mining for XML. Our research is towards the broader task of data mining and data quality for XML data integrations.

[1]  Chengfei Liu,et al.  Local XML functional dependencies , 2003, WIDM '03.

[2]  Wenfei Fan,et al.  Conditional Functional Dependencies for Data Cleaning , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[3]  Wenfei Fan,et al.  Integrity constraints for XML , 2003, J. Comput. Syst. Sci..

[4]  Wenfei Fan,et al.  Dependencies revisited for improving data quality , 2008, PODS.

[5]  Matthias Jarke,et al.  Systematic Development of Data Mining-Based Data Quality Tools , 2003, VLDB.

[6]  Tok Wang Ling,et al.  Designing Functional Dependencies for XML , 2002, EDBT.

[7]  M.M. Khaing,et al.  An Efficient Association Rule Mining For XML Data , 2006, 2006 SICE-ICASE International Joint Conference.

[8]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[9]  Matthias Baumgarten,et al.  Data mining and XML: current and future issues , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[10]  Mukesh K. Mohania,et al.  On the equivalence between FDs in XML and FDs in relations , 2007, Acta Informatica.

[11]  Wenfei Fan,et al.  Conditional functional dependencies for capturing data inconsistencies , 2008, TODS.

[12]  Chengfei Liu,et al.  Strong functional dependencies and their application to normal forms in XML , 2004, TODS.

[13]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[14]  Shuai Ma,et al.  Improving Data Quality: Consistency and Accuracy , 2007, VLDB.

[15]  Chengfei Liu,et al.  Functional Dependencies, from Relational to XML , 2003, Ershov Memorial Conference.

[16]  Wenfei Fan,et al.  Constraints for semistructured data and XML , 2001, SGMD.

[17]  Jixue Liu,et al.  Functional Dependencies for XML , 2003, APWeb.

[18]  Marcelo Arenas,et al.  A normal form for XML documents , 2002, PODS '02.