Data Quality in Genome Databases
暂无分享,去创建一个
[1] Timos K. Sellis,et al. ARKTOS: towards the modeling, design, control and execution of ETL processes , 2001, Inf. Syst..
[2] R. Doolittle. Of urfs and orfs : a primer on how to analyze devised amino acid sequences , 1986 .
[3] Dennis Shasha,et al. Declarative Data Cleaning: Language, Model, and Algorithms , 2001, VLDB.
[4] Charles Elkan,et al. An Efficient Domain-Independent Algorithm for Detecting Approximately Duplicate Database Records , 1997, DMKD.
[5] Heiko Mueller,et al. Problems , Methods , and Challenges in Comprehensive Data Cleansing , 2005 .
[6] William E. Winkler,et al. Methods for evaluating and creating data quality , 2004, Inf. Syst..
[7] Erhard Rahm,et al. Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..
[8] Charles Schroeder,et al. DataBryte: A Proposed Data Warehouse Cleansing Framework , 1998, IQ.
[9] Richard Y. Wang,et al. IP-MAP: Representing the Manufacture of an Information Product , 2000, IQ.
[10] R. Giegerich,et al. GenDB--an open source genome annotation system for prokaryote genomes. , 2003, Nucleic acids research.
[11] Joseph M. Hellerstein,et al. Potter''s Wheel: An Interactive Framework for Data Transformation and Cleaning , 2001, VLDB 2001.
[12] Vincent Lombard,et al. The EMBL Nucleotide Sequence Database: major new developments , 2003, Nucleic Acids Res..
[13] Surajit Chaudhuri,et al. Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.
[14] P. Richterich,et al. Estimation of errors in "raw" DNA sequences: a validation study. , 1998, Genome research.
[15] Tok Wang Ling,et al. IntelliClean: a knowledge-based intelligent data cleaner , 2000, KDD '00.
[16] Miguel A. Andrade-Navarro,et al. Evaluation of annotation strategies using an entire genome sequence , 2003, Bioinform..
[17] A. Valencia,et al. Intrinsic errors in genome annotation. , 2001, Trends in genetics : TIG.
[18] Laura M. Haas,et al. DiscoveryLink: A system for integrated access to life sciences data sources , 2001, IBM Syst. J..
[19] Philip Lijnzaad,et al. The Ensembl genome database project , 2002, Nucleic Acids Res..
[20] Peter D Karp,et al. The past, present and future of genome-wide re-annotation , 2002, Genome Biology.
[21] J. R. MacDonald,et al. Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence , 2003, Genome Biology.
[22] A Bairoch,et al. Go hunting in sequence databases but watch out for the traps. , 1996, Trends in genetics : TIG.
[23] Antonio Sassano,et al. Errors Detection and Correction in Large Scale Data Collecting , 2001, IDA.
[24] S. Brenner. Errors in genome annotation. , 1999, Trends in genetics : TIG.
[25] Wei Zhang,et al. A Framework for Corporate Householding , 2002, ICIQ.
[26] Salvatore J. Stolfo,et al. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.
[27] Frank Piontek,et al. Healthcare Informatics: Data Quality, Warehousing and Mining Applications , 2002, ICIQ.
[28] R. Guigó,et al. An assessment of gene prediction accuracy in large DNA sequences. , 2000, Genome research.