Automated Detection of Records in Biological Sequence Databases that are Inconsistent with the Literature
暂无分享,去创建一个
[1] Paul Pavlidis,et al. Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA) , 2013, BMC Bioinformatics.
[2] W. Bruce Croft,et al. Predicting query performance , 2002, SIGIR '02.
[3] Chris Sander,et al. Removing near-neighbour redundancy from large protein sequence collections , 1998, Bioinform..
[4] David A. Lee,et al. Predicting protein function from sequence and structure , 2007, Nature Reviews Molecular Cell Biology.
[5] Friedhelm Pfeiffer,et al. A Manual Curation Strategy to Improve Genome Annotation: Application to a Set of Haloarchael Genomes , 2015, Life.
[6] Patricia C. Babbitt,et al. Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies , 2009, PLoS Comput. Biol..
[7] Adam Godzik,et al. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..
[8] Bas Teusink,et al. Understanding the Adaptive Growth Strategy of Lactobacillus plantarum by In Silico Optimisation , 2009, PLoS Comput. Biol..
[9] Claire O'Donovan,et al. Expert curation in UniProtKB: a case study on dealing with conflicting and erroneous data , 2014, Database J. Biol. Databases Curation.
[11] Neil D. Rawlings,et al. New mini- zincin structures provide a minimal scaffold for members of this metallopeptidase superfamily , 2014, BMC Bioinformatics.
[12] Marcus C. Chibucos,et al. The Confidence Information Ontology: a step towards a standard for asserting confidence in annotations , 2015, Database J. Biol. Databases Curation.
[13] Karin M. Verspoor,et al. Literature consistency of bioinformatics sequence databases is effective for assessing record quality , 2017, bioRxiv.
[14] Stephen E. Robertson,et al. Okapi at TREC-3 , 1994, TREC.
[15] Bryan Kolaczkowski,et al. Functional Annotations of Paralogs: A Blessing and a Curse , 2016, Life.
[16] Vitor R. Carvalho,et al. Reducing long queries using query quality predictors , 2009, SIGIR.
[17] Daniel W. A. Buchan,et al. A large-scale evaluation of computational protein function prediction , 2013, Nature Methods.
[18] Michal Linial,et al. Automatic detection of false annotations via binary property clustering , 2005, BMC Bioinformatics.
[19] M. Facciotti,et al. An Integrated Pipeline for de Novo Assembly of Microbial Genomes , 2012, PloS one.
[20] Roland J. Siezen,et al. Genome (re‐)annotation and open‐source annotation pipelines , 2010, Microbial biotechnology.
[21] Seng Hong Seah,et al. SCORPION, a molecular database of scorpion toxins. , 2002, Toxicon : official journal of the International Society on Toxinology.
[22] Karin M. Verspoor,et al. A close look at protein function prediction evaluation protocols , 2015, GigaScience.
[23] Christos A. Ouzounis,et al. Annotation inconsistencies beyond sequence similarity-based function prediction – phylogeny and genome structure , 2015, Standards in Genomic Sciences.
[24] Phillip W. Lord,et al. Can Inferred Provenance and Its Visualisation Be Used to Detect Erroneous Annotation? A Case Study Using UniProtKB , 2013, PloS one.
[25] Stephen C. Ekker,et al. Mojo Hand, a TALEN design tool for genome editing applications , 2013, BMC Bioinformatics.
[26] S. Brenner. Errors in genome annotation. , 1999, Trends in genetics : TIG.
[27] Iadh Ounis,et al. Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.
[28] Tin Wee Tan,et al. Large-scale analysis of antigenic diversity of T-cell epitopes in dengue virus , 2006, BMC Bioinformatics.
[29] Hans-Peter Kriegel,et al. LOF: identifying density-based local outliers , 2000, SIGMOD '00.
[30] J. Gogarten,et al. Using comparative genome analysis to identify problems in annotated microbial genomes. , 2010, Microbiology.
[31] Paul T. J. Tan,et al. Duplicate Detection in Biological Data using Association Rule Mining , 2004 .
[32] Richard J. Roberts,et al. Objective: biochemical function , 2014, Front. Genet..
[33] S. Brunak,et al. Cleaning the GenBank Arabidopsis thaliana data set. , 1996, Nucleic acids research.
[34] Karin M. Verspoor,et al. Duplicates, redundancies and inconsistencies in the primary nucleotide databases: a descriptive study , 2016, bioRxiv.
[35] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.
[36] Guillaume J. Filion,et al. Starcode: sequence clustering based on all-pairs search , 2015, Bioinform..
[37] R. Guigó,et al. An assessment of gene prediction accuracy in large DNA sequences. , 2000, Genome research.
[38] Ying Xu,et al. Mapping of orthologous genes in the context of biological pathways: An application of integer programming , 2006, Proc. Natl. Acad. Sci. USA.
[39] Miguel A. Andrade-Navarro,et al. Evaluation of annotation strategies using an entire genome sequence , 2003, Bioinform..
[40] John D. Lafferty,et al. A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.
[41] Carol Harger,et al. Establishing a method of vector contamination identification in database sequences , 1999, Bioinform..
[42] Gregory D. Schuler,et al. Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.
[43] Min Song,et al. Detecting duplicate biological entities using Markov random field-based edit distance , 2008, 2008 IEEE International Conference on Bioinformatics and Biomedicine.
[44] Walter R. Gilks,et al. Modeling the percolation of annotation errors in a database of protein sequences , 2002, Bioinform..
[45] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[46] Éric Gaussier,et al. Information-based models for ad hoc IR , 2010, SIGIR '10.
[47] Iadh Ounis,et al. Query performance prediction , 2006, Inf. Syst..
[48] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[49] Natalia N. Ivanova,et al. Improving Microbial Genome Annotations in an Integrated Database Context , 2013, PloS one.
[50] S. O’Brien,et al. SmileFinder: a resampling-based approach to evaluate signatures of selection from genome-wide sets of matching allele frequency data in two or more diploid populations , 2015, GigaScience.
[51] Min Song,et al. Detecting duplicate biological entities using Shortest Path Edit Distance , 2010, Int. J. Data Min. Bioinform..
[52] Riccardo Percudani,et al. Ureidoglycolate hydrolase, amidohydrolase, lyase: how errors in biological databases are incorporated in scientific papers and vice versa , 2013, Database J. Biol. Databases Curation.
[53] Michael Y. Galperin,et al. Sequence ― Evolution ― Function: Computational Approaches in Comparative Genomics , 2010 .
[54] Seán S. ÓhÉigeartaigh,et al. SearchDOGS Bacteria, Software That Provides Automated Identification of Potentially Missed Genes in Annotated Bacterial Genomes , 2014, Journal of bacteriology.
[55] Yunming Ye,et al. Collective prediction of protein functions from protein-protein interaction networks , 2014, BMC Bioinformatics.
[56] Stephen E. Robertson,et al. Okapi at TREC-2 , 1993, TREC.
[57] Elena Baralis,et al. Data Cleaning and Semantic Improvement in Biological Databases , 2006, J. Integr. Bioinform..
[58] David A. Coil,et al. Swabs to genomes: a comprehensive workflow , 2015, PeerJ.
[59] M. Ashburner,et al. Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.
[60] W. Van Criekinge,et al. PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration , 2014, Nucleic acids research.
[61] K. Osatomi,et al. Complete nucleotide sequence of dengue type 3 virus genome RNA. , 1990, Virology.
[62] Karin M. Verspoor,et al. Evaluation of a Machine Learning Duplicate Detection Method for Bioinformatics Databases , 2015, DTMBIO@CIKM.
[63] Manju Bansal,et al. A novel method for prokaryotic promoter prediction based on DNA stability , 2005, BMC Bioinformatics.
[64] Falk Scholer,et al. Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence , 2008, ECIR.