Estimating Phred scores of Illumina base calls by logistic regression and sparse modeling
暂无分享,去创建一个
Bo Wang | Lin Wan | Sheng Zhang | Lei M. Li | Lei M. Li | Bo Wang | Sheng Zhang | Lin Wan
[1] E. Mardis. Next-generation DNA sequencing methods. , 2008, Annual review of genomics and human genetics.
[2] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .
[3] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[4] M. Morgante,et al. An Extensive Evaluation of Read Trimming Effects on Illumina NGS Data Analysis , 2013, PloS one.
[5] Gonçalo R. Abecasis,et al. The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..
[6] Lei M. Li,et al. An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters , 2017, Scientific reports.
[7] P Green,et al. Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.
[8] J. Ghosh,et al. AIC, BIC and Recent Advances in Model Selection , 2011 .
[9] Juliane C. Dohm,et al. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing , 2008, Nucleic acids research.
[10] An Hongzhi,et al. On the selection of regression variables , 1985 .
[11] Tjalling J. Ypma,et al. Historical Development of the Newton-Raphson Method , 1995, SIAM Rev..
[12] Nicholas A. Bokulich,et al. Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing , 2012, Nature Methods.
[13] D. Hosmer,et al. Applied Logistic Regression , 1991 .
[14] Juliane C. Dohm,et al. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems , 2011, Genome Biology.
[15] Genady Grabarnik,et al. Sparse Modeling: Theory, Algorithms, and Applications , 2014 .
[16] 柴田 里程. Selection of regression variables , 1981 .
[17] Markus Sauer,et al. NUCLEOBASE-SPECIFIC QUENCHING OF FLUORESCENT DYES. 1. NUCLEOBASE ONE-ELECTRON REDOX POTENTIALS AND THEIR CORRELATION WITH STATIC AND DYNAMIC QUENCHING EFFICIENCIES , 1996 .
[18] Chengxi Ye,et al. BlindCall: ultra-fast base-calling of high-throughput sequencing data by blind deconvolution , 2014, Bioinform..
[19] M. DePristo,et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.
[20] P. Green,et al. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.
[21] Lei M. Li,et al. Adjust quality scores from alignment and improve sequencing accuracy. , 2004, Nucleic acids research.
[22] T. Fearn. Ridge Regression , 2013 .
[23] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .
[24] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..
[25] P. McCullagh,et al. Generalized Linear Models , 1984 .
[26] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.
[27] Trevor Hastie,et al. Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.