Viewpoint Paper: Large Datasets in Biomedicine: A Discussion of Salient Analytic Issues
暂无分享,去创建一个
[1] Vladimir Pestov,et al. On the geometry of similarity search: Dimensionality curse and concentration of measure , 1999, Inf. Process. Lett..
[2] George Hripcsak,et al. A statistical methodology for analyzing co-occurrence data from a large sample , 2007, J. Biomed. Informatics.
[3] Peter J. Huber,et al. Massive Datasets Workshop: Four Years After , 1999 .
[4] D. Rubin,et al. Statistical Analysis with Missing Data. , 1989 .
[5] P. Brown,et al. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. , 2004, Proceedings of the National Academy of Sciences of the United States of America.
[6] Michael Wolf,et al. Control of generalized error rates in multiple testing , 2007, 0710.2258.
[7] Theodore Johnson,et al. Exploratory Data Mining and Data Cleaning , 2003 .
[8] Pedro Larrañaga,et al. A review of feature selection techniques in bioinformatics , 2007, Bioinform..
[9] Jon R. Kettenring. A Perspective on Cluster Analysis , 2008 .
[10] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..
[11] Megan K. Mulligan,et al. Toward understanding the genetics of alcohol drinking through transcriptome meta-analysis. , 2006, Proceedings of the National Academy of Sciences of the United States of America.
[12] Lev Klebanov,et al. Multivariate search for differentially expressed gene combinations , 2004, BMC Bioinformatics.
[13] Y. Benjamini,et al. THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .
[14] Fionn Murtagh,et al. Overcoming the Curse of Dimensionality in Clustering by Means of the Wavelet Transform , 2000, Comput. J..
[15] Theodore Johnson,et al. Hunting of the Snark: Finding Data Glitches using Data Mining Methods , 1999, IQ.
[16] Shailesh V. Date,et al. A Probabilistic Functional Network of Yeast Genes , 2004, Science.
[17] I. Jolliffe. Principal Component Analysis , 2002 .
[18] George Hripcsak,et al. Inter-patient distance metrics using SNOMED CT defining relationships , 2006, J. Biomed. Informatics.
[19] E. Shortliffe. Computer-based medical consultations: mycin (elsevier north holland , 1976 .
[20] David B. Skalak,et al. Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.
[21] Olga G. Troyanskaya,et al. A scalable method for integration and functional analysis of multiple microarray datasets , 2006, Bioinform..
[22] Jon R. Kettenring,et al. The Practice of Cluster Analysis , 2006, J. Classif..
[23] Daphne Koller,et al. Toward Optimal Feature Selection , 1996, ICML.
[24] Joseph Beyene,et al. Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models , 2005, BMC Bioinformatics.
[25] Sayan Mukherjee,et al. Feature Selection for SVMs , 2000, NIPS.
[26] Richard Simon,et al. A random variance model for detection of differential gene expression in small microarray experiments , 2003, Bioinform..
[27] A. Dempster. A HIGH DIMENSIONAL TWO SAMPLE SIGNIFICANCE TEST , 1958 .
[28] Chi Hau Chen,et al. Pattern recognition and signal processing , 1978 .
[29] José Martínez Sotoca,et al. A review of data complexity measures and their applicability to pattern classification problems , 2005 .
[30] E. Gehan,et al. The properties of high-dimensional data spaces: implications for exploring gene and protein expression data , 2008, Nature Reviews Cancer.
[31] Y. Benjamini,et al. Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics , 1999 .
[32] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .
[33] So Young Sohn,et al. Meta Analysis of Classification Algorithms for Pattern Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..
[34] A. Farcomeni. Some Results on the Control of the False Discovery Rate under Dependence , 2007 .
[35] Rory A. Fisher,et al. Theory of Statistical Estimation , 1925, Mathematical Proceedings of the Cambridge Philosophical Society.
[36] Lawrence M. Fagan,et al. Medical informatics: computer applications in health care and biomedicine (Health informatics) , 2003 .
[37] Michael Y. Galperin. The Molecular Biology Database Collection: 2005 update , 2004, Nucleic Acids Res..
[38] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[39] Hanlee P. Ji,et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. , 2006, Nature biotechnology.
[40] Kathleen N. Lohr,et al. Effectiveness and Outcomes in Health Care , 1990 .
[41] W. Wu,et al. On false discovery control under dependence , 2008, 0803.1971.
[42] Clayton A. Wiley,et al. Reflections on a Workshop , 1997 .
[43] R. Fisher,et al. On the Mathematical Foundations of Theoretical Statistics , 1922 .
[44] L. Wasserman,et al. Operating characteristics and extensions of the false discovery rate procedure , 2002 .
[45] Olga Brazhnik,et al. Anatomy of data integration , 2007, J. Biomed. Informatics.
[46] Xing Qiu,et al. The effects of normalization on the correlation structure of microarray data , 2005, BMC Bioinformatics.
[47] Aniko Szabo,et al. Multivariate exploratory tools for microarray data analysis. , 2003, Biostatistics.
[48] Edward H. Shortliffe,et al. Computer-based medical consultations, MYCIN , 1976 .
[49] Michael Y. Galperin. The Molecular Biology Database Collection: 2007 update , 2006, Nucleic Acids Res..
[50] J. Friedman,et al. Projection Pursuit Regression , 1981 .
[51] Andrei Yakovlev,et al. Diverse correlation structures in gene expression data and their utility in improving statistical inference , 2007, 0712.2130.
[52] John D. Storey. The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .
[53] Sriram V. Pemmaraju,et al. Error-detecting codes and fault-containing self-stabilization , 2000, Inf. Process. Lett..
[54] Alon Y. Halevy,et al. Data integration and genomic medicine , 2007, J. Biomed. Informatics.
[55] R. A. Leibler,et al. On Information and Sufficiency , 1951 .
[56] P. Diggle,et al. Analysis of Longitudinal Data , 2003 .
[57] B. Efron. Large-Scale Simultaneous Hypothesis Testing , 2004 .
[58] Solomon Kullback,et al. Information Theory and Statistics , 1960 .
[59] George Hripcsak,et al. Considering clustering: a methodological review of clinical decision support system studies , 2000, AMIA.
[60] A. Owen,et al. A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae) , 2003, Proceedings of the National Academy of Sciences of the United States of America.
[61] John D. Storey,et al. Empirical Bayes Analysis of a Microarray Experiment , 2001 .
[62] John D. Storey. A direct approach to false discovery rates , 2002 .
[63] Andrei Yakovlev,et al. expression data: do they matter for correlation analysis? , 2007 .
[64] Sangsoo Kim,et al. Combining multiple microarray studies and modeling interstudy variation , 2003, ISMB.
[65] E. Shortliffe,et al. Readings in medical artificial intelligence: the first decade , 1984 .
[66] Amanda Clare,et al. Predicting gene function in Saccharomyces cerevisiae , 2003, ECCB.
[67] Jason Weston,et al. Learning Gene Functional Classifications from Multiple Data Types , 2002, J. Comput. Biol..
[68] B. Efron. Correlation and Large-Scale Simultaneous Significance Testing , 2007 .