Distributed, Numerically Stable Distance and Covariance Computation with MPI for Extremely Large Datasets
暂无分享,去创建一个
[1] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..
[2] Arnab Nandi,et al. A Closer Look at Variance Implementations In Modern Database Systems , 2015, SGMD.
[3] G. Golub,et al. Updating formulae and a pairwise algorithm for computing sample variances , 1979 .
[4] Gene H. Golub,et al. Algorithms for Computing the Sample Variance: Analysis and Recommendations , 1983 .
[5] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[6] Howard M. Shapiro,et al. Practical Flow Cytometry , 1985 .
[7] Srikumar Venugopal,et al. Dynamic Model Evaluation to Accelerate Distributed Machine Learning , 2018, 2018 IEEE International Congress on Big Data (BigData Congress).
[8] Edward A. Youngs,et al. Some Results Relevant to Choice of Sum and Sum-of-Product Algorithms , 1971 .
[9] Francisco Herrera,et al. Minutiae-based fingerprint matching decomposition: Methodology for big data frameworks , 2017, Inf. Sci..
[10] Yvan Saeys,et al. Computational approaches for high‐throughput single‐cell data analysis , 2018, The FEBS journal.
[11] Robert Tibshirani,et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.
[12] Cliburn Chan,et al. Hierarchical Modeling for Rare Event Detection and Cell Subset Alignment across Flow Cytometry Samples , 2013, PLoS Comput. Biol..
[13] Ruben H. Zamar,et al. Scalable robust covariance and correlation estimates for data mining , 2002, KDD.
[14] Shirish Tatikonda,et al. Scalable and Numerically Stable Descriptive Statistics in SystemML , 2012, 2012 IEEE 28th International Conference on Data Engineering.
[15] Norman Matloff,et al. Fast, General Parallel Computation for Machine Learning , 2018, ICPP Workshops.
[16] David C. Thompson,et al. Design and Performance of a Scalable, Parallel Statistics Toolkit , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[17] D. L. Taylor,et al. High content screening applied to large-scale cell biology. , 2004, Trends in biotechnology.
[18] Ray W. Grout,et al. Numerically stable, single-pass, parallel statistics algorithms , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[19] C. A. Murthy,et al. Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[20] Ameet Talwalkar,et al. MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..
[21] Daniel Peralta,et al. Fast fingerprint identification for large databases , 2014, Pattern Recognit..
[22] Anne E Carpenter,et al. CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.
[23] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[24] Koby Crammer,et al. Exploiting Feature Covariance in High-Dimensional Online Learning , 2010, AISTATS.
[25] Zheguang Zhao,et al. Bridging the Gap between HPC and Big Data frameworks , 2017, Proc. VLDB Endow..
[26] Pavel Pudil,et al. Introduction to Statistical Pattern Recognition , 2006 .