Efficient Distributed Data Clustering on Spark
暂无分享,去创建一个
Yiming Zhang | Jia Li | Dongsheng Li | Dongsheng Li | Yiming Zhang | Jia Li
[1] Zheng Zhang,et al. Error-bounded Sampling for Analytics on Big Sparse Data , 2014, Proc. VLDB Endow..
[2] Carlo Zaniolo,et al. Early Accurate Results for Advanced Analytics on MapReduce , 2012, Proc. VLDB Endow..
[3] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[4] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[5] J. MacKinnon,et al. Bootstrap tests: how many bootstraps? , 2000 .
[6] Pavel Berkhin,et al. A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.