A Two-Stage Data Processing Algorithm to Generate Random Sample Partitions for Big Data Analysis
暂无分享,去创建一个
[1] David García,et al. Estimating the expected value of fuzzy random variables in the stratified random sampling from finite populations , 2001, Inf. Sci..
[2] Wenbo Zhang,et al. Improved K-Means cluster algorithm in telecommunications enterprises customer segmentation , 2010, 2010 IEEE International Conference on Information Theory and Information Security.
[3] Michaël Boireau,et al. Uncovering Online Political Communities of Belgian MPs through Social Network Clustering Analysis , 2015, EGOSE.
[4] Graham Cormode,et al. Sampling for big data: a tutorial , 2014, KDD.
[5] Gianluigi Zanetti,et al. Pydoop: a Python MapReduce and HDFS API for Hadoop , 2010, HPDC '10.
[6] Hairong Kuang,et al. The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).
[7] Wu-chun Feng,et al. Enhancing MapReduce via Asynchronous Data Processing , 2010, 2010 IEEE 16th International Conference on Parallel and Distributed Systems.
[8] Awais Ahmad,et al. An efficient divide-and-conquer approach for big data analytics in machine-to-machine communication , 2016, Neurocomputing.
[9] Yu-Lin He,et al. Empirical Analysis of Asymptotic Ensemble Learning for Big Data , 2016, 2016 IEEE/ACM 3rd International Conference on Big Data Computing Applications and Technologies (BDCAT).
[10] Joshua Zhexue Huang,et al. Big data analytics on Apache Spark , 2016, International Journal of Data Science and Analytics.
[11] M. C. Jones,et al. A reliable data-based bandwidth selection method for kernel density estimation , 1991 .
[12] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[13] Han Liu,et al. Challenges of Big Data Analysis. , 2013, National science review.