ImRP: A Predictive Partition Method for Data Skew Alleviation in Spark Streaming Environment
暂无分享,去创建一个
Kenli Li | Keqin Li | Zhongming Fu | Zhuo Tang | Li Yang | Kuan-Ching Li | Kenli Li | Li Yang | Zhuo Tang | Zhongming Fu
[1] Changjun Jiang,et al. Adaptive Scheduling Parallel Jobs with Dynamic Batching in Spark Streaming , 2018, IEEE Transactions on Parallel and Distributed Systems.
[2] Chen Chen,et al. Cost-effective Resource Provisioning for Spark Workloads , 2019, CIKM.
[3] Wenxin Li,et al. Wide-Area Spark Streaming: Automated Routing and Batch Sizing , 2017, 2017 IEEE International Conference on Autonomic Computing (ICAC).
[4] Jeffrey F. Naughton,et al. Adaptive parallel aggregation algorithms , 1995, SIGMOD '95.
[5] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[6] Sanjay Ghemawat,et al. MapReduce: simplified data processing on large clusters , 2008, CACM.
[7] Zhang De-xin. Big Data Research , 2013 .
[8] Weiwei Xing,et al. MRSIM: Mitigating Reducer Skew In MapReduce , 2017, 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA).
[9] Fei Hu,et al. SASM: Improving spark performance with Adaptive Skew Mitigation , 2015, 2015 IEEE International Conference on Progress in Informatics and Computing (PIC).
[10] Xiaomin Zhu,et al. SP-Partitioner: A novel partition method to handle intermediate data skew in spark streaming , 2017, Future Gener. Comput. Syst..
[11] Norbert Ritter,et al. Real-time stream processing for Big Data , 2016, it Inf. Technol..
[12] Keqin Li,et al. A Data Skew Oriented Reduce Placement Algorithm Based on Sampling , 2020, IEEE Transactions on Cloud Computing.
[13] Zhen Xiao,et al. Improving MapReduce Performance Using Smart Speculative Execution Strategy , 2014, IEEE Transactions on Computers.
[14] Nikolaus Augsten,et al. Handling Data Skew in MapReduce , 2011, CLOSER.
[15] A. Kivity,et al. kvm : the Linux Virtual Machine Monitor , 2007 .
[16] Funda Ergün,et al. Online load balancing for MapReduce with skewed data input , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.
[17] Zhiyang Li,et al. Balancing reducer workload for skewed data using sampling-based partitioning , 2014, Comput. Electr. Eng..
[18] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[19] Magdalena Balazinska,et al. SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.
[20] Patrick Valduriez,et al. A survey of scheduling frameworks in big data systems , 2018, Int. J. Cloud Comput..
[21] Kenli Li,et al. An intermediate data placement algorithm for load balancing in Spark computing environment , 2018, Future Gener. Comput. Syst..
[22] Jimmy J. Lin,et al. The Curse of Zipf and Limits to Parallelization: An Look at the Stragglers Problem in MapReduce , 2009, LSDS-IR@SIGIR.
[23] Zhuo Tang,et al. Optimizing Speculative Execution in Spark Heterogeneous Environments , 2019 .
[24] Viswanath Poosala,et al. Congressional samples for approximate answering of group-by queries , 2000, SIGMOD '00.
[25] P H Ellaway,et al. Cumulative sum technique and its application to the analysis of peristimulus time histograms. , 1978, Electroencephalography and clinical neurophysiology.
[26] Keqiu Li,et al. Sampling-Based Partitioning in MapReduce for Skewed Data , 2012, 2012 Seventh ChinaGrid Annual Conference.
[27] Kenli Li,et al. An Intermediate Data Partition Algorithm for Skew Mitigation in Spark Computing Environment , 2018, IEEE Transactions on Cloud Computing.
[28] Joanna Berlinska,et al. Comparing load-balancing algorithms for MapReduce under Zipfian data skews , 2018, Parallel Comput..
[29] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .
[30] Yu Xu,et al. A new algorithm for small-large table outer joins in parallel DBMS , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).
[31] Xin Huang,et al. Novel heuristic speculative execution strategies in heterogeneous distributed environments , 2016, Comput. Electr. Eng..
[32] Jordi Torres,et al. A Methodology for Spark Parameter Tuning , 2017, Big Data Res..
[33] Frederick Reiss,et al. TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.
[34] Matei A. Zaharia,et al. An Architecture for and Fast and General Data Processing on Large Clusters , 2016 .
[35] Zhen Xiao,et al. LIBRA: Lightweight Data Skew Mitigation in MapReduce , 2015, IEEE Transactions on Parallel and Distributed Systems.
[36] Hai Jin,et al. Handling partitioning skew in MapReduce using LEEN , 2013, Peer Peer Netw. Appl..