A data locality optimization algorithm for large-scale data processing in Hadoop
暂无分享,去创建一个
Jun Li | Dan Meng | Weiping Wang | Xiufeng Yang | Gang Guan | Shubin Zhang | Yanrong Zhao
[1] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[2] Kenneth A. Ross,et al. Cache Conscious Indexing for Decision-Support in Main Memory , 1999, VLDB.
[3] Douglas Stott Parker,et al. Map-reduce-merge: simplified relational data processing on large clusters , 2007, SIGMOD '07.
[4] David R. Karger,et al. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.
[5] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.
[6] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[7] Martin L. Kersten,et al. Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.
[8] Pete Wyckoff,et al. Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..
[9] Ravi Kumar,et al. Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.
[10] Michael Isard,et al. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.
[11] Jeffrey F. Naughton,et al. Cache Conscious Algorithms for Relational Query Processing , 1994, VLDB.
[12] Werner Vogels,et al. Dynamo: amazon's highly available key-value store , 2007, SOSP.
[13] David J. DeWitt,et al. Parallel database systems: the future of high performance database systems , 1992, CACM.
[14] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[15] David J. DeWitt,et al. A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment , 1989, SIGMOD '89.
[16] Elke A. Rundensteiner,et al. Revisiting the Role of Pipelined Parallelism in Multi-Join Query Processing , 2005 .
[17] Scott Shenker,et al. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.
[18] Zheng Shao,et al. Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).