Parallel data processing with MapReduce: a survey
暂无分享,去创建一个
Yon Dohn Chung | Kyong-Ha Lee | Bongki Moon | Yoon-Joon Lee | Hyunsik Choi | B. Moon | Kyong-Ha Lee | Yoon-Joon Lee | Y. Chung | Hyunsik Choi | Bongki Moon
[1] Peter J. Haas,et al. Ricardo: integrating R and Hadoop , 2010, SIGMOD Conference.
[2] David J. DeWitt,et al. Clustera: an integrated computation and data management system , 2008, Proc. VLDB Endow..
[3] Anthony K. H. Tung,et al. MAP-JOIN-REDUCE: Toward Scalable and Efficient Data Analysis on Large Clusters , 2011, IEEE Transactions on Knowledge and Data Engineering.
[4] David J. DeWitt,et al. Weaving Relations for Cache Performance , 2001, VLDB.
[5] Jignesh M. Patel,et al. Energy management for MapReduce clusters , 2010, Proc. VLDB Endow..
[6] Roy H. Campbell,et al. MITHRA: Multiple data independent tasks on a heterogeneous resource architecture , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[7] Odej Kao,et al. Nephele: efficient parallel data processing in the cloud , 2009, MTAGS '09.
[8] Michael C. Schatz,et al. CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..
[9] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[10] Naga K. Govindaraju,et al. Mars: A MapReduce Framework on graphics processors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[11] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[12] Geoffrey C. Fox,et al. Twister: a runtime for iterative MapReduce , 2010, HPDC '10.
[13] Jignesh M. Patel,et al. A comparison of join algorithms for log processing in MaPreduce , 2010, SIGMOD Conference.
[14] Zhiwei Xu,et al. RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[15] Andrey Gubarev,et al. Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .
[16] Pete Wyckoff,et al. Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..
[17] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.
[18] Sanjay Ghemawat,et al. MapReduce: a flexible data processing tool , 2010, CACM.
[19] Douglas Stott Parker,et al. Map-reduce-merge: simplified relational data processing on large clusters , 2007, SIGMOD '07.
[20] John Cieslewicz,et al. SQL/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions , 2009, Proc. VLDB Endow..
[21] David A. Patterson,et al. Technical perspective: the data center is the computer , 2008, CACM.
[22] Michael D. Ernst,et al. HaLoop , 2010, Proc. VLDB Endow..
[23] José A. B. Fortes,et al. CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications , 2008, 2008 IEEE Fourth International Conference on eScience.
[24] Il-Yeol Song,et al. Relational versus non-relational database systems for data warehousing , 2010, DOLAP '10.
[25] Michael Isard,et al. Distributed data-parallel computing using a high-level programming language , 2009, SIGMOD Conference.
[26] Chen Li,et al. Efficient parallel set-similarity joins using MapReduce , 2010, SIGMOD Conference.
[27] Ravi Kumar,et al. Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.
[28] ReedBenjamin,et al. Building a high-level dataflow system on top of Map-Reduce , 2009, VLDB 2009.
[29] Rob Pike,et al. Interpreting the data: Parallel analysis with Sawzall , 2005, Sci. Program..
[30] GhemawatSanjay,et al. The Google file system , 2003 .
[31] Magdalena Balazinska,et al. ParaTimer: a progress indicator for MapReduce DAGs , 2010, SIGMOD Conference.
[32] Geoffrey C. Fox,et al. MapReduce for Data Intensive Scientific Analyses , 2008, 2008 IEEE Fourth International Conference on eScience.
[33] Samuel Madden,et al. Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).
[34] Wei Jiang,et al. A Map-Reduce System with an Alternate API for Multi-core Environments , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.
[35] Aaron Tsai,et al. Design and microarchitecture of the IBM system z10 microprocessor , 2009 .
[36] Christoforos E. Kozyrakis,et al. On the energy (in)efficiency of Hadoop clusters , 2010, OPSR.
[37] Jeffrey Dean,et al. Designs, Lessons and Advice from Building Large Distributed Systems , 2009 .
[38] Beng Chin Ooi,et al. Llama: leveraging columnar storage for scalable join processing in the MapReduce framework , 2011, SIGMOD '11.
[39] Magdalena Balazinska,et al. Estimating the progress of MapReduce pipelines , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).
[40] Henrik Loeser,et al. "One Size Fits All": An Idea Whose Time Has Come and Gone? , 2011, BTW.
[41] Shivnath Babu,et al. Towards automatic optimization of MapReduce programs , 2010, SoCC '10.
[42] Dominic Battré,et al. Nephele/PACTs: a programming model and execution framework for web-scale analytical processing , 2010, SoCC '10.
[43] Abraham Silberschatz,et al. HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..
[44] Prashant J. Shenoy,et al. A platform for scalable one-pass analytics using MapReduce , 2011, SIGMOD '11.
[45] Songting Chen,et al. Cheetah , 2010, Proc. VLDB Endow..
[46] Jeffrey D. Ullman,et al. Optimizing joins in a map-reduce environment , 2010, EDBT '10.
[47] D. DeWitt. MapReduce: A major step backwards | The Database Column , 2011 .
[48] Kurt Keutzer,et al. A map reduce framework for programming graphics processors , 2010 .
[49] Christopher Ré,et al. Automatic Optimization for MapReduce Programs , 2011, Proc. VLDB Endow..
[50] Randy H. Katz,et al. Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.
[51] Eric Anderson,et al. Efficiency matters! , 2010, OPSR.
[52] Michael Isard,et al. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.
[53] Yu Xu,et al. Integrating hadoop and parallel DBMs , 2010, SIGMOD Conference.
[54] Mirek Riedewald,et al. Processing theta-joins using MapReduce , 2011, SIGMOD '11.
[55] Christoforos E. Kozyrakis,et al. Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[56] Magdalena Balazinska,et al. Analyzing massive astrophysical datasets: Can Pig/Hadoop or a relational DBMS help? , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[57] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[58] Archana Ganapathi,et al. To compress or not to compress - compute vs. IO tradeoffs for mapreduce energy efficiency , 2010, Green Networking '10.
[59] Rajeev Gandhi,et al. An Analysis of Traces from a Production MapReduce Cluster , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.
[60] Jignesh M. Patel,et al. Column-Oriented Storage Techniques for MapReduce , 2011, Proc. VLDB Endow..
[61] Ronald C. Taylor. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics , 2010, BMC Bioinformatics.
[62] Joseph M. Hellerstein,et al. MapReduce Online , 2010, NSDI.
[63] Zheng Shao,et al. Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).
[64] Vinay Setty,et al. Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) , 2010, Proc. VLDB Endow..
[65] Daniela Florescu,et al. Rethinking cost and performance of database systems , 2009, SGMD.
[66] Beng Chin Ooi,et al. The performance of MapReduce , 2010, Proc. VLDB Endow..
[67] Ken Yocum,et al. Ad-hoc data processing in the cloud , 2008, Proc. VLDB Endow..
[68] George Kollios,et al. MRShare , 2010, Proc. VLDB Endow..
[69] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.
[70] Michael Stonebraker,et al. A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.
[71] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[72] Michael Stonebraker,et al. MapReduce: A major step backwards , 2014 .
[73] Jimmy J. Lin,et al. Book Reviews: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer , 2010, CL.
[74] Michael Stonebraker,et al. One Size Fits All? - Part 2: Benchmarking Results , 2007 .
[75] Karthikeyan Sankaralingam,et al. MapReduce for the Cell Broadband Engine Architecture , 2009, IBM J. Res. Dev..
[76] Michael Stonebraker,et al. MapReduce and parallel DBMSs: friends or foes? , 2010, CACM.