Upgrading a high performance computing environment for massive data processing
暂无分享,去创建一个
[1] Jesús Labarta,et al. Task-based programming in COMPSs to converge from HPC to big data , 2018, Int. J. High Perform. Comput. Appl..
[2] D. Andersen,et al. A Fast Array of Wimpy Nodes , 2008 .
[3] Leo Goodstadt,et al. Ruffus: a lightweight Python library for computational pipelines , 2010, Bioinform..
[4] Thorsten Meinl,et al. KNIME - the Konstanz information miner: version 2.0 and beyond , 2009, SKDD.
[5] Mike Eisler,et al. Network File System (NFS) Version 4 Minor Version 1 Protocol , 2020 .
[6] Jordi Torres,et al. PyCOMPSs: Parallel computational workflows in Python , 2016, Int. J. High Perform. Comput. Appl..
[7] Nada Lavrac,et al. ClowdFlows: A Cloud Based Scientific Workflow Platform , 2012, ECML/PKDD.
[8] Andrew J. Hutton,et al. Lustre: Building a File System for 1,000-node Clusters , 2003 .
[9] Haoyuan Li,et al. Alluxio: A Virtual Distributed File System , 2018 .
[10] Martin Mozina,et al. Orange: data mining toolbox in python , 2013, J. Mach. Learn. Res..
[11] Nada Lavrac,et al. Orange4WS Environment for Service-Oriented Data Mining , 2012, Comput. J..
[12] Jorge Ejarque,et al. Transparent Orchestration of Task-based Parallel Applications in Containers Platforms , 2018, Journal of Grid Computing.
[13] Peter J. Tonellato,et al. COSMOS: Python library for massively parallel workflows , 2014, Bioinform..
[14] Franck Cappello,et al. Big data and extreme-scale computing , 2018, Int. J. High Perform. Comput. Appl..
[15] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[16] Wes McKinney,et al. pandas: a Foundational Python Library for Data Analysis and Statistics , 2011 .
[17] Geoffrey Fox,et al. Twister2: Design of a big data toolkit , 2020, Concurr. Comput. Pract. Exp..
[18] Ignacio Blanquer,et al. Enabling e-Science Applications on the Cloud with COMPSs , 2011, Euro-Par Workshops.
[19] Domenico Talia,et al. Enabling Cloud Interoperability with COMPSs , 2012, Euro-Par.
[20] Carlos Maltzahn,et al. Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.
[21] Mike Eisler,et al. Network File System (NFS) Version 4 Minor Version 1 External Data Representation Standard (XDR) Description , 2010, RFC.
[22] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[23] Raj Jain,et al. The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.
[24] Geoffrey C. Fox,et al. Big Data, Simulations and HPC Convergence , 2015, WBDB.
[25] Ingo Mierswa,et al. YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.
[26] Gianluigi Zanetti,et al. Pydoop: a Python MapReduce and HDFS API for Hadoop , 2010, HPDC '10.
[27] Milind Bhandarkar,et al. HAWQ: a massively parallel processing SQL engine in hadoop , 2014, SIGMOD Conference.
[28] Balázs Hidasi,et al. Fast ALS-based tensor factorization for context-aware recommendation from implicit feedback , 2012, ECML/PKDD.
[29] Jorge Ejarque,et al. Energy-Aware Programming Model for Distributed Infrastructures , 2016, 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP).
[30] Matei Ripeanu,et al. Amazon S3 for science grids: a viable solution? , 2008, DADC '08.
[31] Wagner Meira,et al. Lemonade: A Scalable and Efficient Spark-Based Platform for Data Analytics , 2017, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).
[32] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[33] Gurhan Gunduz,et al. Twister2: TSet High-Performance Iterative Dataflow , 2019, 2019 International Conference on High Performance Big Data and Intelligent Systems (HPBD&IS).
[34] Amar Phanishayee,et al. FAWN: a fast array of wimpy nodes , 2009, SOSP '09.
[35] Werner Vogels,et al. Dynamo: amazon's highly available key-value store , 2007, SOSP.