The RePhrase Extended Pattern Set for Data Intensive Parallel Computing

We discuss the extended parallel pattern set identified within the EU-funded project RePhrase as a candidate pattern set to support data intensive applications targeting heterogeneous architectures. The set has been designed to include three classes of pattern, namely (1) core patterns, modelling common, not necessarily data intensive parallelism exploitation patterns, usually to be used in composition; (2) high level patterns, modelling common, complex and complete parallelism exploitation patterns; and (3) building block patterns, modelling the single components of data intensive applications, suitable for use—in composition—to implement patterns not covered by the core and high level patterns. We discuss the expressive power of the RePhrase extended pattern set and results illustrating the performances that may be achieved with the FastFlow implementation of the high level patterns.

[1]  Marco Danelutto,et al.  Data stream processing via code annotations , 2016, The Journal of Supercomputing.

[2]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[3]  R. Irvin,et al.  Assessing and Comparing the Usability of ParallelProgramming , 1995 .

[4]  Michael Grossniklaus,et al.  An evaluation of the run-time and task-based performance of event detection techniques for Twitter , 2015, Inf. Syst..

[5]  James Reinders,et al.  Intel® threading building blocks , 2008 .

[6]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[7]  Alejandro Duran,et al.  Ompss: a Proposal for Programming Heterogeneous Multi-Core Architectures , 2011, Parallel Process. Lett..

[8]  Kiminori Matsuzaki,et al.  An Automatic Fusion Mechanism for Variable-Length List Skeletons in SkeTo , 2013, International Journal of Parallel Programming.

[9]  Yugyung Lee,et al.  Real-time network anomaly detection system using machine learning , 2015, 2015 11th International Conference on the Design of Reliable Communication Networks (DRCN).

[10]  Bugra Gedik,et al.  Fundamentals of Stream Processing: Application Design, Systems, and Analytics , 2014 .

[11]  Luis Miguel Sánchez,et al.  A C++ Generic Parallel Pattern Interface for Stream Processing , 2016, ICA3PP.

[12]  Henrique C. M. Andrade,et al.  Fundamentals of Stream Processing: Frontmatter , 2014 .

[13]  Marco Danelutto,et al.  FastFlow: High-level and Efficient Streaming on Multi-core , 2017 .

[14]  Alejandro Duran,et al.  An adaptive cut-off for task parallelism , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[15]  James Demmel,et al.  the Parallel Computing Landscape , 2022 .

[16]  T. Murdoch,et al.  The inevitable application of big data to health care. , 2013, JAMA.

[17]  Herbert Kuchen,et al.  Algorithmic skeletons for multi-core, multi-GPU systems and clusters , 2012, Int. J. High Perform. Comput. Netw..

[18]  Dino Pedreschi,et al.  Returners and explorers dichotomy in human mobility , 2015, Nature Communications.

[19]  Marco Danelutto,et al.  P3ARSEC: towards parallel patterns benchmarking , 2017, SAC.

[20]  Concetto Spampinato,et al.  Parallel visual data restoration on multi-GPGPUs using stencil-reduce pattern , 2015, Int. J. High Perform. Comput. Appl..

[21]  Philip S. Yu,et al.  Scale-Up Strategies for Processing High-Rate Data Streams in System S , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[22]  Tiziano De Matteis,et al.  Parallel Patterns for Window-Based Stateful Operators on Data Streams: An Algorithmic Skeleton Approach , 2017, International Journal of Parallel Programming.

[23]  Peter Kilpatrick,et al.  A parallel pattern for iterative stencil + reduce , 2016, The Journal of Supercomputing.

[24]  Marco Danelutto,et al.  Structured Parallel Programming with "core" FastFlow , 2013, CEFP.

[25]  Andreas Koch,et al.  An Open-Source Tool Flow for the Composition of Reconfigurable Hardware Thread Pool Architectures , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[26]  Christoph W. Kessler,et al.  SkePU: a multi-backend skeleton programming library for multi-GPU systems , 2010, HLPP '10.

[27]  Peter Kilpatrick,et al.  Pool Evolution: A Parallel Pattern for Evolutionary and Symbolic Computing , 2015, International Journal of Parallel Programming.

[28]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[29]  Henrique C. M. Andrade,et al.  Fundamentals of Stream Processing by Henrique C. M. Andrade , 2014 .

[30]  Marco Danelutto,et al.  Bringing Parallel Patterns Out of the Corner , 2017, ACM Trans. Archit. Code Optim..

[31]  Timothy G. Mattson,et al.  Patterns for parallel programming , 2004 .

[32]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[33]  Alex Wright Big data meets big science , 2014, CACM.