Contributions of hybrid architectures to depth imaging: a CPU, APU and GPU comparative study. (Apports des architectures hybrides à l'imagerie profondeur : étude comparative entre CPU, APU et GPU)
暂无分享,去创建一个
[1] Pradeep Dubey,et al. 3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[2] Gerhard Wellein,et al. Parallel Sparse Matrix-Vector Multiplication as a Test Case for Hybrid MPI+OpenMP Programming , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[3] Marc Tchiboukdjian,et al. Design and Performance of an Intel Xeon Phi based Cluster for Reverse Time Migration , 2014, HiPC 2014.
[4] William Jalby,et al. Quantifying performance bottleneck cost through differential analysis , 2013, ICS '13.
[5] D. Komatitsch,et al. An unsplit convolutional perfectly matched layer improved at grazing incidence for the seismic wave equation , 2007 .
[6] Gabriella Cabitza,et al. Migration of seismic data , 1994 .
[7] Vidar Slåtten,et al. 379 Performance Optimizations for TTI RTM on GPU based Hybrid Architectures , 2013 .
[8] D. Yingst,et al. Full waveform inversion – the state of the art , 2013 .
[9] Rudolf Eigenmann,et al. OpenMPC: Extended OpenMP Programming and Tuning for GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[10] Wu-chun Feng,et al. Towards efficient supercomputing: a quest for the right metric , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[11] Gerhard Wellein,et al. Asynchronous MPI for the Masses , 2013, ArXiv.
[12] Zhiming Li,et al. A Multi-Step Approach For Efficient Reverse-Time Migration , 2008 .
[13] Yanfei Wang,et al. Determining finite difference weights for the acoustic wave equation by a new dispersion‐relationship‐preserving method , 2015 .
[14] Zhaohui S. Qin,et al. GPUmotif: An Ultra-Fast and Energy-Efficient Motif Analysis Program Using Graphics Processing Units , 2012, PloS one.
[15] T. Okamoto,et al. Accelerating large-scale simulation of seismic wave propagation by multi-GPUs and three-dimensional domain decomposition , 2010 .
[16] Satoshi Matsuoka,et al. A Multi-Level Optimization Method for Stencil Computation on the Domain that is Bigger than Memory Capacity of GPU , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[17] Lurng-Kuo Liu,et al. High Performance RTM Using Massive Domain Partitioning , 2011 .
[18] Gerhard Wellein,et al. Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory , 2009, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[19] Pavan Balaji,et al. MT-MPI: multithreaded MPI for many-core environments , 2014, ICS '14.
[20] Weiqiang Wang,et al. A Multilevel Parallelization Framework for High-Order Stencil Computations , 2009, Euro-Par.
[21] John A. Scales,et al. Distributed three-dimensional finite-difference modeling of wave propagation in acoustic media , 1997 .
[22] R. Courant,et al. On the Partial Difference Equations, of Mathematical Physics , 2015 .
[23] Etienne Robein. ebook - Seismic Imaging: A Review of the Techniques, their Principles, Merits and Limitations (EET 4) , 2010 .
[24] Haohuan Fu,et al. Selecting the right hardware for reverse time migration , 2010 .
[25] Satoshi Matsuoka,et al. Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[26] Jean Virieux,et al. An overview of full-waveform inversion in exploration geophysics , 2009 .
[27] Scott Hauck,et al. Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation , 2007 .
[28] Aditya Konduri,et al. Asynchronous finite-difference schemes for partial differential equations , 2014, J. Comput. Phys..
[29] G. Schuster. Basics of Seismic Imaging , 2010 .
[30] Hongwei Liu,et al. Wavefield reconstruction methods for reverse time migration , 2013 .
[31] Z. Alterman,et al. Propagation of elastic waves in layered media by finite difference methods , 1968 .
[32] William W. Symes,et al. Computational Strategies For Reverse-time Migration , 2008 .
[33] Song Huang,et al. On the energy efficiency of graphics processing units for scientific computing , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[34] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[35] Scott B. Baden,et al. Overlapping communication and computation with OpenMP and MPI , 2001, Sci. Program..
[36] James Demmel,et al. Benchmarking GPUs to tune dense linear algebra , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[37] Robert G. Clapp. Reverse time migration : Saving the boundaries , 2009 .
[38] Moujahed Al-Husseini,et al. The debate over Hubbert’s Peak: a review , 2006, GeoArabia.
[39] Reiji Suda,et al. Accurate Measurements and Precise Modeling of Power Dissipation of CUDA Kernels toward Power Optimized High Performance CPU-GPU Computing , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.
[40] Li-Yun Fu,et al. Two effective approaches to reduce data storage in reverse time migration , 2013, Comput. Geosci..
[42] W. A. Mulder,et al. A comparison between one-way and two-way wave-equation migration , 2004 .
[43] Brian Hamilton,et al. ROOM ACOUSTICS MODELLING USING GPU-ACCELERATED FINITE DIFFERENCE AND FINITE VOLUME METHODS ON A FACE-CENTERED CUBIC GRID , 2013 .
[44] Hiroyuki Takizawa,et al. A Comparison of Performance Tunabilities between OpenCL and OpenACC , 2013, 2013 IEEE 7th International Symposium on Embedded Multicore Socs.
[45] Eduard Ayguadé,et al. Exploiting memory customization in FPGA for 3D stencil computations , 2009, 2009 International Conference on Field-Programmable Technology.
[46] P. Moczo,et al. The finite-difference time-domain method for modeling of seismic wave propagation , 2007 .
[47] Stephen D. Gedney,et al. Convolution PML (CPML): An efficient FDTD implementation of the CFS–PML for arbitrary media , 2000 .
[48] Volker Strumpen,et al. Cache oblivious stencil computations , 2005, ICS '05.
[49] Changsoo Shin,et al. Acceleration of stable TTI P-wave reverse-time migration with GPUs , 2013, Comput. Geosci..
[50] R. Clapp. Reverse time migration with random boundaries , 2009 .
[51] Samuel Williams,et al. Auto-Tuning the 27-point Stencil for Multicore , 2009 .
[52] Inanc Senocak,et al. An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters , 2010 .
[53] Dheeraj Bhardwaj,et al. 3 D Seismic Modeling in a Message Passing Environment , 2000 .
[54] G. Keller. An Introduction to Geophysical Exploration , 1986 .
[55] Robert A. van de Geijn,et al. Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures , 2012, IEEE Transactions on Computers.
[56] Wayne Luk,et al. A mixed precision Monte Carlo methodology for reconfigurable accelerator systems , 2012, FPGA '12.
[57] H. Peter Hofstee,et al. Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..
[58] Hans-Peter Seidel,et al. Cache Accurate Time Skewing in Iterative Stencil Computations , 2011, 2011 International Conference on Parallel Processing.
[59] Andreas Lemmer,et al. Parallel domain decomposition method with non-blocking communication for flow through porous media , 2015, J. Comput. Phys..
[60] Francesc Alted,et al. Why Modern CPUs Are Starving and What Can Be Done about It , 2010, Computing in Science & Engineering.
[61] Paulius Micikevicius,et al. 3D finite difference computation on GPUs using CUDA , 2009, GPGPU-2.
[62] Wu-chun Feng,et al. On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing , 2011, 2011 Symposium on Application Accelerators in High-Performance Computing.
[63] Murray Cole,et al. PARTANS: An autotuning framework for stencil computation on multi-GPU systems , 2013, TACO.
[64] Hong Liu,et al. The Algorithm of High Order Finite Difference Pre‐Stack Reverse Time Migration and GPU Implementation , 2010 .
[65] Masakazu Sekijima,et al. The Power Efficiency of GPUs in Multi Nodes Environment with Molecular Dynamics , 2011, 2011 40th International Conference on Parallel Processing Workshops.
[66] Henri Calandra,et al. Performance of CPU/GPU compiler directives on ISO/TTI kernels , 2013, Computing.
[67] Mauricio Hanzich,et al. High-Performance Seismic Acoustic Imaging by Reverse-Time Migration on the Cell / B . E . Architecture , 2008 .
[68] J. Claerbout. Toward a unified theory of reflector mapping , 1971 .
[69] Rajeev Thakur,et al. Test suite for evaluating performance of multithreaded MPI communication , 2009, Parallel Comput..
[70] Douglas N. Arnold. Stability, consistency, and convergence of numerical discretizations , 2015 .
[71] Asma Farjallah,et al. Preparing depth imaging applications for Exascale challenges and impacts. (Etude de l'adéquation des machines Exascale pour les algorithmes implémentant la méthode du Reverse Time Migation) , 2014 .
[72] Jyothish Soman,et al. Maximizing TTI RTM Throughput for CPU+GPU , 2013 .
[73] Paul L. Stoffa,et al. 3D Seismic Modeling And Reverse-Time Migration With the Parallel Fourier Method Using Non-blocking Collective Communications , 2009 .
[74] William A. Schneider,et al. INTEGRAL FORMULATION FOR MIGRATION IN TWO AND THREE DIMENSIONS , 1978 .
[75] Jean Virieux,et al. Finite-difference frequency-domain modeling of viscoacoustic wave propagation in 2D tilted transversely isotropic (TTI) media , 2009 .
[76] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[77] Dennis W. Prather,et al. FPGA-based acceleration of the 3D finite-difference time-domain method , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.
[78] Samuel Williams,et al. Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors , 2007, SIAM Rev..
[79] Stephen W. Poole,et al. Power measurement for high performance computing: State of the art , 2011, 2011 International Green Computing Conference and Workshops.
[80] Sayantan Sur,et al. RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits , 2006, PPoPP '06.
[81] Larry Lines,et al. A recipe for stability of finite-difference wave-equation computations , 1999 .
[82] Paul L. Stoffa,et al. Implicit finite-difference simulations of seismic wave propagation , 2012 .
[83] Bo Li,et al. The issues of prestack reverse time migration and solutions with Graphic Processing Unit implementation , 2012 .
[84] G. McMechan. MIGRATION BY EXTRAPOLATION OF TIME‐DEPENDENT BOUNDARY VALUES* , 1983 .
[85] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[86] J. Gazdag,et al. Migration of seismic data , 1984, Proceedings of the IEEE.
[87] J. Carcione,et al. Seismic modeling , 1942 .
[88] Rached Abdelkhalek. Accélération matérielle pour l'imagerie sismique : modélisation, migration et interprétation , 2013 .
[89] R. Pratt. Seismic waveform inversion in the frequency domain; Part 1, Theory and verification in a physical scale model , 1999 .
[90] Haohuan Fu,et al. Eliminating the memory bottleneck: an FPGA-based solution for 3d reverse time migration , 2011, FPGA '11.
[91] Kevin Field. The A List , 2016 .
[92] Gerhard Wellein,et al. Prospects for truly asynchronous communication with pure MPI and hybrid MPI/OpenMP on current supercomputing platforms , 2011 .
[93] Henri Calandra,et al. A review of the spectral, pseudo‐spectral, finite‐difference and finite‐element modelling techniques for geophysical imaging , 2011 .
[94] John Jossey,et al. EQUIVALENCE THEOREMS IN NUMERICAL ANALYSIS : INTEGRATION, DIFFERENTIATION AND INTERPOLATION , 2007, 0709.4046.
[95] Weiqiang Wang,et al. In-Core Optimization of High-Order Stencil Computations , 2009, PDPTA.
[96] D. Hale. Migration by the Kirchhoff, slant stack, and Gaussian beam methods , 1992 .
[97] Larry Lines,et al. Seismic Modeling and Imaging With the Complete Wave Equation , 1997 .
[98] John C. Bancroft,et al. Overcoming computational cost problems of reverse-time migration , 2010 .
[99] Mi Lu,et al. Time domain numerical simulation for transient waves on reconfigurable coprocessor platform , 2005, 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05).
[100] Caroline Baldassari. Modélisation et simulation numérique pour la migration terrestre par équation d'ondes. (Modelling and numerical simulation for land migration by wave equation) , 2009 .
[101] R. Kosloff,et al. Absorbing boundaries for wave propagation problems , 1986 .
[102] A. Chorin. Numerical solution of the Navier-Stokes equations , 1968 .
[103] Ewing L. Lusk,et al. Early Experiments with the OpenMP/MPI Hybrid Programming Model , 2008, IWOMP.
[104] Lijian Tan,et al. Time-Reversal Methods For RTM And FWI , 2011 .
[105] Henri Calandra,et al. Fast seismic modeling and reverse time migration on a graphics processing unit cluster , 2012, Concurr. Comput. Pract. Exp..
[106] Andreas Griewank,et al. Algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation , 2000, TOMS.
[107] Jenö Gazdag,et al. Wave equation migration with the phase-shift method , 1978 .
[108] Thorne Lay,et al. Quantitative Seismology, Second Edition , 2003 .
[109] Antoine Guitton,et al. Shot-profile Migration of Multiple Reflections , 2002 .
[110] Erich M. Nahum,et al. Evaluating the impact of simultaneous multithreading on network servers using real hardware , 2005, SIGMETRICS '05.
[111] Andreas Griewank,et al. Achieving logarithmic growth of temporal and spatial complexity in reverse automatic differentiation , 1992 .
[112] Georg Hager,et al. Domain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model , 2014, Parallel Process. Lett..
[113] B. Chapman,et al. Energy Analysis of Parallel Scientific Kernels on Multiple GPUs , 2012, 2012 Symposium on Application Accelerators in High Performance Computing.
[114] J. Coffeen,et al. Seismic Exploration Fundamentals , 1978 .
[115] Christian Märtin,et al. Post-Dennard Scaling and the final Years of Moore ’ s Law Consequences for the Evolution of Multicore-Architectures , 2014 .
[116] Jairo Panetta,et al. Accelerating Kirchhoff Migration by CPU and GPU Cooperation , 2009, 2009 21st International Symposium on Computer Architecture and High Performance Computing.
[117] Jean-Pierre Berenger,et al. A perfectly matched layer for the absorption of electromagnetic waves , 1994 .
[118] David A. Patterson,et al. Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .
[119] Spencer,et al. 3D Seismic Survey Design , 1995 .
[120] Wang Chen,et al. An FPGA implementation of the two-dimensional finite-difference time-domain (FDTD) algorithm , 2004, FPGA '04.
[121] William J. Dally,et al. Scaling the Power Wall: A Path to Exascale , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[122] Tarek S. Abdelrahman,et al. hiCUDA: High-Level GPGPU Programming , 2011, IEEE Transactions on Parallel and Distributed Systems.
[123] Antonio Cisternino,et al. Device specialization in heterogeneous multi-GPU environments , 2012, ICCSW.
[124] N. Whitmore. Iterative Depth Migration By Backward Time Propagation , 1983 .
[125] J. Xu. OpenCL – The Open Standard for Parallel Programming of Heterogeneous Systems , 2009 .
[126] William W. Symes,et al. Reverse time migration with optimal checkpointing , 2007 .
[127] Peyman P. Moghaddam,et al. Industrial-Scale Reverse Time Migration On GPU Hardware , 2009 .
[128] David E. Keyes,et al. Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates , 2014, SIAM J. Sci. Comput..
[129] Lijian Tan,et al. Time-reversal checkpointing methods for RTM and FWI , 2012 .
[130] Junichiro Makino,et al. Optimal Temporal Blocking for Stencil Computation , 2015, ICCS.
[131] Satoshi Matsuoka,et al. Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[132] Michael Commer,et al. A parallel finite-difference approach for 3D transient electromagnetic modeling with galvanic sources , 2004 .
[133] Chau-Wen Tseng,et al. Tiling Optimizations for 3D Scientific Computations , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[134] Ligang Lu,et al. Multi-level parallel computing of reverse time migration for seismic imaging on blue Gene/Q , 2013, PPoPP '13.
[135] J. Etgen,et al. Seismic migration problems and solutions , 2001 .
[136] Gerhard Wellein,et al. Introduction to High Performance Computing for Scientists and Engineers , 2010, Chapman and Hall / CRC computational science series.
[137] R. Plessix. A review of the adjoint-state method for computing the gradient of a functional with geophysical applications , 2006 .
[138] Peter Messmer,et al. Accelerating Stencil-Based Computations by Increased Temporal Locality on Modern Multi- and Many-Core Architectures , 2008 .
[139] Mauricio Hanzich,et al. Assessing Accelerator-Based HPC Reverse Time Migration , 2011, IEEE Transactions on Parallel and Distributed Systems.
[141] Margaret H. Wright,et al. The opportunities and challenges of exascale computing , 2010 .
[142] Gerhard Wellein,et al. Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model , 2014, ICS.
[143] Don C. Lawton,et al. An acquisition polarity standard for multicomponent seismic data , 2000 .
[144] Robin P. Fletcher,et al. Time-varying boundary conditions in simulation of seismic wave propagation , 2011 .
[145] Tarek S. Abdelrahman,et al. Parallel Radix Sort on the AMD Fusion Accelerated Processing Unit , 2013, 2013 42nd International Conference on Parallel Processing.
[146] Edip Baysal,et al. Forward modeling by a Fourier method , 1982 .
[147] K. Yee. Numerical solution of initial boundary value problems involving maxwell's equations in isotropic media , 1966 .
[148] Kirannmayi M. Sirasala,et al. Experience of Porting and Optimization of Seismic Modelling on Multi and Many Cores of Hybrid Computing Cluster , 2015 .
[149] P. Schultz,et al. Fundamentals of geophysical data processing , 1979 .
[150] Biondo L. Biondi,et al. 3D Seismic Imaging , 2006 .
[151] Jack J. Dongarra,et al. Accelerating GPU Kernels for Dense Linear Algebra , 2010, VECPAR.
[152] Niels Kuster,et al. Comparison of CPML Implementations for the GPU-Accelerated FDTD Solver , 2011 .
[153] R. Stolt. MIGRATION BY FOURIER TRANSFORM , 1978 .
[154] Kristel C. Meza-Fajardo,et al. A Nonconvolutional, Split-Field, Perfectly Matched Layer for Wave Propagation in Isotropic and Anisotropic Elastic Media: Stability Analysis , 2008 .
[155] Ray L. Sengbush. Seismic Exploration Methods , 1983 .
[156] Yue Wang,et al. REVERSE-TIME MIGRATION , 1999 .
[157] K. R. Kelly,et al. SYNTHETIC SEISMOGRAMS: A FINITE ‐DIFFERENCE APPROACH , 1976 .
[158] Carlos Couder-Castañeda,et al. TESLA GPUs versus MPI with OpenMP for the Forward Modeling of Gravity and Gravity Gradient of Large Prisms Ensemble , 2013, J. Appl. Math..
[159] Sergei Gorlatch,et al. Extending the SkelCL Skeleton Library for Stencil Computations on Multi-GPU Systems , 2014 .