Accelerating incoherent dedispersion

Incoherent dedispersion is a computationally intensive problem that appears frequently in pulsar and transient astronomy. For current and future transient pipelines, dedispersion can dominate the total execution time, meaning its computational speed acts as a constraint on the quality and quantity of science results. It is thus critical that the algorithm be able to take advantage of trends in commodity computing hardware. With this goal in mind, we present an analysis of the ‘direct’, ‘tree’ and ‘sub-band’ dedispersion algorithms with respect to their potential for efficient execution on modern graphics processing units (GPUs). We find all three to be excellent candidates, and proceed to describe implementations in c for cuda using insight gained from the analysis. Using recent CPU and GPU hardware, the transition to the GPU provides a speed-up of nine times for the direct algorithm when compared to an optimized quad-core CPU code. For realistic recent survey parameters, these speeds are high enough that further optimization is unnecessary to achieve real-time processing. Where further speed-ups are desirable, we find that the tree and sub-band algorithms are able to provide three to seven times better performance at the cost of certain smearing, memory consumption and development time trade-offs. We finish with a discussion of the implications of these results for future transient surveys. Our GPU dedispersion code is publicly available as a c library at http://dedisp.googlecode.com/.

[1]  Maura McLaughlin,et al.  Rotating Radio Transients: new discoveries, timing solutions and musings , 2011 .

[2]  M. Mclaughlin,et al.  A Bright Millisecond Radio Burst of Extragalactic Origin , 2007, Science.

[3]  Stefano Salvini,et al.  Real-time, fast radio transient searches with GPU de-dispersion , 2011, 1107.2516.

[4]  M. Burgay,et al.  A Double-Pulsar System: A Rare Laboratory for Relativistic Gravity and Plasma Physics , 2004, Science.

[5]  S. Burke-Spolaor,et al.  The High Time Resolution Universe Pulsar Survey - I. System configuration and initial discoveries , 2010, 1006.5744.

[6]  Christopher J. Fluke,et al.  Analysing Astronomy Algorithms for GPUs and Beyond , 2010, ArXiv.

[7]  J. Lattimer,et al.  The Physics of Neutron Stars , 2004, Science.

[8]  S. Chatterjee,et al.  The Vertical Structure of Warm Ionised Gas in the Milky Way , 2008, Publications of the Astronomical Society of Australia.

[9]  A. Lyne,et al.  RRATs: New Discoveries, Timing Solutions & Musings , 2011, 1104.2727.

[10]  D. Bhattacharya,et al.  Formation and evolution of binary and millisecond radio pulsars , 1991 .

[11]  F. Camilo,et al.  The Parkes multi-beam pulsar survey - I. Observing and data analysis systems, discovery and timing of 100 pulsars , 2001, astro-ph/0106522.

[12]  R. N. Manchester,et al.  Transient radio bursts from rotating neutron stars , 2005, Nature.

[13]  Amr H. Hassan,et al.  Astrophysical Supercomputing with GPUs: Critical Decisions for Early Adopters* , 2010, Publications of the Astronomical Society of Australia.

[14]  D. Lorimer,et al.  The parkes Southern pulsar Survey — I. Observing and data analysis systems and initial results , 1996 .

[15]  J. Mclaughlin Searches for Fast Radio Transients , 2003, astro-ph/0304364.