论文信息 - Implementing the Palomar Transient Factory Real-Time Detection Pipeline in GLADE: Results and Observations

Implementing the Palomar Transient Factory Real-Time Detection Pipeline in GLADE: Results and Observations

Palomar Transient Factory is a comprehensive detection system for the identification and classification of transient astrophysical objects. The central piece in the identification pipeline is represented by an automated classifier that distinguishes between real and bogus objects with high accuracy. Given that the classifier has to identify the most significant transients out of a large number of candidates in near real-time, the response time it provides is of critical importance. In this paper, we present an experimental study that evaluates a novel implementation of the classifier in GLADEa parallel data processing system that combines the efficiency of a database with the extensibility of Map-Reduce. We show how each stage in the classifier candidate identification, pruning, and contextual realbogus maps optimally into GLADE tasks by taking advantage of the unique features of the systemrange-based data partitioning, columnar storage, multi-query execution, and in-database support for complex aggregate computation. The result is an efficient classifier implementation capable to process a new set of acquired images in a matter of minutes even on a low-end server. For comparison, an optimized PostgreSQL implementation of the classifier takes hours on the same machine.

Kesheng Wu | Peter Nugent | Florin Rusu

[1] Florin Rusu,et al. GLADE: a scalable framework for efficient analytics , 2012, OPSR.

[2] Ernest E. Croner,et al. The Palomar Transient Factory: System Overview, Performance, and First Results , 2009, 0906.5350.

[3] Peter Nugent,et al. The Palomar transient factory , 2015, Electronic Imaging.

[4] Carl J. Grillmair,et al. An Overview of the Palomar Transient Factory Pipeline and Archive at the Infrared Processing and Analysis Center , 2010 .

[5] Guido van Rossum,et al. Python Programming Language , 2007, USENIX Annual Technical Conference.

[6] Subramanian Arumugam,et al. The DataPath system: a data-centric analytic processing engine for large data warehouses , 2010, SIGMOD Conference.

[7] Yu Cheng,et al. GLADE: big data analytics made easy , 2012, SIGMOD Conference.

[8] Yu Cheng,et al. Astronomical data processing in EXTASCID , 2013, SSDBM.

[9] E. O. Ofek,et al. Automating Discovery and Classification of Transients and Variable Stars in the Synoptic Survey Era , 2011, 1106.5491.