论文信息 - Cloud Computing for Satellite Data Processing on High End Compute Clusters

Cloud Computing for Satellite Data Processing on High End Compute Clusters

Hadoop is a Distributed Filesystem and MapReduce framework originally developed for search applications by Google and subsequently adopted by the Apache foundation as an open source system. We propose that this parallel computing framework is well suited for a variety of service oriented science applications and, in particular, for satellite data processing of remote sensing systems. We show that, by installing Hadoop on a cluster of IBM PowerPC blade clusters, we can efficiently process multiyear remote sensing data, expect to see speed performance improvements over conventional multi-processor methodologies, and have more memory efficient implementation allowing for finer grid resolutions. Moreover, these improvements can be met without significant changes in coding structure.

Milton Halem | Navid Golpayegani | M. Halem | N. Golpayegani

[1] Yelena Yesha,et al. Service-Oriented Atmospheric Radiances (SOAR): Gridding and Analysis Services for Multisensor Aqua IR Radiance Data for Climate Studies , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[2] William L. Smith,et al. AIRS: Improving Weather Forecasting and Providing New Data on Greenhouse Gases. , 2006 .

[3] William L. Smith,et al. AIRS/AMSU/HSB on the Aqua mission: design, science objectives, data products, and processing systems , 2003, IEEE Trans. Geosci. Remote. Sens..

[4] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.