Hadoop is a Distributed Filesystem and MapReduce framework originally developed for search applications by Google and subsequently adopted by the Apache foundation as an open source system. We propose that this parallel computing framework is well suited for a variety of service oriented science applications and, in particular, for satellite data processing of remote sensing systems. We show that, by installing Hadoop on a cluster of IBM PowerPC blade clusters, we can efficiently process multiyear remote sensing data, expect to see speed performance improvements over conventional multi-processor methodologies, and have more memory efficient implementation allowing for finer grid resolutions. Moreover, these improvements can be met without significant changes in coding structure.
[1]
Yelena Yesha,et al.
Service-Oriented Atmospheric Radiances (SOAR): Gridding and Analysis Services for Multisensor Aqua IR Radiance Data for Climate Studies
,
2009,
IEEE Transactions on Geoscience and Remote Sensing.
[2]
William L. Smith,et al.
AIRS: Improving Weather Forecasting and Providing New Data on Greenhouse Gases.
,
2006
.
[3]
William L. Smith,et al.
AIRS/AMSU/HSB on the Aqua mission: design, science objectives, data products, and processing systems
,
2003,
IEEE Trans. Geosci. Remote. Sens..
[4]
Sanjay Ghemawat,et al.
MapReduce: Simplified Data Processing on Large Clusters
,
2004,
OSDI.