The paradigm of processing huge datasets has been shifted from centralized architecture to distributed architecture. As the enterprises faced issues of gathering large chunks of data they found that the data cannot be processed using any of the existing centralized architecture solutions. Apart from time constraints, the enterprises faced issues of efficiency, performance and elevated infrastructure cost with the data processing in the centralized environment. With the help of distributed architecture these large organizations were able to overcome the problems of extracting relevant information from a huge data dump. One of the best open source tools used in the market to harness the distributed architecture in order to solve the data processing problems is Apache Hadoop. Using Apache Hadoop's various components such as data clusters, map-reduce algorithms and distributed processing, we will resolve various location-based complex data problems and provide the relevant information back into the system, thereby increasing the user experience.
[1]
Chuck Lam,et al.
Hadoop in Action
,
2010
.
[2]
蒲思羽.
Big data distributed storage method and system
,
2014
.
[3]
Jason Venner.
Pro Hadoop : build scalable, distributed applications in the cloud
,
2009
.
[4]
Tom White.
Hadoop - The Definitive Guide: MapReduce for the Cloud
,
2009
.
[5]
Paul King,et al.
Groovy in Action
,
2007
.
[6]
B. Achiriloaie,et al.
VI REFERENCES
,
1961
.
[7]
Rod Johnson,et al.
Expert One-on-One J2EE Design and Development
,
2002
.