A novel approach to implement a shop bot on distributed web crawler

Shopping Agent is a kind of Web application software that, when queried by the customer, provides him/her with the consolidated list of the information about all the retail products relating to a query from various e-commerce sites and resources. This helps customers to decide on the best site that provides nearest, cheapest and most reliable product that they desire to buy. This paper aims to develop a distributed crawler to help on-line shoppers to compare the prices of the requested products from different vendors and get the best deal at one place. The crawling usually consumes large set of computer resources to process the vast amount of data in fat e-commerce servers in a real world scenario. So the alternative way is to use map-reduce paradigm to process large amount of data by forming Hadoop cluster of cheap commodity hardware. Therefore, this paper describes implementation of a shopping agent on a distributed web crawler using map-Reduce paradigm to crawl the web pages.