论文信息 - A dive into Web Scraper world

A dive into Web Scraper world

This paper talks about the World of Web Scraper, Web scraping is related to web indexing, whose task is to index information on the web with the help of a bot or web crawler. Here the legal aspect, both positive and negative sides are taken into view. Some cases regarding the legal issues are also taken into account. The Web Scraper's designing principles and methods are contrasted, it tells how a working Scraper is designed. The implementation is divided into three parts: the Web Crawler to fetch the desired links, the data extractor to fetch the data from the links and storing that data into a csv file. The Python language is used for the implementation. On combining all these with the good knowledge of libraries and working experience, we can have a fully-fledged Scraper. Due to a vast community and library support for Python and the beauty of coding style of python language, it is most suitable for Scraping data from Websites.

Lisha Singh | Deepak Kumar Mahto

[1] Davar Pishva,et al. Application of Web Scraping and Google API service to optimize convenience stores' distribution , 2015, 2015 17th International Conference on Advanced Communication Technology (ICACT).

[2] Sanjay Kumar Malik,et al. Information Extraction Using Web Usage Mining, Web Scrapping and Semantic Annotation , 2011, 2011 International Conference on Computational Intelligence and Communication Networks.

[3] Simon Fong,et al. Data Reconstruction of Abandoned Websites , 2014, 2014 2nd International Symposium on Computational and Business Intelligence.

[4] A. Joshi,et al. Web mining: research and practice , 2004, Computing in Science & Engineering.