Wextractor: Follow-up of the evolution of prices in web pages

In the e-commerce world, the follow-up of prices in detail web pages is of great interest for things like buying a product when it falls below some threshold. For doing this task, instead of bookmarking the pages and revisiting them, in this paper we propose a novel web data extraction system, called Wextractor. It consists of an extraction method and a web app for listing the retrieved prices. As for the final user, the main feature of Wextractor is usability because (s)he only has to signal the pages of interest and our system automatically extracts the price from the page.

[1]  Sunita Sarawagi,et al.  Information Extraction , 2008 .

[2]  Irena Holubová,et al.  Strigil: A Framework for Data Extraction in Semi-Structured Web Documents , 2013, IIWAS '13.

[3]  Hector Garcia-Molina,et al.  Extracting structured data from Web pages , 2003, SIGMOD '03.

[4]  Qiang Hao,et al.  From one tree to a forest: a unified solution for structured web data extraction , 2011, SIGIR.

[5]  Tim Furche,et al.  OXPath: A language for scalable data extraction, automation, and crawling on the deep web , 2012, The VLDB Journal.

[6]  Valter Crescenzi,et al.  RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[7]  Nicholas Kushmerick,et al.  Wrapper Induction for Information Extraction , 1997, IJCAI.

[8]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[9]  Bing Liu,et al.  Structured Data Extraction from the Web Based on Partial Tree Alignment , 2006, IEEE Transactions on Knowledge and Data Engineering.

[10]  Khaled Shaalan,et al.  A Survey of Web Information Extraction Systems , 2006, IEEE Transactions on Knowledge and Data Engineering.

[11]  Stephen Soderland,et al.  Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.

[12]  Alberto H. F. Laender,et al.  DEByE - Data Extraction By Example , 2002, Data Knowl. Eng..

[13]  Pasquale De Meo,et al.  Web Data Extraction , Applications and Techniques : A Survey , 2010 .