Social Media Networks (SMN) an Eye: To Envision and Extract Information

This paper tries to portrait outline study on the research problem of how to extract valuable knowledge from the various social media networks (SMN), in this regard many technologies, methods and procedures have been developed. Firstly, we intend to discuss and track the complete profile of the current data extraction online tools called as Social Media Networks Extraction System (SMMES), because they focus on the social spectrum applications where the data extraction tools can be applied. Secondly, we will cover the techniques which are related to the operation of the various tools which are used for the data generation from the SMN. A special focus is given to all the difficulties which are related to obtaining knowledge from online web sources, in particular from Social Media Networks. Thirdly, we categorize different fields of applications where web information extraction techniques can be applied, concentrating specially, on social and enterprise applications.

[1]  Elio Masciari,et al.  Web wrapper induction: a brief survey , 2004, AI Commun..

[2]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[3]  Khaled Shaalan,et al.  A Survey of Web Information Extraction Systems , 2006, IEEE Transactions on Knowledge and Data Engineering.

[4]  Terry Winograd,et al.  Understanding natural language , 1974 .

[5]  Susumu Horiguchi,et al.  Automated data extraction from the web with conditional models , 2005, Int. J. Bus. Intell. Data Min..

[6]  Robert Baumgartner,et al.  DeepWeb Navigation in Web Data Extraction , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).

[7]  Valter Crescenzi,et al.  RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[8]  Carol Tenopir,et al.  Users' interaction with World Wide Web resources: an exploratory study using a holistic approach , 2000, Inf. Process. Manag..

[9]  Tantek Çelik,et al.  Microformats: a pragmatic path to the semantic web , 2006, WWW '06.

[10]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[11]  Jesse James Garrett Ajax: A New Approach to Web Applications , 2007 .

[12]  Torsten Suel,et al.  Interactive wrapper generation with minimal user effort , 2006, WWW '06.

[13]  Giovanni Quattrone,et al.  Effective retrieval of resources in folksonomies using a new tag similarity measure , 2011, CIKM '11.

[14]  David W. Embley,et al.  Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages , 1999, Data Knowl. Eng..

[15]  Stefano Ceri Search Computing , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[16]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[17]  Weiyi Meng,et al.  Automatic wrapper generation for the extraction of search result records from search engines , 2007 .

[18]  Gerhard Weikum,et al.  Harvesting, searching, and ranking knowledge on the web: invited talk , 2009, WSDM '09.

[19]  Giovanni Quattrone,et al.  Measuring Similarity in Large-scale Folksonomies , 2011, SEKE.

[20]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[21]  Porfirio Tramontana,et al.  Reverse Engineering Finite State Machines from Rich Internet Applications , 2008, 2008 15th Working Conference on Reverse Engineering.

[22]  Berthier A. Ribeiro-Neto,et al.  A brief survey of web data extraction tools , 2002, SGMD.

[23]  Arnaud Sahuguet,et al.  Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F , 1999, VLDB.

[24]  Wolfgang Gatterbauer,et al.  Towards domain-independent information extraction from web tables , 2007, WWW '07.

[25]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[26]  Georg Gottlob,et al.  Visual Web Information Extraction with Lixto , 2001, VLDB.

[27]  Ben Hammersley,et al.  Developing Feeds With RSS And Atom , 2005 .