Weblog search engine based on quality criteria

Nowadays, increasing amount of human knowledge is placed in computerized repositories such as the World Wide Web. This gives rise to the problem of how to locate specific pieces of information in these often quite unstructured repositories. Search engines is the best solved. Some studied show that, almost half of the traffic to the blog server comes from search engines. The more outgoing and informal social nature of the blogosphere opens the opportunity for exploiting more socially-oriented features. The nature of blogs, which are usually characterized by their personal and informal nature, dynamically and constructed on the new relational links required new quality measurement for blog search engine. Link analysis algorithms that exploit the Web graph may not work well in the blogosphere in general. (Goncalves et al 2010) indicated that most of the popular blogs in the dataset (70%) have a PageRank value equal -1, being thus almost invisible to the search engine. We expected that incorporated the special blogs quality criteria would be more desirably retrieved by search engines.

[1]  Soo Young Rieh Judgment of information quality and cognitive authority in the Web , 2002, J. Assoc. Inf. Sci. Technol..

[2]  Richi Nayak,et al.  Mining world knowledge for analysis of search engine content , 2007, Web Intell. Agent Syst..

[3]  Felix Naumann,et al.  Assessment Methods for Information Quality Criteria , 2000, IQ.

[4]  Inna Kouper,et al.  Conversations in the Blogosphere: An Analysis "From the Bottom Up" , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[5]  Allan Borodin,et al.  Link analysis ranking: algorithms, theory, and experiments , 2005, TOIT.

[6]  Wenfei Fan,et al.  Keys for XML , 2001, WWW '01.

[7]  Euripides G. M. Petrakis,et al.  Improving the performance of focused web crawlers , 2009, Data Knowl. Eng..

[8]  Mohammad Javad Kargar,et al.  Formulating Priory of Information Quality Criteria on the Blog , 2008 .

[9]  Susan Gauch,et al.  Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web , 2000, SIGIR '00.

[10]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[11]  Sourav S. Bhowmick,et al.  A survey of Web metrics , 2002, CSUR.

[12]  Stuart E. Madnick,et al.  Overview and Framework for Data and Information Quality Research , 2009, JDIQ.

[13]  Evangelos E. Milios,et al.  Using HMM to learn user browsing patterns for focused Web crawling , 2006, Data & Knowledge Engineering.

[14]  Patricia Bouyer,et al.  Improved undecidability results on weighted timed automata , 2006, Inf. Process. Lett..

[15]  M. Schreurs From the Bottom Up , 2008 .

[16]  Edith Cohen,et al.  A short walk in the Blogistan , 2006, Comput. Networks.

[17]  Kathy E. Gill How can we measure the influence of the blogosphere? , 2004 .

[18]  Virgílio A. F. Almeida,et al.  On Popularity in the Blogosphere , 2010, IEEE Internet Computing.

[19]  Sergei Silvestrov,et al.  The Mathematics of Internet Search Engines , 2008 .

[20]  Peter J. Nürnberg,et al.  Proceedings of the seventeenth conference on Hypertext and hypermedia , 2006 .

[21]  Dmitri Loguinov,et al.  IRLbot: Scaling to 6 billion pages and beyond , 2009, TWEB.

[22]  Ying Zhou,et al.  Analysis of Weblog Link Structure - A Community Perspective , 2006, WEBIST.

[23]  Soo Young Rieh Judgement of information quality and cognitive authority in the Web , 2002 .

[24]  Iraklis Varlamis,et al.  BlogRank: ranking weblogs based on connectivity and similarity features , 2006, AAA-IDEA '06.

[25]  Giovanni Pacifici Proceedings of the 2nd international workshop on Advanced architectures and algorithms for internet delivery and applications , 2006 .

[26]  Matthew Hurst,et al.  Social Streams Blog Crawler , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[27]  Thomas Mandl,et al.  Implementation and evaluation of a quality-based search engine , 2006, HYPERTEXT '06.

[28]  Rebecca Blood,et al.  How blogging software reshapes the online community , 2004, CACM.

[29]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.