Comparison of ranking algorithms with dataspace

With increased in digitization the amount of homogeneous, unstructured, semi-structured, structured or heterogeneous data being created and stored is exploding is collectively called “Dataspace”. Data being generated from various heterogeneous sources like, digital images, audio, video, online transactions, online social media, data from sensor nodes, click streams for different domains including, retails, medical, healthcare, energy, and day to day life utilities. In business, industries, institutions and organizations, individuals contribute the data volume like technical reports, seminar reports, research papers, dissertations, thesis etc. For instance, 30 billion web pages are accessed or the World Wide Web. With terrific number of pages of that exist today; search engines assume a significant role in the current internet of thing (IOT). So with billions of web pages accessible on the web, a user query entered in the search engine may returns thousands of web pages, and thus it becomes extremely important to rank these results in such a way that the most “related” or “important” or “authorized” pages are displayed first. This job of prioritizing the results is performed by ranking algorithms, and various search engines use different schemes for ranking the results. Ranking of data can also do in heterogeneous data to retrieve information from the Dataspace. The aim of this paper is to describe Dataspace and present a survey on ranking algorithms, and their comparison, Comparison is done on the basis of some parameters such as main technique use, methodology, and input parameter, and relevancy, quality of results, importance and limitations, search engines and time complexity of algorithms. In this we also explained how ranking can be used in Dataspace with challenges to information retrieval from heterogeneous data or from Dataspace.

[1]  Wenpu Xing,et al.  Weighted PageRank algorithm , 2004, Proceedings. Second Annual Conference on Communication Networks and Services Research, 2004..

[2]  Jöran Beel,et al.  Google Scholar’s Ranking Algorithm : An Introductory Overview , 2009 .

[3]  Sang Ho Lee,et al.  An Improved Computation of the PageRank Algorithm , 2002, ECIR.

[4]  Mrityunjay Singh,et al.  A Survey on Dataspace , 2011 .

[5]  Norbert Fuhr,et al.  Probabilistic Models in Information Retrieval , 1992, Comput. J..

[6]  Ko Fujimura,et al.  The EigenRumor Algorithm for Ranking Blogs , 2005 .

[7]  Hong Hua,et al.  A widget framework for augmented interaction in SCAPE , 2003, UIST '03.

[8]  Mengchi Liu,et al.  Modeling heterogeneous data in dataspace , 2008, IRI.

[9]  Alon Y. Halevy,et al.  Data Modeling in Dataspace Support Platforms , 2009, Conceptual Modeling: Foundations and Applications.

[10]  Chen Chen,et al.  TagRank: A New Rank Algorithm for Webpage Based on Social Web , 2008, 2008 International Conference on Computer Science and Information Technology.

[11]  Allan Borodin,et al.  Link analysis ranking: algorithms, theory, and experiments , 2005, TOIT.

[12]  Krishna Bharat,et al.  When experts agree: using non-affiliated experts to rank popular topics , 2001, TOIS.

[13]  David Maier,et al.  From databases to dataspaces: a new abstraction for information management , 2005, SGMD.

[14]  Bing Han,et al.  TimeRank: A method of improving ranking scores by visited time , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[15]  Nasser Yazdani,et al.  DistanceRank: An intelligent ranking algorithm for web pages , 2008, Inf. Process. Manag..

[16]  Gordon Bell,et al.  MyLifeBits: fulfilling the Memex vision , 2002, MULTIMEDIA '02.

[17]  Xiaofeng Meng,et al.  Supporting context-based query in personal DataSpace , 2009, CIKM.

[18]  Chris H. Q. Ding,et al.  PageRank, HITS and a unified framework for link analysis , 2002, SIGIR '02.

[19]  Alon Y. Halevy,et al.  Indexing dataspaces , 2007, SIGMOD '07.

[20]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[21]  Marcos Antonio,et al.  iMeMex: A Platform for Personal Dataspace Management , 2006 .

[22]  Jens-Peter Dittrich iMeMex: A Platform for Personal Dataspace Management Position Paper , 2006 .

[23]  Wen-Xue Tao,et al.  Query-sensitive self-adaptable Web page ranking algorithm , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[24]  Nicholas J. Belkin,et al.  Retrieval techniques , 1987 .

[25]  Jayant Madhavan,et al.  Personal information management with SEMEX , 2005, SIGMOD '05.

[26]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.