Search Ranking for Heterogeneous Data over Dataspace

Traditional relational database systems queries works over structured data, whereas information retrieval systems are designed for additional versatile and flexible ranked keyword queries, works over unstructured data, Semi-structured, Streamed data, Social networking data and data without any format, known as heterogeneous data. However, several new and emerging applications need data management capabilities that mix the advantages both approaches. In this paper, we have proposed and initiate steps to combine heterogeneous statistics and information retrieval systems over Dataspace, which are the collection heterogeneous data, data from various sources and in different format. In several enterprise, the heterogeneity among information at different levels has becomes a difficult job. In an organization, data exist in structured, semi-structured or unstructured format or combination of all these. The existing heterogeneous data management systems are unsuccessful to deal with such information in efficient manner. Dataspace approach gives the solution of the problem of presence of heterogeneity in information and a variety of drawbacks of the existing systems. The main motive of this paper is to explain searching ranking mechanism in Dataspace. We also investigate how structured, semi structured or unstructured data can be take advantages for ranking of search on Web and Dataspace with their research challenges.

[1]  S. Decker,et al.  Using Naming Authority to Rank Data and Ontologies for Web Search , 2009, SEMWEB.

[2]  M C Freda,et al.  Both sides now. , 1999 .

[3]  Alon Y. Halevy,et al.  Semantic Integration Research in the Database Community : A Brief Survey , 2005 .

[4]  Mike Thelwall,et al.  Synthesis Lectures on Information Concepts, Retrieval, and Services , 2009 .

[5]  Le Zhao,et al.  Effective and efficient structured retrieval , 2009, CIKM.

[6]  Peter Mika,et al.  Ad-hoc object retrieval in the web of data , 2010, WWW '10.

[7]  Savita Shiwani,et al.  Evaluation of Bitmap Index Compression using Data Pump in Oracle Database , 2014 .

[8]  Xuemin Lin,et al.  Keyword search on structured and semi-structured data , 2009, SIGMOD Conference.

[9]  Gerhard Weikum,et al.  Language-model-based ranking for queries on RDF-graphs , 2009, CIKM.

[10]  David Maier,et al.  From databases to dataspaces: a new abstraction for information management , 2005, SGMD.

[11]  Gerhard Weikum DB&IR: both sides now , 2007, SIGMOD '07.

[12]  Daniel M. Herzig Ranking for Web Data Search Using On-The-Fly Data Integration , 2014 .

[13]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[14]  Gerhard Weikum,et al.  Probabilistic Ranking of Database Query Results , 2004, VLDB.

[15]  Stefan Decker,et al.  Hierarchical Link Analysis for Ranking Web Data , 2010, ESWC.

[16]  Daniel Schwabe,et al.  A hybrid approach for searching in the semantic web , 2004, WWW '04.

[17]  Cyril Cleverdon,et al.  The Cranfield tests on index language devices , 1997 .

[18]  Daniela Petrelli,et al.  Hybrid Search: Effectively Combining Keywords and Semantic Searches , 2008, ESWC.

[19]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[20]  Mrityunjay Singh,et al.  Transformation rules for decomposing heterogeneous data into triples , 2015, J. King Saud Univ. Comput. Inf. Sci..

[21]  Monika Podolecheva Principles of Dataspaces , 2007 .