Enhancing information source selection using a genetic algorithm and social tagging

Abstract The selection of information sources in a distributed information retrieval environment remains a critical issue. In this context, it is known that a distributed information retrieval system consists of a huge number of sources. Ensuring retrieval effectiveness is to search only sources which are likely to contain relevant information for a query . An important number of heuristics exist among which we quote genetic algorithm that is used to solve the above problem. The proposed genetic algorithm consists in finding the best selection in large space of potential solutions; where a solution is represented as a combination of a set of sources. The improvement of selection accuracy is assured based on the user’s track through the use of sources, to say that source description is enriched with tags from the tagging history.

[1]  Pasquale Lops,et al.  Integrating tags in a semantic content-based recommender , 2008, RecSys '08.

[2]  Milad Shokouhi,et al.  Federated Search , 2011, Found. Trends Inf. Retr..

[3]  Zbigniew Michalewicz,et al.  Handbook of Evolutionary Computation , 1997 .

[4]  Marshall Ramsey,et al.  A Smart Itsy Bitsy Spider for the Web , 1998, J. Am. Soc. Inf. Sci..

[5]  Ali Selamat,et al.  Query Optimization in Relevance Feedback Using Hybrid GA-PSO for Effective Web Information Retrieval , 2009, 2009 Third Asia International Conference on Modelling & Simulation.

[6]  Fabio Crestani,et al.  Reducing the Uncertainty in Resource Selection , 2013, ECIR.

[7]  W. Bruce Croft,et al.  Cluster-based language models for distributed retrieval , 1999, SIGIR '99.

[8]  James C. French,et al.  The Effects of Query-Based Sampling on Automatic Database Selection Algorithms , 2000 .

[9]  Milad Shokouhi,et al.  Central-Rank-Based Collection Selection in Uncooperative Distributed Information Retrieval , 2007, ECIR.

[10]  Michael D. Gordon Probabilistic and genetic algorithms in document retrieval , 1988, CACM.

[11]  David Hawking,et al.  Merging Results From Isolated Search Engines , 1999, Australasian Database Conference.

[12]  Eman Fares Al Mashagba,et al.  Query Optimization Using Genetic Algorithms in the Vector Space Model , 2011, ArXiv.

[13]  Philomina Simon,et al.  A Document Retrieval System with Combination Terms Using Genetic Algorithm , 2010 .

[14]  Luo Si,et al.  A joint probabilistic classification model for resource selection , 2010, SIGIR '10.

[15]  Hugo Zaragoza,et al.  Structure of morphologically expanded queries: A genetic algorithm approach , 2010, Data Knowl. Eng..

[16]  Adam Mathes,et al.  Folksonomies-Cooperative Classification and Communication Through Shared Metadata , 2004 .

[17]  Milad Shokouhi,et al.  Robust result merging using sample-based score estimates , 2009, TOIS.

[18]  Norbert Fuhr,et al.  Evaluating different methods of estimating retrieval quality for resource selection , 2003, SIGIR.

[19]  Xin Li,et al.  Tag-based social interest discovery , 2008, WWW.

[20]  Jawed I. A. Siddiqi,et al.  Adaptive information retrieval system via modelling user behaviour , 2014, J. Ambient Intell. Humaniz. Comput..

[21]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[22]  SOWMYA RAVI,et al.  SEARCH ENGINES USING EVOLUTIONARY ALGORITHMS , 2012 .

[23]  Abdelhamid Bouchachia,et al.  Online and interactive self-adaptive learning of user profile using incremental evolutionary algorithms , 2014, Evol. Syst..

[24]  Luis Gravano,et al.  GlOSS: text-source discovery over the Internet , 1999, TODS.

[25]  Nikos Loutas,et al.  Closing the Service Discovery Gap by Collaborative Tagging and Clustering Techniques , 2008, SMRR.

[26]  Huilian Fan,et al.  Crawling Strategy of Focused Crawler Based on Niche Genetic Algorithm , 2009, 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing.

[27]  Jamie Callan,et al.  DISTRIBUTED INFORMATION RETRIEVAL , 2002 .

[28]  Luo Si,et al.  Unified utility maximization framework for resource selection , 2004, CIKM '04.

[29]  Joaquín Pérez-Iglesias,et al.  Training a classifier for the selection of good query expansion terms with a genetic algorithm , 2010, IEEE Congress on Evolutionary Computation.

[30]  Abdulmotaleb El-Saddik,et al.  Collaborative user modeling with user-generated tags for social recommender systems , 2011, Expert Syst. Appl..

[31]  Luo Si,et al.  A semisupervised learning method to merge search engine results , 2003, TOIS.

[32]  Norbert Fuhr,et al.  A decision-theoretic approach to database selection in networked IR , 1999, TOIS.

[33]  Luis Gravano,et al.  The Effectiveness of GlOSS for the Text Database Discovery Problem , 1994, SIGMOD Conference.

[34]  Habiba Drias,et al.  A hybrid genetic algorithm for large scale information retrieval , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[35]  Milad Shokouhi,et al.  Enhancing focused crawling with genetic algorithms , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[36]  James P. Callan,et al.  Query-based sampling of text databases , 2001, TOIS.

[37]  Pragati Bhatnagar,et al.  A Combined Matching Function based Evolutionary Approach for development of Adaptive Information Retrieval System , 2012 .

[38]  Yong Yu,et al.  Optimizing web search using social annotations , 2007, WWW '07.

[39]  Hakima Mellah,et al.  Selection of Information Sources Using a Genetic Algorithm , 2017, WorldCIST.

[40]  Luo Si,et al.  Learning from past queries for resource selection , 2009, CIKM.

[41]  P. Kavitha Classification-Aware Hidden-Web Text Database Selection , 2014 .

[42]  Milad Shokouhi,et al.  SUSHI : Scoring Scaled Samples for Server Selection , 2009 .

[43]  Symeon Papadopoulos,et al.  Distributed Technologies for Personalized Advertisement Delivery , 2011 .

[44]  Sumio Fujita,et al.  Retrieval parameter optimization using genetic algorithms , 2009, Inf. Process. Manag..

[45]  Bernardo A. Huberman,et al.  The Structure of Collaborative Tagging Systems , 2005, ArXiv.

[46]  Dik Lun Lee,et al.  Server Ranking for Distributed Text Retrieval Systems on the Internet , 1997, DASFAA.

[47]  Ophir Frieder,et al.  Repeatable evaluation of search services in dynamic environments , 2007, TOIS.

[48]  Bamshad Mobasher,et al.  Personalizing Navigation in Folksonomies Using Hierarchical Tag Clustering , 2008, DaWaK.

[49]  Huynh Thi Thanh Binh,et al.  Crawl Topical Vietnamese Web Pages Using Genetic Algorithm , 2010, 2010 Second International Conference on Knowledge and Systems Engineering.

[50]  Andrew Trotman,et al.  Sound and complete relevance assessment for XML retrieval , 2008, TOIS.

[51]  Pasquale Lops,et al.  A Semantic Content-Based Recommender System Integrating Folksonomies for Personalized Access , 2009, Web Personalization in Intelligent Environments.

[52]  James C. French,et al.  Comparing the performance of collection selection algorithms , 2003, TOIS.

[53]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[54]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[55]  Rajesh Kumar,et al.  A heuristic approach for search engine selection in meta-search engine , 2015, International Conference on Computing, Communication & Automation.

[56]  Joemon M. Jose,et al.  Personalizing Web Search with Folksonomy-Based User and Document Profiles , 2010, ECIR.

[57]  Georgios Paltoglou,et al.  Integral based source selection for uncooperative distributed information retrieval environments , 2008, LSDS-IR '08.

[58]  James C. French,et al.  Comparing the performance of database selection algorithms , 1999, SIGIR '99.

[59]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..