A new fuzzy logic-based query expansion model for efficient information retrieval using relevance feedback approach

Abstract Efficient query expansion (QE) terms selection methods are really very important for improving the accuracy and efficiency of the system by removing the irrelevant and redundant terms from the top-retrieved feedback documents corpus with respect to a user query. Each individual QE term selection method has its weaknesses and strengths. To overcome the weaknesses and to utilize the strengths of the individual method, we used multiple terms selection methods together. In this paper, we present a new method for QE based on fuzzy logic considering the top-retrieved document as relevance feedback documents for mining additional QE terms. Different QE terms selection methods calculate the degrees of importance of all unique terms of top-retrieved documents collection for mining additional expansion terms. These methods give different relevance scores for each term. The proposed method combines different weights of each term by using fuzzy rules to infer the weights of the additional query terms. Then, the weights of the additional query terms and the weights of the original query terms are used to form the new query vector, and we use this new query vector to retrieve documents. All the experiments are performed on TREC and FIRE benchmark datasets. The proposed QE method increases the precision rates and the recall rates of information retrieval systems for dealing with document retrieval. It gets a significant higher average recall rate, average precision rate and F measure on both datasets.

[1]  Ben He,et al.  Modeling term proximity for probabilistic information retrieval models , 2011, Inf. Sci..

[2]  Fuzzy Logic in Control Systems : Fuzzy Logic , 2022 .

[3]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[4]  Philippe Mulhem,et al.  A relational vector space model using an advanced weighting scheme for image retrieval , 2011, Inf. Process. Manag..

[5]  Yi Yu,et al.  ADVISOR: Personalized Video Soundtrack Recommendation by Late Fusion with Heuristic Rankings , 2014, ACM Multimedia.

[6]  Ben He,et al.  Revisiting Rocchio's Relevance Feedback Algorithm for Probabilistic Models , 2010, AIRS.

[7]  Byeong Man Kim,et al.  Query term expansion and reweighting using term co-occurrence similarity and fuzzy inference , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[8]  Hahn-Ming Lee,et al.  Interactive query expansion based on fuzzy association thesaurus for Web information retrieval , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[9]  Fabio Crestani,et al.  The likelihood property in general retrieval operations , 2013, Inf. Sci..

[10]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[11]  E. Mizutani,et al.  Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] , 1997, IEEE Transactions on Automatic Control.

[12]  Lotfi A. Zadeh,et al.  Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic , 1997, Fuzzy Sets Syst..

[13]  Stephen E. Robertson,et al.  On Term Selection for Query Expansion , 1991, J. Documentation.

[14]  Alvaro Barreiro,et al.  Score distributions for Pseudo Relevance Feedback , 2014, Inf. Sci..

[15]  Farooq Ahmad,et al.  Content-based image retrieval using extroverted semantics: a probabilistic approach , 2013, Neural Computing and Applications.

[16]  Jurandy Almeida,et al.  A scalable re-ranking method for content-based image retrieval , 2014, Inf. Sci..

[17]  Yogesh Gupta,et al.  A new fuzzy logic based ranking function for efficient Information Retrieval system , 2015, Expert Syst. Appl..

[18]  Lourdes Araujo,et al.  Comparing and Combining Methods for Automatic Query Expansion , 2008, ArXiv.

[19]  Fernando Diaz,et al.  Improving the estimation of relevance models using large external corpora , 2006, SIGIR.

[20]  Noureddine Mouaddib,et al.  A fuzzy information retrieval and management system and its applications , 1996, SAC '96.

[21]  Aditi Sharan,et al.  Context Window Based Co-occurrence Approach for Improving Feedback Based Query Expansion in Information Retrieval , 2015, Int. J. Inf. Retr. Res..

[22]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[23]  Roger Zimmermann,et al.  EventBuilder: Real-time Multimedia Event Summarization by Visualizing Social Media , 2015, ACM Multimedia.

[24]  Ali Jaoua,et al.  Query expansion using fuzzy association rules between terms , 2003 .

[25]  Kevyn Collins-Thompson,et al.  Reducing the risk of query expansion via robust constrained optimization , 2009, CIKM.

[26]  Xiangji Huang,et al.  Proximity-based rocchio's model for pseudo relevance , 2012, SIGIR '12.

[27]  Soon Myoung Chung,et al.  Text Clustering with Feature Selection by Using Statistical Data , 2008, IEEE Transactions on Knowledge and Data Engineering.

[28]  Chuan-Yu Chang,et al.  Semantic real-world image classification for image retrieval with fuzzy-ART neural network , 2011, Neural Computing and Applications.

[29]  Van Rijsbergen,et al.  A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[30]  Bo Xu,et al.  Query expansion based on term similarity tree model , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.

[31]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[32]  Wei-Ying Ma,et al.  Probabilistic query expansion using query logs , 2002, WWW '02.

[33]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[34]  Samir Elloumi,et al.  Extension of fuzzy Galois connection for information retrieval using a fuzzy quantifier , 2003 .

[35]  Ryoji Kataoka,et al.  Access concentration detection in click logs to improve mobile Web-IR , 2009, Inf. Sci..

[36]  Aditi Sharan,et al.  Relevance Feedback Based Query Expansion Model Using Borda Count and Semantic Similarity Approach , 2015, Comput. Intell. Neurosci..

[37]  Meng-Sung Wu Modeling query-document dependencies with topic language models for information retrieval , 2015, Inf. Sci..

[38]  Chuen-Chien Lee FUZZY LOGIC CONTROL SYSTEMS: FUZZY LOGIC CONTROLLER - PART I , 1990 .

[39]  Jeffrey Xu Yu,et al.  Support IR query refinement by partial keyword set , 2001, Proceedings of the Second International Conference on Web Information Systems Engineering.

[40]  Shyi-Ming Chen,et al.  A new query expansion method based on fuzzy rules , 2003 .

[41]  Yunjie Calvin Xu,et al.  Information Retrieval with a Hybrid Automatic Query Expansion and Data Fusion Procedure , 2004, Information Retrieval.

[42]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[43]  Pushpak Bhattacharyya,et al.  On Improving Pseudo-Relevance Feedback Using Pseudo-Irrelevant Documents , 2010, ECIR.

[44]  Luis Alfonso Ureña López,et al.  Using information gain to improve multi-modal information retrieval systems , 2008, Inf. Process. Manag..

[45]  E. H. Mamdani,et al.  An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller , 1999, Int. J. Man Mach. Stud..

[46]  Rutuja A. Bhat,et al.  Optimization Techniques for Improving the Performance of Information Retrieval System , 2014 .

[47]  Bernard Ijesunor Akhigbe,et al.  A Fuzzy-Ontology Based Information Retrieval System for Relevant Feedback , 2011 .

[48]  Vijay V. Raghavan,et al.  Adaptive relevance feedback method of extended Boolean model using hierarchical clustering techniques , 2006, Inf. Process. Manag..

[49]  Gary Marchionini,et al.  Examining the effectiveness of real-time query expansion , 2007, Inf. Process. Manag..

[50]  Shyi-Ming Chen,et al.  Query Expansion for Document Retrieval by Mining Additional Query Terms , 2008 .

[51]  Hongfei Lin,et al.  Finding a good query-related topic for boosting pseudo-relevance feedback , 2011, J. Assoc. Inf. Sci. Technol..

[52]  Euripides G. M. Petrakis,et al.  Semantic similarity methods in wordNet and their application to information retrieval on the web , 2005, WIDM '05.

[53]  W. Bruce Croft,et al.  A deterministic resampling method using overlapping document clusters for pseudo-relevance feedback , 2013, Inf. Process. Manag..

[54]  J A Swets,et al.  Information Retrieval Systems. , 1963, Science.

[55]  Donato Malerba,et al.  A data mining approach to PubMed query refinement , 2004, Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004..

[56]  Aditi Sharan,et al.  Co-occurrence and Semantic Similarity Based Hybrid Approach for Improving Automatic Query Expansion in Information Retrieval , 2015, ICDCIT.

[57]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[58]  Clement T. Yu,et al.  An effective approach to document retrieval via utilizing WordNet and recognizing phrases , 2004, SIGIR '04.