Genetic algorithms in relevance feedback: a second test and new contributions

The present work is the continuation of an earlier study which reviewed the literature on relevance feedback genetic techniques that follow the vector space model (the model that is most commonly used in this type of application), and implemented them so that they could be compared with each other as well as with one of the best traditional methods of relevance feedback--the Ide dec-hi method. We here carry out the comparisons on more test collections (Cranfield, CISI, Medline, and NPL), using the residual collection method for their evaluation as is recommended in this type of technique. We also add some fitness functions of our own design.

[1]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[2]  Oscar Cordón,et al.  A new evolutionary algorithm combining simulated annealing and genetic programming for relevance feedback in fuzzy information retrieval systems , 2002, Soft Comput..

[3]  Gunar E. Liepins,et al.  A Classifier Based System for Discovering Scheduling Heuristics , 1987, ICGA.

[4]  Martin Smith,et al.  The use of genetic programming to build Boolean queries for text retrieval through relevance feedback , 1997, J. Inf. Sci..

[5]  Donald H. Kraft,et al.  GENETIC ALGORITHMS FOR QUERY OPTIMIZATION IN INFORMATION RETRIEVAL: RELEVANCE FEEDBACK , 1997 .

[6]  Gerard Salton,et al.  The SMART Retrieval System , 1971 .

[7]  O. Cordón,et al.  Breve estudio sobre la aplicación de los algoritmos genéticos a la recuperación de información , 1999 .

[8]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[9]  Donna Harman,et al.  Information Processing and Management , 2022 .

[10]  Michael McGill,et al.  A performance evaluation of similarity measures, document term weighting schemes and representations in a Boolean environment , 1980, SIGIR '80.

[11]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[12]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[13]  Donald H. Kraft,et al.  The use of genetic programming to build queries for information retrieval , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[14]  Hsinchun Chen,et al.  A smart itsy bitsy spider for the web , 1998 .

[15]  Dana Vrajitoru,et al.  Crossover Improvement for the Genetic Algorithm in Information Retrieval , 1998, Information Processing & Management.

[16]  Jorng-Tzong Horng,et al.  Applying genetic algorithms to query optimization in document retrieval , 2000, Inf. Process. Manag..

[17]  John J. Grefenstette,et al.  Genetic algorithms and their applications , 1987 .

[18]  George G. Robertson,et al.  Parallel Implementation of Genetic Algorithms in a Classifier Rystem , 1987, ICGA.

[19]  Henrik Legind Larsen,et al.  A fuzzy genetic algorithm approach to an adaptive information retrieval agent , 1999 .

[20]  Kui-Lam Kwok Comparing representations in Chinese information retrieval , 1997, SIGIR '97.

[21]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[22]  Hsinchun Chen,et al.  Machine Learning for Information Retrieval: Neural Networks, Symbolic Learning, and Genetic Algorithms , 1995, J. Am. Soc. Inf. Sci..

[23]  Vijay V. Raghavan,et al.  A clustering strategy based on a formalism of the reproductive process in natural systems , 1979, SIGIR 1979.

[24]  Hsinchun Chen,et al.  A Machine Learning Approach to Inductive Query by Examples : An Experiment Using Relevance Feedback , ID 3 , Genetic Algorithms , and Simulated Annealing , 1998 .

[25]  Elie Sanchez,et al.  Soft computing perspectives , 1994, Proceedings of 24th International Symposium on Multiple-Valued Logic (ISMVL'94).

[26]  Félix de Moya Anegón,et al.  A GA-P algorithm to automatically formulate extended Boolean queries for a fuzzy information retrieval system , 2000 .

[27]  M. José Martín Bautista Modelos de computación flexible para la recuperación de información , 2000 .

[28]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[29]  Michael D. Gordon The necessity for adaptation in modified boolean document retrieval systems , 1988, Inf. Process. Manag..

[30]  Robert R. Korfhage,et al.  Query modification using genetic algorithms in vector space models , 1994 .

[31]  Félix de Moya Anegón,et al.  A test of genetic algorithms in relevance feedback , 2002, Inf. Process. Manag..

[32]  Cristina López Pujalte Algoritmos genéticos aplicados a la retroalimentación por relevancia , 2001 .

[33]  Peter Willett,et al.  An Upperbound to the Performance of Ranked-output Searching: Optimal Weighting of Query Terms using a Genetic Algorithm , 1996, J. Documentation.

[34]  Félix de Moya Anegón,et al.  Reduction of the dimension of a document space using the fuzzified output of a Kohonen network , 2001, J. Assoc. Inf. Sci. Technol..

[35]  Peter Willett,et al.  Generation of equifrequent Groups of Words using a Genetic Algorithm , 1994, J. Documentation.

[36]  Félix de Moya Anegón,et al.  Document organization using Kohonen's algorithm , 2002, Inf. Process. Manag..

[37]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[38]  Hsinchun Chen,et al.  A Machine Learning Approach to Inductive Query by Examples: An Experiment Using Relevance Feedback, ID3, Genetic Algorithms, and Simulated Annealing , 1998, J. Am. Soc. Inf. Sci..

[39]  Vicente P. Guerrero-Bote,et al.  Order-based Fitness Functions for Genetic Algorithms Applied to Relevance Feedback , 2003, J. Assoc. Inf. Sci. Technol..

[40]  Donna K. Harman,et al.  Relevance Feedback and Other Query Modification Techniques , 1992, Information retrieval (Boston).

[41]  Michael D. Gordon User-based document clustering by redescribing subject descriptions with a genetic algorithm , 1991, J. Am. Soc. Inf. Sci..

[42]  Gerard Salton,et al.  Parallel text search methods , 1988, CACM.

[43]  Takanori Shibata,et al.  Genetic Algorithms And Fuzzy Logic Systems Soft Computing Perspectives , 1997 .

[44]  Vijay V. Raghavan,et al.  A clustering strategy based on a formalism of the reproductive process in natural systems , 1979, SIGIR '79.

[45]  Robert R. Korfhage,et al.  Query Optimization in Information Retrieval Using Genetic Algorithms , 1993, ICGA.

[46]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[47]  Vijay V. Raghavan,et al.  Optimal Determination of User-Oriented Clusters: An Application for the Reproductive Plan , 1987, ICGA.

[48]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[49]  Lawrence Davis,et al.  Genetic Algorithms and Simulated Annealing , 1987 .

[50]  Michael D. Gordon Probabilistic and genetic algorithms in document retrieval , 1988, CACM.

[51]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[52]  Donald H. Kraft,et al.  Applying Genetic Algorithms to Information Retrieval Systems Via Relevance Feedback , 1995 .

[53]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[54]  M. Amparo Vila,et al.  A Fuzzy Genetic Algorithm Approach to an Adaptive Information Retrieval Agent , 1999, J. Am. Soc. Inf. Sci..

[55]  Lawrence. Davis,et al.  Handbook Of Genetic Algorithms , 1990 .

[56]  David C. Gibbon,et al.  Support vector machines: relevance feedback and information retrieval , 2002, Inf. Process. Manag..

[57]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[58]  B. Harrison Las Vegas, Nevada , 2002 .