Structure of morphologically expanded queries: A genetic algorithm approach

In this paper we deal with two issues. First, we discuss the negative effects of term correlation in query expansion algorithms, and we propose a novel and simple method (query clauses) to represent expanded queries which may alleviate some of these negative effects. Second, we discuss a method to optimize local query-expansion methods using genetic algorithms, and we apply this method to improve stemming. We evaluate this method with the novel query representation method and show very significant improvements for the problem of stemming optimization.

[1]  Martin Smith,et al.  The use of genetic programming to build Boolean queries for text retrieval through relevance feedback , 1997, J. Inf. Sci..

[2]  Jorng-Tzong Horng,et al.  Applying genetic algorithms to query optimization in document retrieval , 2000, Inf. Process. Manag..

[3]  Claudio Carpineto,et al.  An information-theoretic approach to automatic query expansion , 2001, TOIS.

[4]  Vijay V. Raghavan,et al.  On modeling of information retrieval concepts in vector spaces , 1987, TODS.

[5]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[6]  Wagner Meira,et al.  Set-based vector model: An efficient approach for correlation-based ranking , 2005, TOIS.

[7]  Ilmério Reis da Silva,et al.  Dependence among terms in vector space model , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[8]  Lourdes Araujo,et al.  Improving Query Expansion with Stemming Terms: A New Genetic Algorithm Approach , 2008, EvoCOP.

[9]  Robert R. Korfhage,et al.  Query modification using genetic algorithms in vector space models , 1994 .

[10]  Enrique Herrera-Viedma,et al.  Improving the learning of Boolean queries by means of a multiobjective IQBE evolutionary algorithm , 2006, Inf. Process. Manag..

[11]  Carol Peters,et al.  European research letter: Cross-language system evaluation: The CLEF campaigns , 2001, J. Assoc. Inf. Sci. Technol..

[12]  José Luis Fernández-Villacañas Martín,et al.  Investigation of the importance of the genotype-phenotype mapping in information retrieval , 2003, Future Gener. Comput. Syst..

[13]  W. Bruce Croft,et al.  The INQUERY Retrieval System , 1992, DEXA.

[14]  Félix de Moya Anegón,et al.  A test of genetic algorithms in relevance feedback , 2002, Inf. Process. Manag..

[15]  Alexander F. Gelbukh,et al.  Detecting Inflection Patterns in Natural Language by Minimization of Morphological Model , 2004, CIARP.

[16]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[17]  Peter Willett,et al.  An Upperbound to the Performance of Ranked-output Searching: Optimal Weighting of Query Terms using a Genetic Algorithm , 1996, J. Documentation.

[18]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1992, Artificial Intelligence.

[19]  Donald H. Kraft,et al.  The use of genetic programming to build queries for information retrieval , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[20]  Iadh Ounis,et al.  Query reformulation using automatically generated query concepts from a document space , 2006, Inf. Process. Manag..

[21]  Oscar Cordón,et al.  A review on the application of evolutionary computation to information retrieval , 2003, Int. J. Approx. Reason..

[22]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[23]  Hsinchun Chen,et al.  A Machine Learning Approach to Inductive Query by Examples: An Experiment Using Relevance Feedback, ID3, Genetic Algorithms, and Simulated Annealing , 1998, J. Am. Soc. Inf. Sci..

[24]  Zbigniew Michalewicz,et al.  Genetic algorithms + data structures = evolution programs (2nd, extended ed.) , 1994 .

[25]  Oscar Cordón,et al.  A new evolutionary algorithm combining simulated annealing and genetic programming for relevance feedback in fuzzy information retrieval systems , 2002, Soft Comput..

[26]  Mohand Boughanem,et al.  Multiple query evaluation based on an enhanced genetic algorithm , 2003, Inf. Process. Manag..