Enhanced Search Scheme Precision and Performance using a GA Approach with Application to Arabic Content

Literature examination shows that information search engines in Arabic are few compared to those available in English and other languages. Additionally, search engines face many problems when programmed in the Arabic language, including difficulty and uncertainty. Employing Genetic Algorithm within the search scheme to improve performance and exactness and tackle issues with non-accurateness of search systems in which Arabic content is used can be considered an advancement. An enhanced search scheme that provides exactness, precision, and performance by applying the Genetic Algorithm Technique to Arabic content is presented in this paper. Based on the user starting page selection, the system employs its dynamic characteristics to search related pages on the Web. A series of experiments has been conducted to test the quality and effectiveness of the proposed system by means of well-known test-base collections – namely, CISI, CACM, and NPL – and 242 Arabic-content sites. General results revealed that the proposed system retrieved the largest number of appropriate documents and minimal non-related documents with respect to user requests in high-performance information retrieval systems that use the Genetic Algorithm.

[1]  M. Aissiou,et al.  GENETIC ALGORITHM APPLICATION TO THE STANDARD ARABIC PHONEMES CLASSIFICATION , 2008, Cybern. Syst..

[2]  Ravi Kumar,et al.  Extracting Large-Scale Knowledge Bases from the Web , 1999, VLDB.

[3]  Andrew McCallum,et al.  Building Domain-Specific Search Engines with Machine Learning Techniques , 1999 .

[4]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[5]  Hector Garcia-Molina,et al.  Efficient Crawling Through URL Ordering , 1998, Comput. Networks.

[6]  A. A. Aly APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM , 2007 .

[7]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[8]  Giles,et al.  Searching the world wide Web , 1998, Science.

[9]  Hanan Aljuaid,et al.  A Tool to Develop Arabic Handwriting Recognition System Using Genetic Approach , 2010 .

[10]  Marshall Ramsey,et al.  A Smart Itsy Bitsy Spider for the Web , 1998, J. Am. Soc. Inf. Sci..

[11]  Michael Chau,et al.  Comparison of Three Vertical Search Spiders , 2003, Computer.

[12]  Vitaliy V. Kluev Compiling document collections from the Internet , 2000, SIGF.

[13]  Carl Lagoze,et al.  Focused Crawls, Tunneling, and Digital Libraries , 2002, ECDL.

[14]  Jon M. Kleinberg,et al.  Mining the Web's Link Structure , 1999, Computer.

[15]  C. Lee Giles,et al.  Accessibility of information on the Web , 2000, INTL.

[16]  Morteza Zahedi,et al.  AN EFFICIENT APPROACH FOR KEYWORD SELECTION ; IMPROVING ACCESSIBILITY OF WEB CONTENTS BY GENERAL SEARCH ENGINES , 2011 .

[17]  Monika Henzinger,et al.  Finding Related Pages in the World Wide Web , 1999, Comput. Networks.

[18]  Donna Bergmark,et al.  Collection synthesis , 2002, JCDL '02.

[19]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[20]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1992, Artificial Intelligence.

[21]  Bassam H. Hammo Towards enhancing retrieval effectiveness of search engines for diacritisized Arabic documents , 2008, Information Retrieval.