Identification of Promoter Region in Genomic DNA Using Cellular Automata Based Text Clustering

Identifying the promoter regions play a vital role in understanding human genes. This paper presents a new cellular automata based text clustering algorithm for identifying these promoter regions in genomic DNA. Experimental results confirm the applicability of cellular automata based text clustering algorithm for identifying these regions. We also note an increase in accuracy of fining these promoter regions by 12 percent for DNA sequences for shorter length. This algorithm was trained to identify promoter regions in mixed and overlapping DNA sequences also. However this algorithm fails in identifying the promoter regions of length greater than 54. This algorithm will be also used to predict the RNA structure.