Alu repeat analysis in the complete human genome: trends and variations with respect to genomic composition

MOTIVATION Transposon-derived Alu repeats are exclusively associated with primate genomes. They have gained considerable importance in the recent times with evidence of their involvement in various aspects of gene regulation, e.g. alternative splicing, nucleosome positioning, CpG methylation, binding sites for transcription factors and hormone receptors, etc. The objective of this study is to investigate the factors that influence the distribution of Alu repeat elements in the human genome. Such analysis is expected to yield insights into various aspects of gene regulation in primates. RESULTS Analysis of Alu repeat distribution for the human genome build 32 (released in January 2003) reveals that they occupy nearly one-tenth portion of the sequenced regions. Huge variations in Alu frequencies were seen across the genome with chromosome 19 being the most and chromosome Y being the least Alu dense chromosomes. The highlights of the analysis are as follows: (1). three-fourth of the total genes in the genome are associated with Alus. (2). Alu density is higher in genes as compared with intergenic regions in all the chromosomes except 19 and 22. (3). Alu density in human genome is highly correlated with GC content, gene density and intron density with GC content being major deterministic factor compared with other two. (4). Alu densities were correlated more with gene density than intron density indicating the insertion of Alus in untranslated regions of exons.

[1]  L. L. Kisselev,et al.  Structural Organization of the Human Genome: Distribution of Nucleotides, AluRepeats, and Exons in Chromosomes 21 and 22 , 2001, Molecular Biology.

[2]  M. Batzer,et al.  Alu repeats and human disease. , 1999, Molecular genetics and metabolism.

[3]  P. A. Biro,et al.  Ubiquitous, interspersed repeated sequences in mammalian genomes. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[4]  C. Burks,et al.  The distribution of interspersed repetitive DNA sequences in the human genome. , 1989, Genomics.

[5]  Deepak Grover,et al.  Nonrandom distribution of alu elements in genes of various functional categories: insight from analysis of human chromosomes 21 and 22. , 2003, Molecular biology and evolution.

[6]  C. Hutchison,et al.  Master genes in mammalian repetitive DNA amplification. , 1992, Trends in genetics : TIG.

[7]  T. Smith,et al.  A fundamental division in the Alu family of repeated sequences. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[8]  D. Labuda,et al.  Sequence conservation in Alu evolution. , 1989, Nucleic acids research.

[9]  C. Schmid,et al.  Transcriptional regulation and transpositional selection of active SINE sequences. , 1992, Current opinion in genetics & development.

[10]  W. Makałowski,et al.  Genomic scrap yard: how genomes utilize all that junk. , 2000, Gene.

[11]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[12]  Mary C. Rykowski,et al.  Human genome organization: Alu, LINES, and the molecular structure of metaphase chromosome bands , 1988, Cell.

[13]  A. Weiner,et al.  Do all SINEs lead to LINEs? , 2000, Nature Genetics.

[14]  R. Britten,et al.  Sources and evolution of human Alu repeated sequences. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[15]  E. Ullu,et al.  Alu sequences are processed 7SL RNA genes , 1984, Nature.

[16]  Aleksandar Milosavljevic,et al.  Reconstruction and analysis of human alu genes , 1991, Journal of Molecular Evolution.

[17]  Carl W. Schmid,et al.  Existence of at least three distinct Alu subfamilies , 2005, Journal of Molecular Evolution.

[18]  A. Mighell,et al.  Alu sequences , 1997, FEBS letters.