Neural Network Based Determination of Splice Junctions by ROC Analysis

Gene, principal unit of inheritance, is an ordered sequence of nucleotides. The genes of eukaryotic organisms include alternating segments of exons and introns. The region of Deoxyribonucleic acid (DNA) within a gene containing instructions for coding a protein is called exon. On the other hand, non-coding regions called introns are another part of DNA that regulates gene expression by removing from the messenger Ribonucleic acid (RNA) in a splicing process. This paper proposes to determine splice junctions that are exon-intron boundaries by analyzing DNA sequences. A splice junction can be either exon-intron (EI) or intron exon (IE). Because of the popularity and compatibility of the artificial neural network (ANN) in genetic fields; various ANN models are applied in this research. Multi-layer Perceptron (MLP), Radial Basis Function (RBF) and Generalized Regression Neural Networks (GRNN) are used to analyze and detect the splice junctions of gene sequences. 10-fold cross validation is used to demonstrate the accuracy of networks. The real performances of these networks are found by applying Receiver Operating Characteristic (ROC) analysis. Keywords—Gene, neural networks, ROC analysis, splice

[1]  Mark J. Embrechts,et al.  DNA classifications with self-organizing maps (SOMs) , 2003, Proceedings of the 2003 IEEE International Workshop on Soft Computing in Industrial Applications, 2003. SMCia/03..

[2]  T. Naenna,et al.  A modified Kohonen network for DNA splice junction classification , 2004, 2004 IEEE Region 10 Conference TENCON 2004..

[3]  T. Leong,et al.  Splice junction classification problems for DNA sequences: representation issues , 2001, 2001 Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[4]  S. Rampone Splice-junction recognition on gene sequences (DNA) by BRAIN learning algorithm , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[5]  Chein-I Chang,et al.  3D ROC Analysis for Medical Imaging Diagnosis , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[6]  Jing-Jing Li,et al.  Characterizing human gene splice sites using evolved regular expressions , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[7]  Cathy H. Wu,et al.  Neural networks and genome informatics , 2000 .

[8]  V. Munteanu,et al.  A New Information Theoretic Approach to Exon - Intron Classification , 2007, 2007 International Symposium on Signals, Circuits and Systems.

[9]  K.S. Nikita,et al.  Cross-validation and neural network architecture selection for the classification of intracranial current sources , 2004, 7th Seminar on Neural Network Applications in Electrical Engineering, 2004. NEUREL 2004. 2004.

[10]  LiMin Fu,et al.  An expert network for DNA sequence analysis , 1999, IEEE Intell. Syst..

[11]  William S. Klug,et al.  Concepts of Genetics , 1999 .

[12]  E. Uberbacher,et al.  Drosophila GRAIL: an intelligent system for gene recognition in Drosophila DNA sequences , 1995, Proceedings First International Symposium on Intelligence in Neural and Biological Systems. INBS'95.

[13]  Thomas J. Downey,et al.  Using the receiver operating characteristic to asses the performance of neural classifiers , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).