Definition and Usage of Texture Feature for Biological Sequence

In recent years, sequencing technology has developed rapidly. This produces a large number of biological sequence data. Because of its importance, there have been many studies on biological sequences. However, there is still a lack of an effective quantitative method for defining and calculating texture features of biological sequences. Texture is an important visual feature; it is generally used to describe the spatial arrangement of intensities of images. Here we defined the texture features of biological sequence. Combining the digital coding of biological sequence with the calculation method of image texture features, we defined the texture features of biological sequence and designed the calculation method. We applied this method to DNA sequence features quantification and analysis. Using these quantified features, we can compute the similarity distance matrix of DNA sequences and construct the phylogenetic relationships based on the clustering of the quantified features. This method can be applied to analyze any biological sequence, and all biological sequences can be digitally coded and texture features can be calculated by this method. This is a novel study of biological sequence texture features. This will usher in a new era of quantitative and mathematical calculation of biological sequence features.

[1]  Yi Pan,et al.  A Knowledge-Based Multiple-Sequence Alignment Algorithm , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  Lin Liu,et al.  Comparison of Next-Generation Sequencing Systems , 2012, Journal of biomedicine & biotechnology.

[3]  Xuyu Xiang,et al.  Multiple sequence alignment algorithm based on a dispersion graph and ant colony algorithm , 2009, J. Comput. Chem..

[4]  B. Blaisdell A measure of the similarity of sets of sequences not requiring sequence alignment. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Leen-Kiat Soh,et al.  Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices , 1999, IEEE Trans. Geosci. Remote. Sens..

[6]  Cun-Quan Zhang,et al.  A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory , 2011, Evolutionary bioinformatics online.

[7]  T Gojobori,et al.  Molecular phylogeny and evolution of primate mitochondrial DNA. , 1988, Molecular biology and evolution.

[8]  David A Clausi An analysis of co-occurrence texture statistics as a function of grey level quantization , 2002 .

[9]  Jens Allmer,et al.  MicroRNA categorization using sequence motifs and k-mers , 2017, BMC Bioinformatics.

[10]  David H. A. Fitch,et al.  Primate evolution at the DNA level and a classification of hominoids , 1990, Journal of Molecular Evolution.

[11]  A I Saeed,et al.  TM4: a free, open-source system for microarray data management and analysis. , 2003, BioTechniques.

[12]  Wagner Coelho A. Pereira,et al.  Analysis of Co-Occurrence Texture Statistics as a Function of Gray-Level Quantization for Classifying Breast Ultrasound , 2012, IEEE Transactions on Medical Imaging.

[13]  Bo Liao,et al.  Graphical approach to analyzing DNA sequences , 2005, J. Comput. Chem..

[14]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[15]  Weiyang Chen,et al.  Use of image texture analysis to find DNA sequence similarities. , 2018, Journal of theoretical biology.

[16]  Gabriel Cristóbal,et al.  Automated pollen identification using microscopic imaging and texture analysis. , 2015, Micron.

[17]  Hao Liu,et al.  An ant colony pairwise alignment based on the dot plots , 2009, J. Comput. Chem..