Application of improved SOM network in gene data cluster analysis

Abstract At present, cluster analysis has become a very good channel for analyzing gene expression data to obtain biological information. In recent years, many experts have used traditional clustering algorithms and new clustering algorithms to mine gene expression data. This article first introduces the preprocessing of gene expression data. Then, by using principal component analysis (PCA) to process the gene data, a small number of characteristic variables are extracted as new indicators, and the indicators are evaluated to achieve the purpose of dimensionality reduction. The dimension reduction index is applied to the dynamic self-organizing neural network (DSOM) neural network, and the victory neurons are selected by the minimum Euclidean distance. The characteristics of the genetic data are clustered through the DSOM network, and the gene types with similar characteristics are divided into one region. The results show that PCA and DSOM networks have a high accuracy rate for clustering of genetic data, and a clear division of boundaries.

[1]  Shihao Yan,et al.  Location Verification Systems Based on Received Signal Strength With Unknown Transmit Power , 2018, IEEE Communications Letters.

[2]  P Vera,et al.  Combination of baseline FDG PET/CT total metabolic tumour volume and gene expression profile have a robust predictive value in patients with diffuse large B-cell lymphoma , 2018, European Journal of Nuclear Medicine and Molecular Imaging.

[3]  Chen Lu,et al.  Bearing performance degradation assessment and prediction based on EMD and PCA-SOM , 2014 .

[4]  Bruce J Aronow,et al.  Shared gene expression profiles in developing heart valves and osteoblast progenitor cells. , 2008, Physiological genomics.

[5]  Jiguo Yu,et al.  Characteristic gene selection via L2,1-norm Sparse Principal Component Analysis , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[6]  Qian Feng Cluster Analysis of Gene Expression Data Based on SOM Network , 2006 .

[7]  Hasan Fleyeh,et al.  Adaptive Shadow and Highlight Invariant Colour Segmentation for Traffic Sign Recognition Based on Kohonen SOM , 2011, J. Intell. Syst..

[8]  Yong Feng,et al.  Network Anomaly Detection Based on DSOM and ACO Clustering , 2007, ISNN.

[9]  Pedro M. Ramos,et al.  Gene expression programming for automatic circuit model identification in impedance spectroscopy: Performance evaluation , 2013 .

[10]  Asif Ekbal,et al.  Semi-supervised clustering for gene-expression data in multiobjective optimization framework , 2015, International Journal of Machine Learning and Cybernetics.

[11]  Gang Liu,et al.  Algorithms designed for compressed-gene-data transformation among gene banks with different references , 2018, BMC Bioinformatics.

[12]  Hikmet Kerem Cigizoglu,et al.  Generalized regression neural network in modelling river sediment yield , 2006, Adv. Eng. Softw..

[13]  Y-h. Taguchi,et al.  Tensor decomposition-based and principal-component-analysis-based unsupervised feature extraction applied to the gene expression and methylation profiles in the brains of social insects with multiple castes , 2018, BMC Bioinformatics.

[14]  Weiying Xu,et al.  Effect of OH−/Al3+ ratio on the coagulation behavior and residual aluminum speciation of polyaluminum chloride (PAC) in surface water treatment , 2011 .

[15]  Francisco Herrera,et al.  Tutorial on practical tips of the most influential data preprocessing algorithms in data mining , 2016, Knowl. Based Syst..

[16]  Yong Feng,et al.  Deep Web Sources Classifier Based on DSOM-EACO Clustering Model , 2010, ADMA.

[17]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[18]  Sacha A. F. T. van Hijum,et al.  PreP: gene expression data pre-processing , 2003, Bioinform..

[19]  Francisco Herrera,et al.  Data Preprocessing in Data Mining , 2014, Intelligent Systems Reference Library.

[20]  Jack W. Baker,et al.  Modeling spatially correlated spectral accelerations at multiple periods using principal component analysis and geostatistics , 2018 .

[21]  Shen Jingling,et al.  Quantitative identification of illicit drugs by using SOM neural networks , 2011 .

[22]  Limsoon Wong,et al.  GFS: fuzzy preprocessing for effective gene expression analysis , 2016, BMC Bioinformatics.

[23]  Isti Surjandari,et al.  Segmentation of Natural Gas Customers in Industrial Sector Using Self-Organizing Map (SOM) Method , 2018 .

[24]  Y-h. Taguchi,et al.  Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression , 2016, BioData Mining.

[25]  Hamid Amiri,et al.  A content-based image retrieval using PCA and SOM , 2016 .

[26]  Nalan Ozkurt,et al.  The circuit realization of Mexican Hat wavelet function , 2005 .

[27]  Zhongfu Wu,et al.  Intrusion Detection Based on Dynamic Self-organizing Map Neural Network Clustering , 2005, ISNN.

[28]  Olivier Aubert,et al.  Complement-Activating Anti-HLA Antibodies in Kidney Transplantation: Allograft Gene Expression Profiling and Response to Treatment. , 2017, Journal of the American Society of Nephrology : JASN.

[29]  Jânio Sousa Santos,et al.  Use of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for multivariate association between bioactive compounds and functional properties in foods: A critical perspective , 2018 .

[30]  E. Ejarque,et al.  Disentangling the complexity of tropical small-scale fisheries dynamics using supervised Self-Organizing Maps , 2018, PloS one.

[31]  Zhang Yi,et al.  A new local PCA-SOM algorithm , 2008, Neurocomputing.