Modified LDA Approach For Cluster Based Gene Classification Using K-Mean Method

Abstract Role of gene expression in cancer and cellular process is a complex problem that continues to haunt and challenge researchers. Sheer number of genes and inter related biological processes make the process of identifying more complex. Gene classification and analysis is a very difficult task for data scientists, as it requires various data, information and facts for making different quality articulation. Latent Dirichlet Allocation (LDA) has shown its compatibility for investigation and quality articulation in solid and malignant growth tissues. LDA is used to connect or grouping of genomic data and results are subsequently used for investigation. This work proposes A Modified Latent Dirichlet Allocation (MLDA) for gene classification and achieves quality articulation. The proposed MLDA identify and group differentially expressed genes between healthy and cancer tissues of various types. Experimental results report better performance of MLDA as compared to current state of art methods over Breast and Lung cancer data sets.

[1]  S. Gabriel,et al.  Advances in understanding cancer genomes through second-generation sequencing , 2010, Nature Reviews Genetics.

[2]  Bhupendra Verma,et al.  A Cooperative Negative Selection Algorithm for Anomaly Detection , 2014 .

[3]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[4]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[5]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[6]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[7]  Xin Chen,et al.  Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic Topic Modeling , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Charles M Perou,et al.  Systems biology and genomics of breast cancer. , 2011, Cold Spring Harbor perspectives in biology.

[9]  Antonino Fiannaca,et al.  Probabilistic topic modeling for the analysis and classification of genomic sequences , 2015, BMC Bioinformatics.

[10]  Bhupendra Verma,et al.  Biologically Inspired Computer Security System: The Way Ahead , 2012, SNDS.

[11]  E. Lander,et al.  A molecular signature of metastasis in primary solid tumors , 2003, Nature Genetics.

[12]  Qiang Sun,et al.  Individual-level analysis of differential expression of genes and pathways for personalized medicine , 2015, Bioinform..

[13]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[14]  Kuldip K. Paliwal,et al.  A Gene Selection Algorithm using Bayesian Classification Approach , 2012 .

[15]  Philip Lijnzaad,et al.  An expression profile for diagnosis of lymph node metastases from primary head and neck squamous cell carcinomas , 2005, Nature Genetics.

[16]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[17]  Bhupendra Verma,et al.  An efficient proactive artificial immune system based anomaly detection and prevention system , 2016, Expert Syst. Appl..