A novel dimensionality reduction algorithm based on Laplace matrix for microbiome data analysis

Visualization is an important method in microbiome data analysis, and dimensionality reduction is a necessary procedure to achieve it. Multidimensional Scaling (MDS) is a popular method, which is necessary to compute the distance matrix. The Unifrac distance is very reasonable and biologically meaningful in the analysis of microbiome data. Due to the complexity of the phylogenetic tree and the high dimensionality of data, MDS needs a large amount of calculations to determine all the distances between pairs. In this paper, we proposed a novel dimensionality reduction algorithm based on Laplace matrix (DRLM) for the analysis of microbiome data. The experimental results indicate that both on synthesized and microbiome data, our algorithm DRLM can not only cluster the data more clearly, but also can significantly reduce the computational cost.

[1]  Josef Kittler,et al.  Locally linear discriminant analysis for multimodally distributed classes for face recognition with a single model image , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[3]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[4]  Rob Knight,et al.  UniFrac – An online tool for comparing microbial community diversity in a phylogenetic context , 2006, BMC Bioinformatics.

[5]  R. Knight,et al.  Fast UniFrac: Facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data , 2009, The ISME Journal.

[6]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[7]  Hongzhe Li,et al.  Associating microbiome composition with environmental covariates using generalized UniFrac distances , 2012, Bioinform..

[8]  R. Knight,et al.  Quantitative and Qualitative β Diversity Measures Lead to Different Insights into Factors That Structure Microbial Communities , 2007, Applied and Environmental Microbiology.

[9]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10]  Markus Ringnér,et al.  What is principal component analysis? , 2008, Nature Biotechnology.

[11]  L. Saul,et al.  Think globally, fit locally: unsupervised l earning of non-linear manifolds , 2002 .

[12]  Ron Kimmel,et al.  Spectral multidimensional scaling , 2013, Proceedings of the National Academy of Sciences.

[13]  Katherine H. Huang,et al.  The Human Microbiome Project: A Community Resource for the Healthy Human Microbiome , 2012, PLoS biology.