Latent Dirichlet Allocation Uncovers Spectral Characteristics of Drought Stressed Plants

Understanding the adaptation process of plants to drought stress is essential in improving management practices, breeding strategies as well as engineering viable crops for a sustainable agriculture in the coming decades. Hyper-spectral imaging provides a particularly promising approach to gain such understanding since it allows to discover non-destructively spectral characteristics of plants governed primarily by scattering and absorption characteristics of the leaf internal structure and biochemical constituents. Several drought stress indices have been derived using hyper-spectral imaging. However, they are typically based on few hyper-spectral images only, rely on interpretations of experts, and consider few wavelengths only. In this study, we present the first data-driven approach to discovering spectral drought stress indices, treating it as an unsupervised labeling problem at massive scale. To make use of short range dependencies of spectral wavelengths, we develop an online variational Bayes algorithm for latent Dirichlet allocation with convolved Dirichlet regularizer. This approach scales to massive datasets and, hence, provides a more objective complement to plant physiological practices. The spectral topics found conform to plant physiological knowledge and can be computed in a fraction of the time compared to existing LDA approaches.

[1]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[2]  L. Plümer,et al.  Original paper: Early detection and classification of plant diseases with Support Vector Machines based on hyperspectral reflectance , 2010 .

[3]  Albert-László Barabási,et al.  A Dynamic Network Approach for the Study of Human Phenotypes , 2009, PLoS Comput. Biol..

[4]  J. Paisley Two Useful Bounds for Variational Inference , 2010 .

[5]  J. Passioura,et al.  Review: Environmental biology and crop improvement. , 2002, Functional plant biology : FPB.

[6]  Rajeev K. Varshney,et al.  Differentially expressed genes between drought-tolerant and drought-sensitive barley genotypes in response to drought stress during the reproductive stage , 2009, Journal of experimental botany.

[7]  Elizabeth Pennisi,et al.  The Blue Revolution, Drop by Drop, Gene by Gene , 2008, Science.

[8]  Jimeng Sun,et al.  Less is More: Sparse Graph Mining with Compact Matrix Decomposition , 2008, Stat. Anal. Data Min..

[9]  J. Dungan,et al.  Exploring the relationship between reflectance red edge and chlorophyll content in slash pine. , 1990, Tree physiology.

[10]  Thomas Mitchell-Olds,et al.  Genetics of Drought Adaptation in Arabidopsis thaliana II. Qtl Analysis of a New Mapping Population, Kas-1 × Tsu-1 , 2008, Evolution; international journal of organic evolution.

[11]  A. Gitelson,et al.  Signature Analysis of Leaf Reflectance Spectra: Algorithm Development for Remote Sensing of Chlorophyll , 1996 .

[12]  Christian Bauckhage,et al.  Simplex Distributions for Embedding Data Matrices over Time , 2012, SDM.

[13]  G. A. Blackburn,et al.  Quantifying Chlorophylls and Caroteniods at Leaf and Canopy Scales: An Evaluation of Some Hyperspectral Approaches , 1998 .

[14]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[15]  Edwin V. Bonilla,et al.  Improving Topic Coherence with Regularized Topic Models , 2011, NIPS.

[16]  L. Plümer,et al.  Robust fitting of fluorescence spectra for pre-symptomatic wheat leaf rust detection with Support Vector Machines , 2011 .

[17]  G. A. Blackburn,et al.  Hyperspectral remote sensing of plant pigments. , 2006, Journal of experimental botany.

[18]  Claude Lebreton,et al.  Identification of QTL for drought responses in maize and their use in testing causal relationships between traits , 1995 .

[19]  B. Miki,et al.  Transcriptome analysis reveals absence of unintended effects in drought-tolerant transgenic plants overexpressing the transcription factor ABF3 , 2010, BMC Genomics.

[20]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21]  Kristian Kersting,et al.  Larger Residuals, Less Work: Active Document Scheduling for Latent Dirichlet Allocation , 2011, ECML/PKDD.

[22]  Christian Bauckhage,et al.  Descriptive matrix factorization for sustainability Adopting the principle of opposites , 2011, Data Mining and Knowledge Discovery.

[23]  K. Shinozaki,et al.  Monitoring Expression Profiles of Rice Genes under Cold, Drought, and High-Salinity Stresses and Abscisic Acid Application Using cDNA Microarray and RNA Gel-Blot Analyses1[w] , 2003, Plant Physiology.

[24]  E. Ford,et al.  Vegetation's red edge: a possible spectroscopic biosignature of extraterrestrial plants. , 2005, Astrobiology.

[25]  Uwe Rascher,et al.  Spatio-temporal variations of photosynthesis: the potential of optical remote sensing to better understand and scale light use efficiency and stresses of plant ecosystems , 2008, Precision Agriculture.

[26]  Daniel Barbará,et al.  Topic Significance Ranking of LDA Generative Models , 2009, ECML/PKDD.

[27]  J. Boyer Plant Productivity and Environment , 1982, Science.

[28]  R. Richards,et al.  Breeding for improved water productivity in temperate cereals: phenotyping, quantitative trait loci, markers and the selection environment , 2010 .

[29]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[30]  M. Govender,et al.  Review of commonly used remote sensing and ground-based technologies to measure plant water stress , 2009 .

[31]  C. Small,et al.  Monitoring Spatio-temporal Dynamics of Photosynthesis with a Portable Hyperspectral Imaging System , 2007 .