Bayesian versus Neural Network Analysis of Algae Data Population - A New Method to Predict and Analyse Cause and Effect

In biology, advanced modelling techniques are needed since there is a mixture of qualitative, linguistics and numerical data on the environmental and biological relationships. Also, experiments and data collecting are expensive and time consuming, so determine which variables are relevant and using inference models less data demanding are highly desirable. In this work, from a set of 200 multivariate data samples of algae population and environmental variables, we propose a Bayesian method to predict compositional population distribution. This is a good application example, since measuring environmental variables are easier to automate, faster and less expensive than population counting that usually involves the need of a large amount of specialized human interaction. An additive log-ratio transformation and a regression model were applied to the data and 255.000 Gibbs samples were simulated using the OPENBUGS software. Also an Artificial Neural Network (ANN) was designed on Matlab to predict the distribution for benchmarking purposes. Both models showed similar prediction performance, but on the Bayesian model an analysis of credible interval of the variables corresponding to the each regression parameters is possible, showing that most of the variables on this study are relevant, which is consistent to the expected results in this case.