Predicting bacterial functional traits from whole genome sequences using random forest

Microbes are the most abundant and diverse biota on earth. Despite their small size, they have a huge impact in many essential ecosystem services and overall global health. However, due to the complexity of microbial communities and the fact that most of the members cannot be cultured, the molecular and ecological details as well as influencing factors of these processes are still poorly understood [1]. An important question in ecological biology is how biodiversity influences ecosystem functioning [2]. In general, it is thought that biodiversity maximizes potential either through greater chances of containing highly successful individuals and/or poorly understood processes that benefit communities [3]-[5]. Underlying this is the relationship between the presence of individual organism and specific functional traits.