Predicting the Wild Salmon Production Using Bayesian Networks

From the management point of view, the production of wild smolts is the most important indicator of the status of a river’s salmon population. We present a methodology allowing the prediction of the number of wild smolts in a river in a consistent and well-defined fashion. Our framework is probabilistic and our approach Bayesian. Our models are Bayesian networks, which have a simple graphical representation allowing visualization of the obtained knowledge. Being the state-of-the-art classifier in many domains, they also possess predictive power. We emphasize empirical modeling, studying what can be learned from the existing real-world data for two Gulf of Bothnia rivers, Simo and Tornio (the Finnish side). To ensure that our models generalize well, we employ strict validation procedures, where care is taken to inhibit leakage of information from the validation set to the training set. Furthermore, with the needs of fisheries management in mind, we highlight the role of the loss function in modeling, evaluating our models also in a setting where it is a greater error to overthan underestimate the size of a population.