Mining of Knowledge Related to Factors Involved in the Aberrant Growth of Plankton

We aim to obtain knowledge relating to the causes of aberrant growth of plankton thought to cause problems such as shellfish poisoning, by using data acquired by measuring populations of more than 1000 species of plankton in specific seas areas with a next-generation sequencer. Previous techniques proposed for predicting future time series data from past time series data are difficult to be applied because the number of measurements is small. On the other hand, association rule mining which is one of the classical data mining techniques, is insufficient to obtain knowledge relating to indirect causes, such as “if species B increases, species A increases, and as a result the target species exhibits a characteristic increase.” Therefore, we propose a method for finding association rules relating to increase/decrease of species other than the target species, and also propose a new model for aggregating those rules, named “time series association graph”. We perform knowledge mining using a time series association graph and clustering (community discovery) on the graph to discover knowledge relating to the causes of the aberrant growth of a specified species. We also describe the used codes written in the programming language R.