Collaborative Learning of Estimation of Distribution Algorithm for RNA secondary structure prediction

Estimation of distribution algorithms (EDAs) are successfully applied in the fields of bioinformatics for tasks such as gene structure analysis, protein structure prediction, and RNA secondary structure prediction. This paper proposes a new method, namely collaborative learning of estimation of distribution algorithms, or Co-EDAs, based on an estimation of distribution algorithm for RNA secondary structure prediction using a single RNA sequence as input. The proposed method consists of two EDAs with minimum free energy objective. The Co-EDAs use both good and poor solutions to improve the algorithm’s to search throughout the search space. Using information from poor solutions can indicate which area is unappealing to explore when searching with high-dimensional data. The Co-EDAs method was tested with 750 known RNA structures from RNA STRAND v2.0. That database includes data with more than 14 RNA types. The proposed method was compared to three prediction programs that are based on dynamic programming algorithms called Mfold, RNAfold, and RNAstructure. These programs are available as services on web servers. The results on average show that the Co-EDAs yields approximately 6% better accuracy than those competitors in all metrics.