A Hidden Markov Model approach to predicting yeast gene function from sequential gene expression data

Existing data mining tools can only achieve about 40% precision in function prediction of unannotated genes. We developed a gene function prediction tool based on profile Hidden Markov Models (HMMs). Each function class was modelled using a distinct HMM whose parameters were trained using yeast time-series gene expression profiles. Two structural variants of HMMs were designed and tested, each of them on 40 function classes. The highest overall prediction precision achieved was 67% using double-split HMM with leave-one-out cross-validation. We also attempted to generalise HMMs to dynamic Bayesian networks for gene function prediction using heterogeneous data sets.