SAMIE: Statistical Algorithm for Modeling Interaction Energies

We are investigating the rules that govern protein-DNA interactions, using a statistical mechanics based formalism that is related to the Boltzmann Machine of the neural net literature. Our approach is data-driven, in which probabilistic algorithms are used to model protein-DNA interactions, given SELEX and/or phage data as input. In the current report, we trained the network using SELEX data, under the "one-to-one" model of interactions (i.e. one amino acid contacts one base). The trained network was able to successfully identify the wild-type binding sites of EGR and MIG protein families. The predictions using our method are the same or better than that of methods existing in the literature. However our methodology offers the potential to capitalise in quantitative detail, as well as to be used to explore more general model of interactions, given availability of data.

[1]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.