论文信息 - The perils of interaction prediction

The perils of interaction prediction

The availability of genome-wide maps of enhancer-promoter interactions (EPIs) has made it possible to use machine learning approaches to extract and interpret features that determine these interactions in different biological contexts. Multiple methods have claimed to accomplish the task of predicting enhancer-promoter interactions based on corresponding genomic features, but this problem is actually still far from being solved. In our analysis, we show that individual enhancer and promoter regions have widely different marginal interaction probabilities, e.g. propensities, which can lead to overfitting and memorization when random cross-validation is employed. Further even when a proper cross-validation scheme is adopted, a simple propensity-based model can still achieve a competitive performance without capturing any information about the EPI mechanism.

Dennis Kostka | Weiguang Mao | Maria Chikina

[1] B. Póczos,et al. Predicting Enhancer-Promoter Interaction from Genomic Sequence with Deep Neural Networks , 2016, bioRxiv.

[2] Ruochi Zhang,et al. Exploiting sequence-based features for predicting enhancer–promoter interactions , 2017, Bioinform..

[3] K. Pollard,et al. Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin , 2016, Nature Genetics.