SUMMARY
Prediction of genomic annotations from DNA sequences using deep learning is today becoming a flourishing field with many applications. Nevertheless, there are still difficulties in handling data in order to conveniently build and train models dedicated for specific end-user's tasks. keras_dna is designed for an easy implementation of Keras models (TensorFlow high level API) for genomics. It can handle standard bioinformatic files formats as inputs such as bigwig, gff, bed, wig, bedGraph, or fasta and returns standardized inputs for model training. keras_dna is designed to implement existing models but also to facilitate the development of news models that can have single or multiple targets or inputs.
AVAILABILITY
Freely available with a MIT License using pip install keras_dna or cloning the github repo at https://github.com/etirouthier/keras_dna.git.
CONTACT
julien.mozziconacci@mnhn.fr and etienne.routhier@upmc.fr.
SUPPLEMENTARY INFORMATION
An extensive documentation can be found online at https://keras-dna.readthedocs.io/en/latest/.
[1]
William Stafford Noble,et al.
Machine learning applications in genetics and genomics
,
2015,
Nature Reviews Genetics.
[2]
B. Frey,et al.
Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
,
2015,
Nature Biotechnology.
[3]
Leopold Parts,et al.
Computational biology: deep learning
,
2017,
Emerging topics in life sciences.
[4]
O. Troyanskaya,et al.
Predicting effects of noncoding variants with deep learning–based sequence model
,
2015,
Nature Methods.
[5]
M. Huss,et al.
A primer on deep learning in genomics
,
2018,
Nature Genetics.
[6]
Ghazaleh Khodabandelou,et al.
Genome annotation across species using deep convolutional neural networks
,
2020,
PeerJ Comput. Sci..
[7]
Jun Cheng,et al.
The Kipoi repository accelerates community exchange and reuse of predictive models for genomics
,
2019,
Nature Biotechnology.