BpForms: a toolkit for concretely describing modified DNA, RNA and proteins

Summary: Non-canonical nucleic and amino acid monomers are essential to enhance the information capacity, functional capabilities, and stability of DNA, RNA, and protein biopolymers. However, there are few tools for describing the primary structure of biopolymers that include non-canonical monomers. We developed BpForms, the first toolkit for concretely and compactly describing the primary structures of non-canonical 1-dimensional biopolymers. BpForms includes the first alphabets of non-canonical DNA, RNA, and protein monomers; a FASTA-like notation for describing biopolymers; and a website, a command line program, a REST API, and a Python package for calculating properties of biopolymers. We anticipate BpForms will be a 1 ar X iv :1 90 3. 10 04 2v 1 [ qbi o. B M ] 2 4 M ar 2 01 9 valuable tool for communicating data about modified DNA, RNA, and proteins, as well as integrating data about epigenetic, post-transcriptional, and post-translational modification. BpForms will also be valuable for whole-cell modeling and cell engineering. Availability and implementation: BpForms is freely available open-source at https://bpforms.org.