Extension to Distributed Annotation System: Summary command

Using the current version of the Distributed Annotation System (DAS) protocol to obtain data from large regions of interest from remote DAS servers can be a time and resource consuming process. Therefore it would be useful to know the amounts and types of features that exist in a region of interest before the DAS request is made. In the current DAS protocol (1.6), the types of data that exist with a DAS source can be obtained before the complete set of features are requested for a specific region using the feature command. Depending on the implementation of the types command in the DAS server, the number of features across the segment can also be obtained. However, counting of features is computationally intensive for every user request and so most DAS servers do not include it in their implementation. For these DAS servers in order for the user to obtain counts of features, the complete set of features needs to be obtained and re-analysed. Additionally in the current DAS protocol no parameter exists to include or exclude the count of features per type when needed. In this paper, an addition to DAS protocol is proposed and implemented in order to request a summary of the features that exist within a region of interest when needed. The summary command was implemented in order to broaden the functionalities, and extend the flexibility of the current DAS protocol. The summary functionality can conserve time and resources, especially for large regions of interest.

[1]  Andrew M. Jenkinson,et al.  MyDas, an Extensible Java DAS Server , 2012, PloS one.

[2]  Y. Gilad,et al.  Comparative studies of gene expression and the evolution of gene regulation , 2012, Nature Reviews Genetics.

[3]  Karen Eilbeck,et al.  A standard variation file format for human genome sequences , 2010, Genome Biology.

[4]  Elaine R. Mardis,et al.  A decade’s perspective on DNA sequencing technology , 2011, Nature.

[5]  Akiyasu C. Yoshizawa,et al.  KAAS: an automatic genome annotation and pathway reconstruction server , 2007, Environmental health perspectives.

[6]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[7]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[8]  Christopher Gignoux,et al.  The 1000 Genomes Project: new opportunities for research and social challenges , 2010, Genome Medicine.

[9]  A. Brookes,et al.  GWAS Central: a comprehensive resource for the comparison and interrogation of genome-wide association studies , 2013, European Journal of Human Genetics.

[10]  Sean R. Eddy,et al.  The Distributed Annotation System , 2001, BMC Bioinformatics.

[11]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[12]  Jessica C. Ebert,et al.  Computational Techniques for Human Genome Resequencing Using Mated Gapped Reads , 2012, J. Comput. Biol..

[13]  Robert D. Finn,et al.  ProServer: a simple, extensible Perl DAS server , 2007, Bioinform..

[14]  P. Shannon,et al.  Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing , 2010, Science.

[15]  Daniel R. Zerbino,et al.  Ensembl 2014 , 2013, Nucleic Acids Res..

[16]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..