SPDE: A Multi-functional Software for Sequence Processing and Data Extraction

Efficiently extracting information from biological big data can be a huge challenge for people (especially those who lack programming skills). We developed Sequence Processing and Data Extraction (SPDE) as an integrated tool for sequence processing and data extraction for gene family and omics analyses. Currently, SPDE has seven modules comprising 100 basic functions that range from single gene processing (e.g., translation, reverse complement, and primer design) to genome information extraction. All SPDE functions can be used without the need for programming or command lines. The SPDE interface has enough prompt information to help users run SPDE without barriers. In addition to its own functions, SPDE also incorporates the publicly available analyses tools (such as, NCBI-blast, HMMER, Primer3 and SAMtools), thereby making SPDE a comprehensive bioinformatics platform for big biological data analysis. Availability SPDE was built using Python and can be run on 32-bit, 64-bit Windows and macOS systems. It is an open-source software that can be downloaded from https://github.com/simon19891216/SPDEv1.2.git. Contact xudongzhuanyong@163.com

[1]  Chenglei Li,et al.  Genome-wide identification, phylogeny, evolutionary expansion and expression analyses of bZIP transcription factor family in tartaty buckwheat , 2019, BMC Genomics.

[2]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[3]  T. Hu,et al.  A high-quality chromosome-level genome assembly reveals genetics for important traits in eggplant , 2020, Horticulture Research.

[4]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[5]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[6]  Javed Ferzund,et al.  Modern Data Formats for Big Bioinformatics Data Analytics , 2017, ArXiv.

[7]  L. Jermiin,et al.  Genome-wide analysis of MIKC-type MADS-box genes in wheat: pervasive duplications, functional conservation and putative neofunctionalization. , 2019, The New phytologist.

[8]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[9]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[10]  B. Liu,et al.  The Phoebe genome sheds light on the evolution of magnoliids , 2020, Horticulture Research.

[11]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[12]  Dong Liu,et al.  The curvature of cucumber fruits is associated with spatial variation in auxin accumulation and expression of a YUCCA biosynthesis gene , 2020, Horticulture research.

[13]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[14]  Jeremy D. DeBarry,et al.  MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity , 2012, Nucleic acids research.

[15]  H. Bohnert,et al.  Cell type-specific responses to salinity - the epidermal bladder cell transcriptome of Mesembryanthemum crystallinum. , 2015 .