The application of Hadoop in structural bioinformatics
暂无分享,去创建一个
[1] Ravi Kumar,et al. Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.
[2] Marek S. Wiewiórka,et al. SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision , 2014, Bioinform..
[3] David T. Jones,et al. Improving the accuracy of transmembrane protein topology prediction using evolutionary information , 2007, Bioinform..
[4] Shaoliang Peng,et al. Bioinformatics applications on Apache Spark , 2018, GigaScience.
[5] Philip E. Bourne,et al. The RCSB PDB information portal for structural genomics , 2005, Nucleic Acids Res..
[6] Ivan Merelli,et al. Clustering Protein Structures with Hadoop , 2015, CIBB.
[7] Trilce Estrada,et al. Automatic selection of near-native protein-ligand conformations using a hierarchical clustering and volunteer computing , 2010, BCB '10.
[8] Ruth Nussinov,et al. An overview of recent advances in structural bioinformatics of protein-protein interactions and a guide to their principles. , 2014, Progress in biophysics and molecular biology.
[9] M. Schatz,et al. Searching for SNPs with cloud computing , 2009, Genome Biology.
[10] Hanan Samet,et al. An Overview of Quadtrees, Octrees, and Related Hierarchical Data Structures , 1988 .
[11] Dariusz Mrozek,et al. Cloud4Psi: cloud computing for 3D protein structure similarity searching , 2014, Bioinform..
[12] Michael C. Schatz,et al. CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..
[13] A. Konagurthu,et al. MUSTANG: A multiple structural alignment algorithm , 2006, Proteins.
[14] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.
[15] Anthony Skjellum,et al. Using MPI - portable parallel programming with the message-parsing interface , 1994 .
[16] Dusanka Janezic,et al. ProBiS algorithm for detection of structurally similar protein binding sites by local structural alignment , 2010, Bioinform..
[17] M. Rawlins. Cutting the cost of drug development? , 2004, Nature Reviews Drug Discovery.
[18] Judy Qiu,et al. Proceedings of the second international workshop on Emerging computational methods for the life sciences , 2011, HPDC 2011.
[19] Andreas Prlic,et al. Sequence analysis , 2003 .
[20] L Nelson Michael,et al. A Comparison of Queueing, Cluster and Distributed Computing Systems , 1994 .
[21] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[22] Laura M. Jackson,et al. Finding Our Way through Phenotypes , 2015, PLoS biology.
[23] Huanming Yang,et al. SNP detection for massively parallel whole-genome resequencing. , 2009, Genome research.
[24] Michael Q. Zhang,et al. Using quality scores and longer reads improves accuracy of Solexa read mapping , 2008, BMC Bioinformatics.
[25] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[26] E J Dodson,et al. Determination and restrained least-squares refinement of the structures of ribonuclease Sa and its complex with 3'-guanylic acid at 1.8 A resolution. , 1991, Acta crystallographica. Section B, Structural science.
[27] David C. Jones,et al. CATH--a hierarchic classification of protein domain structures. , 1997, Structure.
[28] Eija Korpelainen,et al. Hadoop-BAM: directly manipulating next generation sequencing data in the cloud , 2012, Bioinform..
[29] David G. Messerschmitt,et al. Software Ecosystem: Understanding an Indispensable Technology and Industry , 2003 .
[30] Yee Siew Choong,et al. Minireview: Applied Structural Bioinformatics in Proteomics , 2013, The Protein Journal.
[31] Sally R. Ellingson,et al. High-throughput virtual molecular docking: Hadoop implementation of AutoDock4 on a private cloud , 2011, ECMLS '11.
[32] Andreas Prlic,et al. MMTF—An efficient file format for the transmission, visualization, and analysis of macromolecular structures , 2017, PLoS Comput. Biol..
[33] Andreas Prlic,et al. BioJava: an open-source framework for bioinformatics in 2012 , 2012, Bioinform..
[34] F. Allen. The Cambridge Structural Database: a quarter of a million crystal structures and rising. , 2002, Acta crystallographica. Section B, Structural science.
[35] Michael Darsow,et al. ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..
[36] James G. Shanahan,et al. Large Scale Distributed Data Science using Apache Spark , 2015, KDD.
[37] Michael Isard,et al. Scalability! But at what COST? , 2015, HotOS.
[38] Pete Wyckoff,et al. Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..
[39] Saba Latif,et al. A survey on Protein Protein Interactions (PPI) methods, databases, challenges and future directions , 2018, 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET).
[40] J. Irwin,et al. Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.
[41] Lior Pachter,et al. Sequence Analysis , 2020, Definitions.
[42] Dariusz Mrozek,et al. High-throughput and scalable protein function identification with Hadoop and Map-only pattern of the MapReduce processing model , 2018, Knowledge and Information Systems.
[43] Che-Lun Hung,et al. Cloud Computing for Protein-Ligand Binding Site Comparison , 2013, BioMed research international.
[44] Adam Godzik,et al. Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.
[45] M. Mezei,et al. Molecular docking: a powerful approach for structure-based drug discovery. , 2011, Current computer-aided drug design.
[46] Paolo Di Tommaso,et al. Nextflow enables reproducible computational workflows , 2017, Nature Biotechnology.
[47] Liisa Holm,et al. Dali server: conservation mapping in 3D , 2010, Nucleic Acids Res..
[48] Zhao Zhang,et al. Rethinking Data-Intensive Science Using Scalable Analytics Systems , 2015, SIGMOD Conference.
[49] Jeremy Leipzig,et al. A review of bioinformatic pipeline frameworks , 2016, Briefings Bioinform..
[50] Philip E. Bourne,et al. A robust and efficient algorithm for the shape description of protein structures and its application in predicting ligand binding sites , 2007, BMC Bioinformatics.
[51] Daozheng Chen,et al. Predicting Protein Ligand Binding Sites with Structure Alignment Method on Hadoop , 2016 .
[52] Xiaohua Zhang,et al. Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines , 2013, J. Comput. Chem..
[53] Yaw-Ling Lin,et al. Implementation of a Parallel Protein Structure Alignment Service on Cloud , 2013, International journal of genomics.
[54] M. Schatz,et al. Big Data: Astronomical or Genomical? , 2015, PLoS biology.
[55] T. N. Bhat,et al. The Protein Data Bank , 2000, Nucleic Acids Res..
[56] Hugh P. Shanahan,et al. Bioinformatics on the Cloud Computing Platform Azure , 2014, PloS one.
[57] Kevin Bryson,et al. Computer-assisted protein domain boundary prediction using the DomPred server. , 2007, Current protein & peptide science.
[58] P E Bourne,et al. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.
[59] David A. Agard,et al. Structural characterization of a subtype-selective ligand reveals a novel mode of estrogen receptor antagonism , 2002, Nature Structural Biology.
[60] Bernard F. Buxton,et al. The DISOPRED server for the prediction of protein disorder , 2004, Bioinform..
[61] Geoffrey C. Fox,et al. MapReduce in the Clouds for Science , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.
[62] Robert Schmieder,et al. Big data challenges and opportunities in high-throughput sequencing , 2013 .
[63] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.
[64] Barry Honig,et al. Structural bioinformatics of the interactome. , 2014, Annual review of biophysics.
[65] Chris Sander,et al. Touring protein fold space with Dali/FSSP , 1998, Nucleic Acids Res..
[66] José A. B. Fortes,et al. CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications , 2008, 2008 IEEE Fourth International Conference on eScience.
[67] J. S. Sodhi,et al. Predicting metal-binding site residues in low-resolution structural models. , 2004, Journal of molecular biology.
[68] Daniel W. A. Buchan,et al. Scalable web services for the PSIPRED Protein Analysis Workbench , 2013, Nucleic Acids Res..
[69] E Ray Dorsey,et al. Financial anatomy of biomedical research. , 2005, JAMA.
[70] Xian-He Sun,et al. Performance comparison under failures of MPI and MapReduce: An analytical approach , 2013, Future Gener. Comput. Syst..
[71] Trilce Estrada,et al. A scalable and accurate method for classifying protein-ligand binding geometries using a MapReduce approach , 2012, Comput. Biol. Medicine.
[72] Yanli Wang,et al. PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..
[73] G. Morris,et al. Molecular docking. , 2008, Methods in molecular biology.
[74] Jerrold L. Wagener. High performance fortran , 1996, Comput. Stand. Interfaces.
[75] Timothy Nugent,et al. Membrane protein structural bioinformatics. , 2012, Journal of structural biology.
[76] Hans Briem,et al. A crystallographic fragment screen identifies cinnamic acid derivatives as starting points for potent Pim-1 inhibitors. , 2011, Acta crystallographica. Section D, Biological crystallography.
[77] Andrew E. Torda,et al. The GROMOS biomolecular simulation program package , 1999 .
[78] David C. Jones,et al. GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.
[79] M. Karplus,et al. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .
[80] Marta Mattoso,et al. Exploring Large Scale Receptor-Ligand Pairs in Molecular Docking Workflows in HPC Clouds , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.
[81] B. Langmead,et al. Cloud-scale RNA-sequencing differential expression analysis with Myrna , 2010, Genome Biology.
[82] Geoffrey C. Fox,et al. Investigation of Data Locality in MapReduce , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).
[83] Liam J. McGuffin,et al. The PSIPRED protein structure prediction server , 2000, Bioinform..
[84] Christine A. Orengo,et al. FFPred: an integrated feature-based function prediction server for vertebrate proteomes , 2008, Nucleic Acids Res..
[85] Nathan Linial,et al. Approximate protein structural alignment in polynomial time. , 2004, Proceedings of the National Academy of Sciences of the United States of America.
[86] J F Gibrat,et al. Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.
[87] Antony J. Williams,et al. ChemSpider:: An Online Chemical Information Resource , 2010 .
[88] Weisong Shi,et al. CloudAligner: A fast and full-featured MapReduce based tool for sequence mapping , 2011, BMC Research Notes.
[89] M. DePristo,et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.
[90] David E. Culler,et al. User-Centric Performance Analysis of Market-Based Cluster Batch Schedulers , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).
[91] Fabian A. Buske,et al. VariantSpark: population scale clustering of genotype information , 2015, BMC Genomics.
[92] W R Taylor,et al. SSAP: sequential structure alignment program for protein structure comparison. , 1996, Methods in enzymology.
[93] Ronald C. Taylor. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics , 2010, BMC Bioinformatics.
[94] Lars George,et al. HBase - The Definitive Guide: Random Access to Your Planet-Size Data , 2011 .
[95] R. Nussinov,et al. Protein–protein interactions: Structurally conserved residues distinguish between binding sites and exposed protein surfaces , 2003, Proceedings of the National Academy of Sciences of the United States of America.
[96] J L Sussman,et al. Protein Data Bank archives of three-dimensional macromolecular structures. , 1997, Methods in enzymology.