MR-Tandem: parallel X!Tandem using Hadoop MapReduce on Amazon Web Services
暂无分享,去创建一个
SUMMARY
MR-Tandem adapts the popular X!Tandem peptide search engine to work with Hadoop MapReduce for reliable parallel execution of large searches. MR-Tandem runs on any Hadoop cluster but offers special support for Amazon Web Services for creating inexpensive on-demand Hadoop clusters, enabling search volumes that might not otherwise be feasible with the compute resources a researcher has at hand. MR-Tandem is designed to drop in wherever X!Tandem is already in use and requires no modification to existing X!Tandem parameter files, and only minimal modification to X!Tandem-based workflows.
AVAILABILITY AND IMPLEMENTATION
MR-Tandem is implemented as a lightly modified X!Tandem C++ executable and a Python script that drives Hadoop clusters including Amazon Web Services (AWS) Elastic Map Reduce (EMR), using the modified X!Tandem program as a Hadoop Streaming mapper and reducer. The modified X!Tandem C++ source code is Artistic licensed, supports pluggable scoring, and is available as part of the Sashimi project at http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteomic_pipeline/extern/xtandem/. The MR-Tandem Python script is Apache licensed and available as part of the Insilicos Cloud Army project at http://ica.svn.sourceforge.net/viewvc/ica/trunk/mr-tandem/. Full documentation and a windows installer that configures MR-Tandem, Python and all necessary packages are available at this same URL.
CONTACT
brian.pratt@insilicos.com
[1] Andrew J Link,et al. Parallel tandem: a program for parallel processing of tandem mass spectra using PVM or MPI and X!Tandem. , 2005, Journal of proteome research.
[2] Robertson Craig,et al. TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.
[3] Kei-Hoi Cheung,et al. X!!Tandem, an improved method for running X!tandem in parallel on collections of commodity computers. , 2008, Journal of proteome research.