Locality and Parallelism Optimization for Dynamic Programming Algorithm in Bioinformatics
暂无分享,去创建一个
[1] Tao Li,et al. Workload characterization of bioinformatics applications , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.
[2] Donald Yeung,et al. BioBench: A Benchmark Suite of Bioinformatics Applications , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..
[3] Katherine Yelick,et al. UPC Language Specifications V1.1.1 , 2003 .
[4] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[5] Greg J. Regnier,et al. The Virtual Interface Architecture , 2002, IEEE Micro.
[6] Charles E. Leiserson,et al. Cache-Oblivious Algorithms , 2003, CIAC.
[7] Sartaj Sahni,et al. A blocked all-pairs shortest-paths algorithm , 2003, ACM J. Exp. Algorithmics.
[8] Kathryn S. McKinley,et al. Tile size selection using cache organization and data layout , 1995, PLDI '95.
[9] Katherine A. Yelick,et al. Titanium: A High-performance Java Dialect , 1998, Concurr. Pract. Exp..
[10] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[11] Victor Eijkhout,et al. Self-Adapting Linear Algebra Algorithms and Software , 2005, Proceedings of the IEEE.
[12] Jingling Xue. Communication-Minimal Tiling of Uniform Dependence Loops , 1997, J. Parallel Distributed Comput..
[13] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[14] David F. Heidel,et al. An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[15] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.
[16] Viktor K. Prasanna,et al. Optimizing graph algorithms for improved cache performance , 2004, Proceedings 16th International Parallel and Distributed Processing Symposium.
[17] Alan George,et al. Dynamic Programming on a Shared-Memory Multiprocessor , 1993, Parallel Comput..
[18] Bruce A. Shapiro,et al. Optimization of an RNA Folding Algorithm for Parallel Architectures , 1998, Parallel Comput..
[19] Wu-chun Feng,et al. The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.
[20] Charles L. Seitz,et al. Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.
[21] Robert W. Numrich,et al. Co-array Fortran for parallel programming , 1998, FORF.
[22] Francisco Almeida,et al. Optimal tiling for the RNA base pairing problem , 2002, SPAA '02.
[23] Sanjay V. Rajopadhye,et al. Optimal Orthogonal Tiling of 2-D Iterations , 1997, J. Parallel Distributed Comput..
[24] Charles L. Seitz,et al. Myrinet: A Gigabit-per-Second Local , 1995 .
[25] Matteo Frigo,et al. Cache-oblivious algorithms , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).
[26] Jingling Xue,et al. Reuse-Driven Tiling for Improving Data Locality , 1998, International Journal of Parallel Programming.
[27] Michael Wolfe,et al. Iteration Space Tiling for Memory Hierarchies , 1987, PPSC.
[28] David A. Bader,et al. BioPerf: a benchmark suite to evaluate high-performance computer architecture on bioinformatics applications , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..
[29] Christian N. S. Pedersen,et al. Fast evaluation of internal loops in RNA secondary structure prediction , 1999, Bioinform..
[30] Franz Franchetti,et al. SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.
[31] Hiroshi Tezuka,et al. The design and implementation of zero copy MPI using commodity hardware with a high performance network , 1998, ICS '98.
[32] Jingling Xue,et al. On Tiling as a Loop Transformation , 1997, Parallel Process. Lett..
[33] Monica S. Lam,et al. Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..
[34] Zvi Galil,et al. Parallel Algorithms for Dynamic Programming Recurrences with More than O(1) Dependency , 1994, J. Parallel Distributed Comput..
[35] Sanjay V. Rajopadhye,et al. Optimal semi-oblique tiling , 2001, SPAA '01.
[36] Dhabaleswar K. Panda,et al. High performance RDMA-based MPI implementation over InfiniBand , 2003, ICS.
[37] D. Martin Swany,et al. Transformations to Parallel Codes for Communication-Computation Overlap , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[38] H. T. Kung,et al. Direct VLSI Implementation of Combinatorial Algorithms , 1979 .
[39] Lin Xu,et al. An experimental study of optimizing bioinformatics applications , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[40] Jingling Xue,et al. Unimodular Transformations of Non-Perfectly Nested Loops , 1997, Parallel Comput..
[41] J. Ramanujam,et al. Tiling multidimensional iteration spaces for nonshared memory machines , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).