Architecture Exploration for Efficient Data Transfer and Storage in Data-Parallel Applications
暂无分享,去创建一个
[1] Scott A. Mahlke,et al. PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators , 2002, J. VLSI Signal Process..
[2] Sarvapali D. Ramchurn,et al. An Anytime Algorithm for Optimal Coalition Structure Generation , 2014, J. Artif. Intell. Res..
[3] Pierre Boulet,et al. High Level Loop Transformations for Systematic Signal Processing Embedded Applications , 2008, SAMOS.
[4] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.
[5] Erik Brockmeyer,et al. Data Access and Storage Management for Embedded Programmable Processors , 2002, Springer US.
[6] Stamatis Vassiliadis,et al. Embedded Computer Systems: Architectures, Modeling, and Simulation 5th International Workshop, SAMOS 2005, Samos, Greece, July 18-20, 2005, Proceedings , 2005, International Conference / Workshop on Embedded Computer Systems: Architectures, Modeling and Simulation.
[7] Jingling Xue,et al. Loop Tiling for Parallelism , 2000, Kluwer International Series in Engineering and Computer Science.
[8] H. T. Kung. Why systolic architectures? , 1982, Computer.
[9] Erik Brockmeyer,et al. Data and memory optimization techniques for embedded systems , 2001, TODE.
[10] Jeanny Hérault,et al. Modeling Visual Perception for Image Processing , 2007, IWANN.
[11] Rosilde Corvino. Design Space Exploration for data-dominated image applications with non-affine array references , 2009 .
[12] Hiroshi Nakamura,et al. Augmenting Loop Tiling with Data Alignment for Improved Cache Performance , 1999, IEEE Trans. Computers.
[13] Keshav Pingali,et al. Synthesizing transformations for locality enhancement of imperfectly-nested loop nests , 2000 .
[14] Pierre Boulet,et al. Array-OL with delays, a domain specific specification language for multidimensional intensive signal processing , 2010, Multidimens. Syst. Signal Process..
[15] Pierre Boulet,et al. Projection of the Array-OL specification language onto the Kahn process network computation model , 2005, 8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05).
[16] Francky Catthoor,et al. Incremental hierarchical memory size estimation for steering of loop transformations , 2007, TODE.
[17] Surendra Byna,et al. Hiding I/O latency with pre-execution prefetching for parallel applications , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[18] Jürgen Teich,et al. Parallelization Approaches for Hardware Accelerators - Loop Unrolling Versus Loop Partitioning , 2009, ARCS.
[19] Francky Catthoor,et al. Storage Estimation and Design Space Exploration Methodologies for the Memory Management of Signal Processing Applications , 2008, J. Signal Process. Syst..
[20] David B. Whalley,et al. Fast, accurate design space exploration of embedded systems memory configurations , 2007, SAC '07.
[21] Yongmin Kim,et al. Data Cache and Direct Memory Access in Programming Mediaprocessors , 2001, IEEE Micro.
[22] Jean-Luc Dekeyser,et al. A Model-Driven Design Framework for Massively Parallel Embedded Systems , 2011, TECS.
[23] Shambhu J. Upadhyaya,et al. Defect Analysis and Defect Tolerant Design of Multi-port SRAMs , 2008, J. Electron. Test..
[24] Jeanny Hérault,et al. Efficient Demosaicing Through Recursive Filtering , 2007, 2007 IEEE International Conference on Image Processing.
[25] Vincenzo Catania,et al. Efficient design space exploration for application specific systems-on-a-chip , 2007, J. Syst. Archit..
[26] Alberto Prieto,et al. Computational and ambient intelligence , 2009, Neurocomputing.