A Novel DSP Architecture for Scientific Computing and Deep Learning
暂无分享,去创建一个
Chao Yang | Shuming Chen | Jian Zhang | Zhi Wang | Zhao Lv
[1] Xiaohui Liu,et al. A Composite Model of Wound Segmentation Based on Traditional Methods and Deep Neural Networks , 2018, Comput. Intell. Neurosci..
[2] W. Brown. Synthetic Aperture Radar , 1967, IEEE Transactions on Aerospace and Electronic Systems.
[3] Zenghui Wang,et al. Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review , 2017, Neural Computation.
[4] Jianping Yin,et al. A fast and accurate method for detecting fingerprint reference point , 2016, Neural Computing and Applications.
[5] Miriam Leeser,et al. Division and square root: choosing the right implementation , 1997, IEEE Micro.
[6] S. Walther. A unified algorithm for elementary functions , 1899 .
[7] Farid Melgani,et al. Convolutional SVM Networks for Object Detection in UAV Imagery , 2018, IEEE Transactions on Geoscience and Remote Sensing.
[8] Peter Xiaoping Liu,et al. Robust Fuzzy Adaptive Tracking Control for Nonaffine Stochastic Nonlinear Switching Systems , 2018, IEEE Transactions on Cybernetics.
[9] Kevin Barraclough,et al. I and i , 2001, BMJ : British Medical Journal.
[10] Scott A. Mahlke,et al. D2MA: Accelerating coarse-grained data transfer for GPUs , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[11] Pradeep Dubey,et al. Design and Implementation of the Linpack Benchmark for Single and Multi-node Systems Based on Intel® Xeon Phi Coprocessor , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[12] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[13] Robert A. van de Geijn,et al. Anatomy of high-performance matrix multiplication , 2008, TOMS.
[14] Tianzhou Chen,et al. Less reused filter: improving l2 cache performance via filtering less reused lines , 2009, ICS '09.
[15] Robert A. van de Geijn,et al. Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPC , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[16] Per Stenström,et al. An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[17] Shuming Chen,et al. Accelerating the data shuffle operations for FFT algorithms on SIMD DSPs , 2011, 2011 9th IEEE International Conference on ASIC.
[18] Sandip Parikh,et al. High performance DSP for vision, imaging and neural networks , 2016, 2016 IEEE Hot Chips 28 Symposium (HCS).
[19] Randi Thomas. An Architectural Performance Study of the Fast Fourier Transform on Vector IRAM , 2000 .
[20] Peter Xiaoping Liu,et al. Adaptive Neural Output-Feedback Control for a Class of Nonlower Triangular Nonlinear Systems With Unmodeled Dynamics , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[21] Yannis Smaragdakis,et al. Adaptive Caches: Effective Shaping of Cache Behavior to Workloads , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[22] Jungwon Kim,et al. Accelerating LINPACK with MPI-OpenCL on Clusters of Multi-GPU Nodes , 2015, IEEE Transactions on Parallel and Distributed Systems.
[23] Aaas News,et al. Book Reviews , 1893, Buffalo Medical and Surgical Journal.
[24] Shuming Chen,et al. FT-Matrix: A Coordination-Aware Architecture for Signal Processing , 2014, IEEE Micro.
[25] Fabrizio Petrini,et al. Cell Multiprocessor Communication Network: Built for Speed , 2006, IEEE Micro.
[26] Peng Shi,et al. Fuzzy Adaptive Control Design and Discretization for a Class of Nonlinear Uncertain Systems , 2016, IEEE Transactions on Cybernetics.
[27] Javier D. Bruguera,et al. Floating-point multiply-add-fused with reduced latency , 2004, IEEE Transactions on Computers.
[28] Jong Won Park. Multiaccess Memory System for Attached SIMD Computer , 2004, IEEE Trans. Computers.
[29] Yongmin Kim,et al. Efficient 2D FFT implementation on mediaprocessors , 2003, Parallel Comput..
[30] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[31] Hongmin Li,et al. Fuzzy-Approximation-Based Adaptive Output-Feedback Control for Uncertain Nonsmooth Nonlinear Systems , 2018, IEEE Transactions on Fuzzy Systems.