SPICE²: A Spatial, Parallel Architecture for Accelerating the Spice Circuit Simulator
暂无分享,去创建一个
[1] Nachiket Kapre,et al. Packet Switched vs. Time Multiplexed FPGA Overlay Networks , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.
[2] Martin Langhammer. Floating point datapath synthesis for FPGAs , 2008, 2008 International Conference on Field Programmable Logic and Applications.
[3] David M. Lewis,et al. A compiled-code hardware accelerator for circuit simulation , 1992, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[4] George Ho,et al. PAPI: A Portable Interface to Hardware Performance Counters , 1999 .
[5] R.W. Dutton,et al. Impact of Scaling on Analog Performance and Associated Modeling Needs , 2006, IEEE Transactions on Electron Devices.
[6] Monica S. Lam,et al. RETROSPECTIVE : Software Pipelining : An Effective Scheduling Technique for VLIW Machines , 1998 .
[7] Jennifer A. Scott,et al. Stabilized bordered block diagonal forms for parallel sparse solvers , 2005, Parallel Comput..
[8] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.
[9] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[10] Florent de Dinechin,et al. When FPGAs are better at floating-point than microprocessors , 2008, FPGA '08.
[11] M. C. Jeng,et al. A robust physical and predictive model for deep-submicrometer MOS circuit simulation , 1993, Proceedings of IEEE Custom Integrated Circuits Conference - CICC '93.
[12] James Demmel,et al. the Parallel Computing Landscape , 2022 .
[13] John Wawrzynek,et al. Stochastic, spatial routing for hypergraphs, trees, and meshes , 2003, FPGA '03.
[14] P. Sadayappan,et al. Parallelization and performance evaluation of circuit simulation on a shared-memory multiprocessor , 1988, ICS '88.
[15] L. Peterson,et al. The design and implementation of a concurrent circuit simulation program for multicomputers , 1993, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[16] Katherine Yelick,et al. Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply , 2004 .
[17] Bradford Nichols,et al. Pthreads programming - a POSIX standard for better multiprocessing , 1996 .
[18] Wei Dong,et al. WavePipe: Parallel transient simulation of analog and digital circuits on multi-core shared-memory machines , 2008, 2008 45th ACM/IEEE Design Automation Conference.
[19] Alex Pothen,et al. Computing the block triangular form of a sparse matrix , 1990, TOMS.
[20] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.
[21] Rajit Manohar,et al. DATAFLOW NETWORKS FOR EVENT STREAM PROCESSING , 2004 .
[22] R. M. Tomasulo,et al. An efficient algorithm for exploiting multiple arithmetic units , 1995 .
[23] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[24] Teresa H. Y. Meng,et al. Towards program optimization through automated analysis of numerical precision , 2010, CGO '10.
[25] Timothy A. Davis,et al. A column approximate minimum degree ordering algorithm , 2000, TOMS.
[26] B. Ramakrishna Rau,et al. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing , 1981, MICRO 14.
[27] John R. Ellis,et al. Bulldog: A Compiler for VLIW Architectures , 1986 .
[28] Chung-Kuan Cheng,et al. Parallel transistor level circuit simulation using domain decomposition methods , 2009, 2009 Asia and South Pacific Design Automation Conference.
[29] Robert A. van de Geijn,et al. SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks , 2008, PPoPP.
[30] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[31] André DeHon,et al. The Density Advantage of Configurable Computing , 2000, Computer.
[32] A. Richard Newton,et al. Analysis of performance and convergence issues for circuit simulation , 1989 .
[33] Richard F. Barrett,et al. Matrix Market: a web resource for test matrix collections , 1996, Quality of Numerical Software.
[34] Gerhard Wellein,et al. Have the Vectors the Continuing Ability to Parry the Attack of the Killer Micros , 2006 .
[36] Eric R. Keiter,et al. The Xyce Parallel Electronic Simulator - An Overview , 2000 .
[37] John Wawrzynek,et al. Research accelerator for multiple processors , 2006, 2006 IEEE Hot Chips 18 Symposium (HCS).
[38] Joseph A. Fisher. The VLIW Machine: A Multiprocessor for Compiling Scientific Code , 1984, Computer.
[39] Sunil P. Khatri,et al. Fast circuit simulation on graphics processing units , 2009, 2009 Asia and South Pacific Design Automation Conference.
[40] George A. Constantinides,et al. Automated Precision Analysis: A Polynomial Algebraic Approach , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.
[41] L. Lemaitre,et al. Extensions to Verilog-A to support compact device modeling , 2003, Proceedings of the 2003 IEEE International Workshop on Behavioral Modeling and Simulation.
[42] Prawat Nagvajara,et al. Sparse LU Decomposition using FPGA ⋆ , 2008 .
[43] Zhao Li,et al. An efficiently preconditioned GMRES method for fast parasitic-sensitive deep-submicron VLSI circuit simulation , 2005, Design, Automation and Test in Europe.
[44] Guy Lemieux,et al. Towards reliable 5Gbps wave-pipelined and 3Gbps surfing interconnect in 65nm FPGAs , 2009, FPGA '09.
[45] John Wawrzynek,et al. Design automation for streaming systems , 2005 .
[46] Yoshitaka Maekawa,et al. Near Fine Grain Parallel Processing of Circuit Simulation Using Direct Method , 1994 .
[47] Reiji Suda,et al. Implementation of sparta, a highly parallel circuit simulator by the preconditioned Jacobi method, on a distributed memory machine , 1995, ICS '95.
[48] David A. Patterson,et al. Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .
[49] Ralph Wittig,et al. Performance and power of cache-based reconfigurable computing , 2009, FPGA '09.
[50] David E. Culler,et al. Monsoon: an explicit token-store architecture , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[51] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[52] Gung-Chung Yang. PARASPICE: a parallel circuit simulator for shared-memory multiprocessors , 1991, DAC '90.
[53] Albert E. Ruehli,et al. The modified nodal approach to network analysis , 1975 .
[54] Barbara M. Chapman,et al. OpenMP Implementation of SPICE3 Circuit Simulator , 2007, International Journal of Parallel Programming.
[55] Yasser Y. Hanafy,et al. Massive parallelization of SPICE device model evaluation on GPU-based SIMD architectures , 2008, IFMT '08.
[56] Bo Wan,et al. MCAST: an abstract-syntax-tree based model compiler for circuit simulation , 2003, Proceedings of the IEEE 2003 Custom Integrated Circuits Conference, 2003..
[57] William Gropp,et al. Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.
[58] L. Higbie. Optimal Parallel Triangulation of a Sparse Matrix , 1979 .
[59] Michael Garland,et al. Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .
[60] Nachiket Kapre,et al. GraphStep: A System Architecture for Sparse-Graph Algorithms , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.
[61] Robert W. Floyd. The paradigms of programming , 2007 .
[62] David Bryan,et al. Combinational profiles of sequential benchmark circuits , 1989, IEEE International Symposium on Circuits and Systems,.
[63] J. Gilbert,et al. Sparse Partial Pivoting in Time Proportional to Arithmetic Operations , 1986 .
[64] Martin Langhammer,et al. FPGA Floating Point Datapath Compiler , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.
[65] Nachiket Kapre,et al. Accelerating SPICE Model-Evaluation using FPGAs , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.
[66] G. Gildenblat,et al. PSP: An Advanced Surface-Potential-Based MOSFET Model for Circuit Simulation , 2006, IEEE Transactions on Electron Devices.
[67] André DeHon,et al. Compact, multilayer layout for butterfly fat-tree , 2000, SPAA '00.
[68] C. A. R. Hoare,et al. Communicating sequential processes , 1978, CACM.
[69] Ausif Mahmood,et al. Parallel SOLVE for direct circuit simulation on a transputer array , 1996, Proceedings of 3rd International Conference on High Performance Computing (HiPC).
[70] Prathima Agrawal,et al. PACE: A Multiprocessor System for VLSI Circuit Simulation , 1993, PPSC.
[71] David J. Frank,et al. Power-constrained CMOS scaling limits , 2002, IBM J. Res. Dev..
[72] Resve A. Saleh,et al. Parallel waveform-Newton algorithms for circuit simulation , 1992, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[73] Henk A. van der Vorst,et al. A parallel linear system solver for circuit simulation problems , 2000, Numer. Linear Algebra Appl..
[74] Heather M. Quinn,et al. Vision for cross-layer optimization to address the dual challenges of energy and reliability , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[75] 吉野 智興,et al. Programmer's guide , 1993 .
[76] Fujio Yamamoto,et al. Vectorized LU Decomposition Algorithms for Large-Scale Circuit Simulation , 1985, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[77] Andrew B. Kahng,et al. Improved algorithms for hypergraph bipartitioning , 2000, ASP-DAC '00.
[78] Jeremy Johnson,et al. Power flow computation using field programmable gate arrays , 2007 .
[79] Marcus Van Ierssel. Circuit Simulation on a Field Programmable Accelerator , 1995 .
[80] Qiang Wang,et al. Automated field-programmable compute accelerator design using partial evaluation , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).
[81] Nachiket Kapre,et al. Optimistic Parallelization of Floating-Point Accumulation , 2007, 18th IEEE Symposium on Computer Arithmetic (ARITH '07).
[82] Sotirios G. Ziavras,et al. Exploiting mixed-mode parallelism for matrix operations on the HERA architecture through reconfiguration , 2006 .
[83] Youn-Long Lin,et al. Recent developments in high-level synthesis , 1997, TODE.
[84] Daniel D. Gajski,et al. High ― Level Synthesis: Introduction to Chip and System Design , 1992 .
[85] Sudhakar Yalamanchili,et al. Interconnection Networks: An Engineering Approach , 2002 .
[86] Sotirios G. Ziavras,et al. Parallel LU factorization of sparse matrices on FPGA-based configurable computing engines: Research Articles , 2004 .
[87] Mansun Chan,et al. The engineering of BSIM for the nano-technology era and beyond , 2002 .
[88] Ekanathan Palamadai Natarajan,et al. KLU{A HIGH PERFORMANCE SPARSE LINEAR SOLVER FOR CIRCUIT SIMULATION PROBLEMS , 2005 .
[89] André DeHon,et al. Floating-point sparse matrix-vector multiply for FPGAs , 2005, FPGA '05.
[90] Paul Chow,et al. Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 2000, Monterey, CA, USA, February 10-11, 2000 , 2000, FPGA.
[91] Srinivas Devadas,et al. Algorithms for hardware allocation in data path synthesis , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[92] Goichi Yokomizo,et al. A parallel and accelerated circuit simulator with precise accuracy , 2002, Proceedings of ASP-DAC/VLSI Design 2002. 7th Asia and South Pacific Design Automation Conference and 15h International Conference on VLSI Design.
[93] David A. Patterson,et al. Computer Architecture - A Quantitative Approach (4. ed.) , 2007 .
[94] Sani R. Nassif,et al. MAPS: multi-algorithm parallel circuit simulation , 2008, ICCAD 2008.
[95] Zhao Li,et al. SILCA: SPICE-accurate iterative linear-centric analysis for efficient time-domain Simulation of VLSI circuits with strong parasitic couplings , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[96] André DeHon,et al. Hardware-assisted simulated annealing with application for fast FPGA placement , 2003, FPGA '03.
[97] Timothy A. Davis,et al. Algorithm 907 , 2010 .
[98] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[99] Kenneth B. Kent,et al. VPR 5.0: FPGA CAD and architecture exploration tools with single-driver routing, heterogeneity and process scaling , 2011, TRETS.
[100] Anant Agarwal,et al. Logic emulation with virtual wires , 1997, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[101] Roy L. Russo,et al. On a Pin Versus Block Relationship For Partitions of Logic Graphs , 1971, IEEE Transactions on Computers.
[102] Gi-Joon Nam,et al. Ispd2009 clock network synthesis contest , 2009, ISPD '09.
[103] David M. Lewis. A programmable hardware accelerator for compiled electrical simulation , 1988, 25th ACM/IEEE, Design Automation Conference.Proceedings 1988..
[104] Philipp Birken,et al. Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.
[105] Jennifer A. Scott,et al. A parallel direct solver for large sparse highly unsymmetric linear systems , 2004, TOMS.
[106] Nikil Mehta,et al. Time-Multiplexed FPGA Overlay Networks on Chip , 2006 .
[107] Saurabh Dighe,et al. The 48-core SCC Processor: the Programmer's View , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[108] H. Diab,et al. An FPGA-based MOS circuit simulator , 2005, 48th Midwest Symposium on Circuits and Systems, 2005..
[109] Andrei Vladimirescu,et al. A Vector Hardware Accelerator with Circuit Simulation Emphasis , 1987, 24th ACM/IEEE Design Automation Conference.
[110] Christoforos E. Kozyrakis,et al. RAMP: Research Accelerator for Multiple Processors , 2007, IEEE Micro.