Efficient aerial image simulation on multi-core SIMD CPU

Aerial image simulation is a fundamental problem in advanced lithography for chip fabrication. Since it requires a huge number of mathematical computations, an efficient yet accurate implementation becomes a necessity. In the literature, GPU or FPGA has demonstrated its potential for accelerating aerial image simulation. However, the comparisons of GPU or FPGA to CPU were not done thoroughly. In particular, careful tunings for the CPU-based method were missing in the previous works, while the recent CPU architectures have significant modifications toward high performance computing capabilities. In this paper, we present and discuss several algorithms for the aerial image simulation on multi-core SIMD CPU. Our fastest method achieves up to 73X speedup over the baseline serial approach and outperforms the state-of-the-art GPU-based approach by up to 2X speedup on a single hex-core SIMD CPU. We show that the performance on the multi-core SIMD CPU is promising, and that careful CPU tunings are necessary in order to exploit its computing capabilities.

[1]  Abbes Amira,et al.  FPGA implementations of fast fourier transforms for real-time signal and image processing , 2003, Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798).

[2]  Ilhami H. Torunoglu,et al.  OPC on a single desktop: a GPU-based OPC and verification tool for fabs and designers , 2010, Advanced Lithography.

[3]  Martin D. F. Wong,et al.  Accelerating aerial image simulation with GPU , 2011, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[4]  Kunle Olukotun,et al.  Efficient Parallel Graph Exploration on Multi-Core CPU and GPU , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[5]  Jason Cong,et al.  FPGA-Based Hardware Acceleration of Lithographic Aerial Image Simulation , 2009, TRETS.

[6]  A. Wong Optical Imaging in Projection Microlithography , 2005 .

[7]  Alexander Heinecke,et al.  Exploiting State-of-the-Art x86 Architectures in Scientific Computing , 2012, 2012 11th International Symposium on Parallel and Distributed Computing.

[8]  Sunil P. Khatri,et al.  Hardware Acceleration of EDA Algorithms: Custom ICs, FPGAs and GPUs , 2010 .

[9]  Avideh Zakhor,et al.  Fast optical and process proximity correction algorithms for integrated circuit manufacturing , 1998 .

[10]  Herb Sutter,et al.  The Free Lunch Is Over A Fundamental Turn Toward Concurrency in Software , 2013 .