Financial modeling on the cell broadband engine

High performance computing is critical for financial markets where analysts seek to accelerate complex optimizations such as pricing engines to maintain a competitive edge. In this paper we investigate the performance of financial workloads on the Sony-Toshiba- IBM Cell Broadband Engine, a heterogeneous multicore chip architected for intensive gaming applications and high performance computing. We analyze the use of Monte Carlo techniques for financial workloads and design efficient parallel implementations of different high performance pseudo and quasi random number generators as well as normalization techniques. Our implementation of the Mersenne Twister pseudo random number generator outperforms current Intel and AMD architectures by over an order of magnitude. Using these new routines, we optimize European option (EO) and collateralized debt obligation (CDO) pricing algorithms. Our Cell-optimized EO pricing achieves a speedup of over 2 in comparison with using RapidMind SDK for Cell, and comparing with GPU, a speedup of 1.26 as compared with using RapidMind SDK for GPU (NVIDIA GeForce 8800), and a speedup of 1.51 over NVIDIA GeForce 8800 (using CUDA). Our detailed analyses and performance results demonstrate that the Cell/B.E. processor is well suited for financial workloads and Monte Carlo simulation.

[1]  Fabrizio Petrini,et al.  Cell Multiprocessor Communication Network: Built for Speed , 2006, IEEE Micro.

[2]  Alexandros V. Gerbessiotis,et al.  Architecture independent parallel binomial tree option price valuations , 2004, Parallel Comput..

[3]  B. Flachs,et al.  A streaming processing unit for a CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[4]  Jerome Spanier,et al.  Dynamic creation of pseudorandom number generators , 2000 .

[5]  ChenT.,et al.  Cell Broadband Engine Architecture and its first implementation—A view , 2007 .

[6]  Gary L. Mullen,et al.  Parallel computing of a quasi-Monte Carlo algorithm for valuing derivatives , 2000, Parallel Comput..

[7]  Darrell Duffie,et al.  Risk and Valuation of Collateralized Debt Obligations , 2001 .

[8]  J. Halton On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals , 1960 .

[9]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[10]  H. Peter Hofstee Real-time supercomputing and technology for games and entertainment , 2006, SC.

[11]  M. E. Muller,et al.  A Note on the Generation of Random Normal Deviates , 1958 .

[12]  David A. Bader,et al.  FFTC: Fastest Fourier Transform for the IBM Cell Broadband Engine , 2007, HiPC.

[13]  F. Black,et al.  The Pricing of Options and Corporate Liabilities , 1973, Journal of Political Economy.

[14]  Frank J. Fabozzi,et al.  Collateralized Debt Obligations: Structures and Analysis , 2002 .

[15]  Giorgio Pauletto,et al.  Parallel Monte Carlo Methods for Derivative Security Pricing , 2000, NAA.

[16]  Harald Niederreiter,et al.  Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.

[17]  Giuseppe Campolieti,et al.  Parallel lattice implementation for option pricing under mixed state-dependent volatility models , 2005, 19th International Symposium on High Performance Computing Systems and Applications (HPCS'05).

[18]  Guang R. Gao,et al.  Exploring Financial Applications on Many-Core-on-a-Chip Architecture: A First Experiment , 2006, ISPA Workshops.

[19]  Ruppa K. Thulasiram,et al.  Parallel algorithm for pricing American Asian options with multi-dimensional assets , 2005, 19th International Symposium on High Performance Computing Systems and Applications (HPCS'05).

[20]  S. Asano,et al.  The design and implementation of a first-generation CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[21]  J. Dongarra,et al.  Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems) , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[22]  Victor Podlozhnyuk,et al.  Monte Carlo Option Pricing , 2008 .

[23]  N. Metropolis,et al.  The Monte Carlo method. , 1949 .

[24]  Peter Shirley,et al.  A Low Distortion Map Between Disk and Square , 1997, J. Graphics, GPU, & Game Tools.

[25]  J. Hammersley MONTE CARLO METHODS FOR SOLVING MULTIVARIABLE PROBLEMS , 1960 .

[26]  H. Peter Hofstee,et al.  Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..

[27]  David X. Li On Default Correlation: A Copula Function Approach , 1999 .

[28]  David X. Li On Default Correlation , 2000 .

[29]  Philip S. Yu,et al.  CellSort: High Performance Sorting on the Cell Processor , 2007, VLDB.

[30]  Samuel Williams,et al.  Scientific Computing Kernels on the Cell Processor , 2007, International Journal of Parallel Programming.

[31]  Sang H. Dhong,et al.  The vector floating-point unit in a synergistic processor element of a CELL processor , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).