AnEfficient Parallelization Technique forHigh Throughput FFT-AS IPs

FastFourier Transformation (FFT)andit's inverse Thereisasubstantial amountofpublications onFFTpar- (IFFT) areusedinOrthogonal Frequency Division Multiplexing allelization, however, thefocus hasbeenongeneral purpose, (OFDM)systems fordata(de)modulation. Thetransformations parallel anddistributed computing domains. Therecent emer- arethekernel tasks inanOFDM implementation, andarethe mostprocessing-intensive ones.Recenttrends intheelectronic gncomethodologynan sufo archItecture exploatio consumer market require OFDM implementations tobeflexible, andautomatic implementation suchasLISA-based tools (5) makinga trade-off between area, energy-efficiency, flexibility enables this kindofanalysis tobeconducted forFFT-ASIPs andtiming a necessity. Thishasspurred thedevelopment Of aswell. Application-Specific Instruction-Set Processors (ASIPs) forFFT . . processing. Parallelization isan architectural parameter that the anasrata existingnparalle lization techique significantly influence design goals. Thispaperpresents an attheinstruction anddatalevel cannot efficiently meethigh- analysis oftheefficiency ofparallelization techniques foranFFT- throughput requirements suchas409.6Msamples/s forUWB. ASIP. Itisshownthat existing techniques areinefficient forhighThisiscaused bymemorybottlenecks. Therefore, wepropose throughput applications suchasUltra Wideband (UWB), becausetheuseofaninterleaved execution technique whichiscapable ofmemorybottlenecks. Therefore, aninterleaved execution ofhiding some ofthestage execution cycles byexploiting technique whichexploits temporal parallelism isproposed. With ofmhidin some of thestagecion cycles beloiting this technique, itispossible tomeetthethroughput requirement temporal parallelism. Inthisanalysis, UWB wasselected to ofUWB (409.6 Msamples/s) withonly 4non-trivial butterfly unitsdemonstrate thatitisfeasible toimplement instruction-set foranASIPthatrunsat400MHz. processors whichcanefficiently support, amongother things, thecomputation ofhigh-rate FFTs.