Cost Minimization with HPDFG and Data Mining for Heterogeneous DSP

Cost minimization and execution-time reduction have become the most important issues in today’s real-time embedded system. Meanwhile, for the DSP (Digital Signal Processing) applications running on embedded system, loops inside them are the most critical part for performance optimization. To optimize the loop iteration patterns, we need to schedule the loop execution order. Due to the uncertainties within the execution time of tasks, we model varied execution times of tasks as random variables and propose a novel data graph model, called HPDFG (Heterogeneous Probabilistic Data-Flow Graph) to model DSP applications on embedded systems. A novel algorithm, LSHAPE, is proposed to minimize the cost and satisfy the timing constraints. First of all, we use the data mining methods to estimate the probabilistic distribution of the execution time variables. Second, we rotate the loops in the application to explore different possible execution patterns. Finally, we combine the list-scheduling and the dynamic programming to generate a near-optimal task allocation and a core-mode assignment. Experimental results demonstrate the effectiveness of our algorithm. Our approach can handle loops efficiently.

[1]  Keshab K. Parhi,et al.  ILP-based cost-optimal DSP synthesis with module selection and data format conversion , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[2]  YangChia-Lin,et al.  Tolerating memory latency through push prefetching for pointer-intensive applications , 2004 .

[3]  Niraj K. Jha,et al.  Power-profile driven variable voltage scaling for heterogeneous distributed real-time embedded systems , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[4]  Gang Qu,et al.  Energy reduction techniques for multimedia applications with tolerance to deadline misses , 2003, DAC.

[5]  Chia-Lin Yang,et al.  Tolerating memory latency through push prefetching for pointer-intensive applications , 2004, TACO.

[6]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[7]  Jyrki Leskela,et al.  OpenCL embedded profile prototype in mobile device , 2009, 2009 IEEE Workshop on Signal Processing Systems.

[8]  A. A. Maciejewski,et al.  ITERATIVE ALGORITHMS FOR STOCHASTICALLY ROBUST STATIC RESOURCE ALLOCATION IN PERIODIC SENSOR DRIVEN CLUSTERS , 2006 .

[9]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[10]  Meikang Qiu,et al.  Dynamic and Leakage Energy Minimization With Soft Real-Time Loop Scheduling and Voltage Assignment , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[11]  Edwin Hsing-Mean Sha,et al.  Probabilistic Loop Scheduling for Applications with Uncertain Execution Time , 2000, IEEE Trans. Computers.

[12]  Wayne H. Wolf Design Challenges in Multiprocessor Systems-on-Chip , 2006, DIPES.

[13]  Michael D. Lemmon,et al.  Generalized Elastic Scheduling for Real-Time Tasks , 2009, IEEE Transactions on Computers.

[14]  Edwin Hsing-Mean Sha,et al.  Rotation Scheduling: A Loop Pipelining Algorithm , 1993, 30th ACM/IEEE Design Automation Conference.

[15]  Meikang Qiu,et al.  Rotation Scheduling and Voltage Assignment to Minimize Energy for SoC , 2009, 2009 International Conference on Computational Science and Engineering.

[16]  Pierre G. Paulin,et al.  Force-directed scheduling for the behavioral synthesis of ASICs , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[17]  D. Chen,et al.  Task scheduling and voltage selection for energy minimization , 2002, Proceedings 2002 Design Automation Conference (IEEE Cat. No.02CH37324).

[18]  Luca Benini,et al.  Source code optimization and profiling of energy consumption in embedded systems , 2000, ISSS '00.

[19]  Gang Qu,et al.  Exploring the probabilistic design space of multimedia systems , 2003, 14th IEEE International Workshop on Rapid Systems Prototyping, 2003. Proceedings..

[20]  Alan Jay Smith,et al.  CPU Cache Prefetching: Timing Evaluation of Hardware Implementations , 1998, IEEE Trans. Computers.

[21]  Gang Qu,et al.  Approaching the Maximum Energy Saving on Embedded Systems with Multiple Voltages , 2003, ICCAD.

[22]  Xiaobo Sharon Hu,et al.  A Metric for Judicious Relaxation of Timing Constraints in Soft Real-Time Systems , 2009, 2009 15th IEEE Real-Time and Embedded Technology and Applications Symposium.

[23]  Jun Sun,et al.  Probabilistic performance guarantee for real-time tasks with varying computation times , 1995, Proceedings Real-Time Technology and Applications Symposium.

[24]  Yuan Xie,et al.  Profile-Driven Selective Code Compression , 2003, DATE.

[25]  Santosh G. Abraham,et al.  Effective instruction prefetching in chip multiprocessors for modern commercial applications , 2005, 11th International Symposium on High-Performance Computer Architecture.

[26]  Edwin Hsing-Mean Sha,et al.  Estimating probabilistic timing performance for real-time embedded systems , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[27]  Edwin Hsing-Mean Sha,et al.  Efficient assignment and scheduling for heterogeneous DSP systems , 2005, IEEE Transactions on Parallel and Distributed Systems.

[28]  T. Srikanthan,et al.  Profile-based technique for Dynamic Power Management in embedded systems , 2008, 2008 International Conference on Electronic Design.

[29]  Keshab K. Parhi,et al.  Register minimization in cost-optimal synthesis of DSP architectures , 1995, VLSI Signal Processing, VIII.

[30]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[31]  Charles E. Leiserson,et al.  Retiming synchronous circuitry , 1988, Algorithmica.

[32]  Gary S. Tyson,et al.  Branch history guided instruction prefetching , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[33]  Meikang Qiu,et al.  Loop scheduling to minimize cost with data mining and prefetching for heterogeneous DSP , 2006 .

[34]  Bernd Kleinjohann,et al.  From Model-Driven Design to Resource Management for Distributed Embedded Systems: IFIP TC 10 Working Conference on Distributed and Parallel Embedded Systems ... Federation for Information Processing) , 2006 .

[35]  James E. Smith,et al.  A Performance Study of Instruction Cache Prefetching Methods , 1998, IEEE Trans. Computers.

[36]  Meikang Qiu,et al.  Energy minimization with soft real-time and DVS for uniprocessor and multiprocessor embedded systems , 2007 .

[37]  Anthony A. Maciejewski,et al.  Stochastic-Based Robust Dynamic Resource Allocation in a Heterogeneous Computing System , 2009, 2009 International Conference on Parallel Processing.

[38]  Meikang Qiu,et al.  Voltage Assignment with Guaranteed Probability Satisfying Timing Constraint for Real-time Multiproceesor DSP , 2007, J. VLSI Signal Process..

[39]  Onur Mutlu,et al.  Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[40]  Sholom M. Weiss,et al.  Predictive data mining - a practical guide , 1997 .

[41]  Dirk Grunwald,et al.  Prefetching Using Markov Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[42]  Yi Zhang,et al.  Execution History Guided Instruction Prefetching , 2004, The Journal of Supercomputing.