Conclusion and Future Outlook

Targeting multimedia systems under high throughput, resource and power constraints, this book discusses efficient software-/application-level techniques and hardware-/architectural-level designs for the multimedia (specifically video) systems. Mainly, the aim of the techniques discussed in this book is to maximize the throughput-per-watt metric of the system while considering some modern design challenges and methodologies. The challenges addressed in this book include parallelization of multimedia applications on possibly heterogeneous systems, load balancing on many-core and customized nodes, resource (number of cores and power) budgeting, and efficient design of the multimedia system’s memory architecture. In a broader perspective, these problems can collectively represent the power wall or dark silicon challenge for the next-generation video processing systems.

[1]  Lingamneni Avinash,et al.  Highly energy and performance efficient embedded computing through approximately correct arithmetic: a mathematical foundation and preliminary experimental validation , 2008, CASES '08.

[2]  Alireza Ejlali,et al.  DRVS: Power-efficient reliability management through Dynamic Redundancy and Voltage Scaling under variations , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[3]  Nuno Roma,et al.  Dynamic Load Balancing for Real-Time Video Encoding on Heterogeneous CPU+GPU Systems , 2014, IEEE Transactions on Multimedia.

[4]  Puneet Gupta,et al.  Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[5]  Ragunathan Rajkumar,et al.  Energy-Efficient Allocation of Real-Time Applications onto Single-ISA Heterogeneous Multi-Core Processors , 2016, J. Signal Process. Syst..

[6]  Kaushik Roy,et al.  IMPACT: IMPrecise adders for low-power approximate computing , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[7]  Kaushik Roy,et al.  SALSA: Systematic logic synthesis of approximate circuits , 2012, DAC Design Automation Conference 2012.

[8]  Kaushik Roy,et al.  Low-Power Digital Signal Processing Using Approximate Adders , 2013, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Craig A. Knoblock,et al.  A Survey of Digital Map Processing Techniques , 2014, ACM Comput. Surv..

[10]  H.-T. Huang,et al.  A Low-Power High-Performance H.264/AVC Intra-Frame Encoder for 1080pHD Video , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[11]  Vivek K. Goyal,et al.  Multiple description coding: compression meets the network , 2001, IEEE Signal Process. Mag..

[12]  Jeffrey S. Vetter,et al.  A Survey of CPU-GPU Heterogeneous Computing Techniques , 2015, ACM Comput. Surv..

[13]  A. Sil,et al.  A novel high write speed, low power, read-SNM-free 6T SRAM cell , 2008, 2008 51st Midwest Symposium on Circuits and Systems.

[14]  Nuno Roma,et al.  Exploiting task and data parallelism for advanced video coding on hybrid CPU + GPU platforms , 2013, Journal of Real-Time Image Processing.

[15]  Heiko Schwarz,et al.  3D High-Efficiency Video Coding for Multi-View Video and Depth Data , 2013, IEEE Transactions on Image Processing.

[16]  Kaushik Roy,et al.  Substitute-and-simplify: A unified design paradigm for approximate and quality configurable circuits , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17]  Liang-Gee Chen,et al.  Level C+ data reuse scheme for motion estimation with corresponding coding orders , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Salvatore Cuomo,et al.  3D Data Denoising via Nonlocal Means Filter by Using Parallel GPU Strategies , 2014, Comput. Math. Methods Medicine.

[19]  Muhammad Usman Karim Khan,et al.  Software architecture of High Efficiency Video Coding for many-core systems with power-efficient workload balancing , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[20]  Wei Jiang,et al.  Gradient based fast mode decision algorithm for intra prediction in HEVC , 2012 .

[21]  Heiko Schwarz,et al.  Overview of the Scalable Video Coding Extension of the H.264/AVC Standard , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  S. Goto,et al.  A Low-Complexity HEVC Intra Prediction Algorithm Based on Level and Mode Filtering , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[23]  Gary J. Sullivan,et al.  Overview of the Stereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard , 2011, Proceedings of the IEEE.

[24]  Muhammad Usman Karim Khan,et al.  Power efficient and workload balanced tiling for parallelized high efficiency video coding , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[25]  Sandeep K. Gupta,et al.  Approximate logic synthesis for error tolerant applications , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[26]  Sudhanva Gurumurthi,et al.  Recovery Boosting: A Technique to Enhance NBTI Recovery in SRAM Arrays , 2010, 2010 IEEE Computer Society Annual Symposium on VLSI.

[27]  Claudia Rosas,et al.  Workload Balancing Methodology for Data-Intensive Applications with Divisible Load , 2011, 2011 23rd International Symposium on Computer Architecture and High Performance Computing.

[28]  Kai Ma,et al.  Scalable power control for many-core architectures running multi-threaded applications , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[29]  Andrew B. Kahng,et al.  Accuracy-configurable adder for approximate arithmetic designs , 2012, DAC Design Automation Conference 2012.

[30]  Kaushik Roy,et al.  ASLAN: Synthesis of approximate sequential circuits , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).