Perception-Oriented 3D Rendering Approximation for Modern Graphics Processors

Anisotropic filtering enabled by modern rasterization-based GPUs provides users with extremely authentic visualization experience, but significantly limits the performance and energy efficiency of 3D rendering process due to its large texture data requirement. To improve 3D rendering efficiency, we build a bridge between anisotropic filtering process and human visual system by analyzing users’ perception on image quality. We discover that anisotropic filtering does not impact user perceived image quality on every pixel. This motives us to approximate the anisotropic filtering process for non-perceivable pixels in order to improve the overall 3D rendering performance without damaging user experience. To achieve this goal, we propose a perceptionoriented runtime approximation model for 3D rendering by leveraging the inner-relationship between anisotropic and isotropic filtering. We also provide a low-cost texture unit design for enabling this approximation. Extensive evaluation on modern 3D games demonstrates that, under a conservative tuning point, our design achieves a significant average speedup of 17% for the overall 3D rendering along with 11% total GPU energy reduction, without visible image quality loss from users’ perception. It also reduces the texture filtering latency by an average of 29%. Additionally, it creates a unique perception-based tuning space for performance-quality tradeoffs on graphics processors.

[1]  Sébastien Marcel,et al.  Biometric Antispoofing Methods: A Survey in Face Recognition , 2014, IEEE Access.

[2]  Wolfgang Straßer,et al.  Texram: a smart memory for texturing , 1996, IEEE Computer Graphics and Applications.

[3]  Evan Hart 3D Textures and Pixel Shaders , 2004 .

[4]  Edward R. Vrscay,et al.  Solving Optimization Problems That Employ Structural Similarity As The Fidelity Measure , 2014 .

[5]  Yunxin Liu,et al.  Optimizing Smartphone Power Consumption through Dynamic Resolution Scaling , 2015, MobiCom.

[6]  Yang Gao,et al.  CW-SSIM based image classification , 2011, 2011 18th IEEE International Conference on Image Processing.

[7]  David Chu,et al.  FlashBack: Immersive Virtual Reality on Mobile Devices via Rendering Memoization , 2016, MobiSys.

[8]  Simon Fenney,et al.  Texture compression using low-frequency signal modulation , 2003, HWWS '03.

[9]  Petru Eles,et al.  Perception-aware power management for mobile games via dynamic resolution scaling , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[10]  Edward R. Vrscay,et al.  SSIM-inspired image restoration using sparse representation , 2012, EURASIP Journal on Advances in Signal Processing.

[11]  Onur Mutlu,et al.  Rollback-free value prediction with approximate loads , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[12]  Jose-Maria Arnau,et al.  Parallel frame rendering: Trading responsiveness for energy on a mobile GPU , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.

[13]  D. Chandler Seven Challenges in Image Quality Assessment: Past, Present, and Future Research , 2013 .

[14]  Jing Wang,et al.  Processing-in-Memory Enabled Graphics Processors for 3D Rendering , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[15]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[16]  Carlos González,et al.  ATTILA: a cycle-level execution-driven simulator for modern GPU architectures , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[17]  Norman P. Jouppi,et al.  CACTI 6.0: A Tool to Model Large Caches , 2009 .

[18]  Vittoria Bruni,et al.  Non local means image denoising using noise-adaptive SSIM , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[19]  Ian Bratt,et al.  The ARM® Mali-T880 Mobile GPU , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).

[20]  Georgios Papaioannou,et al.  High quality elliptical texture filtering on GPU , 2011, SI3D.

[21]  Wolfgang Straßer,et al.  Hardware for Superior Texture Performance , 1995, Workshop on Graphics Hardware.

[22]  Chen-Yu Chen,et al.  Energy-aware hybrid precision selection framework for mobile GPUs , 2013, Comput. Graph..

[23]  Marc Olano,et al.  Vertex-based anisotropic texturing , 2001, HWWS '01.

[24]  Hadi Esmaeilzadeh,et al.  AxGames: Towards Crowdsourcing Quality Target Determination in Approximate Computing , 2016, ASPLOS.

[25]  Abdul Rehman,et al.  SSIM-Inspired Perceptual Video Coding for HEVC , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[26]  Sudhakar Yalamanchili,et al.  Power Modeling for GPU Architectures Using McPAT , 2014, TODE.

[27]  Raghuram Srinivasan,et al.  Efficient management of last-level caches in graphics processors for 3D scene rendering workloads , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[28]  Anselmo Lastra,et al.  Precision selection for energy-efficient pixel shaders , 2011, HPG '11.

[29]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[30]  Shao-Yi Chien,et al.  High-Quality Mipmapping Texture Compression With Alpha Maps for Graphics Processing Units , 2009, IEEE Transactions on Multimedia.

[31]  Sean Ellis,et al.  Adaptive scalable texture compression , 2012, EGGH-HPG'12.

[32]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[33]  Andrew Chi-Sing Leung,et al.  Self-organizing map-based color palette for high-dynamic range texture compression , 2011, Neural Computing and Applications.

[34]  Peter A. Dinda,et al.  Learning and Leveraging the Relationship between Architecture-Level Measurements and Individual User Satisfaction , 2008, 2008 International Symposium on Computer Architecture.

[35]  Diego Gutierrez,et al.  Effects of Approximate Filtering on the Appearance of Bidirectional Texture Functions , 2014, IEEE Transactions on Visualization and Computer Graphics.

[36]  Martin White,et al.  Implementing an anisotropic texture filter , 2000, Comput. Graph..

[37]  Zhou Wang,et al.  Structural Similarity-Based Approximation of Signals and Images Using Orthogonal Bases , 2010, ICIAR.

[38]  Tomas Akenine-Möller,et al.  iPACKMAN: high-quality, low-complexity texture compression for mobile phones , 2005, HWWS '05.

[39]  Jose-Maria Arnau,et al.  Eliminating redundant fragment shader executions on a mobile GPU via hardware memoization , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[40]  Fuzheng Yang,et al.  Perceived Image Quality on Mobile Phones with Different Screen Resolution , 2016, Mob. Inf. Syst..

[41]  Jose-Maria Arnau,et al.  Boosting mobile GPU performance with a decoupled access/execute fragment processor , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).