Enabling Approximate Storage through Lossy Media Data Compression

As compute capabilities continue to scale, memory capacity and bandwidth continue to lag behind. Data compression is an effective approach to improving memory capacity and bandwidth; but prior works have focused primarily on lossless compression and retrieval for data-critical applications. While data integrity is critical to many applications, a growing number of memory-bound applications contain non-critical program data; this property can be exploited to improve memory performance and capacity. In this work, we enable approximate storage to exploit the inherent error resilience of multimedia and other applications, through lossy compression, to improve storage without adversely affecting application functionality. We realize an incidence of approximate storage through modification of the Base-Delta-Immediate algorithm to create a lossy memory compression algorithm. We conclude that approximate storage is an effective and promising technique for improving effective memory capacity and decreasing the energy cost of memory access; results show that memory capacity can be increased by 15% with only 3% loss in image recognition application quality.

[1]  Nam Sung Kim,et al.  Decoupled Control and Data Processing for Approximate Near-Threshold Voltage Computing , 2015, IEEE Micro.

[2]  Nikil D. Dutt,et al.  Exploiting Partially-Forgetful Memories for Approximate Computing , 2015, IEEE Embedded Systems Letters.

[3]  Onur Mutlu,et al.  Base-delta-immediate compression: Practical data compression for on-chip caches , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[4]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[5]  Mircea R. Stan,et al.  Advances and Future Prospects of Spin-Transfer Torque Random Access Memory , 2010, IEEE Transactions on Magnetics.

[6]  P.A. Ruetz,et al.  Video compression makes big gains , 1991, IEEE Spectrum.

[7]  Mario Badr,et al.  Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[8]  M. Valero,et al.  Fuzzy memoization for floating-point multimedia applications , 2005, IEEE Transactions on Computers.

[9]  Natalie D. Enright Jerger,et al.  Doppelgänger: A cache for approximate computing , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[10]  Subhasish Mitra,et al.  ERSA: Error Resilient System Architecture for probabilistic applications , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[11]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[12]  Robert H. Dennard,et al.  Design of ion-implanted MOSFET's with very small physical dimensions , 2007 .

[13]  Pradeep Dubey,et al.  Convergence of Recognition, Mining, and Synthesis Workloads and Its Implications , 2008, Proceedings of the IEEE.

[14]  Richard Veras,et al.  RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[15]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[16]  David A. Wood,et al.  Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches , 2004 .

[17]  Qiang Xu,et al.  ApproxLUT: A novel approximate lookup table-based accelerator , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[18]  D. Geer,et al.  Chip makers turn to multicore processors , 2005, Computer.

[19]  Kaushik Roy,et al.  Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[20]  Thierry Moreau,et al.  A Taxonomy of General Purpose Approximate Computing Techniques , 2018, IEEE Embedded Systems Letters.

[21]  Neil Genzlinger A. and Q , 2006 .

[22]  Song Liu,et al.  Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.

[23]  Sparsh Mittal,et al.  A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..

[24]  Arnab Raha,et al.  Synergistic Approximation of Computation and Memory Subsystems for Error-Resilient Applications , 2017, IEEE Embedded Systems Letters.

[25]  Anand Raghunathan,et al.  Approximate memory compression for energy-efficiency , 2017, 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[26]  David Blaauw,et al.  Approximate SRAMs With Dynamic Energy-Quality Management , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[27]  Jason Schlessman,et al.  Reconfigurable SRAM Architecture With Spatial Voltage Scaling for Low Power Mobile Multimedia Applications , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[28]  Kaushik Roy,et al.  Dynamic Bit-Width Adaptation in DCT: An Approach to Trade Off Image Quality and Computation Energy , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[29]  Thierry Moreau,et al.  QAPPA : A Framework for Navigating Quality-Energy Tradeoffs with Arbitrary Quantization , 2017 .

[30]  P. K. Dubey,et al.  Recognition, Mining and Synthesis Moves Comp uters to the Era of Tera , 2005 .

[31]  M. Ekman,et al.  A robust main-memory compression scheme , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[32]  Kaushik Roy,et al.  Quality programmable vector processors for approximate computing , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[33]  Lotfi A. Zadeh,et al.  Fuzzy logic, neural networks, and soft computing , 1993, CACM.

[34]  Shekhar Borkar How to stop interconnects from hindering the future of computing! , 2013, 2013 Optical Interconnects Conference.

[35]  Jack J. Dongarra,et al.  Exascale computing and big data , 2015, Commun. ACM.

[36]  Melvin A. Breuer,et al.  Hardware that produces bounded rather than exact results , 2010, Design Automation Conference.

[37]  Qiang Xu,et al.  ApproxMA: Approximate Memory Access for Dynamic Precision Scaling , 2015, ACM Great Lakes Symposium on VLSI.

[38]  W. Marsden I and J , 2012 .

[39]  Brian Rogers,et al.  Scaling the bandwidth wall: challenges in and avenues for CMP scaling , 2009, ISCA '09.

[40]  Ping Tak Peter Tang,et al.  Table-lookup algorithms for elementary functions and their error analysis , 1991, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic.

[41]  Hadi Esmaeilzadeh,et al.  AxBench: A Multiplatform Benchmark Suite for Approximate Computing , 2017, IEEE Design & Test.

[42]  Massimo Alioto,et al.  Energy-quality scalable adaptive VLSI circuits and systems beyond approximate computing , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[43]  Dave Evans,et al.  How the Next Evolution of the Internet Is Changing Everything , 2011 .

[44]  Natalie D. Enright Jerger,et al.  The Bunker Cache for spatio-value approximation , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[45]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[46]  Onur Mutlu,et al.  Linearly compressed pages: A low-complexity, low-latency main memory compression framework , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[47]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[48]  David A. Wood,et al.  Adaptive cache compression for high-performance processors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[49]  Xin Xu,et al.  Exploring Data-Level Error Tolerance in High-Performance Solid-State Drives , 2015, IEEE Transactions on Reliability.

[50]  Xi Chen,et al.  C-Pack: A High-Performance Microprocessor Cache Compression Algorithm , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[51]  Donald Yeung,et al.  Application-Level Correctness and its Impact on Fault Tolerance , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[52]  Lingamneni Avinash,et al.  Sustaining moore's law in embedded computing through probabilistic and approximate design: retrospects and prospects , 2009, CASES '09.

[53]  Dan Grossman,et al.  EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.

[54]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[55]  Onur Mutlu,et al.  Mitigating the Memory Bottleneck With Approximate Load Value Prediction , 2016, IEEE Design & Test.

[56]  Jeffrey S. Vetter,et al.  A Survey Of Architectural Approaches for Data Compression in Cache and Main Memory Systems , 2016 .

[57]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[58]  Stephen Richardson,et al.  Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era , 2016, IEEE Design & Test.

[59]  Jacob Nelson,et al.  Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[60]  Vivienne Sze,et al.  Designing Hardware for Machine Learning: The Important Role Played by Circuit Designers , 2017, IEEE Solid-State Circuits Magazine.

[61]  Onur Mutlu,et al.  Rollback-free value prediction with approximate loads , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[62]  Henry Hoffmann,et al.  Quality of service profiling , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[63]  Nam Sung Kim,et al.  Lossless and lossy memory I/O link compression for improving performance of GPGPU workloads , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[64]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[65]  Woongki Baek,et al.  Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.

[66]  Kaushik Roy,et al.  Approximate storage for energy efficient spintronic memories , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).