Area-Aware Cache Update Trackers for Postsilicon Validation

The internal state of the complex modern processors often needs to be dumped out frequently during postsilicon validation. Since the caches hold most of the state, the volume of data dumped and the transfer time are dominated by the large caches present in the architecture. The limited bandwidth to transfer data present in these large caches off-chip results in stalling the processor for long durations when dumping the cache contents off-chip. To alleviate this, we propose to transfer only those cache lines that were updated since the previous dump. Since maintaining a bit-vector with a separate bit to track the status of individual cache lines is expensive, we propose two methods: 1) where a bit tracks multiple cache lines and 2) an Interval Table which stores only the starting and ending addresses of continuous runs of updated cache lines. Both methods require significantly lesser space compared with a bit-vector, and allow the designer to choose the amount of space to allocate for this design-for-debug feature. The impact of reducing storage space is that some nonupdated cache lines are dumped too. We attempt to minimize such overheads. We propose a scheme to share such cache update tracking hardware (or Update Trackers) across multiple caches in case of physically distributed caches so that they are replicated fewer times, thereby limiting the area overhead. We show that the proposed Update Trackers occupy less than 1% of cache area for both the shared and distributed caches.

[1]  Torben Bach Pedersen,et al.  Position list word aligned hybrid: optimizing space and performance for compressed bitmaps , 2010, EDBT '10.

[2]  Valeria Bertacco,et al.  Post-silicon bug diagnosis with inconsistent executions , 2011, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[3]  Gérard Memmi,et al.  A reconfigurable design-for-debug infrastructure for SoCs , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[4]  Zeljko Zilic,et al.  On a New Mechanism of Trigger Generation for Post-Silicon Debugging , 2014, IEEE Transactions on Computers.

[5]  Nicola Nicolici,et al.  On Using Lossy Compression for Repeatable Experiments during Silicon Debug , 2011, IEEE Transactions on Computers.

[6]  Alessandro Colantonio,et al.  Concise: Compressed 'n' Composable Integer Set , 2010, Inf. Process. Lett..

[7]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[8]  Hong Wang,et al.  BLoG: Post-Silicon bug localization in processors using bug localization graphs , 2010, Design Automation Conference.

[9]  Alan J. Hu,et al.  Formal-Analysis-Based Trace Computation for Post-Silicon Debug , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[10]  Nicola Nicolici,et al.  Embedded Debug Architecture for Bypassing Blocking Bugs During Post-Silicon Validation , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[11]  Sanjit A. Seshia,et al.  Scalable specification mining for verification and diagnosis , 2010, Design Automation Conference.

[12]  Kees G. W. Goossens,et al.  A high-level debug environment for communication-centric debug , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[13]  Nicola Nicolici,et al.  On Automated Trigger Event Generation in Post-Silicon Validation , 2008, 2008 Design, Automation and Test in Europe.

[14]  Prabhat Mishra,et al.  Efficient trace data compression using statically selected dictionary , 2011, 29th VLSI Test Symposium.

[15]  Igor L. Markov,et al.  Automating post-silicon debugging and repair , 2007, ICCAD 2007.

[16]  David Lin,et al.  QED: Quick Error Detection tests for effective post-silicon validation , 2010, 2010 IEEE International Test Conference.

[17]  Vijay Janapa Reddi,et al.  PIN: a binary instrumentation tool for computer architecture research and education , 2004, WCAE '04.

[18]  Pierre Michaud Online compression of cache-filtered address traces , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[19]  Shmuel Tomi Klein,et al.  Improved hierarchical bit-vector compression in document retrieval systems , 1986, SIGIR '86.

[20]  Kwang-Ting Cheng,et al.  A path-based methodology for post-silicon timing validation , 2004, ICCAD 2004.

[21]  Luca Benini,et al.  At-Speed Distributed Functional Testing to Detect Logic and Delay Faults in NoCs , 2014, IEEE Transactions on Computers.

[22]  Valeria Bertacco,et al.  Machine learning-based anomaly detection for post-silicon bug diagnosis , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[23]  Azadeh Davoodi,et al.  Trace signal selection to enhance timing and logic visibility in post-silicon validation , 2010, 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[24]  Prathmesh Kallurkar,et al.  Tejas Simulator : Validation against Hardware , 2015, ArXiv.

[25]  Prathmesh Kallurkar,et al.  Tejas: A java based versatile micro-architectural simulator , 2015, 2015 25th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS).

[26]  Masahiro Fujita,et al.  Global transaction ordering in Network-on-Chips for post-silicon validation , 2011, 2011 12th International Symposium on Quality Electronic Design.

[27]  Valeria Bertacco,et al.  Dacota: Post-silicon validation of the memory subsystem in multi-core designs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[28]  Masahiro Fujita,et al.  Automated data analysis techniques for a modern silicon debug environment , 2012, 17th Asia and South Pacific Design Automation Conference.

[29]  Qiang Xu,et al.  On signal tracing in post-silicon validation , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[30]  Kamran Rahmani,et al.  Efficient Signal Selection Using Fine-grained Combination of Scan and Trace Buffers , 2013, 2013 26th International Conference on VLSI Design and 2013 12th International Conference on Embedded Systems.

[31]  Subhasish Mitra,et al.  IFRA: Instruction Footprint Recording and Analysis for post-silicon bug localization in processors , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[32]  Preeti Ranjan Panda,et al.  Online cache state dumping for processor debug , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[33]  Qiang Xu,et al.  Trace signal selection for visibility enhancement in post-silicon validation , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[34]  Nicola Nicolici,et al.  On using lossless compression of debug data in embedded logic analysis , 2007, 2007 IEEE International Test Conference.

[35]  Sandip Kundu,et al.  A Design-for-Debug (DfD) for NoC-Based SoC Debugging via NoC , 2008, 2008 17th Asian Test Symposium.

[36]  Nicola Nicolici,et al.  Automated trace signals selection using the RTL descriptions , 2010, 2010 IEEE International Test Conference.

[37]  Eli Singerman,et al.  Transaction based pre-to-post silicon validation , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[38]  Nicola Nicolici,et al.  Automated Trace Signals Identification and State Restoration for Improving Observability in Post-Silicon Validation , 2008, 2008 Design, Automation and Test in Europe.

[39]  Smruti R. Sarangi,et al.  Efficient on-line algorithm for maintaining k-cover of sparse bit-strings , 2012, FSTTCS.

[40]  Valeria Bertacco,et al.  Simulation-based signal selection for state restoration in silicon debug , 2011, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[41]  Preeti Ranjan Panda,et al.  Compressing Cache State for Postsilicon Processor Debug , 2011, IEEE Transactions on Computers.