Debugging post-silicon fails in the IBM POWER8 bring-up lab

Debugging post-silicon fails continues to be a difficult problem that is becoming even more challenging as chips integrate more functionality and implement increasingly complicated functions. Additionally, the complexity of hardware systems, coupled with the difficulty in observing the state of the system that led to the failure, make the debugging effort a unique challenge. In this paper, we review the techniques and mechanisms used to facilitate effective debugging in the POWER8™ processor post-silicon validation phase. We further describe several functional bugs and describe the debugging process that drove the identification of their root cause.

[1]  Vitali Sokhin,et al.  Threadmill: A post-silicon exerciser for multi-threaded processors , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[2]  Prabhat Mishra,et al.  RATS: Restoration-Aware Trace Signal Selection for Post-Silicon Validation , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Don Douglas Josephson,et al.  Debug methodology for the McKinley processor , 2001, Proceedings International Test Conference 2001 (Cat. No.01CH37260).

[4]  Subhasish Mitra,et al.  Post-silicon bug localization for processors using IFRA , 2010, Commun. ACM.

[5]  Alan J. Hu,et al.  BackSpace: Formal Analysis for Post-Silicon Debug , 2008, 2008 Formal Methods in Computer-Aided Design.

[6]  Mihalis Psarakis,et al.  Accelerating microprocessor silicon validation by exposing ISA diversity , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Nicola Nicolici,et al.  Low Cost Debug Architecture using Lossy Compression for Silicon Debug , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[8]  Peter Dahlgren,et al.  Latch divergency in microprocessor failure analysis , 2003, International Test Conference, 2003. Proceedings. ITC 2003..

[9]  Pankaj Pant,et al.  Lessons from at-speed scan deployment on an Intel® Itanium® microprocessor , 2010, 2010 IEEE International Test Conference.

[10]  Valeria Bertacco,et al.  Dacota: Post-silicon validation of the memory subsystem in multi-core designs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[11]  Gérard Memmi,et al.  A reconfigurable design-for-debug infrastructure for SoCs , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[12]  Nicola Nicolici,et al.  On using lossless compression of debug data in embedded logic analysis , 2007, 2007 IEEE International Test Conference.

[13]  Alan J. Hu,et al.  nuTAB-BackSpace: Rewriting to Normalize Non-determinism in Post-silicon Debug Traces , 2012, CAV.

[14]  Michael S. Hsiao,et al.  Using Non-trivial Logic Implications for Trace Buffer-Based Silicon Debug , 2009, 2009 Asian Test Symposium.

[15]  Maged M. Michael,et al.  Transactional memory support in the IBM POWER8 processor , 2015, IBM J. Res. Dev..

[16]  Jeffrey Stuecheli,et al.  CAPI: A Coherent Accelerator Processor Interface , 2015, IBM J. Res. Dev..

[17]  Ismet Bayraktaroglu,et al.  Microprocessor silicon debug based on failure propagation tracing , 2005, IEEE International Conference on Test, 2005..

[18]  Charlie Johnson,et al.  IBM Power Edge of Network Processor: A Wire-Speed System on a Chip , 2011, IEEE Micro.

[19]  Allon Adir,et al.  Leveraging pre-silicon verification resources for the post-silicon validation of the IBM POWER7 processor , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[20]  Avi Ziv,et al.  Optimizing test-generation to the execution platform , 2012, 17th Asia and South Pacific Design Automation Conference.

[21]  Klaus-Dieter Schubert,et al.  Post-silicon validation of the IBM POWER8 processor , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[22]  Klaus-Dieter Schubert,et al.  Functional verification of the IBM POWER7 microprocessor and POWER7 multiprocessor systems , 2011 .

[23]  Nicola Nicolici,et al.  Algorithms for State Restoration and Trace-Signal Selection for Data Acquisition in Silicon Debug , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24]  Allon Adir,et al.  Reaching Coverage Closure in Post-silicon Validation , 2010, Haifa Verification Conference.

[25]  Mack W. Riley,et al.  Debug of the CELL Processor: Moving the Lab into Silicon , 2006, 2006 IEEE International Test Conference.

[26]  Lixin Zhang,et al.  Mambo: a full system simulator for the PowerPC architecture , 2004, PERV.

[27]  Derek Feltham,et al.  Pentium Pro Processor Design for Test and Debug , 1998, IEEE Des. Test Comput..