Automatic Microprocessor Performance Bug Detection

Processor design validation and debug is a difficult and complex task, which consumes the lion's share of the design process. Design bugs that affect processor performance rather than its functionality are especially difficult to catch, particularly in new microarchitectures. This is because, unlike functional bugs, the correct processor performance of new microarchitectures on complex, long-running benchmarks is typically not deterministically known. Thus, when performance benchmarking new microarchitectures, performance teams may assume that the design is correct when the performance of the new microarchitecture exceeds that of the previous generation, despite significant performance regressions existing in the design. In this work, we present a two-stage, machine learning-based methodology that is able to detect the existence of performance bugs in microprocessors. Our results show that our best technique detects 91.5% of microprocessor core performance bugs whose average IPC impact across the studied applications is greater than 1% versus a bug-free design with zero false positives. When evaluated on memory system bugs, our technique achieves 100% detection with zero false positives. Moreover, the detection is automatic, requiring very little performance engineer time.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Brad Calder,et al.  How to use SimPoint to pick simulation points , 2004, PERV.

[3]  D. Skinner,et al.  Understanding the causes of performance variability in HPC workloads , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..

[4]  Doug Burger,et al.  Measuring Experimental Error in Microprocessor Simulation , 2001, ISCA 2001.

[5]  Thomas M. Conte,et al.  Reducing state loss for effective trace sampling of superscalar processors , 1996, Proceedings International Conference on Computer Design. VLSI in Computers and Processors.

[6]  Gilberto Contreras,et al.  Power prediction for Intel XScale processors using performance monitoring unit events , 2005 .

[7]  Shan Lu,et al.  Pcatch: automatically detecting performance cascading bugs in cloud systems , 2018, EuroSys.

[8]  F. Santosa,et al.  Linear inversion of ban limit reflection seismograms , 1986 .

[9]  Sang Min Yoon,et al.  Human activity recognition from accelerometer data using Convolutional Neural Network , 2017, 2017 IEEE International Conference on Big Data and Smart Computing (BigComp).

[10]  Charles Yount,et al.  Using Model Trees for Computer Architecture Performance Analysis of Software Applications , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[11]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[12]  Samuel Williams,et al.  Performance Variability on Xeon Phi , 2017, ISC Workshops.

[13]  Christina Delimitrou,et al.  iBench: Quantifying interference for datacenter applications , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[14]  Karthikeyan Sankaralingam,et al.  Architectural Simulators Considered Harmful , 2015, IEEE Micro.

[15]  Ronak Singhal,et al.  Performance Analysis and Validation of the Intel Pentium 4 Processor on 90nm Technology , 2004 .

[16]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[17]  Xiaohui Gu,et al.  UBL: unsupervised behavior learning for predicting performance anomalies in virtualized cloud systems , 2012, ICAC '12.

[18]  Christoforos E. Kozyrakis,et al.  Reconciling high server utilization and sub-millisecond quality-of-service , 2014, EuroSys '14.

[19]  Pradip Bose,et al.  Architectural timing verification and test for super scalar processors , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.

[20]  Serkan Kiranyaz,et al.  A Generic Intelligent Bearing Fault Diagnosis System Using Compact Adaptive 1D CNN Classifier , 2018, Journal of Signal Processing Systems.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Ming Zhong,et al.  I/O system performance debugging using model-driven anomaly characterization , 2005, FAST'05.

[23]  Margaret Martonosi,et al.  Run-time power estimation in high performance microprocessors , 2001, ISLPED '01.

[24]  John Paul Shen,et al.  A Buffer-Oriented Methodology for Microarchitecture Validation , 2000, J. Electron. Test..

[25]  Thomas F. Wenisch,et al.  SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, ISCA '03.

[26]  Timothy Mattson,et al.  A Zero-Positive Learning Approach for Diagnosing Software Performance Regressions , 2017, NeurIPS.

[27]  Margaret Martonosi,et al.  Power prediction for Intel XScale/spl reg/ processors using performance monitoring unit events , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[28]  Erik Elmroth,et al.  Performance Anomaly Detection and Bottleneck Identification , 2015, ACM Comput. Surv..

[29]  Jinchun Kim,et al.  Path confidence based lookahead prefetching , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[30]  Tian Jiang,et al.  Discovering, reporting, and fixing performance bugs , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[31]  Christina Delimitrou,et al.  Tarcil: reconciling scheduling speed and quality in large shared clusters , 2015, SoCC.

[32]  Mark Horowitz,et al.  Architecture validation for processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[33]  Efraim Rotem,et al.  Inside 6th-Generation Intel Core: New Microarchitecture Code-Named Skylake , 2017, IEEE Micro.

[34]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[35]  John D. McCalpin HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[36]  Yepang Liu,et al.  Characterizing and detecting performance bugs for smartphone applications , 2014, ICSE.

[37]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[38]  Scott A. Mahlke,et al.  BugMD: Automatic Mismatch Diagnosis for Bug triaging , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[39]  Shu Yang,et al.  Performance Events Based Full System Estimation on Application Power Consumption , 2016, 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[40]  Brad Calder,et al.  Using SimPoint for accurate and efficient simulation , 2003, SIGMETRICS '03.

[41]  David Gregg,et al.  Parallel Performance Problems on Shared-Memory Multicore Systems: Taxonomy and Observation , 2016, IEEE Transactions on Software Engineering.

[42]  Ranjit Jhala,et al.  Finding latent performance bugs in systems implementations , 2010, FSE '10.

[43]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[44]  Brad Calder,et al.  SimPoint 3.0: Faster and More Flexible Program Phase Analysis , 2005, J. Instr. Level Parallelism.

[45]  Lizy Kurian John,et al.  Complete System Power Estimation: A Trickle-Down Approach Based on Performance Events , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[46]  Yuan He,et al.  Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices , 2019, ASPLOS.

[47]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[48]  Jacob A. Abraham,et al.  Architectural performance verification: PowerPC processors , 1994, Proceedings 1994 IEEE International Conference on Computer Design: VLSI in Computers and Processors.