Exploring Deep Neural Networks for Branch Prediction

Recently, there have been significant advances in deep neural networks (DNNs) and they have shown superior performance in audio and image processing. In this paper, we explore DNNs to push the limit for branch prediction. We treat branch prediction as a classification problem and explore both deep convolutional neural networks (CNNs) and deep belief networks (DBNs) for branch prediction. We analyze the impact of the length of hashed program counter (PC), local history register (LHR), global history register (GHR) and branch global addresses (GA) of deep learning classifiers on the misprediction rate. We compare the effectiveness of DNNs with the state-of-the-art branch predictors, including the perceptron, the Multi-poTAGE+SC, and MTAGE+SC branch predictors. The last two are the most recent winners of championship branch prediction (CBP) contests in the category with unlimited resources. Several interesting observations emerged from our study. The first is that for branch prediction, the DBNs and CNNs outperform the perceptron predictor while only deeper CNN models could outperform Multi-poTAGE+SC and MTAGE+SC. Second, we analyze the impact of the depth of CNNs (i.e., the number of convolutional layers and pooling layers) on the misprediction rates. The results show that deeper CNN structures lead to lower misprediction rates. Keywords—branch predictor, DBN, CNN, branch misprediction rate; deep neural networks.

[1]  Mikko H. Lipasti,et al.  Bias-Free Neural Predictor , 2014 .

[2]  Kevin Skadron,et al.  Merging path and gshare indexing in perceptron branch prediction , 2005, TACO.

[3]  Huiyang Zhou,et al.  Adaptive Information Processing: An Effective Way to Improve Perceptron Branch Predictors , 2006 .

[4]  Daniel A. Jiménez,et al.  Dynamic branch prediction with perceptrons , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[5]  André Seznec,et al.  A new case for the TAGE branch predictor , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  André Seznec,et al.  Analysis of the O-GEometric history length branch predictor , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[7]  Daniel A. Jiménez,et al.  Piecewise linear branch prediction , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[8]  Dana S. Henry,et al.  Predicting conditional branches with fusion-based hybrid predictors , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[10]  Pierre Michaud,et al.  A case for (partially) TAgged GEometric history length branch prediction , 2006, J. Instr. Level Parallelism.

[11]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[12]  Pierre Michaud,et al.  Pushing the branch predictability limits with the multi-poTAGE+SC predictor , 2014 .

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[15]  André Seznec TAGE-SC-L Branch Predictors , 2014 .

[16]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[17]  André Seznec Exploring branch predictability limits with the MTAGE+SC predictor * , 2016 .