Student Projects

An Improved Temporal Stream Branch Predictor in Gem5

Project Video

Team Members

Team Members:

顾煜程 Yucheng Gu, 潘其平 Qiping Pan, 李世骐 Shiqi LI, 武昊辰 Haochen Wu, 强履冰 Lvbing Qiang, 杨毅文 Yiwen Yang


Xinfei Guo

Project Description

  • Problem

    Nowadays, the developmentof computation efficiency lands on superscalar out-of-order processors to agreat extent. Today’s deeply pipelined super-scalar out-of-order processorshave high requirements on accurate and efficient branch predictors.

    In this project, we applyand optimize a temporal stream model on Bi-mode branch predictor to get betterperformance on existing branch predictors in out-of-order CPUs.

  • Concept Generation

    There are a number ofproven branch predictors: TAGE, Bi-mode, Gshare… Those BPs focus onoptimizing different parts of a basic scheme or model, solving drawbacks ofglobal or per-address history scheme, reducing interference, etc.

    In this project, we decideto develop and implement a more general error correcting model that can beapplied to some existing base BPs to increase the accuracy based on theTemporal Stream Branch Predictor design proposed by Shen et al.[2]

  • Design Description

    Some basic branch predictors often repeats its mistakes. In thisapproach, temporal streaming is introduced to reduce the repetitive wrongpredictions. The model records the predictions in a circular buffer, when awrong prediction takes place, TS looks up the record to find a similar case,then it reverse its recorded mistaken prediction to reduce the time wasted bythe repeated mistake.

  • Validation

    We use PolyBench 4.2.1 as the mainbenchmark suite for our gem5 CPU with temporal stream Bi-mode BP. We applied TSon several BPs: TAGE, Bi-mode, etc. and tested the performance with differentkernels.

    We use MPKI as the main indicator of branch predictors’performance. For the general case, there's a 10% decrease in MPKI when temporalsteaming is applied to BiMode predictor, and a slighterincrease in performance when applied to L-TAGE. By increasing buffer size, wecan get even better performance.

    The results shows our model has better performance in Stencilscomputations (for more repetitive instructions), while weaker in linear algebrakernels and solvers. We are keep working on new mechanisms to push the resultfurther.

  • Modeling and Analysis

    We implemented the design on gem5. Gem5 is a commonly used platformto simulate computer architecture designs. It is now widely used in academia,industry and teaching.

  • Conclusion

    Temporal Stream model applied onbase branch predictors is a moregeneral error correcting model thatcan be applied to some existing baseBPs to increase the accuracy, whichhas better performance to solverepeated work.

  • Acknowledgement

    Faculty Advisor:  Xinfei Guo from UM-SJTU JointInstitute

    Haoyang Zhang from UM-SJTU JointInstitute

  • Reference

    [1] Lee et al. The bi-mode branch predictor

    [2] Shen et al. Temporal StreamBP