Student Projects

done

Design of a Superscalar Out-of-Order Pipeline Implementing RV32IM Based on ClasH Haskell

Project Video

Team Members

Team Members:

姜宇辰 Yuchen Jiang,秦彬皓 Binhao Qin,徐文涛 Wentao Xu,谢奕宁 Yining Xie,张亦林 Yilin Zhang,李欧盟 Oumeng Li

Instructors:

Xinfei Guo

Project Description

  • Problem

    Our design is a 32-bit, four-way, out-of-order superscalar pipeline architecture. We use Clash Haskell for system verilog code generation. As a functional hardware description language, Clash Haskell offers hardware designers with a number of fantastic features, including and not limited to polymorphisms and type derivations [1]. These features would largely increase the scalability and portability of a hardware project. It is first proposed in the master's theses at University of Twente in the Netherlands in 2009 [2][3]. To learn more about Clash, you may feel free to read the publications based on Clash at https://clash-lang.org/publications/



  • Concept Generation

    Clash Haskell enables an additional layer for testing and design abstraction. Let us focus on the design aspect below. For example, we can easily use the same function to operate on a series of inputs of the same type with a different type state called "acc". This is achieved by mapAccumL, whose function block diagram is attached below.

     We used the type derivation and polymorphism feature to design a multiport queue. A queue pops a vector efflux, and outputs how many more elements that can be pushed into itself at the first half of the cycle. At the next half of the cycle, it pushes a vector influx, into its storage, and examines how many elements in the previous output efflux has been accepted through a size. Furthermore, the queue can take reload contents to replace everything that it stores.


  • Design Description

    We design the 4-way superscalar out-of-order RV32IM processor's archiecture as follows. The most special feature is that we use the multiport queue as pipeline registers to avoid stall detection. With the powerful modularity of the Clash language, we are able to build flexible ports for each unit. Fetch unit is responsible for continuously ongoing of the processor. Decode is a unit where hard coded 32 bit instruction will be parsed and wrapped into data structure. After renaming, the wrapped instruction data structure is dispatched to multiple reservation station for later usage of functional units. Different functional units receive precise data from queue and send output to common data bus(CDB). The CDB broadcasts values to reservation station and reorder buffer(ROB) to avoid hazards. What the ROB does is to keep the result of out-of-order execution of these instructions back in order. These in order results are sending back to memory interface. It ensures the manipulation to memory is accurate and correct.



  • Validation

    Clash Haskell's test and verification tools could be divided into three groups. First, we can perform tests for different units provided the inputs and expected outputs. More sophisticatedly, we can write math functions describing the properties of outputs, paired with randomly-generated inputs. Moreover, designers could use RISC-V formal verification toolchain in Clash to formally verify that the processor functions correctly. We use formal veirfication to run a thorough check for our processor.


  • Conclusion

    This work mainly presents the design of a superscalar out-of-order pipeline processor implementing RV32IM instruction sets. The processor is implemented using a hardware description language code generation tool, Clash Haskell, and is being formally verified using RISC-V formal verification toolchain. The project environment and project workflow described in this thesis could be adopted in projects involving digital integrated circuits design, implementation and verification. The work will be valuable to those who want to build advanced processors using Clash.


  • Acknowledgement

    We would like to extend our gratitude and respects to the instructor, Prof. Xinfei Guo, and the teaching assistants, Runxi Wang and Yichen Cai. We also want to thank the Clash developers for solving some of our problems, especially for the compilation efficiency.

  • Reference

    [1] R. Wester, C. Baaij and J. Kuper, A two step hardware design method using CλaSH, 22nd International Conference on Field Programmable Logic and Applications (FPL), Oslo, Norway, 2012, pp. 181-188, doi: 10.1109/FPL.2012.6339258.

    [2] Baaij, C.P.R. (2009) CλasH : from Haskell to hardware. MSc thesis, University of Twente, Enschede, The Netherlands, December 2009.

    [3] Kooijman, M. (2009) Haskell as a higher order structural hardware description language. MSc thesis, University of Twente, Enschede, The Netherlands, December 2009.