.. role:: html(raw) :format: html .. role:: py(code) :language: py :class: highlight .. urlinclude:: :branch: a7d98ec :github: bogdanvuk/pygears_riscv My First Instruction ==================== .. post:: October 30, 2018 :author: Bogdan :category: RISC-V .. _RISC-V ISA Specification: https://content.riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf .. verbosity_slider:: 3 :v:`2` First instruction is probably going to be unlike any other in the amount of work that I'll need to put into implementing it, so it deserves a post on its own. :v:`1` Let's start from the RV32I description in the (currently) latest version of the `RISC-V ISA Specification`_, which is given in the `Chapter 2: RV32I Base Integer Instruction Set `_. The specification first goes on to describe `Integer Computational Instructions (Chapter 2.4) `_, of which the ``addi`` instruction is explained first, so let's start with that one. Relevant pygears_riscv git commit: `pygears_riscv@a7d98ec `_ .. verbosity:: 2 All RV32I instructions are encoded with 32 bits using several formats (although there is also a `Compressed Instruction Formats (Chapter 12.2) `_ but I'll leave that for later). All the information needed for the instruction execution has to be encoded in 32 bits and these formats specify where exactly is each peace of information located within these 32 bits. Usually the instruction needs to specify which operation to perform (``opcode`` and ``funct`` fields), which registers are involved (``rs`` - register source or ``rd`` - register destination), and usually provides some immediate values as arguments (``imm`` fields). :v:`3` One of the key advantages of the RISC-V ISA is that pieces of information of the same type (like ``rd`` field) are usually located at the same position within the 32 bit encoding for different formats, which proved to simplify the hardware implementation. For RV32I, a set of 32 registers is needed, named ``x0`` - ``x31``, where ``x0`` is different from the others in that it has a fixed value of 0, i.e it's value cannot be changed. The ISA specification defines the ``XLEN`` parameter to represent the width of the registers in number of bits: either 32 or 64. :v:`3` I'll try to keep ``XLEN`` a design parameter of the processor implementation, but I'll first focus on a version with ``XLEN=32``, i.e with the processor version with 32 bit wide registers. .. verbosity:: 1 Instruction format ------------------ The ``addi`` instruction has an "Integer Register-Immediate" format, aka the "I-type" format shown below. :v:`2` The instruction is executed by adding the value of the 12 bit immediate field ``imm`` to the value read from the register specified by the ``rs1`` field. The result is then truncated to ``XLEN`` bits and stored into the register specified by the ``rd`` field. .. figure:: images/integer-register-immediate-instruction.png :align: center "Integer Register-Immediate" instruction format, aka the "I-type" format, from the `RISC-V ISA Specification`_ Since the instruction encodings have fields that serve different purposes from one another, I'll represent the instruction with the :any:`typing/tuple` PyGears type. :v:`2` The :any:`typing/tuple` type represents a generic heterogeneous container type akin to records and structs in other HDLs, and I can specify the names and types of the fields by providing a Python dict in square brackets which maps field names to the field types. :v:`1` For the "I-type" instructions, I ended-up with a following definition in PyGears, given in :giturl:`pygears_riscv/riscv/riscv.py`: .. data:: TInstructionI .. code-block:: python TInstructionI = Tuple[{ 'opcode': Uint[7], 'rd' : Uint[5], 'funct3': Uint[3], 'rs1' : Uint[5], 'imm' : Int[12] }] .. verbosity:: 2 The ``opcode`` and ``funct3`` fields determine the function to be executed, and ``rd``, ``rs1`` and ``imm`` fields carry the function arguments. The ``opcode`` and ``funct3`` fields store the ID of the function, so I can represent them with an unsigned number, i.e the :any:`typing/uint` PyGears type. :v:`3` An enumerated type might constrain this fields better, since not all function IDs might be available in a specific processor implementation (after this blog post I will have implemented only one function - ``addi``). However, PyGears doesn't yet have enumerated types, so I'll use the :any:`typing/uint` type as the second best. Values of the ``rs1`` and ``rd`` fields contain the IDs of the registers involved, hence they are 5 bit wide so that they can encode all 32 register IDs, hence they are represented by the :any:`Uint[5] ` type. ISA specifies that ``addi`` as a signed operation, and that the values in the ``imm`` field are encoded as signed integers, so I'll use :any:`Int[12] ` type here. Now any gear that operates on the ``imm`` field can, if needed, automatically adjust its operation to handle the signed numbers correctly, and I don't have to worry about it for every gear explicitly. :v:`3` This is a major advantage of the typing system, since I can express my intents using the type (like with :any:`Int ` here) in a single place in the code, and this intent will propagate automatically throughout the design. Traditional HDLs offer only rudimentary typing support, so you need to follow you signals around and explicitly. However, just specifying the type is only a half of the story. The other half lies in providing the `polymorphic `__ behavior for the modules, so that they automatically accommodate for different data types. .. verbosity:: 1 :v:`2` OK, so now we have the :py:data:`TInstructionI` type, that describes the general format for the "I-type" instructions, and my ``addi`` instruction will be an instance of this type. :v:`1` As I said, ``opcode`` and ``funct3`` will have unique, specific value for the ``addi`` instruction which is specified by ISA. I had to consult `Chapter 19: RV32/64G Instruction Set Listings `_ in order to get the correct values for the function ID fields: :py:`opcode=0x13` and ``funct3=0x0``. .. figure:: images/addi-instruction-field-value.png :align: center ``addi`` instruction format, from `RISC-V ISA Specification`_ Other instruction fields: ``rd``, ``rs1`` and ``imm``, can take arbitrary values, so I can't fix those in advance. This gives me the following template for the ``addi`` instruction: .. py:data:: OPCODE_IMM :py:`OPCODE_IMM = 0x13` .. py:data:: FUNCT3_ADDI :py:`FUNCT3_ADDI = 0x0` .. py:data:: ADDI .. code-block:: python ADDI = TInstructionI({ 'opcode': OPCODE_IMM, 'rd' : 0, 'funct3': FUNCT3_ADDI, 'rs1' : 0, 'imm' : 0 }) :v:`2` Since PyGears doesn't have templates for type instances, all I can do is assign some default values to the fields whose values can change. :v:`3` Maybe its worth considering whether true generic templates (with generic parameters) for the type instances would add anything of value (or researching if there are languages that support these). In that case, instead of zeros above, the fields would be assigned some template placeholder names, that would need to be assigned values later. Prolog does something like that? Processor implementation ------------------------ :v:`2` Since the idea of this blog series is to show how one can evolve a complex hardware design using PyGears without wasted effort, by implementing one feature at a time, I will turn a blind eye to the fact that RISC-V processor needs to support multiple instructions at this moment. I will exclude the PC manipulation functionality, which gets important once jump instructions get into play, and the interface to the data memory, which gets important once load and store instructions git into play. For now I will move the :giturl:`register file ` outside the processor into a separate module and implement it in pure Python to ease reading and writing for the verification purposes. :v:`3` Later, I'll provide an RTL implementation of the register file, but it is a simple module and it should be a straightforward design process, so I don't feel like cheating for postponing it. :v:`2` Important concepts for describing gears are sketched-out in this :ref:`Quick Introduction ` documentation page. :v:`1` Without further ado, this single-instruction capable RISC-V processor written in PyGears looks like this: .. literalinclude:: pygears_riscv/riscv/riscv.py :github: bogdanvuk/pygears_riscv :pyobject: riscv .. verbosity:: 2 Let's dig deeper into those 6 lines of code. The :py:`@gear` statement is called a decorator in Python terminology. If it is placed in front of the function definition it can wrap it with some additional code. The :py:`@gear` decorator is where most of the magic happens in PyGears. It makes a function composable via '|' (pipe) operator, it performs type checking and matching, it instantiates a new hardware module each time the function is called, it takes care about module hierarchy, etc. Next, the `function prototype `__ declares the types of input interfaces the ``riscv`` gear accepts, namely: :py:`instruction: TInstructionI` and :py:`reg_data: Uint['xlen']`. So on the first interface ``riscv`` expects to see a flow of instructions of the "I-type" format, and on the second, the operation argument read from the register determined by the ``rs1`` field (``riscv`` gear will issue these read requests as we'll see in the moment). For the details on how PyGears implements interfaces in HDL, checkout the PyGears documentation section :ref:`One Interface `. The ``riscv`` gear is implemented via the gear composition, so I needn't specify the output interface types since they will be determined by the interfaces returned from the ``riscv()`` function. In order to instantiate the ``riscv`` gear, all the input interfaces need to be specified as arguments to the ``riscv`` gear function. Inside the ``gear`` function, ``instruction`` and ``reg_data`` become local variables that bring the interface objects from the outside and distribute them to the internal gears. :v:`1` Image below shows the resulting processor structure and connection with its environment. :v:`2` The graph was auto-generated with the :giturl:`riscv_graph.py script