The objectives of this tutorial are:
- Learn how the generated ISA and binary decoders fit together.
- Write the necessary C++ code to create a full instruction decoder for RiscV RV32I that combines the ISA and binary decoders.
Understand the instruction decoder
The instruction decoder is responsible for, given an instruction address, read
the instruction word from memory and return a fully initialized instance of the
Instruction
that represents that instruction.
The top level decoder implements the generic::DecoderInterface
shown below:
// This is the simulator's interface to the instruction decoder.
class DecoderInterface {
public:
// Return a decoded instruction for the given address. If there are errors
// in the instruciton decoding, the decoder should still produce an
// instruction that can be executed, but its semantic action function should
// set an error condition in the simulation when executed.
virtual Instruction *DecodeInstruction(uint64_t address) = 0;
virtual ~DecoderInterface() = default;
};
As you can see, there is only one method that has to be implemented: cpp
virtual Instruction *DecodeInstruction(uint64_t address);
Now let's look at what is provided and what is needed by the generated code.
First, consider the top level class RiscV32IInstructionSet
in the file
riscv32i_decoder.h
, which was generated at the end of the tutorial on the
ISA decoder. To see the contents anew, navigate to the solution directory of
that tutorial and rebuild all.
$ cd riscv_isa_decoder/solution
$ bazel build :all
...<snip>...
Now change your directory back to the repository root, then let's take a look
at the sources that were generated. For that, change directory to
bazel-out/k8-fastbuild/bin/riscv_isa_decoder
(assuming you are on an x86
host - for other hosts, the k8-fastbuild will be another string).
$ cd ../..
$ cd bazel-out/k8-fastbuild/bin/riscv_isa_decoder
You will see the four source files that contain the generated C++ code listed:
riscv32i_decoder.h
riscv32i_decoder.cc
riscv32i_enums.h
riscv32i_enums.cc
Open up the first file riscv32i_decoder.h
. There are three classes that we
need to take a look at:
RiscV32IEncodingBase
RiscV32IInstructionSetFactory
RiscV32IInstructionSet
Note the naming of the classes. All the classes are named based on the
Pascal-case version of the name given in the "isa" declaration in that file:
isa RiscV32I { ... }
Let's start with the RiscVIInstructionSet
class first. It is shown below:
class RiscV32IInstructionSet {
public:
RiscV32IInstructionSet(ArchState *arch_state,
RiscV32IInstructionSetFactory *factory);
Instruction *Decode(uint64 address, RiscV32IEncodingBase *encoding);
private:
std::unique_ptr<Riscv32Slot> riscv32_decoder_;
ArchState *arch_state_;
};
There are no virtual methods in this class, so this is a stand-alone class, but
notice two things. First, the constructor takes a pointer to an instance of the
RiscV32IInstructionSetFactory
class. This is a class that the generated
decoder uses to create an instance of the RiscV32Slot
class, which is used to
decode all the instructions defined for the slot RiscV32
as defined in the
riscv32i.isa
file. Second, the Decode
method takes an additional parameter
of type pointer to RiscV32IEncodingBase
, this is a class that will provide the
interface between the isa decoder generated in the first tutorial and the binary
decoder generated in the second lab.
The class RiscV32IInstructionSetFactory
is an abstract class from which we
have to derive our own implementation for the full decoder. In most cases this
class is trivial: just provide a method for calling the constructor for each
slot class defined in our .isa
file. In our case, it's very simple as there
is only a single such class: Riscv32Slot
(Pascal-case of the name riscv32
concatenated with Slot
). The method is not generated for you as there are
some advanced use cases where there might be utility in deriving a subclass
from the slot, and calling its constructor instead.
We will go through the final class RiscV32IEncodingBase
later in this
tutorial, as that is the subject of another exercise.
Define top level instruction decoder
Define the factory class
If you rebuilt the project for the first tutorial, make sure you change back to
the riscv_full_decoder
directory.
Open up the file riscv32_decoder.h
. All the necessary include files have
already been added and the namespaces have been set up.
After the comment marked //Exercise 1 - step 1
define the class
RiscV32IsaFactory
inheriting from RiscV32IInstructionSetFactory
.
class RiscV32IsaFactory : public RiscV32InstructionSetFactory {};
Next, define the override for CreateRiscv32Slot
. Since we don't use any
derived classes of Riscv32Slot
, we simply allocate a new instance using
std::make_unique
.
std::unique_ptr<Riscv32Slot> CreateRiscv32Slot(ArchState *) override {
return std::make_unique<Riscv32Slot>(state);
}
If you need help (or want to check your work), the full answer is here.
Define the decoder class
Constructors, destructor, and method declarations
Next it's time to define the decoder class. In the same file as above, go to the
declaration of RiscV32Decoder
. Expand the declaration into a class definition
where RiscV32Decoder
inherits from generic::DecoderInterface
.
class RiscV32Decoder : public generic::DecoderInterface {
public:
};
Next, before we write the constructor, let's take a quick look at the code
generated in our second tutorial on the binary decoder. In addition to all the
Extract
functions, there is the function DecodeRiscVInst32
:
OpcodeEnum DecodeRiscVInst32(uint32_t inst_word);
This function takes the instruction word that needs to be decoded, and returns
the opcode that matches that instruction. On the other hand, the
DecodeInterface
class that RiscV32Decoder
implements only passes in an
address. Thus, the RiscV32Decoder
class has to be able to access memory to
read the instruction word to pass to DecodeRiscVInst32()
. In this project
the way to access memory is through a simple memory interface defined in
.../mpact/sim/util/memory
aptly named util::MemoryInterface
, seen below:
// Load data from address into the DataBuffer, then schedule the Instruction
// inst (if not nullptr) to be executed (using the function delay line) with
// context. The size of the data access is based on size of the data buffer.
virtual void Load(uint64_t address, DataBuffer *db, Instruction *inst,
ReferenceCount *context) = 0;
In addition we need to be able to pass a state
class instance to the
constructors of the other decoder classes. The appropriate state class is
riscv::RiscVState
class, which derives from generic::ArchState
, with added
functionality for RiscV. This means we must declare the constructor so that it
can take a pointer to the state
and the memory
:
RiscV32Decoder(riscv::RiscVState *state, util::MemoryInterface *memory);
Delete the default constructor and override the destructor:
RiscV32Decoder() = delete;
~RiscV32Decoder() override;
Next declare the DecodeInstruction
method we need to override from
generic::DecoderInterface
.
generic::Instruction *DecodeInstruction(uint64_t address) override;
If you need help (or want to check your work), the full answer is here.
Data Member Definitions
The RiscV32Decoder
class will need private data members to store the
constructor parameters and a pointer to the factory class.
private:
riscv::RiscVState *state_;
util::MemoryInterface *memory_;
It also needs a pointer to the encoding class that is derived from
RiscV32IEncodingBase
, let's call that RiscV32IEncoding
(we will implement
this in exercise 2). Additionally it needs a pointer to an instance of
RiscV32IInstructionSet
, so add:
RiscV32IsaFactory *riscv_isa_factory_;
RiscV32IEncoding *riscv_encoding_;
RiscV32IInstructionSet *riscv_isa_;
Finally, we need to define a data member for use with our memory interface:
generic::DataBuffer *inst_db_;
If you need help (or want to check your work), the full answer is here.
Define the Decoder Class Methods
Next, it's time to implement the constructor, destructor, and the
DecodeInstruction
method. Open up the file riscv32_decoder.cc
. The empty
methods are already in the file as well as namespace declarations and a couple
of using
declarations.
Constructor Definition
The constructor only needs to initialize the data members. First initialize the
state_
and memory_
:
RiscV32Decoder::RiscV32Decoder(riscv::RiscVState *state,
util::MemoryInterface *memory)
: state_(state), memory_(memory) {
Next allocate instances of each of the decoder related classes, passing in the appropriate parameters.
// Allocate the isa factory class, the top level isa decoder instance, and
// the encoding parser.
riscv_isa_factory_ = new RiscV32IsaFactory();
riscv_isa_ = new RiscV32IInstructionSet(state, riscv_isa_factory_);
riscv_encoding_ = new RiscV32IEncoding(state);
Finally, allocate the DataBuffer
instance. It is allocated using a factory
accessible through the state_
member. We allocate a data buffer sized to store
a single uint32_t
, as that is the size of the instruction word.
inst_db_ = state_->db_factory()->Allocate<uint32_t>(1);
Destructor Definition
The destructor is simple, just free the objects we allocated in the constructor,
but with one twist. The data buffer instance is reference counted, so instead
off calling delete
on that pointer, we DecRef()
the object:
RiscV32Decoder::~RiscV32Decoder() {
inst_db_->DecRef();
delete riscv_isa_;
delete riscv_isa_factory_;
delete riscv_encoding_;
}
Method definition
In our case, the implementation of this method is pretty simple. We will assume that the address is properly aligned and no additional error checking is required.
First, the instruction word has to be fetched from memory using the memory
interface and the DataBuffer
instance.
memory_->Load(address, inst_db_, nullptr, nullptr);
uint32_t iword = inst_db_->Get<uint32_t>(0);
Next, we call into the RiscVIEncoding
instance to parse the instruction word,
which has to be done before calling the ISA decoder itself. Recall that the ISA
decoder calls into the RiscVIEncoding
instance directly to obtain the opcode
and operands specified by the instruction word. We haven't implemented that
class yet, but let's use void ParseInstruction(uint32_t)
as that method.
riscv_encoding_->ParseInstruction(iword);
Finally we call the ISA decoder, passing in the address and the Encoding class.
auto *instruction = riscv_isa_->Decode(address, riscv_encoding_);
return instruction;
If you need help (or want to check your work), the full answer is here.
The encoding class
The encoding class implements an interface that is used by the decoder class to obtain the instruction opcode, its source and destination operands, and resource operands. These objects all depend on information from the binary format decoder, such as the opcode, values of specific fields in the instruction word etc. This is separated from the decoder class to keep it encoding agnostic and enable support for multiple different encoding schemes in the future.
The RiscV32IEncodingBase
is an abstract class. The set of methods we have to
implement in our derived class is shown below.
class RiscV32IEncodingBase {
public:
virtual ~RiscV32IEncodingBase() = default;
virtual OpcodeEnum GetOpcode(SlotEnum slot, int entry) = 0;
virtual ResourceOperandInterface *
GetSimpleResourceOperand(SlotEnum slot, int entry, OpcodeEnum opcode,
SimpleResourceVector &resource_vec, int end) = 0;
virtual ResourceOperandInterface *
GetComplexResourceOperand(SlotEnum slot, int entry, OpcodeEnum opcode,
ComplexResourceEnum resource_op,
int begin, int end) = 0;
virtual PredicateOperandInterface *
GetPredicate(SlotEnum slot, int entry, OpcodeEnum opcode,
PredOpEnum pred_op) = 0;
virtual SourceOperandInterface *
GetSource(SlotEnum slot, int entry, OpcodeEnum opcode,
SourceOpEnum source_op, int source_no) = 0;
virtual DestinationOperandInterface *
GetDestination(SlotEnum slot, int entry, OpcodeEnum opcode,
DestOpEnum dest_op, int dest_no, int latency) = 0;
virtual int GetLatency(SlotEnum slot, int entry, OpcodeEnum opcode,
DestOpEnum dest_op, int dest_no) = 0;
};
At first glance it looks a bit complicated, particularly with the number of parameters, but for a simple architecture like RiscV we actually ignore most of the parameters, as their values will be implied.
Let's go through each of the methods in turn.
OpcodeEnum GetOpcode(SlotEnum slot, int entry);
The GetOpcode
method returns the OpcodeEnum
member for the current
instruction, identifying the instruction opcode. The OpcodeEnum
class is
defined in the generated isa decoder file riscv32i_enums.h
. The method takes
two parameters, both of which can be ignored for our purposes. The first of
these is the slot type (an enum class also defined in riscv32i_enums.h
),
which, since RiscV only has a single slot, has only one possible value:
SlotEnum::kRiscv32
. The second is the instance number of the slot (in case
there are multiple instances of the slot, which may occur in some VLIW
architectures).
ResourceOperandInterface *
GetSimpleResourceOperand(SlotEnum slot, int entry, OpcodeEnum opcode,
SimpleResourceVector &resource_vec, int end)
ResourceOperandInterface *
GetComplexResourceOperand(SlotEnum slot, int entry, OpcodeEnum opcode,
ComplexResourceEnum resource_op,
int begin, int end);
The next two methods are used for modeling hardware resources in the processor
in order to improve cycle accuracy. For our tutorial exercises, we will not use
these, so in the implementation, they will be stubbed out, returning nullptr
.
PredicateOperandInterface *
GetPredicate(SlotEnum slot, int entry, OpcodeEnum opcode,
PredOpEnum pred_op);
SourceOperandInterface *
GetSource(SlotEnum slot, int entry, OpcodeEnum opcode,
SourceOpEnum source_op, int source_no);
DestinationOperandInterface *
GetDestination(SlotEnum slot, int entry, OpcodeEnum opcode,
DestOpEnum dest_op, int dest_no, int latency);
These three methods return pointers to operand objects that are used within
the instruction semantic functions to access the value of any instruction
predicate operand, each of the instruction source operands, and write new
values to the instruction destination operands. Since RiscV does not use
instruction predicates, that method need only return nullptr
.
The pattern of parameters is similar across these functions. First, just like
GetOpcode
the slot and the entry are passed in. Then the opcode for the
instruction for which the operand has to be created. This is only used if the
different opcodes need to return different operand objects for the same operand
types, which is not the case for this RiscV simulator.
Next is the Predicate, Source, and Destination, operand enumeration entry which
identifies the operand that has to be created. These come from the three
OpEnums in the riscv32i_enums.h
as seen below:
enum class PredOpEnum {
kNone = 0,
kPastMaxValue = 1,
};
enum class SourceOpEnum {
kNone = 0,
kBimm12 = 1,
kCsr = 2,
kImm12 = 3,
kJimm20 = 4,
kRs1 = 5,
kRs2 = 6,
kSimm12 = 7,
kUimm20 = 8,
kUimm5 = 9,
kPastMaxValue = 10,
};
enum class DestOpEnum {
kNone = 0,
kCsr = 1,
kNextPc = 2,
kRd = 3,
kPastMaxValue = 4,
};
If you look back at the
riscv32.isa
file, you'll note that these correspond to the sets of source and destination
operand names used in the declaration of each instruction. By using different
operand names for operands that represent different bitfields and operand
types, it makes writing the encoding class easier as the enum member uniquely
determines the exact operand type to return, and it is not necessary to
consider the values of the slot, entry, or opcode parameters.
Finally, for source and destination operands, the ordinal position of the operand is passed in (again, we can ignore this), and for the destination operand, the latency (in cycles) that elapses between the time the instruction is issued, and the destination result is available to subsequent instructions. In our simulator, this latency will be 0, meaning that the instruction writes the result out immediately to the register.
int GetLatency(SlotEnum slot, int entry, OpcodeEnum opcode,
DestOpEnum dest_op, int dest_no);
The final function is used to get the latency of a particular destination
operand if it has been specified as *
in the .isa
file. This is uncommon,
and is not used for this RiscV simulator, so our implementation of this function
will just return 0.
Define the encoding class
Header file (.h)
Methods
Open up the file riscv32i_encoding.h
. All the necessary include files have
already been added and the namespaces have been set up. All code addition is
done following the comment // Exercise 2.
Let's start by defining a class RiscV32IEncoding
that inherits from the
generated interface.
class RiscV32IEncoding : public RiscV32IEncodingBase {
public:
};
Next, the constructor should take a pointer to the state instance, in this case
a pointer to riscv::RiscVState
. The default destructor should be used.
explicit RiscV32IEncoding(riscv::RiscVState *state);
~RiscV32IEncoding() override = default;
Before we add in all the interface methods, let's add in the method called by
RiscV32Decoder
to parse the instruction:
void ParseInstruction(uint32_t inst_word);
Next, let's add in those methods that have trivial overrides while dropping the names of the parameters that are not used:
// Trivial overrides.
ResourceOperandInterface *GetSimpleResourceOperand(SlotEnum, int, OpcodeEnum,
SimpleResourceVector &,
int) override {
return nullptr;
}
ResourceOperandInterface *GetComplexResourceOperand(SlotEnum, int, OpcodeEnum,
ComplexResourceEnum ,
int, int) override {
return nullptr;
}
PredicateOperandInterface *GetPredicate(SlotEnum, int, OpcodeEnum,
PredOpEnum) override {
return nullptr;
}
int GetLatency(SlotEnum, int, OpcodeEnum, DestOpEnum, int) override { return 0; }
Finally add in the remaining method overrides of the public interface but with the implementations deferred to the .cc file.
OpcodeEnum GetOpcode(SlotEnum, int) override;
SourceOperandInterface *GetSource(SlotEnum , int, OpcodeEnum,
SourceOpEnum source_op, int) override;
DestinationOperandInterface *GetDestination(SlotEnum, int, OpcodeEnum,
DestOpEnum dest_op, int,
int latency) override;
In order to simplify the implementation of each of the operand getter methods
we will create two arrays of callables (function objects) indexed by the
numeric value of the SourceOpEnum
and DestOpEnum
members respectively.
This way the bodies of these to methods is reduced down into calling the
function object for the enum value that is passed in and returning its return
value.
To organize the initialization of these two arrays we define two private methods that will be called from the constructor as follows:
private:
void InitializeSourceOperandGetters();
void InitializeDestinationOperandGetters();
Data members
The data members required are as follows:
state_
to hold theriscv::RiscVState *
value.inst_word_
of typeuint32_t
which holds the value of the current instruction word.opcode_
to hold the opcode of the current instruction that is updated by theParseInstruction
method. This has typeOpcodeEnum
.source_op_getters_
an array to store the callables used to obtain source operand objects. The type of the array elements isabsl::AnyInvocable<SourceOperandInterface *>()>
.dest_op_getters_
an array to store the callables used to obtain destination operand objects. The type of the array elements isabsl::AnyInvocable<DestinationOperandInterface *>()>
.xreg_alias
an array of RiscV integer register ABI names, e.g., "zero" and "ra" instead of "x0" and "x1".
riscv::RiscVState *state_;
uint32_t inst_word_;
OpcodeEnum opcode_;
absl::AnyInvocable<SourceOperandInterface *()>
source_op_getters_[static_cast<int>(SourceOpEnum::kPastMaxValue)];
absl::AnyInvocable<DestinationOperandInterface *(int)>
dest_op_getters_[static_cast<int>(DestOpEnum::kPastMaxValue)];
const std::string xreg_alias_[32] = {
"zero", "ra", "sp", "gp", "tp", "t0", "t1", "t2", "s0", "s1", "a0",
"a1", "a2", "a3", "a4", "a5", "a6", "a7", "s2", "s3", "s4", "s5",
"s6", "s7", "s8", "s9", "s10", "s11", "t3", "t4", "t5", "t6"};
If you need help (or want to check your work), the full answer is here.
Source file (.cc).
Open up the file riscv32i_encoding.cc
. All the necessary include files have
already been added and the namespaces have been set up. All code addition is
done following the comment // Exercise 2.
Helper functions
We will start by writing a couple of helper functions that we use to create
source and destination register operands. These will be templated on the
register type and will call into the RiscVState
object to get a handle to the
register object, and then call an operand factory method in the register object.
Let's start with the destination operand helpers:
template <typename RegType>
inline DestinationOperandInterface *GetRegisterDestinationOp(
RiscVState *state, const std::string &name, int latency) {
auto *reg = state->GetRegister<RegType>(name).first;
return reg->CreateDestinationOperand(latency);
}
template <typename RegType>
inline DestinationOperandInterface *GetRegisterDestinationOp(
RiscVState *state, const std::string &name, int latency,
const std::string &op_name) {
auto *reg = state->GetRegister<RegType>(name).first;
return reg->CreateDestinationOperand(latency, op_name);
}
As you can see, there are two helper functions. The second takes an additional
parameter op_name
that allows the operand to have a different name, or string
representation, than the underlying register.
Similarly for the source operand helpers:
template <typename RegType>
inline SourceOperandInterface *GetRegisterSourceOp(RiscVState *state,
const std::string ®_name) {
auto *reg = state->GetRegister<RegType>(reg_name).first;
auto *op = reg->CreateSourceOperand();
return op;
}
template <typename RegType>
inline SourceOperandInterface *GetRegisterSourceOp(RiscVState *state,
const std::string ®_name,
const std::string &op_name) {
auto *reg = state->GetRegister<RegType>(reg_name).first;
auto *op = reg->CreateSourceOperand(op_name);
return op;
}
Constructor and interface functions
The constructor and the interface functions are very simple. The constructor just calls the two initialize methods to initialize the callables arrays for the operand getters.
RiscV32IEncoding::RiscV32IEncoding(RiscVState *state) : state_(state) {
InitializeSourceOperandGetters();
InitializeDestinationOperandGetters();
}
ParseInstruction
stores the instruction word and then the opcode that it
obtains from calling into the binary decoder generated code.
// Parse the instruction word to determine the opcode.
void RiscV32IEncoding::ParseInstruction(uint32_t inst_word) {
inst_word_ = inst_word;
opcode_ = mpact::sim::codelab::DecodeRiscVInst32(inst_word_);
}
Lastly, the operand getters return the value from the getter function it calls based on the array lookup using the destination/source operand enum value.
DestinationOperandInterface *RiscV32IEncoding::GetDestination(
SlotEnum, int, OpcodeEnum, DestOpEnum dest_op, int, int latency) {
return dest_op_getters_[static_cast<int>(dest_op)](latency);
}
SourceOperandInterface *RiscV32IEncoding::GetSource(SlotEnum, int, OpcodeEnum,
SourceOpEnum source_op, int) {
return source_op_getters_[static_cast<int>(source_op)]();
}
Array initialization methods
As you may have guessed, most of the work is in initializing the getter
arrays, but don't worry, it's done using an easy, repeating pattern. Let's
start with InitializeDestinationOpGetters()
first, since there are only a
couple of destination operands.
Recall the generated DestOpEnum
class from riscv32i_enums.h
:
enum class DestOpEnum {
kNone = 0,
kCsr = 1,
kNextPc = 2,
kRd = 3,
kPastMaxValue = 4,
};
For dest_op_getters_
we need to initialize 4 entries, one each for kNone
,
kCsr
, kNextPc
and kRd
. For convenience, each entry is initialized with a
lambda, though you could use any other form of callable as well. The signature
of the lambda is void(int latency)
.
Up to now we haven't talked much about the different kinds of destination
operands that are defined in MPACT-Sim. For this exercise we will only use two
types: generic::RegisterDestinationOperand
defined in
register.h
,
and generic::DevNullOperand
defined in
devnull_operand.h
.
The details of these operands aren't really important right now, except that the
former is used to write to registers, and the latter ignores all writes.
The first entry for kNone
is trivial - just return a nullptr and optionally
log an error.
void RiscV32IEncoding::InitializeDestinationOperandGetters() {
// Destination operand getters.
dest_op_getters_[static_cast<int>(DestOpEnum::kNone)] = [](int) {
return nullptr;
};
Next is kCsr
. Here we are going to cheat a little. The "hello world" program
doesn't rely on any actual CSR update, but there is some boilerplate code that
execute CSR instructions. The solution is to just dummy this up by using a
regular register named "CSR" and channel all such writes to it.
dest_op_getters_[static_cast<int>(DestOpEnum::kCsr)] = [this](int latency) {
return GetRegisterDestinationOp<RV32Register>(state_, "CSR", latency);
};
Next is kNextPc
, which refers to the "pc" register. It is used as the target
for all branch and jump instructions. The name is defined in RiscVState
as
kPcName
.
dest_op_getters_[static_cast<int>(DestOpEnum::kNextPc)] = [this](int latency) {
return GetRegisterDestinationOp<RV32Register>(state_, RiscVState::kPcName, latency);
}
Finally there is the kRd
destination operand. In riscv32i.isa
the operand
rd
is only used to refer to the integer register encoded in the "rd" field
of the instruction word, so there is no ambiguity to which it refers. There
is only one complication. Register x0
(abi name zero
) is hardwired to 0,
so for that register we use the DevNullOperand
.
So in this getter we first extract the value in the rd
field using the
Extract
method generated from the .bin_fmt file. If the value is 0, we
return a "DevNull" operand, otherwise we return the correct register operand,
taking care to use the appropriate register alias as the operand name.
dest_op_getters_[static_cast<int>(DestOpEnum::kRd)] = [this](int latency) {
// First extract register number from rd field.
int num = inst32_format::ExtractRd(inst_word_);
// For register x0, return the DevNull operand.
if (num == 0) return new DevNullOperand<uint32_t>(state, {1});
// Return the proper register operand.
return GetRegisterDestinationOp<RV32Register>(
state_, absl::StrCat(RiscVState::kXRegPrefix, num), latency,
xreg_alias_[num]);
)
}
}
Now onto the InitializeSourceOperandGetters()
method, where the pattern is
much the same, but the details differ slightly.
First let's take a look at the SourceOpEnum
that was generated from
riscv32i.isa
in the first tutorial:
enum class SourceOpEnum {
kNone = 0,
kBimm12 = 1,
kCsr = 2,
kImm12 = 3,
kJimm20 = 4,
kRs1 = 5,
kRs2 = 6,
kSimm12 = 7,
kUimm20 = 8,
kUimm5 = 9,
kPastMaxValue = 10,
};
Examining the members, in addition to kNone
, they fall into two groups. One
is immediate operands: kBimm12
, kImm12
, kJimm20
, kSimm12
, kUimm20
,
and kUimm5
. The other are register operands: kCsr
, kRs1
, and kRs2
.
The kNone
operand is handled just like for destination operands - return a
nullptr.
void RiscV32IEncoding::InitializeSourceOperandGetters() {
// Source operand getters.
source_op_getters_[static_cast<int>(SourceOpEnum::kNone)] = [] () {
return nullptr;
};
Next, let's work on the register operands. We will handle the kCsr
similar
to how we handled the corresponding destination operands - just call the
helper function using "CSR" as the register name.
// Register operands.
source_op_getters_[static_cast<int>(SourceOpEnum::kCsr)] = [this]() {
return GetRegisterSourceOp<RV32Register>(state_, "CSR");
};
Operands kRs1
and kRs2
are handled equivalently to kRd
, except that
while we didn't want to update x0
(or zero
), we want to make sure that
we always read 0 from that operand. For that we will use the
generic::IntLiteralOperand<>
class defined in
literal_operand.h
.
This operand is used to store a literal value (as opposed to a simulated
immediate value). Otherwise the pattern is the same: first extract the
rs1/rs2 value from the instruction word, if it is zero return the literal
operand with a 0 template parameter, otherwise return a regular register
source operand using the helper function, using the abi alias as the operand
name.
source_op_getters_[static_cast<int>(SourceOpEnum::kRs1)] =
[this]() -> SourceOperandInterface * {
int num = inst32_format::ExtractRs1(inst_word_);
if (num == 0) return new IntLiteralOperand<0>({1}, xreg_alias_[0]);
return GetRegisterSourceOp<RV32Register>(
state_, absl::StrCat(RiscVState::kXregPrefix, num), xreg_alias_[num]);
};
source_op_getters_[static_cast<int>(SourceOpEnum::kRs2)] =
[this]() -> SourceOperandInterface * {
int num = inst32_format::ExtractRs2(inst_word_);
if (num == 0) return new IntLiteralOperand<0>({1}, xreg_alias_[0]);
return GetRegisterSourceOp<RV32Register>(
state_, absl::StrCat(RiscVState::kXregPrefix, num), xreg_alias_[num]);
};
Finally we handle the different immediate operands. Immediate values are
stored in instances of the class generic::ImmediateOperand<>
defined in
immediate_operand.h
.
The only difference between the different getters for the immediate operands
is which Extractor function is used, and whether the storage type is signed or
unsigned, according to the bitfield.
// Immediates.
source_op_getters_[static_cast<int>(SourceOpEnum::kBimm12)] = [this]() {
return new ImmediateOperand<int32_t>(
inst32_format::ExtractBImm(inst_word_));
};
source_op_getters_[static_cast<int>(SourceOpEnum::kImm12)] = [this]() {
return new ImmediateOperand<int32_t>(
inst32_format::ExtractImm12(inst_word_));
};
source_op_getters_[static_cast<int>(SourceOpEnum::kUimm5)] = [this]() {
return new ImmediateOperand<uint32_t>(
inst32_format::ExtractUimm5(inst_word_));
};
source_op_getters_[static_cast<int>(SourceOpEnum::kJimm20)] = [this]() {
return new ImmediateOperand<int32_t>(
inst32_format::ExtractJImm(inst_word_));
};
source_op_getters_[static_cast<int>(SourceOpEnum::kSimm12)] = [this]() {
return new ImmediateOperand<int32_t>(
inst32_format::ExtractSImm(inst_word_));
};
source_op_getters_[static_cast<int>(SourceOpEnum::kUimm20)] = [this]() {
return new ImmediateOperand<uint32_t>(
inst32_format::ExtractUimm32(inst_word_));
};
}
If you need help (or want to check your work), the full answer is here.
This concludes this tutorial. We hope it has been useful.