Instruction semantic functions tutorial

The objectives of this tutorial are:

Learn how semantic functions are used to implement instruction semantics.
Learn how semantic functions relate to the ISA decoder description.
Write instruction semantic functions for RiscV RV32I instructions.
Test the final simulator by running a small "Hello World" executable.

Overview of semantic functions

A semantic function in MPACT-Sim is a function that implements the operation of an instruction so that its side-effects are visible in the simulated state in the same way the instruction's side-effects are visible when executed in hardware. The simulator's internal representation of each decoded instruction contains a callable that is used to call the semantic function for that instruction.

A semantic function has a the signature void(Instruction *), that is, a function that takes a pointer to an instance of the Instruction class and returns void.

The Instruction class is defined in instruction.h

For the purposes of writing semantic functions we are particularly interested in the source and destination operand interface vectors accessed using the Source(int i) and Destination(int i) method calls.

The source and destination operand interfaces are shown below:

// The source operand interface provides an interface to access input values
// to instructions in a way that is agnostic about the underlying implementation
// of those values (eg., register, fifo, immediate, predicate, etc).
class SourceOperandInterface {
 public:
  // Methods for accessing the nth value element.
  virtual bool AsBool(int index) = 0;
  virtual int8_t AsInt8(int index) = 0;
  virtual uint8_t AsUint8(int index) = 0;
  virtual int16_t AsInt16(int index) = 0;
  virtual uint16_t AsUint16(int) = 0;
  virtual int32_t AsInt32(int index) = 0;
  virtual uint32_t AsUint32(int index) = 0;
  virtual int64_t AsInt64(int index) = 0;
  virtual uint64_t AsUint64(int index) = 0;

  // Return a pointer to the object instance that implements the state in
  // question (or nullptr) if no such object "makes sense". This is used if
  // the object requires additional manipulation - such as a fifo that needs
  // to be pop'ed. If no such manipulation is required, nullptr should be
  // returned.
  virtual std::any GetObject() const = 0;

  // Return the shape of the operand (the number of elements in each dimension).
  // For instance {1} indicates a scalar quantity, whereas {128} indicates an
  // 128 element vector quantity.
  virtual std::vector<int> shape() const = 0;

  // Return a string representation of the operand suitable for display in
  // disassembly.
  virtual std::string AsString() const = 0;

  virtual ~SourceOperandInterface() = default;
};

// The destination operand interface is used by instruction semantic functions
// to get a writable DataBuffer associated with a piece of simulated state to
// which the new value can be written, and then used to update the value of
// the piece of state with a given latency.
class DestinationOperandInterface {
 public:
  virtual ~DestinationOperandInterface() = default;
  // Allocates a data buffer with ownership, latency and delay line set up.
  virtual DataBuffer *AllocateDataBuffer() = 0;
  // Takes an existing data buffer, and initializes it for the destination
  // as if AllocateDataBuffer had been called.
  virtual void InitializeDataBuffer(DataBuffer *db) = 0;
  // Allocates and initializes data buffer as if AllocateDataBuffer had been
  // called, but also copies in the value from the current value of the
  // destination.
  virtual DataBuffer *CopyDataBuffer() = 0;
  // Returns the latency associated with the destination operand.
  virtual int latency() const = 0;
  // Return a pointer to the object instance that implmements the state in
  // question (or nullptr if no such object "makes sense").
  virtual std::any GetObject() const = 0;
  // Returns the order of the destination operand (size in each dimension).
  virtual std::vector<int> shape() const = 0;
  // Return a string representation of the operand suitable for display in
  // disassembly.
  virtual std::string AsString() const = 0;
};

The basic way of writing a semantic function for a normal 3 operand instruction such as a 32-bit add instruction is as follows:

void MyAddFunction(Instruction *inst) {
  uint32_t a = inst->Source(0)->AsUint32(0);
  uint32_t b = inst->Source(1)->AsUint32(0);
  uint32_t c = a + b;
  DataBuffer *db = inst->Destination(0)->AllocateDataBuffer();
  db->Set<uint32_t>(0, c);
  db->Submit();
}

Let's break down the pieces of this function. The two first lines of the function body reads from source operands 0 and 1. The AsUint32(0) call interprets the underlying data as a uint32_t array and fetches the 0th element. This is true regardless of whether the underlying register or value is array valued or not. The size (in elements) of the source operand can be obtained from the source operand method shape(), which returns a vector containing the number of elements in each dimension. That method returns {1} for a scalar, {16} for a 16 element vector, and {4, 4} for a 4x4 array.

  uint32_t a = inst->Source(0)->AsUint32(0);
  uint32_t b = inst->Source(1)->AsUint32(0);

Then a uint32_t temporary named c is assigned the value a + b.

The next line may require a little more explanation:

  DataBuffer *db = inst->Destination(0)->AllocateDataBuffer();

A DataBuffer is a reference counted object that is used to store values in simulated state such as registers. It is relatively untyped, though it has a size based on the object from which it is allocated. In this case, that size is sizeof(uint32_t). This statement allocates a new data buffer sized for the destination that is the target of this destination operand - in this case a 32-bit integer register. The DataBuffer is also initialized with the architectural latency for the instruction. This is specified during instruction decode.

The next line treats the data buffer instance as an array of uint32_t and writes the value stored in c to the 0th element.

  db->Set<uint32_t>(0, c);

Finally, the last statement submits the data buffer to the simulator to be used as the new value of the target machine state (in this case a register) after the latency of the instruction that was set when the instruction was decoded and the destination operand vector populated.

While this is a reasonably brief function, it does have a bit of boilerplate code that becomes repetitive when implementing instruction after instruction. Additionally, it can obscure the actual semantics of the instruction. In order to further simplify writing the semantic functions for most instructions, there are a number of templated helper functions defined in instruction_helpers.h. These helpers hide the boilerplate code for instructions with one, two or three source operands, and a single destination operand. Let's take a look at a two operand helper functions:

// This is a templated helper function used to factor out common code in
// two operand instruction semantic functions. It reads two source operands
// and applies the function argument to them, storing the result to the
// destination operand. This version supports different types for the result and
// each of the two source operands.
template <typename Result, typename Argument1, typename Argument2>
inline void BinaryOp(Instruction *instruction,
                     std::function<Result(Argument1, Argument2)> operation) {
  Argument1 lhs = generic::GetInstructionSource<Argument1>(instruction, 0);
  Argument2 rhs = generic::GetInstructionSource<Argument2>(instruction, 1);
  Result dest_value = operation(lhs, rhs);
  auto *db = instruction->Destination(0)->AllocateDataBuffer();
  db->SetSubmit<Result>(0, dest_value);
}

// This is a templated helper function used to factor out common code in
// two operand instruction semantic functions. It reads two source operands
// and applies the function argument to them, storing the result to the
// destination operand. This version supports different types for the result
// and the operands, but the two source operands must have the same type.
template <typename Result, typename Argument>
inline void BinaryOp(Instruction *instruction,
                     std::function<Result(Argument, Argument)> operation) {
  Argument lhs = generic::GetInstructionSource<Argument>(instruction, 0);
  Argument rhs = generic::GetInstructionSource<Argument>(instruction, 1);
  Result dest_value = operation(lhs, rhs);
  auto *db = instruction->Destination(0)->AllocateDataBuffer();
  db->SetSubmit<Result>(0, dest_value);
}

// This is a templated helper function used to factor out common code in
// two operand instruction semantic functions. It reads two source operands
// and applies the function argument to them, storing the result to the
// destination operand. This version requires both result and source operands
// to have the same type.
template <typename Result>
inline void BinaryOp(Instruction *instruction,
                     std::function<Result(Result, Result)> operation) {
  Result lhs = generic::GetInstructionSource<Result>(instruction, 0);
  Result rhs = generic::GetInstructionSource<Result>(instruction, 1);
  Result dest_value = operation(lhs, rhs);
  auto *db = instruction->Destination(0)->AllocateDataBuffer();
  db->SetSubmit<Result>(0, dest_value);
}

You will notice that instead of using a statement like:

  uint32_t a = inst->Source(0)->AsUint32(0);

The helper function uses:

generic::GetInstructionSource<Argument>(instruction, 0);

The GetInstructionSource is a family of templated helper functions that are used to provide templated access methods to the instruction source operands. Without them each of the instruction helper function would have to specialized for each type to access the source operand with the correct As<int type>() function. You can see the definitions of these template functions in instruction.h.

As you can see there are three implementations, depending on whether the source operand types are the same as the destination, whether the destination is different from the sources, or whether they are all different. Each version of the function takes a pointer to the instruction instance as well as a callable (which includes lambda functions). This means that we can now rewrite the add semantic function above as follows:

void MyAddFunction(Instruction *inst) {
  generic::BinaryOp<uint32_t>(inst,
                              [](uint32_t a, uint32_t b) { return a + b; });
}

When compiled with bazel build -c opt and copts = ["-O3"] in the build file, this should inline completely with no overhead, giving us notational succinctness without any performance penalties.

As mentioned there are helper functions for unary, binary and ternary scalar instructions as well as vector equivalents. They also serve as useful templates for creating your own helpers for instructions that don't fit the general mold.

Initial build

If you haven't changed directory to riscv_semantic_functions, do so now. Then build the project as follows - this build should succeed.

$  bazel build :riscv32i
...<snip>...

There are no files that are generated, so this is really just a dry run to make sure all is in order.

Add three operand ALU instructions

Now let's add the semantic functions for some generic, 3-operand ALU instructions. Open up the file rv32i_instructions.cc, and make sure that any missing definitions get added to the file rv32i_instructions.h as we go along.

The instructions we will add are:

add - 32-bit integer add.
and - 32-bit bitwise and.
or - 32-bit bitwise or.
sll - 32-bit logical shift left.
sltu - 32-bit unsigned set less-than.
sra - 32-bit arithmetic right shift.
srl - 32-bit logical right shift.
sub - 32-bit integer subtract.
xor - 32-bit bitwise xor.

If you have done the previous tutorials you may recall that we distinguished between register-register instructions and register-immediate instructions in the decoder. When it comes to semantic functions, we no longer need to do that. The operand interfaces will read the operand value from whichever the operand is, register or immediate, with the semantic function completely agnostic to what the underlying source operand really is.

Except for sra, all of the instructions above can be treated as operating on 32-bit unsigned values, so for these we can use the BinaryOp template function we looked at earlier with only the single template type argument. Fill in the function bodies in rv32i_instructions.cc accordingly. Note that only the low 5 bits of the second operand to the shift instructions are used for the shift amount. Otherwise, all the operations are of the form src0 op src1:

add: a + b
and: a & b
or : a | b
sll : a << (b & 0x1f)
sltu : (a < b) ? 1 : 0
srl : a >> (b & 0x1f)
sub : a - b
xor : a ^ b

For sra we will use the three argument BinaryOp template. Looking at the template, the first type argument is the result type uint32_t. The second is the type of source operand 0, in this case int32_t, and the last is the type of source operand 1, in this case uint32_t. That makes the body of the sra semantic function:

  generic::BinaryOp<uint32_t, int32_t, uint32_t>(
      instruction, [](int32_t a, uint32_t b) { return a >> (b & 0x1f); });

Go ahead and make the changes and build. You can check your work against rv32i_instructions.cc.

Add two operand ALU instructions

There are only two 2-operand ALU instructions: lui and auipc. The former copies the pre-shifted source operand directly to the destination. The latter adds the instruction address to the immediate before writing it to the destination. The instruction address is accessible from the address() method of the Instruction object.

Since there is only a single source operand, we can't use BinaryOp, instead we need to use UnaryOp. Since we can treat both the source and the destination operands as uint32_t we can use the single argument template version.

// This is a templated helper function used to factor out common code in
// single operand instruction semantic functions. It reads one source operand
// and applies the function argument to it, storing the result to the
// destination operand. This version supports the result and argument having
// different types.
template <typename Result, typename Argument>
inline void UnaryOp(Instruction *instruction,
                    std::function<Result(Argument)> operation) {
  Argument lhs = generic::GetInstructionSource<Argument>(instruction, 0);
  Result dest_value = operation(lhs);
  auto *db = instruction->Destination(0)->AllocateDataBuffer();
  db->SetSubmit<Result>(0, dest_value);
}

// This is a templated helper function used to factor out common code in
// single operand instruction semantic functions. It reads one source operand
// and applies the function argument to it, storing the result to the
// destination operand. This version requires that the result and argument have
// the same type.
template <typename Result>
inline void UnaryOp(Instruction *instruction,
                    std::function<Result(Result)> operation) {
  Result lhs = generic::GetInstructionSource<Result>(instruction, 0);
  Result dest_value = operation(lhs);
  auto *db = instruction->Destination(0)->AllocateDataBuffer();
  db->SetSubmit<Result>(0, dest_value);
}

The body of the semantic function for lui is about as trivial as it can be, just return the source. The semantic function for auipc introduces a minor issue, since you need to access the address() method in the Instruction instance. The answer is to add instruction to the lambda capture, making it available to use in the lambda function body. Instead of [](uint32_t a) { ... } as before, the lambda should be written [instruction](uint32_t a) { ... }. Now instruction can be used in the lambda body.

Go ahead and make the changes and build. You can check your work against rv32i_instructions.cc.

Add control flow change instructions

The control flow change instructions that you need to implement are divided into conditional branch instructions (shorter branches that are performed if a comparison holds true), and jump-and-link instructions, which are used to implement function calls (the -and-link is removed by setting the link register to zero, making those writes no-ops).

Add conditional branch instructions

There is no helper function for branch instruction, so there's two options. Write the semantic functions from scratch, or write a local helper function. Since we need to implement 6 branch instructions, the latter seems worth the effort. Before we do that, let's look at the implementation of a branch instruction semantic function from scratch.

void MyConditionalBranchGreaterEqual(Instruction *instruction) {
  int32_t a = generic::GetInstructionSource<int32_t>(instruction, 0);
  int32_t b = generic::GetInstructionSource<int32_t>(instruction, 1);
  if (a >= b) {
    uint32_t offset = generic::GetInstructionSource<uint32_t>(instruction, 2);
    uint32_t target = offset + instruction->address();
    DataBuffer *db = instruction->Destination(0)->AllocateDataBuffer();
    db->Set<uint32_t>(0,m target);
    db->Submit();
  }
}

The only thing that varies across the branch instructions is the branch condition, and the data types, signed vs unsigned 32 bit int, of the two source operands. That means we need to have a template parameter for the source operands. The helper function itself need to take the Instruction instance and a callable object such as std::function that returns bool as parameters. The helper function would look like:

template <typename OperandType>
static inline void BranchConditional(
    Instruction *instruction,
    std::function<bool(OperandType, OperandType)> cond) {
  OperandType a = generic::GetInstructionSource<OperandType>(instruction, 0);
  OperandType b = generic::GetInstructionSource<OperandType>(instruction, 1);
  if (cond(a, b)) {
    uint32_t offset = generic::GetInstructionSource<uint32_t>(instruction, 2);
    uint32_t target = offset + instruction->address();
    DataBuffer *db = instruction->Destination(0)->AllocateDataBuffer();
    db->Set<uint32_t>(0, target);
    db->Submit();
  }
}

Now we can write the bge (signed branch greater or equal) semantic function as:

void RV32IBge(Instruction *instruction) {
  BranchConditional<int32_t>(instruction,
                             [](int32_t a, int32_t b) { return a >= b; });
}

The remaining branch instructions are as follows:

Beq - branch equal.
Bgeu - branch greater or equal (unsigned).
Blt - branch less than (signed).
Bltu - branch less than (unsigned).
Bne - branch not equal.

Go ahead and make the changes to implement these semantic functions, and re-build. You can check your work against rv32i_instructions.cc.

Add jump-and-link instructions

There is no point in writing a helper function for the jump and link instructions, so we will need to write these from scratch. Let's start with looking at their instruction semantics.

The jal instruction takes an offset from source operand 0 and adds it to the current pc (instruction address) to compute the jump target. The jump target gets written to destination operand 0. The return address is the address of the next sequential instruction. It can be computed by adding the current instruction's size to its address. The return address gets written to destination operand 1. Remember to include the instruction object pointer in the lambda capture.

The jalr instruction takes a base register as source operand 0 and an offset as source operand 1, and adds them together to compute the jump target. Otherwise it is identical to the jal instruction.

Based on these descriptions of the instruction semantics, write the two semantic functions and build. You can check your work against rv32i_instructions.cc.

Add memory store instructions

There are three store instructions that we need to implement: store byte (sb), store halfword (sh), and store word (sw). Store instructions differ from the instructions we have implemented so far in that they don't write to local processor state. Instead they write to a system resource - main memory. MPACT-Sim does not treat memory as an instruction operand, so the memory access has to be performed using another methodology.

The answer is to add memory access methods to the MPACT-Sim ArchState object, or more properly, create a new RiscV state object that derives from ArchState where this can be added. The ArchState object manages core resources, such as registers and other state objects. It also manages the delay lines used to buffer the destination operand data buffers until they can be written back to the register objects. Most instruction can be implemented without knowledge of this class, but some, like memory operations and other specific system instructions require functionality to reside in this state object.

Let's take a look at the semantic function for the fence instruction that is already implemented in rv32i_instructions.cc as an example. The fence instruction holds instruction issue until certain memory operations have completed. It is used to guarantee memory ordering between instructions executing before the instruction and those executing after.

// Fence.
void RV32IFence(Instruction *instruction) {
  uint32_t bits = instruction->Source(0)->AsUint32(0);
  int fm = (bits >> 8) & 0xf;
  int predecessor = (bits >> 4) & 0xf;
  int successor = bits & 0xf;
  auto *state = static_cast<RiscVState *>(instruction->state());
  state->Fence(instruction, fm, predecessor, successor);
}

The key part of the fence instruction's semantic function is the last two lines. First the state object is fetched using a method in the Instruction class and downcast<> to the RiscV specific derived class. Then the Fence method of the RiscVState class is called to perform the fence operation.

Store instructions will work similarly. First the effective address of the memory access is computed from the base and offset instruction source operands, then the value to be stored is fetched from the next source operand. Next, the RiscV state object is obtained through the state() method call and static_cast<>, and the appropriate method is called.

The RiscVState object StoreMemory method is relatively simple, but has a couple of implications we need to be aware of:

  void StoreMemory(const Instruction *inst, uint64_t address, DataBuffer *db);

As we can see, the method takes three parameters, the pointer to the store instruction itself, the address of the store, and a pointer to a DataBuffer instance that contains the store data. Notice, no size is required, the DataBuffer instance itself contains a size() method. However, there is no destination operand accessible to the instruction that can be used to allocate a DataBuffer instance of the appropriate size. Instead we need to use a DataBuffer factory that is obtained from the db_factory() method in the Instruction instance. The factory has a method Allocate(int size) that returns a DataBuffer instance of the required size. Here is an example of how to use this to allocate a DataBuffer instance for a half-word store (note the auto is a C++ feature that deduces the type from the right hand side of the assignment):

  auto *state = down_cast<RiscVState *>(instruction->state());
  auto *db = state->db_factory()->Allocate(sizeof(uint16_t));

Once we have the DataBuffer instance we can write to it as usual:

  db->Set<uint16_t>(0, value);

Then pass it to the memory store interface:

  state->StoreMemory(instruction, address, db);

We are not quite done yet. The DataBuffer instance is reference counted. This is normally understood and handled by the Submit method, so as to keep the most frequent use-case as simple as possible. However, the StoreMemory is not written that way. It will IncRef the DataBuffer instance while it operates on it and then DecRef when done. However, if the semantic function does not DecRef its own reference, it will never be reclaimed. Thus, the last line has to be:

  db->DecRef();

There are three store functions, and the only thing that differs is the size of the memory access. This sounds like a great opportunity for another local templated helper function. The only thing different across the store function is the type of the store value, so the template has to have that as an argument. Other than that, only the Instruction instance has to be passed in:

template <typename ValueType>
inline void StoreValue(Instruction *instruction) {
  auto base = generic::GetInstructionSource<uint32_t>(instruction, 0);
  auto offset = generic::GetInstructionSource<uint32_t>(instruction, 1);
  uint32_t address = base + offset;
  auto value = generic::GetInstructionSource<ValueType>(instruction, 2);
  auto *state = down_cast<RiscVState *>(instruction->state());
  auto *db = state->db_factory()->Allocate(sizeof(ValueType));
  db->Set<ValueType>(0, value);
  state->StoreMemory(instruction, address, db);
  db->DecRef();
}

Go ahead and finish the store semantic functions and build. You can check your work against rv32i_instructions.cc.

Add memory load instructions

The load instructions that need to be implemented are the following:

lb - load byte, sign-extend into a word.
lbu - load byte unsigned, zero-extend into a word.
lh - load half-word, sign-extend into a word.
lhu - load half-word unsigned, zero-extend into a word.
lw - load word.

The load instructions are the most complex instructions we have to model in this tutorial. They are similar to store instructions, in that they need to access the RiscVState object, but adds complexity in that each load instructions is divided into two separate semantic functions. The first is similar to the store instruction, in that it computes the effective address and initiates the memory access. The second is executed when the memory access is complete, and writes the memory data to the register destination operand.

Let's start by looking at the LoadMemory method declaration in RiscVState:

  void LoadMemory(const Instruction *inst, uint64_t address, DataBuffer *db,
                  Instruction *child_inst, ReferenceCount *context);

Compared to the StoreMemory method, LoadMemory takes two additional parameters: a pointer to an Instruction instance and a pointer to a reference counted context object. The former is the child instruction that implements the register write-back (described in the ISA decoder tutorial). It is accessed using the child() method in the current Instruction instance. The latter is a pointer to an instance of a class that derives from ReferenceCount that in this case stores a DataBuffer instance that will contain the loaded data. The context object is available through the context() method in the Instruction object (though for most instructions this is set to nullptr).

The context object for RiscV memory loads is defined as the following struct:

// A simple load context class for convenience.
struct LoadContext : public generic::ReferenceCount {
  explicit LoadContext(DataBuffer *vdb) : value_db(vdb) {}
  ~LoadContext() override {
    if (value_db != nullptr) value_db->DecRef();
  }

  // Override the base class method so that the data buffer can be DecRef'ed
  // when the context object is recycled.
  void OnRefCountIsZero() override {
    if (value_db != nullptr) value_db->DecRef();
    value_db = nullptr;
    // Call the base class method.
    generic::ReferenceCount::OnRefCountIsZero();
  }
  // Data buffers for the value loaded from memory (byte, half, word, etc.).
  DataBuffer *value_db = nullptr;
};

The load instructions are all the same except for the data size (byte, half-word, and word) and whether the loaded value is sign-extended or not. The latter only factors into the child instruction. Let's create a templated helper function for the main load instructions. It will be very similar to the store instruction, except it will not access a source operand to get a value, and it will create a context object.

template <typename ValueType>
inline void LoadValue(Instruction *instruction) {
  auto base = generic::GetInstructionSource<uint32_t>(instruction, 0);
  auto offset = generic::GetInstructionSource<uint32_t>(instruction, 1);
  uint32_t address = base + offset;
  auto *state = down_cast<RiscVState *>(instruction->state());
  auto *db = state->db_factory()->Allocate(sizeof(ValueType));
  db->set_latency(0);
  auto *context = new riscv::LoadContext(db);
  state->LoadMemory(instruction, address, db, instruction->child(), context);
  context->DecRef();
}

As you can see, the main difference is that the allocated DataBuffer instance is both passed to the LoadMemory call as a parameter, as well as stored in the LoadContext object.

The child instruction semantic functions are all very similar. First, the LoadContext is obtained by calling the Instruction method context(), and static-casted to the LoadContext *. Second, the value (according to the data type) is read from the load-data DataBuffer instance. Third, a new DataBuffer instance is allocated from the destination operand. Finally, the loaded value is written to the new DataBuffer instance, and Submit'ed. Again, a templated helper function is a good idea:

template <typename ValueType>
inline void LoadValueChild(Instruction *instruction) {
  auto *context = down_cast<riscv::LoadContext *>(instruction->context());
  uint32_t value = static_cast<uint32_t>(context->value_db->Get<ValueType>(0));
  auto *db = instruction->Destination(0)->AllocateDataBuffer();
  db->Set<uint32_t>(0, value);
  db->Submit();
}

Go ahead and implement these last helper functions and semantic functions. Pay attention to the data type you use in the template for each helper function call, and that it corresponds to the size and signed/unsigned nature of the load instruction.

You can check your work against rv32i_instructions.cc.

Build and run the final simulator

Now that we have done all the hard word we can build the final simulator. The top level C++ libraries that tie together all the work in these tutorials are located in other/. It is not necessary to look too hard at that code. We will visit that topic in an future advanced tutorial.

Change your working directory to other/, and build. It should build without errors.

$ cd ../other
$ bazel build :rv32i_sim

In that directory there is a simple "hello world" program in the file hello_rv32i.elf. To run the simulator on this file and see the results:

$ bazel run :rv32i_sim -- other/hello_rv32i.elf

You should see something along the lines of:

INFO: Analyzed target //other:rv32i_sim (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //other:rv32i_sim up-to-date:
  bazel-bin/other/rv32i_sim
INFO: Elapsed time: 0.203s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/other/rv32i_sim other/hello_rv32i.elf
Starting simulation
Hello World
Simulation done
$

The simulator can also be run in an interactive mode using the command bazel run :rv32i_sim -- -i other/hello_rv32i.elf. This brings up a simple command shell. Type help at the prompt to see the available commands.

$ bazel run :rv32i_sim -- -i other/hello_rv32i.elf
INFO: Analyzed target //other:rv32i_sim (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //other:rv32i_sim up-to-date:
  bazel-bin/other/rv32i_sim
INFO: Elapsed time: 0.180s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/other/rv32i_sim -i other/hello_rv32i.elf
_start:
80000000   addi           ra, 0, 0
[0] > help


    quit                           - exit command shell.
    core [N]                       - direct subsequent commands to core N
                                     (default: 0).
    run                            - run program from current pc until a
                                     breakpoint or exit. Wait until halted.
    run free                       - run program in background from current pc
                                     until breakpoint or exit.
    wait                           - wait for any free run to complete.
    step [N]                       - step [N] instructions (default: 1).
    halt                           - halt a running program.
    reg get NAME [FORMAT]          - get the value or register NAME.
    reg NAME [FORMAT]              - get the value of register NAME.
    reg set NAME VALUE             - set register NAME to VALUE.
    reg set NAME SYMBOL            - set register NAME to value of SYMBOL.
    mem get VALUE [FORMAT]         - get memory from location VALUE according to
                                     format. The format is a letter (o, d, u, x,
                                     or X) followed by width (8, 16, 32, 64).
                                     The default format is x32.
    mem get SYMBOL [FORMAT]        - get memory from location SYMBOL and format
                                     according to FORMAT (see above).
    mem SYMBOL [FORMAT]            - get memory from location SYMBOL and format
                                     according to FORMAT (see above).
    mem set VALUE [FORMAT] VALUE   - set memory at location VALUE(1) to VALUE(2)
                                     according to FORMAT. Default format is x32.
    mem set SYMBOL [FORMAT] VALUE  - set memory at location SYMBOL to VALUE
                                     according to FORMAT. Default format is x32.
    break set VALUE                - set breakpoint at address VALUE.
    break set SYMBOL               - set breakpoint at value of SYMBOL.
    break VALUE                    - set breakpoint at address VALUE.
    break SYMBOL                   - set breakpoint at value of SYMBOL.
    break clear VALUE              - clear breakpoint at address VALUE.
    break clear SYMBOL             - clear breakpoint at value of SYMBOL.
    break clear all                - remove all breakpoints.
    help                           - display this message.

_start:
80000000   addi           ra, 0, 0
[0] >

This concludes this tutorial. We hope it has been helpful.