AI-generated Key Takeaways
-
This guide is for SoC hardware designers integrating the Coral NPU into their system design as an AXI4/TileLink peripheral.
-
The SystemVerilog for the Coral NPU can be generated using specific Bazel build commands for scalar-only AXI or RISC-V vector AXI configurations.
-
The Coral NPU interfaces primarily through an AXI4 slave and master, along with debug and status signals.
-
The memory map defines regions for ITCM (instruction memory), DTCM (data memory), and CSR (control/status registers) within the Coral NPU.
-
Booting the Coral NPU involves initializing instruction memory, programming the start PC, releasing the clock gate, and then releasing the reset, followed by monitoring status signals.
If you are an SoC hardware designer, follow the guidance here to generate the SystemVerilog for Coral NPU and to integrate it into your system design. The Coral NPU core will be an AXI4/TileLink peripheral in the system.
Generate SystemVerilog
A scalar-only Coral NPU configuration is provided that can integrate with an AXI-based system. The SystemVerilog can be generated with this build command:
bazel build //hdl/chisel/src/coralnpu:core_mini_axi_cc_library_emit_verilog
You can build the RISC-V vector (RV32IMF_Zve32x) version of Coral NPU with this command:
bazel build //hdl/chisel/src/coralnpu:rvv_core_mini_axi_cc_library_emit_verilog
Module interfaces

Notes:
- When
stallis asserted, the core logic is gated off (i.e.core_clk= 0). The S_AXI interface is still available to access ITCM and DTCM (tightly-couple memories) and CSRs (registers). - Before releasing
reset,stallshould be asserted to prevent Coral NPU from executing until the TCMs are initialized. Thenresetcan be released and the TCMs initialized. Then removestallto begin Coral NPU execution. haltedandfaultcan be used to set system CSRs and generate a system interrupt.debugandslogare not required in silicon; they are typically used for simulation purposes.
AXI bus
The interfaces to Coral NPU are as follows:
| Signal Bundle | Description |
|---|---|
| clk | The clock of the AXI bus / Coral NPU core. |
| reset | The active-low reset signal for the AXI bus/ Coral NPU core. |
| s_axi | An AXI4 slave interface that can be used to write TCMs or touch Coral NPU CSRs. |
| m_axi | An AXI4 master interface used by Coral NPU to read/write to memories/CSRs. |
| irqn | Active-low interrupt to the Coral NPU core. Can be triggered by peripherals or other host processor. |
| wfi | Active-high signal from the Coral NPU core, indicating that the core is waiting for an interrupt. While this is active, Coral NPU is clock-gated. |
| debug | Debug interface to monitor Coral NPU instructions execution. This interface is typically only used for simulation. |
| s_log | Debug interface to handle SLOG instruction. This interface is typically only used for simulation. |
| halted | Output interface informing if the Coral NPU core is running or not. Can be ignored. |
| fault | Output interface to determine if the core hit a fault. These signals should be connected to a system control CPU interrupt-line or status register for notification when Coral NPU faults or is halted. |
AXI master signals
AR / AW channel
| Signal | Behavior |
|---|---|
| addr | Address Coral NPU wishes to read/write |
| prot | Always 2 (unprivileged, insecure, data) |
| id | Always 0 |
| len | (Count of beats in the burst) – 1 |
| size | Bytes-per-beat (1, 2, or 4) |
| burst | Always 1 (INCR) |
| lock | Always 0 (normal access) |
| cache | Always 0 (Device non-bufferable) |
| qos | Always 0 |
| region | Always 0 |
R channel
| Signal | Behavior |
|---|---|
| data | Response data from the slave |
| id | Ignored, but should be 0 as Coral NPU only emits txns with an id of 0 |
| resp | Response code |
| last | Whether the beat is the last in the burst |
W channel
| Signal | Behavior |
|---|---|
| data | Data Coral NPU wishes to write |
| last | Whether the beat is the last in the burst |
| strb | Which bytes in the data are valid |
B channel
| Signal | Behavior |
|---|---|
| id | Ignored, but should be 0 as Coral NPU only emits txns with an id of 0 (an RTL assertion exists for this) |
| resp | Response code |
AXI slave signals
AR / AW channel
| Signal | Behavior |
|---|---|
| addr | Address the master wishes to read / write to |
| prot | Ignored |
| id | Transaction ID, should be reflected in the response beats |
| len | (Count of beats in the burst) - 1 |
| size | Bytes-per-beat (1,2,4,8,16) |
| burst | 0, 1, or 2 (FIXED, INCR, WRAP) |
| lock | Ignored |
| cache | Ignored |
| qos | Ignored |
| region | Ignored |
R channel
| Signal | Behavior |
|---|---|
| data | Response data from Coral NPU |
| id | Transaction ID, should match with the id field from AR |
| resp | Response code (0/OKAY or 2/SLVERR) |
| last | Whether the beat is the last in the burst |
W channel
| Signal | Behavior |
|---|---|
| data | Data the master wishes to write to Coral NPU |
| last | Whether the beat is the last in the burst |
| strb | Which bytes in data is valid |
B channel
| Signal | Behavior |
|---|---|
| id | Transaction ID, should match with the id field from AW |
| resp | Response code (0/OKAY or 2/SLVERR) |
Debug signals
| Signal | Behavior |
|---|---|
| en | 4-bit value, indicating which fetch lanes are active |
| addr | 32-bit values, containing the PC for each fetch lane |
| inst | 32-bit values, containing the instruction for each fetch lane |
| cycles | cycle counter |
| dbus | Information about internal LSU transactions |
| -> valid | Whether the transaction is valid |
| -> bits | addr: The 32-bit address for the transaction |
| write: If the transaction is a write | |
| wdata: 128-bit write data for the transaction | |
| dispatch | Information about instructions which are dispatched for execution |
| -> fire | If an instruction was dispatched in the slot, this cycle |
| -> addr | The 32-bit address of the instruction |
| -> inst | The 32-bit value of the instruction |
| regfile | Information about writes to the integer register file |
| -> writeAddr | Register addresses to which a future write is expected |
| ->-> valid | If an instruction was dispatched in this lane, which will write the regfile |
| ->-> bits | The 5-bit register address to which the write is expected |
| -> writeData | For each port in the register file, information about writes |
| ->-> valid | If a write occurred on this port, this cycle |
| ->-> bits_addr | The 5-bit register address to which the write occurred |
| ->-> bits_data | The 32-bit value which was written to the register |
| float | Information about write to the floating point register file |
| -> writeAddr | Register addresses to which a future write is expected |
| ->-> valid | If an instruction was dispatched to floating point on this cycle |
| ->-> bits | The address of the register to which a write is expected |
| -> writeData | For each port in the register file, information about writes |
| ->-> valid | If a write occured on this port, this cycle |
| ->-> bits_addr | The 5-bit register address to which the write occurred |
| ->-> bits_data | The 32-bit value which was written to the register |
Coral NPU memory map
Memory accesses to the Coral NPU are defined as follows:
| Region | Range | Size | Alignment | Description |
|---|---|---|---|---|
| ITCM | 0x0000 – 0x1FFF | 8 kB | 4 bytes | ITCM storage for code executed by Coral NPU. |
| DTCM | 0x10000 – 0x17FFF | 32 kB | 1 bytes | DTCM storage for data used by Coral NPU. |
| CSR | 0x30000 – TBD | TBD | 4 bytes | CSR interface used to query/control Coral NPU. |
Reset considerations
Coral NPU uses a synchronous reset strategy. To ensure proper reset behavior, ensure that the clock runs for a cycle with reset active, before enabling either the internal clock gate (via CSR) or gating externally.
Booting Coral NPU
Note that in these examples, Coral NPU is located in the overall system memory map at address 0x0000 0000.
A timing diagram for this boot sequence is shown at the end of steps 1) – 5).
See Coral NPU custom CSRs for reference specifications for the register fields.
1) The instruction memory of Coral NPU must be initialized:
Sample code
volatile uint8_t* coralnpu_itcm = (uint8_t*)0x00000000L;
for (int i = 0; i < coralnpu_binary_len; ++i) {
coralnpu_itcm[i] = coralnpu_binary[i];
}
or
Sample code
volatile uint8_t* coralnpu_itcm = (uint8_t*)0x00000000L;
memcpy(coralnpu_itcm, coralnpu_binary, coralnpu_binary_len);
If something like a DMA engine is present in your system, that is probably a
better option for initializing the ITCM.
2) Program the start PC value. If your program is linked such that the starting address is 0, you may skip this.
Sample code
volatile uint32_t* coralnpu_pc_csr = (uint32_t*)0x00030004L;
*coralnpu_pc_csr = start_addr;
3) Release clock gate
Sample code
volatile uint32_t* coralnpu_reset_csr = (uint32_t*)0x00030000L;
*coralnpu_reset_csr = 1;
After this, be sure to wait a cycle to allow Coral‘s reset to occur. If you want to configure something like an interrupt that is connected to Coral’s fault or halted outputs, this is a good time.
4) Release reset
Sample code
volatile uint32_t* coralnpu_reset_csr = (uint32_t*)0x00030000L;
*coralnpu_reset_csr = 0;
At this point, Coral NPU will begin executing at the PC value programmed in step 2.
5) Monitor for io_halted. The status of Coral's execution can be checked by
reading the status CSR:
Sample code
volatile uint32_t* coralnpu_status_csr = (uint32_t*)0x00030008L;
uint32_t status = *coralnpu_status_csr;
bool halted = status & 1;
bool fault = status & 2;
