The scripted examples given here let you compile and run a TensorFlow Lite ML model on Coral NPU's scalar core and then on the vector engine. The model executed is MobileNet V1, a convolutional neural network (CNN) architecture designed for image classification, object detection, and other computer vision tasks.
The process uses reference kernels from TensorFlow Lite Micro, and intermediate layers of MobileNet are executed.
The model is run in the CoreMini AXI high-memory simulator, with real inputs and real outputs. The simulator output reports simulation metrics, notably the execution cycle count for performance assessment.
Prerequisites
This example assumes you are working in the Google Coral NPU repository on GitHub
- Be sure to complete the preliminary steps described in Software prerequisites and system setup.
- Complete the Simple programming tutorial.
- Follow the edge AI tutorial here to learn about running an inference on microcontrollers.
Run MobileNet on the scalar core
To run the script, enter the following command:
bazel run -c opt tests/cocotb/tutorial/tfmicro:cocotb_run_mobilenet_v1
The script performs the following sub-tasks, as illustrated in the figure below:
- Compiles
run_mobilenet.cc
using coralnpu_v2_binary.- coralnpu_v2_binary is a bazel rule that composes flags and settings
to compile for the
coralnpu_v2
platform. run_mobilenet.cc
is a TFMicro inference with reference kernels.- Finds the example rule run_mobilenet_v1_025_partial_binary.
- coralnpu_v2_binary is a bazel rule that composes flags and settings
to compile for the
- Uses high-memory TCM (tightly-coupled memory) of Coral NPU to:
- Run high-memory programs such as ML model inferencing. See this page for information about high-memory TCM.
- Add a new data section
.extdata
. - Memory buffers such as tensor arena, inputs, and outputs can be stored
in this data section:
(uint8_t tensor_arena[kTensorArenaSize] __attribute__((section(".extdata"), aligned(16), used, retain)))
- Runs the cocotb test suite.
- When the TFLite inference program is ready, the
cocotb_test_suite
bazel rule is used to simulate the program withrvv_core_mini_highmem_axi_model
. - Run the script
cocotb_run_mobilenet_v1.py
:- Loads
run_mobilenet_v1_025_partial_binary.elf
. - Invokes and executes the program to halt.
- Reads the memory buffer inference_status in program memory.
- Loads
- When the TFLite inference program is ready, the
cocotb is a coroutine-based, co-simulation test bench environment for verifying VHDL and SystemVerilog RTL using Python. cocotb is free, open source, and hosted on GitHub.
Run MobileNet on the vector execution engine
This example demonstrates how to use Coral NPU’s vector execution engine to optimize a model with parallelization.
Run the example script:
bazel run tests/cocotb/tutorial/tfmicro:cocotb_run_rvv_mobilenet_v1
The script performs the following sub-tasks:
- Parallelize the first program with a custom kernel:
- Rewrites the custom vector kernel for Coral's vector execution engine.
- Compiles this kernel into the TensorFlow Lite RT runtime as a library.
- Run the model using the Coral NPU simulator with vector configuration:
- Runs the actual model with real inputs and real outputs.
- Check the simulation metrics, notably number of execution cycles. Performance can be compared to the scalar core example to illustrate the boost in performance.