AI-generated Key Takeaways
-
GpuAccelerationConfig.GpuInferencePriority defines relative priorities for GPU delegate client needs in inference.
-
Ordered priorities determine decisions made by the inference engine.
-
The
GPU_PRIORITY_AUTOoption can only be used when higher priorities are fully specified. -
Specific enum values include
GPU_PRIORITY_AUTO,GPU_PRIORITY_MAX_PRECISION,GPU_PRIORITY_MIN_LATENCY, andGPU_PRIORITY_MIN_MEMORY_USAGE.
Relative priorities given by the GPU delegate to different client needs. Ordered priorities provide better control over desired semantics, where priority(n) is more important than priority(n+1), therefore, each time inference engine needs to make a decision, it uses ordered priorities to do so.
For example: GPU_PRIORITY_MAX_PRECISION at priority(1) would not allow to
decrease precision, but moving it to priority(2) or priority(3) would result in F16
calculation.
GPU_PRIORITY_AUTO can only be used when higher priorities are fully specified.
Inherited Method Summary
Enum Values
public static final GpuAccelerationConfig.GpuInferencePriority GPU_PRIORITY_AUTO
Auto GPU priority.
public static final GpuAccelerationConfig.GpuInferencePriority GPU_PRIORITY_MAX_PRECISION
Maximum precision GPU priority.
public static final GpuAccelerationConfig.GpuInferencePriority GPU_PRIORITY_MIN_LATENCY
Minimum latency GPU priority.
public static final GpuAccelerationConfig.GpuInferencePriority GPU_PRIORITY_MIN_MEMORY_USAGE
Minimum memory usage GPU priority.