GpuAccelerationConfig.GpuInferenceUsage

  • GpuAccelerationConfig.GpuInferenceUsage is an enum defining GPU inference preferences for initialization vs. inference time.

  • GPU_INFERENCE_PREFERENCE_FAST_SINGLE_ANSWER is used when the delegate will be used only once, prioritizing faster initialization.

  • GPU_INFERENCE_PREFERENCE_SUSTAINED_SPEED is used when the same delegate will be used repeatedly, prioritizing maximizing throughput.

public static final enum GpuAccelerationConfig.GpuInferenceUsage extends Enum<GpuAccelerationConfig.GpuInferenceUsage>

GPU inference preference for initialization time vs. inference time.

Inherited Method Summary

Enum Values

public static final GpuAccelerationConfig.GpuInferenceUsage GPU_INFERENCE_PREFERENCE_FAST_SINGLE_ANSWER

Delegate will be used only once, therefore, bootstrap/init time should be taken into account.

public static final GpuAccelerationConfig.GpuInferenceUsage GPU_INFERENCE_PREFERENCE_SUSTAINED_SPEED

Prefer maximizing the throughput. Same delegate will be used repeatedly on multiple inputs.