Machine Learning | Google for Developers

A new and improved version of Machine Learning Crash Course is coming in August 2024. Stay tuned!

Explore the options below.

Dynamic (online) inference means making predictions on demand. That is, in online inference, we put the trained model on a server and issue inference requests as needed. Which of the following are true of dynamic inference?

You can provide predictions for all possible items.

Yes, this is a strength of online inference. Any request that comes in will be given a score. Online inference handles long-tail distributions (those with many rare items), like the space of all possible sentences written in movie reviews.

You can do post-verification of predictions before they are used.

In general, it's not possible to do a post-verification of all predictions before they get used because predictions are being made on demand. You can, however, potentially monitor aggregate prediction qualities to provide some level of sanity checking, but these will signal fire alarms only after the fire has already spread.

You must carefully monitor input signals.

Yes. Signals could change suddenly due to upstream issues, harming our predictions.

When performing online inference, you do not need to worry about prediction latency (the lag time for returning predictions) as much as when performing offline inference.

Prediction latency is often a real concern in online inference. Unfortunately, you can't necessarily fix prediction latency issues by adding more inference servers.