Static vs. Dynamic Training: Check Your Understanding

Dynamic (Online) Training

Explore the options below.

Which one of the following statements is true of dynamic (online) training?
The model stays up to date as new data arrives.
This is the primary benefit of online training—we can avoid many staleness issues by allowing the model to train on new data as it comes in.
Very little monitoring of training jobs needs to be done.
Actually, you must continuously monitor training jobs to ensure that they are healthy and working as intended. You'll also need supporting infrastructure like the ability to roll a model back to a previous snapshot in case something goes wrong in training, such as a buggy job or corruption in input data.
Very little monitoring of input data needs to be done at inference time.
Just like a static, offline model, it is also important to monitor the inputs to the dynamically updated models. We are likely not at risk for large seasonality effects, but sudden, large changes to inputs (such as an upstream data source going down) can still cause unreliable predictions.

Static (Offline) Training

Explore the options below.

Which of the following statements are true about static (offline) training?
The model stays up to date as new data arrives.
Actually, if we train offline, then the model has no way to incorporate new data as it arrives. This can lead to model staleness, if the distribution we are trying to learn from changes over time.
You can verify the model before applying it in production.
Yes, offline training gives ample opportunity to verify model performance before introducing the model in production.
Offline training requires less monitoring of training jobs than online training.
In general, monitoring requirements at training time are more modest for offline training, which insulates us from many production considerations. However, the more frequently you train your model, the higher the investment you'll need to make in monitoring. You'll also want to validate regularly to ensure that changes to your code (and its dependencies) don't adversely affect model quality.
Very little monitoring of input data needs to be done at inference time.
Counterintuitively, you do need to monitor input data at serving time. If the input distributions change, then our model's predictions may become unreliable. Imagine, for example, a model trained only on summertime clothing data suddenly being used to predict clothing buying behavior in wintertime.