Check Your Understanding: ML in Production

The pipeline testing guidelines cannot be demonstrated in a Colab. Instead, the following exercises help practice the guidelines. The next page describes resources for implementing the guidelines.

For the following questions, click on your selection to expand and check your answer.

After launching your unicorn appearance predictor, you must keep your predictor fresh by retraining on new data. Because you are gathering too much new data to train on, you decide to limit the training data by sampling the new data over a window of time. You also need to account for daily and annual patterns in unicorn appearances. And, the fastest you can launch new model versions is every three months. What window of time do you choose?
One day, because a larger window would result in lots of data and your model would take too long to train.
Incorrect. You can adjust the data sampling rate to limit the size of the dataset. Given that you can only update your model every three months, a model trained on a day's worth of data will gradually become stale.
One week, so that your dataset is not too large but you can still smooth out patterns.
Incorrect. You can adjust the data sampling rate to limit the size of the dataset. Given that you can only update your model every three months, a model trained on a week's worth of data will gradually become stale.
One year, to ensure that your model is not biased by daily or yearly patterns.
Correct! You should choose a representative dataset so that your model learns to predict across all scenarios.
You launch your unicorn appearance predictor. It's working well! You go on vacation and return after three weeks to find that your model quality has dropped significantly. Assume that unicorn behavior is unlikely to change significantly in three weeks. What is the most likely explanation for the decrease in quality?
Training-serving skew.
Correct. While unicorn behavior probably didn't change, perhaps the underlying data reporting or data formatting changed in the serving data after the training data was collected. Detect potential training-serving skew by checking the serving data against the data schema of the training data.
You forgot to test model quality against a fixed threshold.
Incorrect. Testing model quality would help catch a decrease in quality, but would not explain why that decrease occurred.
Your model is stale.
Incorrect, assuming that your training data covers all cycles of unicorn behavior, as discussed in the previous question.
You wisely decide to monitor predictions for Antarctica because you lack sufficient training data there. Your prediction quality mysteriously drops for a few days at a time, especially in winter. What could be the cause?
An environmental factor.
Correct. You discover that storms in Antarctica correlate with decreases in your prediction quality. During these storms, unicorn behavior changes. Furthermore, collecting data during storms in Antarctica is impossible, meaning your model cannot train for such conditions.
Your model becomes stale.
Incorrect. If this cause were correct, then quality would drop continuously as your model became stale, instead of dropping for only a few days.
No cause necessary. ML models have inherent randomness.
Incorrect. If your model quality fluctuates, you should investigate the cause. Try to eliminate randomness in your model training to increase reproducibility.
Your unicorn appearance predictor has operated for a year. You've fixed many problems, and quality is now high. However, you notice a small but persistent problem. Your model quality has drifted slightly lower in urban areas. What might be the cause?
The high quality of your predictions lead users to easily find unicorns, affecting unicorn appearance behavior itself.
Correct. Unicorns responded to increased attention by changing their behavior in urban areas. As your model's predictions adapt to the changing behavior, unicorns continue to change their behavior. Such a situation, where your model's behavior affects the training data itself, is called a feedback loop. You should try modifying your training-serving skew detection to detect changes in serving data that correspond to changes in unicorn behavior.
Unicorn appearances are reported multiple times in heavily populated areas, skewing your training data.
Incorrect. This is probably not the cause because this skew should have lowered your quality from launch.
Urban areas are difficult to model.
Incorrect. If your model was having trouble predicting in urban areas, the quality would be low from the start, instead of drifting lower after launch.