Explore the options below.
We looked at a process of using a test set and a training set to drive iterations of model development. On each iteration, we'd train on the training data and evaluate on the test data, using the evaluation results on test data to guide choices of and changes to various model hyperparameters like learning rate and features. Is there anything wrong with this approach? (Pick only one answer.)
Totally fine, we're training on training data and evaluating on separate, held-out test data.
Actually, there's a subtle issue here. Think about what might happen if we did many, many iterations of this form.
Doing many rounds of this procedure might cause us to implicitly fit to the peculiarities of our specific test set.
Yes indeed! The more often we evaluate on a given test set, the more we are at risk for implicitly overfitting to that one test set. We'll look at a better protocol next.
This is computationally inefficient. We should just pick a default set of hyperparameters and live with them to save resources.
Although these sorts of iterations are expensive, they are a critical part of model development. Hyperparameter settings can make an enormous difference in model quality, and we should always budget some amount of time and computational resources to ensure we're getting the best quality we can.