Validation

Partitioning a data set into a training set and test set lets you judge whether a given model will generalize well to new data. However, using only two partitions may be insufficient when doing many rounds of hyperparameter tuning.

Validation

A Possible Workflow?

A workflow diagram consisting of three stages. 1. Train model on training set. 2. Evaluate model on test set. 3. Tweak model according to results on test set. Iterate on 1, 2, and 3, ultimately picking the model that does best on the test set.

Partitioning Data Sets

A horizontal bar divided into three pieces: 70% of which is the training set, 15% the validation set, and 15% the test set

Better Workflow: Use a Validation Set

Similar workflow to Figure 1, except that instead of evaluating the model against the test set, the workflow evaluates the model against the validation set. Then, once the training set and validation set more-or-less agree, confirm the model against the test set.

Send feedback about...

Machine Learning Crash Course