Training and Test Sets

A test set is a data set used to evaluate the model developed from a training set.

Training and Test Sets

A horizontal bar divided into two pieces: 80% of which is the training set and 20% the test set.
Two models: one run on training data and the other on test data.  The model is very simple, just a line dividing the orange dots from the blue dots.  The loss on the training data is similar to the loss on the test data.
  • Divide into two sets:
    • training set
    • test set
  • Classic gotcha: do not train on test data
    • Getting surprisingly low loss?
    • Before celebrating, check if you're accidentally training on test data