ML Systems in the Real World: Literature

In this lesson, you'll debug a real-world ML problem* related to 18th century literature.

Real World Example: 18th Century Literature

Professor of 18th Century Literature wanted to predict the political affiliation of authors based only on the "mind metaphors" the author used.

Real World Example: 18th Century Literature

Professor of 18th Century Literature wanted to predict the political affiliation of authors based only on the "mind metaphors" the author used.
Team of researchers made a big labeled data set with many authors' works, sentence by sentence, and split into train/validation/test sets.

Real World Example: 18th Century Literature

Professor of 18th Century Literature wanted to predict the political affiliation of authors based only on the "mind metaphors" the author used.
Team of researchers made a big labeled data set with many authors' works, sentence by sentence, and split into train/validation/test sets.
Trained model did nearly perfectly on test data, but researchers felt results were suspiciously accurate. What might have gone wrong?

Real World Example: 18th Century Literature

Why do you think test accuracy was suspiciously high? See if you can figure out the problem, and then click the Play button ▶ below to find out if you're correct.

Real World Example: 18th Century Literature

Data Split A: Researchers put some of each author's examples in training set, some in validation set, some in test set.

Diagram showing the breakdown of author examples in the training, validation, and test sets. Examples from each of the three authors are represented in each set.

Real World Example: 18th Century Literature

Data Split B: Researchers put all of each author's examples in a single set.

Diagram showing the breakdown of author examples in the training, validation, and test sets. The training set contains only examples from Swift, the validation set contains only examples from Blake, and the test set contains only examples from Defoe.

Real World Example: 18th Century Literature

Data Split A: Researchers put some of each author's examples in training set, some in validation set, some in test set.
Data Split B: Researchers put all of each author's examples in a single set.
Results: The model trained on Data Split A had much higher accuracy than the model trained on Data Split B.

Real World Example: 18th Century Literature

The moral: carefully consider how you split examples.

Know what the data represents.

* We based this module very loosely (making some modifications along the way) on "Meaning and Mining: the Impact of Implicit Assumptions in Data Mining for the Humanities" by Sculley and Pasanek.

Help Center

Cancer Prediction (5 min)

Guidelines (2 min)