Classification: Check Your Understanding (ROC and AUC)

ROC and AUC

Explore the options below.

Which of the following ROC curves produce AUC values greater than 0.5?

An ROC curve with a vertical line running from (0,0) to (0,1), and a horizontal from (0,1) to (1,1). The TP rate is 1.0 for all FP rates.

This is the best possible ROC curve, as it ranks all positives above all negatives. It has an AUC of 1.0.

In practice, if you have a "perfect" classifier with an AUC of 1.0, you should be suspicious, as it likely indicates a bug in your model. For example, you may have overfit to your training data, or the label data may be replicated in one of your features.

An ROC curve with a horizontal line running from (0,0) to (1,0), and a vertical line from (1,0) to (1,1). The FP rate is 1.0 for all TP rates

This is the worst possible ROC curve; it ranks all negatives above all positives, and has an AUC of 0.0. If you were to reverse every prediction (flip negatives to positives and postives to negatives), you'd actually have a perfect classifier!

An ROC curve with one diagonal line running from (0,0) to (1,1). TP and FP
rates increase linearly at the same rate.

This ROC curve has an AUC of 0.5, meaning it ranks a random positive example higher than a random negative example 50% of the time. As such, the corresponding classification model is basically worthless, as its predictive ability is no better than random guessing.

An ROC curve that arcs up and right from (0,0) to (1,1). TP rate increases at
a faster rate than FP rate.

This ROC curve has an AUC between 0.5 and 1.0, meaning it ranks a random positive example higher than a random negative example more than 50% of the time. Real-world binary classification AUC values generally fall into this range.

An ROC curve that arcs right and up from (0,0) to (1,1). FP rate increases at
a faster rate than TP rate.

This ROC curve has an AUC between 0 and 0.5, meaning it ranks a random positive example higher than a random negative example less than 50% of the time. The corresponding model actually performs worse than random guessing! If you see an ROC curve like this, it likely indicates there's a bug in your data.

AUC and Scaling Predictions

Explore the options below.

How would multiplying all of the predictions from a given model by 2.0 (for example, if the model predicts 0.4, we multiply by 2.0 to get a prediction of 0.8) change the model's performance as measured by AUC?

No change. AUC only cares about relative prediction scores.

Yes, AUC is based on the relative predictions, so any transformation of the predictions that preserves the relative ranking has no effect on AUC. This is clearly not the case for other metrics such as squared error, log loss, or prediction bias (discussed later).

It would make AUC terrible, since the prediction values are now way off.

Interestingly enough, even though the prediction values are different (and likely farther from the truth), multiplying them all by 2.0 would keep the relative ordering of prediction values the same. Since AUC only cares about relative rankings, it is not impacted by any simple scaling of the predictions.

It would make AUC better, because the prediction values are all farther apart.

The amount of spread between predictions does not actually impact AUC. Even a prediction score for a randomly drawn true positive is only a tiny epsilon greater than a randomly drawn negative, that will count that as a success contributing to the overall AUC score.

Help Center

ROC Curve and AUC

Prediction Bias