Explore the options below.

Imagine a linear model with 100 input features:
  • 10 are highly informative.
  • 90 are non-informative.
  • Assume that all features have values between -1 and 1. Which of the following statements are true?
    L2 regularization will encourage many of the non-informative weights to be nearly (but not exactly) 0.0.
    Yes, L2 regularization encourages weights to be near 0.0, but not exactly 0.0.
    L2 regularization will encourage most of the non-informative weights to be exactly 0.0.
    L2 regularization does not tend to force weights to exactly 0.0. L2 regularization penalizes larger weights more than smaller weights. As a weight gets close to 0.0, L2 "pushes" less forcefully toward 0.0.
    L2 regularization may cause the model to learn a moderate weight for some non-informative features.
    Surprisingly, this can happen when a non-informative feature happens to be correlated with the label. In this case, the model incorrectly gives such non-informative features some of the "credit" that should have gone to informative features.