Explore the options below.
Imagine a linear model with 100 input features:
10 are highly informative.
90 are non-informative.
Assume that all features have values between -1 and 1.
Which of the following statements are true?
L2 regularization will encourage many of the non-informative weights to be nearly (but not exactly) 0.0.
Yes, L2 regularization encourages weights to be near 0.0, but not exactly 0.0.
L2 regularization will encourage most of the non-informative weights to be exactly 0.0.
L2 regularization does not tend to force weights to exactly 0.0. L2 regularization penalizes larger weights more than smaller weights. As a weight gets close to 0.0, L2 "pushes" less forcefully toward 0.0.
L2 regularization may cause the model to learn a moderate weight for some non-informative features.
Surprisingly, this can happen when a non-informative feature happens to be correlated with the label. In this case, the model incorrectly gives such non-informative features some of the "credit" that should have gone to informative features.