Explore the options below.
Imagine a linear model with 100 input features:
10 are highly informative.
90 are non-informative.
Assume that all features have values between -1 and 1.
Which of the following statements are true?
L2 regularization will encourage many of the
non-informative weights to be nearly (but not exactly) 0.0.
Yes, L2 regularization encourages weights to be
near 0.0, but not exactly 0.0.
L2 regularization will encourage most of the
non-informative weights to be exactly 0.0.
L2 regularization does not tend to force weights
to exactly 0.0. L2 regularization penalizes larger
weights more than smaller weights. As a weight gets close to 0.0,
L2 "pushes" less forcefully toward 0.0.
L2 regularization may cause the model to learn a
moderate weight for some non-informative features.
Surprisingly, this can happen when a non-informative feature happens
to be correlated with the label. In this case, the model incorrectly
gives such non-informative features some of the "credit" that should
have gone to informative features.