Suppose an online shoe store wants to create a supervised ML model
that will provide personalized shoe recommendations to users. That is,
the model will recommend certain pairs of shoes to Marty and
different pairs of shoes to Janet. The system will use past user
behavior data to generate training data. Which of the following
statements are true?
"Shoe size" is a useful feature.
"Shoe size" is a quantifiable signal that likely has
a strong impact on whether the user will like the recommended
shoes. For example, if Marty wears size 9, the model shouldn't
recommend size 7 shoes.
"Shoe beauty" is a useful feature.
Good features are concrete and quantifiable.
Beauty is too vague a concept to serve as a useful feature.
Beauty is probably a blend of certain concrete features,
such as style and color. Style and color would each be
better features than beauty.
"The user clicked on the shoe's description" is a useful label.
Users probably only want to read more about those shoes that
they like. Clicks by users is, therefore, an observable, quantifiable
metric that could serve as a good training label. Since our training
data derives from past user behavior, our labels need to derive from
objective behaviors like clicks that strongly correlate with user
"Shoes that a user adores" is a useful label.
Adoration is not an observable, quantifiable metric. The best we can
do is search for observable proxy metrics for adoration.