Reducing Loss: Optimizing Learning Rate

Exercise 1

Set a learning rate of 0.03 on the slider. Keep hitting the STEP button until the gradient descent algorithm reaches the minimum point of the loss curve. How many steps did it take?

Exercise 2

Can you reach the minimum more quickly with a higher learning rate? Set a learning rate of 0.1, and keep hitting STEP until gradient descent reaches the minimum. How many steps did it take this time?

Exercise 3

How about an even larger learning rate. Reset the graph, set a learning rate of 1, and try to reach the minimum of the loss curve. What happened this time?

Optional Challenge

Can you find the Goldilocks learning rate for this curve, where gradient descent reaches the minimum point in the fewest number of steps? What is the fewest number of steps required to reach the minimum?