Neural Net Initialization
This exercise uses the XOR data again, but looks at the repeatability of training Neural Nets and the importance of initialization.
Task 1: Run the model as given four or five times. Before each trial, hit the Reset the network button to get a new random initialization. (The Reset the network button is the circular reset arrow just to the left of the Play button.) Let each trial run for at least 500 steps to ensure convergence. What shape does each model output converge to? What does this say about the role of initialization in non-convex optimization?
Task 2: Try making the model slightly more complex by adding a layer and a couple of extra nodes. Repeat the trials from Task 1. Does this add any additional stability to the results?
(Answers appear just below the exercise.)
Click the plus icon for an answer to Task 1.
The learned model had different shapes on each run. The converged test loss varied almost 2X from lowest to highest.
Click the plus icon for an answer to Task 2.
Adding the layer and extra nodes produced more repeatable results. On each run, the resulting model looked roughly the same. Furthermore, the converged test loss showed less variance between runs.