Stay organized with collections
Save and categorize content based on your preferences.
Neural Net Initialization
This exercise uses the XOR data again, but looks at the repeatability
of training Neural Nets and the importance of initialization.
Task 1: Run the model as given four or five times. Before each trial,
hit the Reset the network button to get a new random initialization.
(The Reset the network button is the circular reset arrow just to the
left of the Play button.) Let each trial run for at least 500 steps
to ensure convergence. What shape does each model output converge to?
What does this say about the role of initialization in non-convex
optimization?
Task 2: Try making the model slightly more complex by adding a layer
and a couple of extra nodes. Repeat the trials from Task 1. Does this
add any additional stability to the results?
(Answers appear just below the exercise.)
Click the plus icon for an answer to Task 1.
The learned model had different shapes on each run. The converged
test loss varied almost 2X from lowest to highest.
Click the plus icon for an answer to Task 2.
Adding the layer and extra nodes produced more repeatable results.
On each run, the resulting model looked roughly the same. Furthermore,
the converged test loss showed less variance between runs.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-08-21 UTC."],[[["Neural networks with the same architecture and data can converge to different solutions due to random initialization, highlighting its role in non-convex optimization."],["Increasing the complexity of a neural network by adding layers and nodes can improve the stability and repeatability of training results, leading to more consistent model performance."],["Initialization significantly impacts the final model and the variance in test loss, especially in simpler network structures."],["While simpler networks can exhibit diverse solutions and varying losses, more complex models demonstrate increased stability and repeatable convergence."]]],[]]