Google is committed to advancing racial equity for Black communities. See how.

# Training Neural Networks

Backpropagation is the most common training algorithm for neural networks. It makes gradient descent feasible for multi-layer neural networks. TensorFlow handles backpropagation automatically, so you don't need a deep understanding of the algorithm. To get a sense of how it works, walk through the following: Backpropagation algorithm visual explanation. As you scroll through the preceding explanation, note the following:

• How data flows through the graph.
• How dynamic programming lets us avoid computing exponentially many paths through the graph. Here "dynamic programming" just means recording intermediate results on the forward and backward passes.

# Training Neural Nets

• If it's differentiable, we can probably learn on it
• If it's differentiable, we can probably learn on it
• Each additional layer can successively reduce signal vs. noise
• ReLus are useful here
• If it's differentiable, we can probably learn on it
• Each additional layer can successively reduce signal vs. noise
• ReLus are useful here
• Learning rates are important here
• Batch normalization (useful knob) can help
• If it's differentiable, we can probably learn on it
• Each additional layer can successively reduce signal vs. noise
• ReLus are useful here
• Learning rates are important here
• Batch normalization (useful knob) can help
• ReLu layers can die
• Keep calm and lower your learning rates
• We'd like our features to have reasonable scales
• Roughly zero-centered, [-1, 1] range often works well
• Helps gradient descent converge; avoid NaN trap
• Avoiding outlier values can also help
• Can use a few standard methods:
• Linear scaling
• Hard cap (clipping) to max, min
• Log scaling
• Dropout: Another form of regularization, useful for NNs
• Works by randomly "dropping out" units in a network for a single gradient step
• There's a connection to ensemble models here
• The more you drop out, the stronger the regularization
• 0.0 = no dropout regularization
• 1.0 = drop everything out! learns nothing
• Intermediate values more useful
[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples/Code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }]
[{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]