In the interactive exercises below, you'll further explore the inner workings of
neural networks. First, you'll see how parameter and hyperparameter changes
affect the network's predictions. Then you'll use what you've learned to train a
neural network to fit nonlinear data.
Exercise 1
The following widget sets up a neural network with the following configuration:
- Input layer with 3 neurons containing the values
0.00, 0.00, and 0.00
- Hidden layer with 4 neurons
- Output layer with 1 neuron
- ReLU activation function applied to
all hidden layer nodes and the output node
Review the initial setup of the network (note: do not click the ▶️ or
>| buttons yet), and then complete the tasks below the widget.
Task 1
The values for the three input features to the neural network model are all
0.00. Click each of the nodes in the network to see all the initialized
values. Before hitting the Play (▶️) button, answer this question:
What kind of output value do you
think will be produced: positive, negative, or 0?
Positive output value
You chose positive
output value. Follow the instructions below
to perform inference on the input data and see if you're right.
Negative output value
You chose negative
output value. Follow the instructions below
to perform inference on the input data and see if you're right.
Output value of 0
You chose output
value of 0. Follow the instructions below
to perform inference on the input data and see if you're right.
Now click the Play (▶️) button above the network, and watch all the hidden-layer
and output node values populate. Was your answer above correct?
Click here for an explanation
The exact output value you get will vary based on how the weight
and bias parameters are randomly initialized. However, since each neuron
in the input layer has a value of 0, the weights used to calculate the
hidden-layer node values will all be zeroed out. For example, the first
hidden layer node calculation will be:
y = ReLU(w11* 0.00 + w21* 0.00 + w31* 0.00 + b)
y = ReLU(b)
So each hidden-layer node's value will be equal to the ReLU value of the
bias (b), which will be 0 if b is negative and b itself if b is 0 or
positive.
The value of the output node will then be calculated as follows:
y = ReLU(w11* x11 + w21* x21
+ w31* x31 + w41* x41 + b)
Task 2
Before modifying the neural network, answer the following question:
If you add another hidden layer
to the neural network after the first hidden layer, and give this new layer 3 nodes, keeping all
input and weight/bias parameters the same, which other nodes' calculations
will be affected?
All the nodes
in the network, except the input nodes
You chose all the
nodes in the network, except the input nodes. Follow the
instructions below to update the neural network and see if you're
correct.
Just the nodes in the
first hidden layer
You chose just the
nodes in the first hidden layer. Follow the instructions below
to update the neural network and see if you're correct.
Just the output node
You chose just the
output node. Follow the instructions below to update the
neural network and see if you're correct.
Now modify the neural network to add a new hidden layer with 3 nodes as follows:
- Click the + button to the left of the text 1 hidden layer to add a new
hidden layer before the output layer.
- Click the + button above the new hidden layer twice to add 2 more nodes
to the layer.
Was your answer above correct?
Click here for an explanation
Only the output node changes. Because inference for this neural network
is "feed-forward" (calculations progress from start to finish), the addition
of a new layer to the network will only affect nodes after the new
layer, not those that precede it.
Task 3
Click the second node (from the top) in the first hidden layer of the network
graph. Before making any changes to the network configuration, answer the
following question:
If you change the value of the
weight w12 (displayed below the first input node, x1),
which other nodes' calculations could be affected for some input
values?
None
You chose none
. Follow the instructions below to update the neural network and
see if you're correct.
The second node in
the first hidden layer, all the nodes in the second hidden layer, and the
output node.
You chose the second
node in the first hidden layer, all the nodes in the second hidden layer,
and the output node. Follow the instructions below
to update the neural network and see if you're correct.
All the nodes in the
first hidden layer, the second hidden layer, and the output layer.
You chose all the
nodes in the first hidden layer, the second hidden layer, and the
output layer. Follow the instructions below
to update the neural network and see if you're correct.
Now, click in the text field for the weight w12 (displayed below the
first input node, x1), change its value to 5.00, and hit Enter.
Observe the updates to the graph.
Was your answer correct? Be careful when verifying your answer: if a node
value doesn't change, does that mean the underlying calculation didn't change?
Click here for an explanation
The only node affected in the first hidden layer is the second node (the
one you clicked). The value calculations for the other nodes in the first
hidden layer do not contain w12 as a parameter, so they are not
affected. All the nodes in the second hidden layer are affected, as their
calculations depend on the value of the second node in the first
hidden layer. Similarly, the output node value is affected because its
calculations depend on the values of the nodes in the second hidden layer.
Did you think the answer was "none" because none of the node values in the
network changed when you changed the weight value? Note that an underlying
calculation for a node may change without changing the node's value
(e.g., ReLU(0) and ReLU(–5) both produce an output of 0).
Don't make assumptions about how the network was affected just by
looking at the node values; make sure to review the calculations as well.
Exercise 2
In the Feature cross exercises
in the Categorical data module,
you manually constructed feature crosses to fit nonlinear data.
Now, you'll see if you can build a neural network that can automatically learn
how to fit nonlinear data during training.
Your task: configure a neural network that can separate the orange dots from
the blue dots in the diagram below, achieving a loss of less than 0.2 on both
the training and test data.
Instructions:
In the interactive widget below:
- Modify the neural network hyperparameters by experimenting with some
of the following config settings:
- Add or remove hidden layers by clicking the + and - buttons to the
left of the HIDDEN LAYERS heading in the network diagram.
- Add or remove neurons from a hidden layer by clicking the + and -
buttons above a hidden-layer column.
- Change the learning rate by choosing a new value from the Learning rate
drop-down above the diagram.
- Change the activation function by choosing a new value from the
Activation drop-down above the diagram.
- Click the Play (▶️) button above the diagram to train the neural network
model using the specified parameters.
- Observe the visualization of the model fitting the data as training
progresses, as well as the
Test loss and
Training loss values in
the Output section.
- If the model does not achieve loss below 0.2 on the test and training data,
click reset, and repeat steps 1–3 with a different set of configuration
settings. Repeat this process until you achieve the preferred results.
Click here for our solution
We were able to achieve both test and training loss below 0.2 by:
- Adding 1 hidden layer containing 3 neurons.
- Choosing a learning rate of 0.01.
- Choosing an activation function of ReLU.