In the interactive exercises below, you'll further explore the inner workings of neural networks. First, you'll see how parameter and hyperparameter changes affect the network's predictions. Then you'll use what you've learned to train a neural network to fit nonlinear data.
Exercise 1
The following widget sets up a neural network with the following configuration:
- Input layer with 3 neurons containing the values
0.00
,0.00
, and0.00
- Hidden layer with 4 neurons
- Output layer with 1 neuron
- ReLU activation function applied to all hidden layer nodes and the output node
Review the initial setup of the network (note: do not click the ▶️ or >| buttons yet), and then complete the tasks below the widget.
Task 1
The values for the three input features to the neural network model are all
0.00
. Click each of the nodes in the network to see all the initialized
values. Before hitting the Play (▶️) button, answer this question:
Now click the Play (▶️) button above the network, and watch all the hidden-layer and output node values populate. Was your answer above correct?
Click here for an explanation
The exact output value you get will vary based on how the weight and bias parameters are randomly initialized. However, since each neuron in the input layer has a value of 0, the weights used to calculate the hidden-layer node values will all be zeroed out. For example, the first hidden layer node calculation will be:
y = ReLU(w11* 0.00 + w21* 0.00 + w31* 0.00 + b)
y = ReLU(b)
So each hidden-layer node's value will be equal to the ReLU value of the bias (b), which will be 0 if b is negative and b itself if b is 0 or positive.
The value of the output node will then be calculated as follows:
y = ReLU(w11* x11 + w21* x21 + w31* x31 + w41* x41 + b)
Task 2
Before modifying the neural network, answer the following question:
Now modify the neural network to add a new hidden layer with 3 nodes as follows:
- Click the + button to the left of the text 1 hidden layer to add a new hidden layer before the output layer.
- Click the + button above the new hidden layer twice to add 2 more nodes to the layer.
Was your answer above correct?
Click here for an explanation
Only the output node changes. Because inference for this neural network is "feed-forward" (calculations progress from start to finish), the addition of a new layer to the network will only affect nodes after the new layer, not those that precede it.
Task 3
Click the second node (from the top) in the first hidden layer of the network graph. Before making any changes to the network configuration, answer the following question:
Now, click in the text field for the weight w12 (displayed below the
first input node, x1), change its value to 5.00
, and hit Enter.
Observe the updates to the graph.
Was your answer correct? Be careful when verifying your answer: if a node value doesn't change, does that mean the underlying calculation didn't change?
Click here for an explanation
The only node affected in the first hidden layer is the second node (the one you clicked). The value calculations for the other nodes in the first hidden layer do not contain w12 as a parameter, so they are not affected. All the nodes in the second hidden layer are affected, as their calculations depend on the value of the second node in the first hidden layer. Similarly, the output node value is affected because its calculations depend on the values of the nodes in the second hidden layer.
Did you think the answer was "none" because none of the node values in the network changed when you changed the weight value? Note that an underlying calculation for a node may change without changing the node's value (e.g., ReLU(0) and ReLU(–5) both produce an output of 0). Don't make assumptions about how the network was affected just by looking at the node values; make sure to review the calculations as well.
Exercise 2
In the Feature cross exercises in the Categorical data module, you manually constructed feature crosses to fit nonlinear data. Now, you'll see if you can build a neural network that can automatically learn how to fit nonlinear data during training.
Your task: configure a neural network that can separate the orange dots from the blue dots in the diagram below, achieving a loss of less than 0.2 on both the training and test data.
Instructions:
In the interactive widget below:
- Modify the neural network hyperparameters by experimenting with some
of the following config settings:
- Add or remove hidden layers by clicking the + and - buttons to the left of the HIDDEN LAYERS heading in the network diagram.
- Add or remove neurons from a hidden layer by clicking the + and - buttons above a hidden-layer column.
- Change the learning rate by choosing a new value from the Learning rate drop-down above the diagram.
- Change the activation function by choosing a new value from the Activation drop-down above the diagram.
- Click the Play (▶️) button above the diagram to train the neural network model using the specified parameters.
- Observe the visualization of the model fitting the data as training progresses, as well as the Test loss and Training loss values in the Output section.
- If the model does not achieve loss below 0.2 on the test and training data, click reset, and repeat steps 1–3 with a different set of configuration settings. Repeat this process until you achieve the preferred results.
Click here for our solution
We were able to achieve both test and training loss below 0.2 by:
- Adding 1 hidden layer containing 3 neurons.
- Choosing a learning rate of 0.01.
- Choosing an activation function of ReLU.