Implement the backpropagation algorithm for neural networks and apply it to the task of hand-written digit recognition.
1. Neural Networks
- implement the backpropagation algorithm to learn the parameters for the neural network.
1.1 Visualizing the data
5000 training examples
each training example is a 20 pixel by 20 pixel grayscale image of the digit
The 20 by 20 grid of pixels is “unrolled” into a 400-dimensional vector
1.2 Model representation
- 3 layers – an input layer, a hidden layer and an output layer
1.3 Feedforward and cost function
- implement the cost function and gradient for the neural network
should not be regularizing the terms that correspond to the bias
Cost function with regularization:
compute the gradient for the neural network cost function
2.1 Sigmoid gradient
Gradient for the sigmoid function:
2.2 Random initialization
- When training neural networks, it is important to randomly initialize the parameters for symmetry breaking.
epsilon init = 0.12; W = rand(L out, 1 + L in) * 2 * epsilon init − epsilon init;
Intuition behind the backpropagation algorithm:
- Given a training example (x(t),y(t)), first run a “forward pass” to compute all the activations throughout the network
for each node j in layer l, compute an “error term” δ(l) that measures how much that node was “responsible” j for any errors in our output
Step 1-4 to implement backpropagation: