Perceptron - Learning Algorithm

Learning Algorithm

Below is an example of a learning algorithm for a (single-layer) perceptron, a variant of stochastic gradient descent (SGD). For multilayer perceptrons, where a hidden layer exists, more complicated algorithms such as backpropagation must be used. Alternatively, methods such as the delta rule can be used if the function is non-linear and differentiable, although the one below will work as well.

When multiple perceptrons are combined in an artificial neural network, each output neuron operates independently of all the others; thus, learning each output can be considered in isolation.

We first define some variables:

denotes the output from the perceptron for an input vector .
is the bias term, which in the example below we take to be 0.
is the training set of samples, where:
- is the -dimensional input vector.
- is the desired output value of the perceptron for that input.

We show the values of the nodes as follows:

is the value of the th node of the th training input vector.
.

To represent the weights:

is the th value in the weight vector, to be multiplied by the value of the th input node.

An extra dimension, with index, can be added to all input vectors, with, in which case replaces the bias term. To show the time-dependence of, we use:

is the weight at time .
is the learning rate, where .

Too high a learning rate makes the perceptron periodically oscillate around the solution unless additional steps are taken.

Famous quotes containing the word learning:

“Every act of conscious learning requires the willingness to suffer an injury to one’s self-esteem. That is why young children, before they are aware of their own self-importance, learn so easily; and why older persons, especially if vain or important, cannot learn at all.”
—Thomas Szasz (b. 1920)

Related Subjects

Perceptron Algorithm

Related Words

Algorithm

Linear