The original perceptron nets consisted of a single layer of perceptron units. The inputs to the net went to each of the the units and the outputs from the units make the output of the net. There was no inter-communication between the units.

It can be demonstrated that if a perceptron net can learn a function, it is guaranteed to learn it, regardless of the initial state of the neural net, prior to training.

Training is performed by putting a sample input on the net and obtaining an output. An error is calculated depending on how the output relates to the desired output for the sample input. The weights of the net are adjusted to reduce this error. This is done iteratively until the error is sufficiently low (this will be zero in single layer perceptron nets). This is typically done by a human observer (supervised learning) to guarantee that the net is taught correctly.

A perceptron net will be able to learn a function if it is linearly separable. This means that we can draw a straight line through the state space that seperates it into two classes.

When it was realised that the single layer net was incapable of solving the XOR problem, faith in the ability of the nets was lost. Experiments showed that if the inputs were replaced with a second row of units and the inputs were randomly connected to this pre-processing layer, the net would sometimes learn functions that were not linearly separable.

This showed that the addition of a second layer allowed more complex functions to be learnt but the method of training these nets (multi-layer perceptrons) took a lot longer to emerge.

The output layer of multilayer perceptrons is trained as with the single layer perceptron. The weights are adjusted according to the error in the result. The earlier layers are then adjusted relative to the error in the units that they feed their outputs into, multiplied by the weight of that link.

Multi-layer percpetron nets overcome the linear separability problem but there is no guarantee that the net will learn a function. This is due to local minima in the training process. The net can reach a point where any change to the weights will result in increased error, hence it stops training. Changing the initial weights of the net, prior to training can result in a different result after training is complete.

Another style of neural net is the Hopfield net. Hopfield nets can be used as content addressable memory (CAM). If you give a damaged or incomplete piece of data to the net, it can retrieve the clean prototype that it conforms to. The nets are based on the theory of the properties of magnetic materials, hence they are easy to study analytically. The net has an energy level at each point in time. Plotting energy against current input weights gives the energy surface. Stable states correspond to local minima in the energy surface.

The output of each unit in a Hopfield net is connected to the input of every other unit (but not to itself). The input to these nets consists of the state (on or off) of each unit. The neurons are randomly selected to update their state according to the weighted sum of its inputs. This is done until no neuron changes its state anymore. This stable state represents an output.

Training a Hopfield net involves setting the stable states to represent only those patterns whihc you wish to store, and no others.

Hopfield nets can also be used to solve optimisation problems, such as the Travelling Salesman Problem. Each stable state corresponds to a possible solution. The lower the solution is on the energy surface, the more optimal the solution.