The perceptron learning rule is used to train a simple (single
layer) perceptron. In order to train a perceptron, you need to have a
set of linearly separable test cases. Linearly separable means that
the required outputs must be separable by a single straight line. A
set of inputs defining an XOR function are not linerally seperable,
and thus the XOR function cannot be reproduced by a simple
perceptron.

The perceptron learning rule basically boils down to adding or
subtracting the input vector from the weight vector and adding or
subtracting from the bias depending on an error signal.

As example, say you have four test cases and say that the weight
vector and the bias are both zero:

- p1 = (0 0), t1 = 0
- p2 = (1 0), t2 = 0
- p2 = (0 1), t2 = 0
- p2 = (1 1), t2 = 1

These cases define a boolean AND. Starting with the last case,
since it is the only one that will make a difference at the start, we
have:

a = hardlim((1 1)(0 0)^{T} + 0) = 0

We now have an error in the output: e = t - a = 1 - 0. We can correct this
error:

Wnew = Wold + p1 (e) = (0 0) + (1 1) = (1 1)

bnew = bold + e = 1

We should now iterate through the cases again, to make sure they
all still work:

a = hardlim((0 0)(1 1)^{T} + 1) = 1, e = t - a = -1

Wnew = Wold + p1 (e) = (1 1) + (0 0) = (1 1)

bnew = bold + e = 0

... until e = 0 for all cases.

After iterating over the cases until they all produce the correct
output (i.e. e = 0), the network will have learned to classify all
of the inputs correctly.

The exact final configuration of the network will depend on the
order in which the test cases are evaluated. There are many possible
configurations that give the correct answers.