f ∗ g = -∞ f(τ) g(t - τ) dτ

Convolution "blends" two functions into each other: it measures the amount of overlap as f is shifted to g. The weighted moving average is a convolution with respect to the weighting function. In the case of a regular moving average, the weighting function is just 1 along the width of the moving average. In the case of Gaussian blur, a type of image blur, it is a Gaussian curve. Another convolution is found here: there are two random variables distributed X → f and Y → g, then X + Y → f ∗ g. A shadow is a convolution of the shape of the object with respect to the shape of the bundle of light rays casting the shadow. Etc.

The working definition is: a convolution of two functions (f ∗ g, should look like f * g), Fourier or Laplace transformed, becomes a product of the individual transforms (F · G), or:

L{f(t) ∗ g(t)}(s) = F(s) G(s).
The form of the integral of a convolution isn't common, so this theorem helps little with transforming functions. More useful is the inverse, i.e. the inverse of a product of two transforms (F and G) is a convolution of their inverses (f and g):
L-1{F(s) G(s)}(t) = f(t) ∗ g(t).
This allows us to extract a solution from a solution transformed, removing the guesswork common with determining inverses. In other words, multiplication in Laplace transformed world corresponds to a convolution in regular math world. With the Fourier transform, the property works in the converse way, too. (Not with Laplace, though.) The convolution theorem states that a product transformed is a convolution with the 1/2π factor:
F{f·g} = (1/2π) F ∗ G


First, notice that convolution "inherits" some properties from multiplication.

  1. Since F(t)G(t) commutes to G(t)F(t), convolution does, too: f ∗ g = g ∗ f.
  2. The associative property follows from the same argument: f ∗ (g ∗ h) = (f ∗ g) ∗ h.
  3. Because a convolution is an integral, which is a linear transform, convolution distributes: f ∗ (g + h) = f ∗ g + f ∗ h.

However, convolution corresponds to, but is not multiplication, because it's an integral. For this reason, these properties of products do not generally hold with convolutions:

  1. Multiplication by itself (square) is positive: f · f ≥ 0
  2. Multiplicative identity is number one: f · 1 = f
A good counterexample these is sin(x), which convolutes with itself into a sum of cos and sin, which is not always positive. And, instead of number one, the identity has an impulse of one - it's the Dirac delta function. The Dirac delta describes a very short, but very powerful impulse, like an unlimitedly forceful but unlimitedly short hammerblow with the impulse of one unit. (Some food for thought: in the worked example below, notice how the phase shift in the superposition approaches zero as the input approaches zero duration.)

In chemical engineering, if you inject a marker into a vessel with the concentration per time function X(t), and measure the output marker concentration Y(t), then the convolution X ∗ Y is the residence time distribution E(t) of the vessel, that is, the distribution of how long individial particles are retained inside the vessel. You can measure the residence time distribution Y(t) directly by choosing the right function X(t) for the marker feed: a sharp spike approximating the Dirac delta function. The step function (aka Heaviside function) is another option.

The proof for three properties lined above apply only with the Laplace transform - proof directly from the definition does exist. This is beyond the scope of my writeup, but nevertheless, ariels' comments are included here. First, commutativity may be proven by just switching the parameters t and t-τ between the functions. To prove associativity, ariels informs us that Lebesgue integral is the way to go. To formally prove that the Dirac delta is the identity, use this:

C is the space of infinitely differentiable function s (i.e. 1st, 2nd, ... derivatives all exist) that tend in a "nice" way to 0. You use a subspace of that -- the space of all functions with bounded support (ie. infinitely differentiable and 0 for |x|>M -- a "small" space of functions) to define the dual vector space of distributions. These are linear functionals of such functions. One such functional is f → f(0) (assignment of 0), and it's this is what gets called the delta function.

With derivatives and integrals, the behavior resembles multiplication. Derivative is like just like with multiplication; here a subscript is a derivative:

(f ∗ g)x = f &lowast gx + fx ∗ g

The area under a convolution is the product of the convolved functions, but variables changed:

-∞ (f ∗ g) dx = -∞ (-∞ f(τ) g(t - τ) dτ) dx = (-∞ f(τ) dτ)(-∞ g(x) dx)

(truth from MathWorld)

Unabridged worked example

Let us theorize no more. An example of differential system solving follows. (All capitalized functions are Laplace transforms of the functions in lowercase, that is: L{f} = F. Better look up Laplace transform if the reader is unfamiliar with it.)

There is a heavy weight attached to a light spring, without any damping. Initially, it's at rest. We ask: what's the movement of the weight under a force r(t)? Let's concern ourselves with r(t) later to see that the method is universal. The distance of the weight from rest is called y(t). Thus, the velocity of the weight is y'(t), and its acceleration is y''(t). At first, nothing moves, so y(t) = 0, y'(t) = 0.

Let us assume that the spring is a perfect Hooke's law spring, meaning the further we pull it, the larger the force is, so that:

Fspring = k·y(t)
where k is the spring's constant. Another point we need to note is that the force, which is not used to stretch the spring, is used to accelerate the weight:
Finertia = m·a = m·y''(t)
The force (r(t)) is used up completely by these two mechanisms, so it's their sum:
y''(t) + k·y(t) = r(t)

This is a second-order differential equation. Either we solve it by "inherited wisdom" or "trial and error", or with a systematic method. Laplace transforms are a fine demonstration why math is not hard, after all. The perfect example: it transforms derivatives, like y''(t), into degree s2Y(s). Here, for example, the hard-to-solve differential equation transforms into an easier form:
m·s2Y(s) - s·y(0) - y'(0) + k·Y(s) = R(s)
Both the y(0) and y'(0) terms are zero, because at first, there is no displacement (y) or velocity (y'), which simplifies this further:
m·s2Y(s) + k·Y(s) = R(s)
From this, solving for Y:
(m·s2 + k) Y(s) = R(s)
Y(s) = R(s) 1/(m·s2 + k)
Now, notice it's in the the form of a product of transformed functions, so can write
Y(s) = R(s) Q(s)
Q(s) = --------
       m·s2 + k
Because Y = R·Q, correspondingly, the reaction y(t) is the convolution y = r ∗ q. We need to find out q(t), that is, the inverse transform L-1{Q(s)}, and convolute it with r(t). This is simplified by the use of ready-made Laplace transform tables.

From the node Laplace transform, we learn this:

L{sin(at)} = -------
             s2 + a2

Q(s) is written a form similar to this equation. First, notice Q(s) may be a product of 1/m, and then set a2 = k/m. Multiply, and equally divide, with a.

       1     1       1     a
Q(s) = - -------- = --- ------
       m s2 + k/m   m·a s2 + a2
and then the inverse is written down:
q(t) = L-1{Q(s)} = (1/ma) sin(a·t) - where
a = √(k/m).
All we have left to do is to convolute this with r(t).

We have to know the force to use this equation. Let's choose something simple, yet with inconspicuous complexities. Say, r(t) is just one newton for one second, that is, r(t) = 1 on 0<t<1. The convolution, with the factor (1/ma) conveniently moved to the front of the integral, is:

q ∗ r = (1/ma) -∞ r(t) sin(a(t - τ)) dτ
There are two phases to be considered separately: first when the force is on, and second when the force has ceased:
r(t) sin(a(t - τ)) = sin(a(t - τ)), if 0 < t < 1,
r(t) sin(a(t - τ)) = 0, elsewhere.

Integrating from 0 to t when 0<t<1 we get a picture of the movement while the force is still acting, and all of the impulse has not been delivered. Integration is not between the infinities, because only the delivered impulse can be included in the range of integration. Only the force that has been realized - between times 0 and t - is integrated.

r ∗ q = -(1/m) 0t (-1/a) sin(a(t - τ)) dτ
= -1/m ((-cos a(t-t)) - (-cos (t-0)))

y0<t<1 = 1/m (1 - cos at)

So, it's a nicely rising function, meaning the spring is a little resistant to move at first, but then accelerates, and is finally slowed down by the force of the spring. This brings us to the next part.

For the reaction after the force ceases (t>1), we integrate from 0 to 1, and don't need the whole -∞ to ∞, because all the impulse is delivered from 0 to 1. Moving the 1/a inside the integral, it integrates nicely:

r ∗ q = -(1/m) 01 (-1/a) sin(a(t - τ)) dτ
= -1/m [-cos a(t-τ)]t=0t=1
= -1/m ((-cos a(t-1)) - (-cos a(t-0)))

yt>1 = 1/m (cos a(t-1) - cos at)
(Notice the use of the chain rule, namely (cos f(x))x = -f'(x) sin f(x). Be careful with the pluses and minuses, because they alternate for several reasons, like -τ and the derivative -sin(x).)

In the result, observe that there are two similar functions, which differ only because one has a phase shift of one second. This coincides with the duration of the force. The curve for this function is a nice sinusoidal curve, just what can be expected from an undamped spring.

I saw this as the most simple and elegant example there is, as it's generic, and saves a lot of trial and error solving. The same method can be used to solve more complicated problems with little added complexity.