The error in an approximation obtained via the Trapezoidal Rule described above is given by:
        (a-b)^3
E(n) = ---------*f"(c)
        12*n^2

where f" is the second derivative of the function f being integrated from a to b, and c is some number such that a <= c <= b, and n is the number of trapezoids used for the approximation.
Why do we need numerical integration?
Ans. Most functions are not analytically integrable. For example ex^2, ((1-x2)(1-(kx)2)-0.5. etc. etc. In fact all you need to do is write down a complicated looking function and the odds are that you would not be able to find its antiderivative in terms of elementary functions.
Unfortunately Nature doesnt know of these elementary functions, and so most of the problems she sets us contain these 'non-analytically integrable' functions. So what do we do when we get one of these. Aha! Here's the place for numerical integration!

Now to the theory. There are a number of ways of attacking numerical integration problems. Probably the simplest is to approximate the function involved by a polynomial, and integrate this polynomial over the interval instead. Of course there is no need to calculate the intermediate polynomial. Since the coefficients of this polynomial depends on the value of f at certain discrete points a general quadrature formula may be worked out using these values. To introduce notation these points will now be referred to as x0,x1,x2... and the function values as f0,f1,f2... These points are generally taken to be equally spaced though that is because it was simpler that way when there were no computers. Quadrature formulae may be worked out for points spaced in a more uneven and complicated fashion. Gaussian quadrature involves this kind of point spacing. Improper integrals may also be tackled by Gaussian quadrature. I'll start by considering the first method first, and then briefly(very briefly) touch Gaussian quadrature.
Both Simpson's rule and the Trapezoidal rule covered above fall into the first class. Deriving these rules and the associated error is rather simple, is rather simple so I'll derive these two here.

Trapezoidal Rule: Lets say you have 2 points a and b and you wish to evaluate integral(a to b) f(x) dx. The Trapezoidal approximation involves approximating f by a linear polynomial so we only need the values of f at a and b. To introduce a bit of notation lets always call the first point at which we evaluate the function x0(in this case a) and the second point x1(in this case b)
Lets use Newton's forward difference formula to get this polynomial(though you can just write down the equation of the straight line). The notation used is
del-denotes the forward difference operator. del f0 = f1-f0; del f1 = f2 - f1 etc. del^2(f0) = del(f1)-del(f0) = f2-2*f1+f0 and so on for higher powers of del.
p is defined by x=x0+p*h
h is the point spacing(in this case b-a)

With this we have
l(x)=f0 + p*del f0
Also use the identity
integral(x0 to x1) l dx = integral(0 to (x1-x0)/h) l dp
With this and the fact that del f0 = (f1-f0) its trivial to show that the integral of l over the interval is
(f0+f1)*h/2
The error in the formula can be worked out in two ways. One way is to use the fact that the error in interpolation is
(x-x0)(x-x1)f''(n)/2
For some n in (x0,x1) . I'm running out of decent letters but this n is seperate from the n used for the number of points below! Now lets say we write the first term involving x as s(x). Of course n also involves x but in an unknown fashion. Since we dont know this dependence we want to pull f out of the integral. This can be done using the mean value theorem for integrals(Since s keeps the same sign above) its easy to write
e = f''(n1)/2 * integral(x0 to x2)s(x)dx
etotal = integral(x0 to x1) f''(n)/2 * s(x) = f''(n1) * h3/12 .
Where n1 is another unknown between x0 and x1.
Now if we have an extended interval (a,b) we divide it up into lots of intervals via the partition (x0,x1,x2...xn). Now if we sum the formula above we'll end up with
integral(a to b) f(x)dx = h/2 * (f0 + 2*(f1+f2+....f(n-1)) + fn)
The two factor comes in because terms like f1 come in twice. Once while integrating from x0 to x1 and the next time from x1 to x2. The error is
h2/12 * f''(n1)+f''(n2)+f''(n3) + .... = n*h*h/12 *M
=M*(b-a)*h2/12
Where M is the upper bound of f''(x) in (a,b) and I have used the fact that n*h = (b-a)

Simpson Rule: I wont cover this in so much detail. The idea here is the same. Here you approximate f by a quadratic polynomial which means you use three points. x0 as a, x2 as b and x1 as the midpoint of a and b. So
l = f0+p*del(f0) + p(p-1)/2 * del^2 (f0).
Now integrate this to get
integral(x0 to x2)fdx = h/6 * (f0 + 4f1+f2)
Thats the famous Simpson Rule. Deriving the error is a little tricky. Here
s=(x-x0)(x-x1)(x-x2)
Now s doesnt keep the same sign throughout (a,b) and its not possible to use the mean value theorem directly. However lets define t(x) as the antiderivative of s. Then the integral involved here is
e = integral(x0 to x2) f'''(n)/6 * t'(x) dx
Integrate this by parts to get
e = t(x2)*f'''(n(x2))/6 - t(x1)*f'''(n(x1))/6 - integral(x0 to x2) f(4) * t(x) /6
Now the mean value theorem can be used(as t maintains the same sign throughout) and we get the error to be
-h5f(4)(n) / 90
To sum this over an extended interval, we must now use an odd number of points (x0,x1.....x2n). We end up with
h/6 * (f0+4*(f1+f3+f5+ ....f(2n-1)) + 2*(f2+f4+f6+...f(2n-2)) + f2n)
The error can be got in the same fashion
- (b-a)*h4*M/180
Here M is the upper bound of f(4) in (a,b). We get an extra factor of half because now b - a = 2*(number of intervals)*h .
Look at that error formula for something interesting though. We used a quadratic polynomial but the error term involves the fourth derivative of f!! This means that Simpson's rule is exact for cubic polynomials also. If we had used a cubic polynomial initially we would again have got a similar error term but a more complicated formula. Incidentally the formula is this case is called the Simpson's 3/8 Rule.

Newton Cotes RulesLots of these kind of formulae can be derived, and in general the entire class is referred to as Newtons Cotes quadrature rules.

Gaussian Quadrature for Proper integralsThis is more complicated, and more efficient. The idea here is that we dont need to keep the points equally spaced any longer, and this extra degree of freedom can be used to gain additional precision.
The starting point here is a set of polynomials which are orthogonal over some interval with some weighting function. For example the Legendre polynomials are orthogonal over (-1,1) with weighting function 1. Each such set of functions produces its own type of Gaussian Quadrature. Using the Legendre polynomials for example we get Gauss-Legendre quadrature. Note that now the integral of integration is no longer arbitrary-it must be the interval over which these polynomials are orthogonal. If this is not the original interval of integration, a linear transformation must be used to change it
The Gauss method of order k evaluates the function at the zeroes of the kth order polynomial in this set. Each value has some weight(just like in the Simpson's rule f1 had a weight of 4/6, f0 and f2 of 1/6). The simplest way to calculate these weights is just to ensure that the quadrature formula is exact for polynomials of order k at least. Now there is an interesting fact that if this is true, then the formula is also exact for the next (k-1) order of polynomials! This is what makes Gaussian quadrature attractive.
A simple example. Consider Gauss-Legendre quadrature of order 2. The zeroes of the 2nd Legendre polynomial(1.5*x^2 - 0.5) are 1/sqrt(3) and -1/sqrt(3). The weights can easily be worked out to be 1 for both these points. This gives us:
integral(-1 to 1) f(x)dx = f(1/sqrt(3))+f(-1/sqrt(3))

Gaussian Quadrature for improper integrals The idea here is very similar. We now use a weighting function(just like the weighting function above) which accounts for the singularity. For example if our function has singularities at -1,1 we could use the weighting function 1/sqrt(1-x2) . Thus the singularity is absorbed into w(x). The function could have singularities at -inf and +inf in which case the weighting function e-x^2 is convenient. The Hermite polynomials are orthogonal in the range -inf to +inf with this choice of w(x). Now we proceed to derive a quadrature formula in exactly the same way ...

Okay, end of lecture!

In the strictest sense of the term, "numerical integration" means "quadrature". Quadrature is named after one of our culture's oldest problems, that of squaring the circle. This means, quite literally, producing a square that has the same area as a circle. Time proved this to be intractable with the Ancient Greek of compass-and-unmarked-rule geometry, but thanks to calculus we compute the area of any curve given to us in some useful manner. Sometimes quadrature can be done exactly, but in the general case numerical methods are needed. Therefore schemes such as the rectangle and trapezoid rules, Monte Carlo and Quasi Monte Carlo, etc.

However, most often "numerical integration" actually means "numerical schemes for differential equations". This is because differential equations are still one of the core tools (arguably the single most important) of mathematical modeling and because of the Fundamental Theorem of Calculus the following equations are identical:

(1a) d/dt x(t) = f(x(t),t)
(1b) x(t) = x(0) + Integrate f(x(s),s) ds from 0 to t.

That is: solving a differential equation is the same as calculating an integral. However, the most popular methods for quadrature (very importantly, Monte Carlo) are not commonly used, while there are specific methods for equations (1a,1b) that use differential equation structure that doesn't generalize to arbitrary integrals.

It' true that many of the more elementary numerical schemes for differential equations (particularly in the ODE family; PDEs, SDEs, RDEs are worlds of pain in their own) are derived from elementary quadrature methods. The rectangle method for quadrature, for example becomes Explicit (or Forward) Euler. However, schemes for differential equations are usually formed by methods taking advantage of the ordered nature of time, such as Taylor series and, in the real workd, Runge-Kutta schemes.

Moreover, conceiving of numerical schemes for differential equations as quadratures obscures the nature of the dynamical systems that we hope to model with differential equations. Chiefly: although a differential equation (1a) is given as the problem of discovering a map t → x(t), what we hope to know is the "law of motion" of the system, that is, the family of maps φt x(0) → x(t) that take states of the system to themselves. These are technically known as flow maps, and the point of conceiving a numerical scheme is to discover a discrete-time map ψn z(0) → z(nh) such that $z(t)$ sort of behaves like $x(t)$. This is what is meant by numerical integration basically anywhere in modern Applied Math research (and practice!).

Log in or register to write something here or to contact authors.