Notation: We will choose all vectors to be column vectors, and symbols denoting vectors or matrices will be written in bold.

The Jacobian is defined for a vector function of multiple variables. Consider a function F(X), where X is an n-element vector and F(X) is an m-element vector. We can write it as

```   / / x1 \ \   / f1(x1, ..., xn) \
| | x2 | |   | f2(x1, ..., xn) |
F| | .  | | = | .               |
| | .  | |   | .               |
| | .  | |   \ fm(x1, ..., xn) /
\ \ xn / /
```
Here we have explicitly written the m-dimensional vector function F as a vector of m real-valued functions, one to provide each component of the vector. We have also split the input vector X into its n components.

Formally, the Jacobian is defined as an m by n matrix such that

```          ∂fi
Jij = ---
∂xj
```

The Jacobian can be though of as a vector and multi-variable extension of the derivative of a real-valued function of a single variable. For example, given an arbitrary real-valued function f(x), we might know the value of the function at one point x = a and want to determine the value of f at other points very close to a (a is constant):

```    f(a + Δx) = f(a) + Δf
```
Single-variable calculus tells us that
```   Δf   df
-- ≈ --
Δx   dx
```
for a small change Δx, which means that
```           df
Δf ≈ Δx --
dx
```
That means that
```
df |
f(a + Δx) ≈ f(a) + Δx -- |
dx | x = a
```
Now imagine a different function f(x, y); this is a function of two variables. We are interested in the value of f near (x, y) = (a, b). If y is held constant while x varies, then f is really just a function of a single variable x, so that as before
```
∂f |
f(a + Δx, b) ≈ f(a, b) + Δx -- |
∂x | (x, y) = (a, b)
```
and
```       |              ∂f |
Δf |         ≈ Δx -- |
| Δy = 0       ∂x | (x, y) = (a, b)
```
exactly as before. (I wrote ∂f instead of df to indicate that all derivatives are now partial derivatives, which are what you need when you deal with a function of multiple variables. A partial derivative is obtained by differentiating a function of multiple variables by a single variable, while treating all the other variables as constants.)

Similarly, if x is held constant while y varies then

```
|              ∂f |
Δf |         ≈ Δy -- |
| Δx = 0       ∂y | (x, y) = (a, b)
```
So if we change x, holding y constant, and then we change y, holding x constant (or if we do it the other way around), then as long as the changes were small enough that the partial derivative stayed roughly constant we can write the total change in f as
```         /    ∂f      ∂f  \ |
Δf ≈ | Δx -- + Δy --  | |
\    ∂x      ∂y  / | (x, y) = (a, b)
```
This can be extended to a function of arbitrarily many variables. But what does any of this have to do with the Jacobian? Consider our function f. From the definition above, the Jacobian will be a 1x2 matrix, which is almost degenerate but will do for the sake of example. Then
```        / ∂f   ∂f \
J = | --   -- |
\ ∂x   ∂y /
```
So that we can use vector notation for everything let
```          / Δx \
ΔX = |    |
\ Δy /
```
which is 2x1, and let ΔF = [ Δf ], which is 1x1 (like I said, almost degenerate). Then
```           / ∂f   ∂f \ / Δx \       ∂f      ∂f
J ΔX = | --   -- | |    |  = Δx -- + Δy -- ≈ ΔF
\ ∂x   ∂y / \ Δy /       ∂x      ∂y
```
So that means that
```    ΔF ≈ J ΔX
```
which is the equivalent of the single-variable relation
```        df
Δf ≈ -- Δx
dx
```
with the Jacobian standing in for the single-variable derivative. The simplest possible multi-variable case was shown here, but this generalizes to a vector of functions f1, f2, ..., fm of a vector of variables x1, x2, ..., xn.

Many single-variable results are easily generalized to the multi-variable case simply by replacing the derivative with the Jacobian. For example, Newton's method for the solution of non-linear equations generalizes to the Newton-Raphson method, and chain rules are pretty much exactly what you'd guess.

None of this is rigorous.