The idea behind shape from shading is to recover the shape of an object given pictures of it taken under different lighting conditions. Some implementations are able to get a rough idea of the object with only one single picture of it, but to get good results you generally need more.

A typical application is the analysis of satellite snapshots (for example to obtain a height map of an area, or to acquire the geometry of buildings). Pictures of the same spot are taken at different day time and information is extracted by analysis of the shadows casted on the floor. Shape from shading makes it possible to build cheap 3d scanners. With a digital camera, white paint and a spotlight it is possible to acquire the shape of objects and then use them in your favorite ray tracing program. I successfully implemented this technique and I explain it below.

A related scheme is shape acquisition from a set of stereoscopic views. Seeing the same object under two slightly different angles, just like your two eyes do, makes the perception of volumes possible. This is called stereoscopic vision. On the contrary it makes things easier for shape from shading if only lighting conditions vary and the viewer occupies the same spot throughout the different pictures.

If I were a light beam

The first step in shape from shading is to understand how light is diffused by objects that surround us and many models have been developed to do this. They all root down to an equation called the bi-directional reflectance diffusion function, application of the first law of thermodynamics : The surfacic power of light emitted by a small surface is equal to the sum of all the incident surfacic powers.

Inter-reflections make this calculation very complex. Suppose two objects face each other, they both send light back and forth to each other and as a result each object is indirectly lit by himself ! To simplify this, the global illumination model has been introduced in computer graphics. It divides light in three parts :

  • Ambient : this is a constant component that models inter-reflections. Even if not directly lit by a source, objects receive light. Ambient lighting supposes that the amount of indirect lighting is constant throughout the scene (or the part of the scene). This technique contrasts with radiosity.
  • Diffuse : this is the component of light emitted by the object in every direction. The same power is emitted per solid angle. It represents the "rough" aspect of the surface.
  • Specular : this is the component of light emitted that creates a highlight on the surface. The power emitted per solid angle has a peak around the direction of the normal reflection of the ray. It represents the "shiny" aspect of the surface.
On the beautiful diagram below, you can see a sphere lit from the top-right angle and you can spot each of the three regions : ambient (X), diffuse (.), and specular (O).
            ....       + light                 
          X...OO..       source              
The Phong illumination model is commonly used to model the specular term. Working with shiny objects is complex because the power emitted isn't uniformly distributed. The easiest way to do shape from shading is to use non-shiny surfaces, or to flag specular regions as "corrupted data".

The diffusion term can be calculated with Lambert's law, also called Lambert's cosine law. It has the great advantage that the power emitted per solid angle (called radiance) is constant and only depends on the amount of light received (the irradiance). This means that the surface will be seen having the same luminosity whatever angle it forms with a detector (camera or eye) ; luminosity is indeed the power captured per solid angle by a receptor.

Let's get those normals

Since the irradiance (surfacic power received by a surface) can be obtained by measuring the radiance (surfacic power emitted per solid angle) and that the irradiance is linked to the angle between the normal to the surface and the incident ray of light, it is possible to extract those normals from multiple observations.

Let's say the light source is located far enough away from the object to consider that all rays are parallel and convey a constant surfacic power P (in Watt*m^-2). An rough example of a such source is the Sun. If the surface does an angle θ with the light ray, the surfacic power absorbed is cos(θ)*P.

light ________________/
beam  _______________/
      ______________/ θ
Lambert's law states that the radiance is k*P*cos(θ), and thus the luminosity is I=λ*k*P*cos(θ). Let's group all constants in one, this relation can be expressed with a scalar product of vectors. If s is a vector that points to the light source and n is the normal vector to the surface.
I = K * s . n
Say we have three pictures with the light incoming from three different directions. This yields three intensity measurements, hence :
I1 = K * s1 . n
I2 = K * s2 . n
I3 = K * s3 . n
The normal is obtained by solving this 3x3 linear system. This operation should be repeated for each point of the object. With the bunch of normals and a little assumption on the surface, it is possible to determine it completely. In a nutshell, we're almost done !

Let's get this surface

If the surface is differentiable (that means that it is continuous, ie. it has no holes in it, and that it has a normal vector, which is the case of almost everything at our scale), let's express its equation as :

f(x,y) - z = 0
A normal vector to this surface at point (x, y, f(x,y)) is (df/dx(x,y), df/dy(x,y), -1). This one should be parallel to the one computed in the step above. f is known by its partial derivatives. The method I suggest to obtain f is to use finite differencing which yields a huge linear system that can be solved by Gauss-Seidel iteration.
Finite differencing in a glimpse :
df/dx(x,y) ∼ (f(x+h,y) - f(x,y)) / h
Take h as small as possible.

Strengths and drawbacks

You'd be surprised to see how fast this method is. Normal calculation is a O(n) step, where n is the number of pixels. Solving the linear system obtained by finite differencing is fast since it is diagonally dominant (about O(n) too). All in all, you can do shape from shading in O(n) operations.

But the surface used must be perfectly lit : It must be perfectly diffusive (ie. it should not be shiny/have a specular) because only diffusion is considered, and have a uniform color because intensities are computed on the same basis, and all the surface must be directly hit by the light because shadows are not supported. So this method isn't very interesting for outdoor objects (for example satellite snapshots) but I obtained very interesting results with white wax/playdoh models and my digital camera.