The term Augmented Reality is used to refer to systems which combine input from the real world with Virtual Reality elements. Sometimes VR (and by extension, I suppose, AR) is used loosely to refer to any computer-generated world, including the many 3D computer games now on the market - I have even seen it used to include text-based worlds like this. However, I am talking here specifically about real time stereoscopic systems. Usually AR involves either superimposing computer graphics imagery onto a direct view of the real world (for instance, using half-silvered mirrors), or else mixing the computer graphics action with real-world video feeds, from cameras strapped to the participant's head or at remote locations.


Augmented Reality raises most of the same issues as the mixing of computer graphics with live action in special effects, relating to the difficulty of convincingly integrating live action and computer graphics. It also raises a number of problems of its own. Usually there is some practical end for the technology, so that things like the accurate perception of depth are likely to take priority over aesthetic considerations. The problem of registration (aligning real and virtual objects) also becomes far more acute; if this is not achieved in a reasonable approximation of real time then objects will appear to swim around disconcertingly in front of the user. Seriously failures of the system motion-tracking system are even more disturbing for the user. This has led researchers at the University of North Carolina to employ a hybrid tracking system combining magnetic tracking with visual cues.

The inevitable latency in the tracking and rendering system, which can cause mis-alignment of objects when the camera moves, has been attacked from several angles. In some instances it can be worthwhile to trade computational accuracy for speed, but of course this carries its own set of problems. Another approach is to make some effort to predict head movements. This can help, but people are just too unpredictable for this approach to eliminate the problem entirely. Transforming the rendered image is another useful trick; a simple translation or 2D rotation takes a tiny fraction of the computation it takes to render a whole frame and can often be used to fake the effects of perspective changes, so registration can be improved by applying these at the end of the render using the most recent possible information about head position and orientation. With systems which mix live video with computer images, it is possible to delay the video input so that it matches the latency of the graphics; this makes registration much better, but slows down the user's ability to respond to real stimuli.

Other Potential Perceptual Pitfalls

Human depth perception makes use of a wide range of depth cues, any or all of which can cause problems for the Augmented Reality user. The most important of all depth cues is binocular parallax: The difference between the view from our two eyes which is the main source of our sense of distance. Matching the computer graphics to the real view convincingly requires that the computer graphics take into account the distance between the eyes of the user very exactly: An error of just one millimeter can result in distance mismatches of 20% or more. If the display works by combining video and computer graphics, the distances between the pairs of real and virtual cameras can easily be matched up. This ensures that depth judgements to be consistent between the two sources, which should make interactions possible, but if the cameras don't match the user's eyes closely they may be unsettled and find certain tasks difficult. If the display combines computer images with a direct view of the real world, it is absolutely crucial that the virtual cameras should match the user's interocular distance, if users are expected to interpret what they are seeing as a consistent view.

Another important depth cue is accomodation: The way that the eye focuses to view objects at different distances. This is impossible to reproduce electronically with today's technology, and this remains a barrier to both truly immersive VR and AR in general. Things tend to just look a little bit wrong when our binocular vision is out of sync with our accomodation. What is more, our visual system is likely to misinterpret certain kinds of difference between the computer graphics and the real world as being the result of accomodation, leading to inappropriate depth judgements: If one of the sources is at a lower resolution, our brains are liable to interpret that as focus blur.

Another problem associated with display resolution is that it is often too low to allow really detailed depth perception; a difference of just one pixel can translate to a significant difference in depth, and even if anti-aliasing is used (effectively delivering a resolution of as little as one fifth of a pixel) it is hard to get anything but very coarse depth resolution.

As with special effects, luminance is also a problem in AR. Aside from the aesthetic desirability of matching up the light levels of the two sources, there is also a risk of depth misperception due to our tendency to view brighter objects as being closer to us. The limitations of head-mounted displays make this a much harder problem for AR than it is for special effects; it is very difficult for a portable display to come close to matching the range of brightness levels we encounter in day-to-day life.

One really serious outstanding problem in AR is occlusion. Although less central to our depth perception than binocular convergence, it provides a stronger depth cue in the sense that people will perceive something as being nearer than something else if it blocks it out of view, whatever their binocular vision is telling them. Glaring disparities between occlusion and other depth cues often result in double vision and other problems. Computer graphics objects in AR will only be occluded if the object which should be hiding them is known to the computer in control. Existing AR systems have had some success in using computer models of the real environment to control occlusion, but these necessarily have trouble keeping up with any changes in that environment and limit interactions to the area known to the computer. In future it might be possible to use a range-finding device to obtain a detailed map of the surrounding world to deal with this, but this is out of the reach of today's technology.

Medical Uses

A number of medical uses for Augmented Reality are currently being tested, and some have already been used in real surgery. MRI and CT scans are often used to help plan surgery, sometimes with the images available to the surgeons alongside the patient. Augmented Reality allows the surgeon to view the scans superimposed onto the patient, so that they can see directly which parts of the scan correspond to which parts of the patient. On a similar theme, researchers at UNC have experimented with ultrasound scans to enable doctors to see images of the fetus superimposed on a woman's abdomen. The result is not unlike the X-ray vision which has for so long been the preserve of science fiction and comic books, and holds great potential for helping doctors to avoid slicing through the wrong parts of someone's body. According to the Electronic Telegraph ( surgeons who have used the system for brain surgery say it cuts operating times from eight hours to five.

Military and Related Uses

The military has been making use of computer-enhanced displays in fighter planes for some time now. However it is only in the last few years that ground-based augmented reality systems have started to be tested. These relay strategic information to soldiers, such as their location and any commands (see There is plenty of scope for more detailed information to be included: For instance, targets spotted by another source might be highlighted in the soldier's view, and views from cameras sensitive to parts of the spectrum besides the visible (such as infrared and millimetre waves) might be added in. This sort of information could also be extremely useful to firefighters and police.

Engineering Applications

Another area of promising AR research is in engineering, and in particular maintenance and repair work: A worker faced with a mass of coloured wiring and component parts could find it enormously helpful to have schematic diagrams and the name and function of any part displayed in front of him on demand. Similarly someone doing electrical or plumbing work in a building could find it extremely useful to be able to see what lies hidden behind a particular section of wall, perhaps from a computerised blueprint.


Virtual Heritage

Several groups are working on the possibility of allowing people to walk around computerised reconstructions of archaeological sites as they once were; with augmented reality it will be possible to see this concurrently with viewing the site as it appears today.