3-D Perception

From HoloWiki - A Holography FAQ
Jump to: navigation, search

Biological Basis of Vision

The Human Eye

Well-informed expositions on the biology and architecture of the human eye are available online, for example: Online Merck Manual. A broad article is available at WikiPedia. A frequently cited reference on the retina from a neural/functional standpoint is Dowling (1987).

The human eye is a direct extension of the brain; much more than a "biological camera," the eye performs pre-computation on observed imagery prior to transmitting it towards the visual cortex. In the words of Churchland and Sejnowski (1993), "[t]he primate retina transforms patterns of light on the 100 million photoreceptors into electrical signals on the mere one million axons in the optic nerve, and the 100:1 compression ratio suggests heavy-duty signal processing and information compression" (p. 148).

One striking illusion that highlights visual precomputation, Mach banding, is the incorrectly-perceived brightness at edges of differently-shaded fields. Although still incompletely understood, this may be due to lateral inhibition amongst nearby photoreceptors, resulting in high-pass filtering in the eye itself.

Relevant Neural Regions

In general, imagery from the right visual field (as collected by the left hemisphere of both eyes) is transmitted via the optic nerve and optic chiasm to the left hemisphere's optic tract; likewise, imagery from the left visual field travels along the right hemisphere's optic tract. We note that most of this flows to the thalamus's lateral geniculate nucleus (LGN), with another pathway to the superior colliculus.

The information reaches the visual cortex, which is located at the back of the brain. The visual cortex has several regions: V1, V2, and so on, whose supposed function is beyond the scope of this discussion. However, we note two relevant and generally-supported hypotheses:

Retinotopic Mapping in V1

A striking series of experiments showed that regions of the visual cortex are mapped retinotopically to the observed field, that is, "neighboring cells have neighboring receptive fields" (Churchland & Sejnowski, 1994, p. 155). In a seminal experiment performed by Roger Tootell, a primate's brain was examined after fixation on a patterned bullseye-like target - when the brain was stained as a function of activity, an image of the target was clearly visible on the unfolded cortex (Tootell, Silverman, Switkes & De Valois, 1982). See a photograph here, from the Harvard website containing David Hubel's online vision textbook. section: The Architecture of the Visual Cortex.

Regions of V1 and V2 Correspond to Varying Degrees of Monocularity and Binocularity

Churchland and Sejnowski (1994) state that: "For the brain to generate stereo vision, there must be means for the brain to compare retinal images relative to varying planes of fixation. Hubel and Wiesel (1963) discovered that striate cortical cells were not uniform in their response to a visual stimulus, but some cells were strongly monocular, and were flanked by other cells responding somewhat to stimuli from both eyes, though preferring one or the other, flanked in turn by cells that were binocular" (p. 197).

See David Hubel's 1981 Nobel Prize lecture.

Learn more about the visual cortex at Wikipedia.

Depth Cues

Humans perceive imagery that falls on their retina(s) as three-dimensional when influenced by one or more depth cues. Monocular depth cues can be experienced with just one eye; binocular depth cues require two.

Monocular Depth Cues

Briefly, monocular depth cues include:

  • Relative size: larger objects are interpreted as being nearer the observer
  • Interposition / Overlapping: close objects tend to occlude far objects
  • Linear perspective (foreshortening): Receding parallel lines appear to meet at the horizon.
  • Aerial perspective (haze / fog): Blurry or foggy objects may be interpreted as distant, since haze usually blurs distant scene elements.
  • Light and shade: So-called "2 1/2-D" rendering uses the interplay of shape and light to suggest the three-dimensionality of objects. Note that people assume that light comes from above when viewing an image; this is the so-called light-from-above prior or light-from-above heuristic.
  • Motion parallax: Horizontal observer movement (egomotion) "makes" near objects appear to move faster than distant objects. Note that this cue can be used to simulate egomotion, that is, in movies, animations, and true 3-D representations, moving foreground elements faster than background elements evokes the sensation of movement.
  • Accommodation (focus): Retinal focus provides information to your brain about the probable distance from your eye to the object you are fixating on. One issue of non-holographic 3-D displays is the so-called "accommodation / vergence conflict," in which the angular swivel of the eyes does not agree with their focus. This happens, for example, when watching a stereoscopic 3-D movie, since there are cases in which your eyes are focused at a distant screen while they are rotated inwards to gaze at a very close scene element.
  • Texture Gradient: As in a field of wheat, the perception of a textured region is a function of distance.

A variety of optical illusions prey upon the assumptions your mind makes about interpreting monocular depth cues.

Binocular Depth Cues (Stereopsis)

The average interpupillary distance is approximately 6-6.5 cm. In normal circumstances, this leads to each eye observing a different 2-D field. The brain interprets these differences for depth information, such as (De Valois & De Valois, 1990):

  • Vergence: The angular "swivel" of the eyes while gazing at an object provides a strong cue regarding the depth of that object.
  • Positional Disparity: A large-scale illustration of positional disparity is observed by holding one's outstretched index finger and observing the relative motion of your finger and the background when viewed alternately by your left and right eyes. Wikipedia: Stereopsis
  • Phase Disparity of Frequency Components: There is evidence suggesting that the brain is sensitive to the phase difference of the frequency components of an image, which has a different magnitude, of course, than displacing the sine wave component itself (De Valois & De Valois, 1990, p. 302)
  • Orientation Disparity: Orientation disparity refers, for example, to the different angle a line makes on each retina when gazing at a line pitched toward or away from the observer.
  • Spatial Frequency Disparity: The separable existence of this cue may still be in debate. Spatial frequency disparity is the difference in spatial frequency for scene elements that are, for example, at varying depths from the observer (Halpern et al, 1987). For example, pitching a single-frequency grating at an angle to the observer yields different perceived spatial frequencies in each eye (De Valois & De Valois, 1990, p. 307).

The collection of potential disparities are called stereopsis.

An Implication of Random-dot Stereograms

Note that the brain does not require local stereopsis to perceive depth; global stereopsis "can occur without monocular contours" (De Valois & De Valois, 1990, p. 314). For example, Julesz's (1971) random-dot stereograms present two views that appear, in a monocular sense, like disorganized spatial noise. However, the brain is able to fuse the two images into a scene containing depth - perhaps via the global low-freqency content in the imagery.

Guidelines for Effective 3-D Imagery

Rules of Thumb for Particular Display Media

One implication of the preceding discussion is that it is best to match subject matter to the display medium and intended observation environment. Experts in the following media are invited to add their own rules of thumb:

  • Holographic stereograms
  • Cylindrical multiplex holograms
  • Quasi-holographic electro-optical displays

Bandlimiting Can Decrease Interview Aliasing

Holographic stereograms and other discrete-"view" 3-D displays can exhibit motion artifacts due to interview aliasing. For example, image points far from the image surface appear to jump to neighboring views during egomotion if they are sampled or reconstructed improperly. Holography researcher Michael Halle (1994) discusses these constraints, which apply in particular to holographic stereograms and non-holographic parallax displays. In short, interview aliasing can be mitigated by intentionally blurring scene elements distant from the image surface.

Understand Your Medium's Focus Characteristics

Of course, different 3-D display media use different methods to reconstruct 3-D light fields. For example, some holograms are highly astigmatic, putting the horizontal and vertical foci at very different surfaces in or beyond the 3-D scene. The family of horizontal parallax only (HPO) holograms discards some or all vertical parallax information (De Bitetto, 1968; Benton, 1969; De Bitetto, 1969; Benton, 1977). The long-term effects of viewing astigmatic display media, such as HPO holograms, are not widely known in the display community, and references to thoughtful work in the area are appreciated. While not holographic, the variety of electronic 3-D display technologies also vary in their focus characteristics. They range from volumetric displays, whose true voxels in (x, y, z) space elicit proper vergence and accommodation cues (Favalora et al, 2005) to experimental "highly-multiview" HPO systems (Favalora, 2005) and lenticular-sheet displays which are HPO and typically project very discrete infrequently sampled horizonal parallax information.

Members of the former MIT Media Laboratory's Spatial Imaging Group explore the importance of choosing the correct scene-sampling and reconstruction geometries as a function of factors including the intended observation point and propose computational predistortion methods for dealing with these issues (Halle, Benton, Klug, & Underkoffler, 1991).


  • Churchland, P. & Sejnowski, T. J. (1994). The Computational Brain. Cambridge, Mass.:The MIT Press. ISBN 0262531208
  • Benton, S. A. (1969). Hologram Reconstructions with Extended Light Sources, J. Opt. Soc. Amer. 59, 1545A.
  • Benton, S. A. (1977). White-light transmission/reflection holographic imaging. In E. Marom, A. Friesem, & E. Wiener-Avnear (Eds.), Applications of Holography and Optical Data Processing (pp. 401-409).
  • De Bitetto, D. J. (1968, March 1). Bandwidth reduction of hologram transmission systems by elimination of vertical parallax. Applied Physics Letters, 12(5), 176-178.
  • De Bitetto, D. J. (1969, August). Holographic Panoramic Stereograms Synthesized from White Light Recordings. Applied Optics, 8(8), 1740-1741.
  • De Valois, R. L. & De Valois, K. K. (1990). Spatial Vision. Oxford: Oxford University Press. ISBN 0195050193
  • Dowling, J. E. (1987). The Retina: An Approachable Part of the Brain. Cambridge, MA: Harvard University Press (Belknap Press?). ISBN 0674766806
  • Favalora, G. E. (2005, August). Volumetric 3D Displays and Application Infrastructure. Computer, 38(8), 37-44. PDF
  • Favalora, G. E., Chun, W., Cossairt, O. S., Dorval, R. K., Halle, M., Napoli, J., & Thomas, M. (2005), "Scanning optical devices and systems," U.S. Pat. App. US2005/0285027A1, filed Feb. 15.
  • Halle, M. W., Benton, S. A., Klug, M. A., & Underkoffler, J. S. (1991). The Ultragram: A Generalized Holographic Stereogram. In S. A. Benton (Ed.), Practical Holography V [Proc. SPIE-IS&T Electronic Imaging, SPIE Vol. 1461] (pp. 142-155). CiteSeer
  • Halle, M. (1994). Holographic stereograms as discrete imaging systems. In S.A. Benton (Ed.), Practical Holography VIII [Proc. SPIE] Vol 2176, (pp. 73-84). Bellingham, WA. Preprint PDF
  • Halle, M. (1997, May). Autostereoscopic displays and computer graphics. Computer Graphics, ACM SIGGRAPH, 31(2), 58-62. HTML and PDF versions.
  • Halpern, D. L. et al (1987). What causes stereoscopic tilt from spatial frequency disparity. Vision Res., 27(9), 1619-1629.
  • Hubel, D. H. & Wiesel, T. N. (1963). Shape and arrangement of columns in cat's striate cortex. Journal of Physiology, 165, 559-568.
  • Julesz, B. (1971). Foundation of cyclopean perception. Chicago: University of Chicago Press.
  • Okoshi, T. (1976). Three-Dimensional Imaging Techniques. Academic Press. ISBN 0-12-525250-1
  • Ratliff, F., Milkman, N., & Rennert, N. (1983). Attenuation of Mach bands by adjacent stimuli. Proc Natl Acad Sci U S A 80(14), 4554-8. Abstract and Article PDF
  • Shepherd, G. M. (2003). The Synaptic Organization of the Brain. Oxford University Press. ISBN 019515956X
  • Tootell, R. B. H., Silverman, M. S., Switkes, E., & De Valois, R. L. (1982). Deoxyglucose analysis of retinotopic organization in primate striate cortex. Science, 218, 902-904.

External Links