 |
| Volume 2, Number 9, Article 2, Pages 597-607 |
doi:10.1167/2.9.2 |
http://journalofvision.org/2/9/2/ |
ISSN 1534-7362 |
Bi-stability in perceived slant when binocular disparity and monocular perspective specify different slants
Raymond van Ee |
Utrecht University, Helmholtz Institute,
Utrecht, The Netherlands |
|
Loes C. J. van Dam |
Utrecht University, Helmholtz Institute,
Utrecht, The Netherlands |
|
Casper J. Erkelens |
Utrecht University, Helmholtz Institute,
Utrecht, The Netherlands |
|
Abstract
We examined how much depth we perceive when viewing a depiction of a slanted plane in which binocular disparity and monocular perspective provide different slant information. We exposed observers to a grid stimulus in which the monocular- and binocular-specified grid orientations were varied independently across stimulus presentations. The grids were slanted about the vertical axis and observers estimated the slant relative to the frontal plane. We were particularly interested in the metrical aspects of perceived slant for a broad spectrum of possible combinations of disparity- and perspective-specified slants. We found that observers perceived only one grid orientation when the two specified orientations were similar. More interestingly, when the monocular- and binocular-specified orientations were rather different, observers experienced perceptual bi-stability (they were able to select either a perspective- or a disparity-dominated percept).
History
Received August 20, 2002; published December 9, 2002
Citation
van Ee, R., van Dam, L. C. J., & Erkelens, C. J. (2002). Bi-stability in perceived slant when binocular disparity and monocular perspective specify different slants.
Journal of Vision, 2(9):2, 597-607,
http://journalofvision.org/2/9/2/,
doi:10.1167/2.9.2.
Keywords
stereopsis, bi-stability, binocular vision, ambiguous figures, binocular disparity
for related articles by these authors
for papers that cite this paper |
Each of our eyes views a scene from a slightly
different position. The resulting binocular disparities enable us to reconstruct
the 3-dimensional (3D) lay-out. The processing of disparities is, however, not
essential for the 3D reconstruction because we are often able to perceive depth
solely on the basis of monocular vision. For example, monocular perspective
(including texture, outline, and linear perspective) is a powerful cue for
surface slant ( Clark, Smith, & Rabe,
1955; Cutting & Millard, 1984; Freeman, 1966; Stevens, 1981). How much depth do we
perceive when viewing a depiction of a slanted plane in which binocular
disparity and monocular perspective provide opposite slant information?
Recently, we examined this question in a metrical (quantitative) way and we
found for a range of disparity-perspective cue conflicts that observers
experience bi-stability when viewing such depictions ( van Ee, Hol, & Erkelens, 2001).
Although, quite interesting, phenomenological aspects of bi-stability in
stereoscopically perceived slant were reported in the early days of stereoscopic
research, little progress seems to have been made since then, and the metrical
aspects have never been investigated systematically.
The literature on perceptual bi-stability is vast.
However, almost all demonstrations of bi-stability are essentially monocular,
even when they are viewed binocularly. Figure
1 shows the well-known Necker cube, which is an example of a stimulus that
evokes perceptual bi-stability. The literature on bi-stability that requires
stereopsis is surprisingly sparse, even though quite a few studies have
addressed conflicts between monocular and binocular specified depth (see
“Discussion”). A survey of the literature reveals interesting
findings. First, as far as we know, only two studies have reported that
bi-stability occurs in slant perception for extreme disparity-perspective cue
conflict situations ( Wheatstone, 1852;
Schriever, 1925). These studies were
phenomenological in nature and did not address metrical aspects of perceived
slant. Wheatstone, in particular, reported bi-stability for a variety of
different 3D stimuli in which perspective and disparity provided opposite
depths. 1 Second, a couple of studies did
examine estimated slant when disparity and perspective provide opposite slant
information but they did not report bi-stability ( Allison & Howard, 2000a; Allison & Howard, 2000b; Gillam & Cook, 2001).
In sum, there seem to be no studies in the literature
that investigated how much depth is perceived (i.e., the metrical aspects) in
stimuli that engender bi-stability. On the phenomenological aspects, however, Wheatstone (1838, 1852; i.e. over 150 years ago) reported a
wealth of information about and insights into bi-stability. Because many of his
findings are relevant for our study, we will use them as a central thread
through this introduction. Figure 1. Necker cube
bi-stability example. A constant
stimulus gives rise to two alternative 3-D interpretations. Although eye
movements play a role, the general consensus is that the bi-stability is
predominantly central. Stereopsis is not required to experience bi-stability in
this stimulus: the bi-stability is essentially monocular.
Wheatstone, using the stereoscope that he constructed,
was one of the first to study stimuli in which binocular disparities and
monocular perspective provided opposite slant information ( Wheatstone, 1838, p. 377):
“A very singular effect is
produced when the drawing originally intended to be seen by the right eye is
placed at the left hand side of the stereoscope, and that designed to be seen by
the left eye is placed on its right hand side. A figure of three dimensions, as
bold in relief as before, is perceived, but it has a different form.” He
called this the “converse figure” (1838) or “conversion of
relief” (1852); nowadays, we call it “reverse perspective”
(reviewed in Howard & Rogers, 2002).
“Those points which are nearest the observer in the proper figure are the
most remote from him in the converse figure” and he continues, ”but
it is not an exact inversion, for the near parts appear smaller, and the remote
parts larger than the same parts before inversion ( Wheatstone, 1838, p. 377).” 2 And then he explains that in the case of
simple line drawings, the reverse perspective figure is “as readily
apprehended as the original one, because it is generally a figure of a frequent
occurrence.” He also states that the reversals “seem entirely to
depend on our mental contemplation of the figure intended to be represented, or
of its converse.” In the Bakerian Lecture ( Wheatstone, 1852, p. 14), he is
extraordinarily explicit about the occurrence of bi-stability (which he calls
“the two ideas in the mind”) in binocular vision 3: “I know of nothing more wonderful,
among the phenomena of perception, than the spontaneous successive occurrence of
these two different ideas in the mind, while all external circumstances remain
precisely the same,” and he goes on to state that an object “becomes
converted into another totally dissimilar object uncouth in appearance, and
which gives rise to no agreeable emotions in the mind; yet in both cases all the
sensations that intervene between object reality and ideal conception continue
unchanged.”
Figures 2 and 3 demonstrate the two 3D percepts that observers
are able to distinguish when (monocular) perspective and (binocular) disparity
specify very conflicting slants: one percept in which the grid’s slant is
positive ( Figure 3b) and the other in which
the slant is negative ( Figure 3c). The two
percepts are never present simultaneously. Figure 2. Demonstration of
bi-stability in stereoscopic perception.
In these stereograms, both perspective and binocular disparity specify
surface slant about the vertical axis. Red/green filters are required to view
them. When the red filter is over the left eye, two relatively stable percepts
can be distinguished. In the first percept, the grid recedes in depth with its
left side further away (it is perceived as a slanted rectangle). In the other
percept, the right side of the grid is further away (it is perceived as a
trapezoid with the near-edge shorter than the far-edge). In fact, the perceived
slant depends on the viewing distance; however, when the red filter is over the
left eye, their signs are always conflicting. When the red filter is over the
right eye, perspective and disparity specify similar slants and the observer
perceives a single stable slanted grid with its right side closer. In the lower
stereogram, the conflict between disparity and perspective-specified slant is
relatively small and observers generally perceive one slanted plane (no
bi-stability). More demonstrations can be found on our Web site: http://www.phys.uu.nl/~vanee/
Figure 3.
Experimental procedure. The drawing is schematic. A. The stimulus consisted of a
grid (subtending 15 x 11 deg when perspective specifies a frontal - unslanted -
grid). The grid was viewed against a large surrounding reference (92 x 39 deg)
consisting of unslanted squares (1 x 1 deg). The window in the center of the
surround was 19 x 17 deg. Perspective-specified slant and disparity-specified
slant could be either in conflict (B and C) or in agreement (in which case
panels B and C are identical). D. The stimulus was followed by a display in
which subjects matched the perceived slant(s) to the angle(s) between the fixed
horizontal line and the rotatable intersecting line(s).
Most observers with normal stereovision have no
difficulty in focusing their attention on either of the two 3D percepts.
However, during pilot studies and during presentations at conferences, we have
asked at least 60 observers to report their perceptions while viewing ambiguous
slant stimuli; as in many other studies in binocular depth perception, we found
considerable variability between observers (reviewed in Howard & Rogers, 2002). Some of the
observers were able to perceive both the perspective and the disparity-dominated
percept (bi-stability), some observers perceived solely the
perspective-dominated percept, and some solely the disparity-dominated percept
(see also Stevens, Lees, & Brookes,
1991, for the same finding in a comparable study for surface curvature).
Roughly speaking, about 30% of the 60 pilot observers
tested were able to perceive both the perspective-dominated and the
disparity-dominated percept directly. The other 70% of the observers initially
perceived solely the perspective-dominated percept (even if they knew that
bi-stability would be possible). Only after they had been told they were looking
at a stimulus that they could see in reversed perspective were they able to
perceive bi-stability. About 10% to 20% of the 60 observers kept seeing solely
the perspective-dominated percept even after they had been coached in trying to
perceive the disparity-dominated percept. Two observers (very experienced
colleagues in stereo vision research, but not the authors) perceived solely the
disparity-dominated percept, and they were unable to alternate between the
disparity- and the perspective-dominated percept.
Bi-stability in stereoscopic vision is an interesting
phenomenon because it creates the rare opportunity of having two states in
neural processing that are related to the percepts rather than to the stimulus.
To enable future theoretical analyses on how both perspective- and
disparity-specified slant contribute to bi-stable 3D perception, we collected
systematic data on metrical aspects for a broad spectrum of possible
combinations of disparity- and perspective-specified slants. We asked observers
to view ambiguous stereoscopic images in which both disparity and perspective
specified different orientations of a grid in 3D space. Grid rotation was about
the vertical axis, and we manipulated perspective and disparity
independently.
The stimuli ( Figures
2 and 3) were planar grids (subtending 15
× 11 deg in unslanted conditions)
presented dichoptically by a conventional red-green anaglyphic technique. The
correct perspective and disparity distortions of the stimuli were generated
using OpenGl libraries. The stimuli were rear-projected onto a large screen (92
× 77 deg). A surrounding pattern (92 × 39 deg) consisting of small
squares (1 × 1 deg) provided a zero-slant reference and prevented depth
contrast illusions. Only 80% of these surrounding squares were shown to prevent
fixation in the wrong depth plane (wallpaper effect). Subjects were seated at a
viewing distance of 114 cm. The head was stabilized with a chin and forehead
rest. Subjects were free to move their eyes. 4 Line widths were 6.3
arcmin. The intensities of the red and green half-images were adjusted until
they appeared equiluminant when viewed through the red and green filters.
Photometric measurements showed that miniscule amounts (0.3%) of the green and
the red light leaked through the custom-made red and the green filter,
respectively. The room was completely dark, so only the grid and the reference
were visible.
To investigate systematically how both perspective- and
disparity-specified slant contribute to bi-stable 3D perception, we varied both
disparity-specified slant (-70 to 70 deg in 10 steps) and perspective-specified
slant (–70 to 70 in 6 steps). Positive slants are defined as right side
away. In each block of 77 trials, all of the stimulus conditions appeared once
in random order for 10 s. There were five trial
blocks. The subjects were instructed that both
ambiguous (flip) and nonambiguous (non-flip) stimuli would be presented and that
the stimuli could be either trapezoidal or
rectangular.
The subjects’ task was to estimate the perceived
slant of the grid. The slant estimation procedure ( van Ee & Erkelens, 1996) is depicted in Figure 3. The subject initiated the stimulus
onset by a mouse click. A subject was instructed first to decide whether he or
she was able to see either the left side in front or the right side in front.
Then the task was to estimate the respective slants and to remember the
estimated angles. After presentation of the stimulus (10 s), three
frontoparallel lines were presented on the screen ( Figure 3d). One of the lines was horizontal and
the other two lines could be rotated about their center. The horizontal line was
fixed and represented a top view of the unslanted reference; each of the other
lines represented the top-view of the perceived grid in either the
perspective-dominated percept or in the disparity-dominated percept ( Figure 3d). Subjects were instructed to match
the angles between the rotatable lines and the horizontal line to the two
perceived slants. If observers were not able to experience bi-stability, they
matched both angles to the (single) slant they perceived. Because the lines were
displayed in the plane of the screen, the lines also served as a zero-slant
reference between successive stimuli. 5
Observers who were able to perceive bi-stability are
particularly interesting for the purposes of this work. We therefore carried out
a complete experiment with five subjects who were able to perceive bi-stability.
In order to obtain a reasonably complete overview of the spectrum of possible
results, we asked one only-perspective-dominant observer and one
only-disparity-dominant to participate in a complete experiment. Both the five
observers who were able to perceive bi-stability and the only-disparity-dominant
observer had excellent stereo vision: their stereoacuities were lower than 10
arcs, and they were also able to distinguish disparities of different signs and
magnitudes within a range of –1 to 1 deg in a stereoanomaly test ( van Ee & Richards, 2002). The
only-perspective-dominant observer participating was unable to distinguish
disparities of different signs and magnitudes even while making eye movements.
Prior to participation, the candidates were also tested for consistency in their
responses when estimating the slants of both real and dichoptically presented
planes. The seven subjects knew that they were participating in an experiment
containing ambiguous (flip) and nonambiguous (non-flip) stimuli, but they were
not informed about the purpose of the experiment.
Figures 4, 5, and 6 show the mean perceived
slants for a range of varying perspectives and disparities. Figure 4 shows the mean perceived slants across the five
subjects who were able to perceive bi-stability. The data show clearly that
there are two different domains. In one domain (when disparity-specified slant
and perspective-specified slant were similar) only one slant is perceived. In
this domain, slants derived from perspective and disparity have been reconciled,
engendering an intermediate perceived slant (in Figure 4,
this situation is represented in the left part of the three top panels and the
right part of the three bottom panels). The reconciled data in this domain show
the often reported slant underestimation. In the other domain (when
disparity-specified slant and perspective-specified slant were quite different)
a subject experienced bi-stability and was able to select one of the two
perceived slants. 6 The occurrence of such a
clear bifurcation has been reported before for an ambiguous vertical disparity
stimulus ( Porrill, Frisby, Adams, &
Buckley, 1999), though that study did not report on bi-stability.
In general, after the onset of the stimulus, all five
observers first perceived the perspective-dominated slant (see also Schriever, 1925, and Stevens et al., 1991, for very similar
findings in bi-stable slant and curvature, respectively). After a couple of
seconds, the disparity-dominated percept almost literally “kicked
in.” During the rest of the presentation period, the two percepts remained
present: although spontaneous flips could not be prevented, subjects were able
to select either of the two percepts and to flip between them by switching their
attention. Another study that looked at stereo-perspective conflicts also
reported that perspective dominated initially before disparity took over ( Allison & Howard, 2000a). Other
relevant studies on the timing issues in perspective-disparity conflict are
concerned with slant reversals ( Gillam,
1967; Gillam, 1993). Gillam reported
perceived slants to be in the direction opposite to that predicted when subjects
view a stimulus with rich perspective cues (such as a brick wall) while one of
the retinal images was horizontally scaled relative to the other. Those slant
reversals also involve perspective-specified slant dominating the
disparity-specified slant in initial stages of viewing ( Seagrim, 1967). Van Ee conducted a
perceptual learning experiment. One of the subjects showed reversed slants only
for roughly the first 25 responses of an experimental session. The rest of his
responses were in Figure 4.
Experimental results for five observers who perceived bi-stability. The graphs
depict perceived slant as a function of disparity-specified slant (i.e., slants
that were geometrically present in the stimulus). Error bars represent ±
1SD in the mean across the five subjects. The subjects perceived the grid with
the left side either behind (grey disks) or in front (black diamonds) of the
unslanted reference. The trapezoid-shaped icons depict perspective-specified
slant.
the predicted direction. The number of initial
slant reversals decreased over one-week interval sessions but did not disappear
( van Ee,
2001). The fact that bi-stability occurs when
the conflict between perspective and disparity is small (see particularly the
center panel of Figure 4 when perspective is zero) seems
to be at odds with the literature on stereo slant perception. Most studies
(including our own) report reconciliation of slant cues in such conditions.
There are at least two explanations for this difference. First, we used grid
stimuli with stronger perspective cues than those that are usually used in
research. Second, we explicitly asked subjects to focus on the occurrence of
bi-stability (see ”Discussion”).
Figure 5 shows the mean
perceived slants across five trial repetitions for the observer who was able to
experience only the perspective-dominated percept. This pattern of data
resembles the pattern of data of Figure 4 for the
perspective-dominated percept in bi-stability. Figure 6
shows mean perceived slants across five trial repetitions for the observer who
was able to experience only the disparity-dominated percept. Although
bi-stability occurred occasionally, this pattern of data resembles the pattern
of data of Figure 4 for the disparity-dominated percept
in bi-stability.
All the data figures show that the observers follow
neither the disparity-specified slant nor the perspective-specified slant. So
far we have considered only disparity and perspective as cues for grid slant. In
a stereoscopic experiment in the laboratory, there are, however, more slant cues
available to the visual system; some of them are inevitably conflicting (e.g.,
accommodative blur, the fixed graininess of the pixels on the screen, or the
brightness gradient). In our experiment, these residual cues specify zero slant
- often called flatness - of the grid.
The presence of the flatness cues explains why subjects deviate from both the
disparity-specified and the perspective-specified slant even when the two
specify the same slant.
In the ”Introduction,” we referred to
considerable differences across subjects during pilot studies. Such differences
are commonly found in stereo studies (for a review, see Howard & Rogers, 2002). More relevant for
the current study is the fact that two of the studies that reported bi-stability
in perception when monocular and binocular cues conflict also found such
differences ( Schriever, 1925; Stevens et al., 1991). A couple of studies
have related the differences across subjects to stereoanomaly ( Harwerth, Möller, & Wensveen,
1998; van Ee & Richards,
2002; Rouse, Tittle, & Braunstein,
1989). We found that the differences across subjects were considerably
reduced if we selected subjects who were able to distinguish disparities of
different sign and magnitude in a recently developed stereoanomaly test ( van Ee & Richards, 2002). The
subject who continued to see the perspective-dominated percept was unable to
distinguish disparities of different sign and magnitude in this test (even while
making eye movements).
Figure 5. Same as Figure 4 but for a perspective-dominant observer. This
observer hardly perceived bi-stability.
We have examined the metrical aspects of perceived
slant for a broad spectrum of possible combinations of disparity- and
perspective-specified slants. Observers perceived only one slant when the
perspective- and disparity-specified grid orientations were similar. More
interestingly, observers with normal stereopsis were able to select either a
perspective- or a disparity-dominated slant when the specified orientations were
rather different.
Why have so few studies reported on bi-stability in
stereo vision? Many investigators have been interested in the interaction of
binocular and monocular cues. Quite a few have varied monocular cues ( Banks & Backus, 1998; Buckley & Frisby, 1993; Clark, Smith, & Rabe, 1956; Cumming, Johnston, & Parker, 1993; Frisby, Buckley, & Horsman, 1995; Frisby, Buckley, & Freeman, 1996; Gillam, 1968; Gillam & Ryan, 1992; Harwerth et al., 1998; Johnston, Cumming, & Parker, 1993; Ryan & Gillam, 1994; Smith, 1967; Stevens & Brookes, 1988; van Ee, Banks, & Backus, 1999; Youngs, 1976), and others have gone so far as
to present binocular cues that specified a depth sign that was opposite to the
depth sign specified by monocular cues ( Allison & Howard, 2000a; Allison & Howard, 2000b; Braunstein, Andersen, Rouse, & Tittle,
1986; Bülthoff & Mallot,
1988; Bülthoff & Mallot,
1990; Dosher, Sperling, & Wurst,
1986; Gillam & Cook, 2001; Rogers & Collett, 1989; Turner, Braunstein, & Andersen, 1997; van der Meer, 1979). Most of the
above-mentioned studies, however, were not concerned primarily with the study of
bi-stability. Therefore, they did not employ disparity and perspective stimuli
that consisted of large differences in depth magnitude. This might be a first
reason why so few studies explicitly report on bi-stability. Second, most
studies that employed rather conflicting cues used short presentation times,
which did not leave time to build-up the bi-stable percepts. Third, we used grid
stimuli with perspective cues that are stronger than those used in most existing
studies. Finally, and perhaps most importantly, we explicitly asked subjects to
focus on the occurrence of bi-stability. This relates to the classical question
of cognitive intervention in perceptual responses and is difficult to rule out.
Some naïve observers might not perceive bi-stability when they are not
explicitly instructed to look for it. Figure 6.
Same as Figure 5 but for a disparity-dominant
observer. Just as the observer in Figure 5, this observer
hardly perceived bi-stability.
Why is it interesting to study bi-stability? Wheatstone (1852, p. 13) wrote 7 “the relief and distance of objects is
not suggested to the mind solely by the binocular pictures and the convergence
of the optic axes, but also by other signs” (nowadays called cues),
“which are perceived by means of each eye singly. One idea being therefore
suggested to the mind by one set of signs, and another totally incompatible idea
by another set, according as the mental attention is directed to the one and
abstracted from the other, the normal form or its converse is
perceived .” Generally, and often
in psychophysics, it is beneficial to study signal interaction under conflicting
conditions. Another way of studying perception is to expose the visual system to
an ambiguous stimulus that generates bi-stable perception because it creates the
rare opportunity of having two states in neural processing that are related to
the percepts rather than to the stimulus. Although to our knowledge Brewster and
von Helmholtz were not explicit about the occurrence of bi-stability in
binocular vision, it might be of historical interest to compare their analyses
of reverse perspective. Brewster stated that the reverse perspective illusion
“is the result of an operation of our minds, whereby we judge the forms of
bodies by the knowledge we have acquired” (quoted in Wheatstone, 1838, p. 383) and von
Helmholtz noted that we see objects as those
that “produce the same impression
on the nervous mechanism under ordinary normal conditions” ( von Helmholtz, 1866, Vol. III,
§26). These authors understood that we use prior knowledge of the world
(and not just the information on the retinae) to infer the object that would
most likely have produced the stimulus, an analysis that is now advanced in
Bayesian-like analyses of the visual system.
The use that is made of prior knowledge is evident in
one of the most striking examples of depth inversion, namely a hollow relief
mask which can be seen in reversed perspective despite stereopsis ( Yellott & Kaiwi, 1979). Yellott and
Kaiwi report that if a random-dot stereogram is projected onto such a mask,
stereopsis can be achieved for the stereogram, and its depth planes can be seen
correctly while the mask itself, including the region covered by the stereogram,
is simultaneously perceived with depth inverted. Frisby and Mayhew (1979) published another
striking example on depth inversion in random-dot stereograms. Their observers
viewed, by crossing their eyes, a classical Julesz random-dot stereogram that
contained a square that receded relative to the surround. If their observers
stared at the square while at the same time they forcibly converged their gaze
even further, then there came a point at which the depth direction of the square
changed and it appeared to protrude in front of the surround, rather than
recede. Their explanation for the depth inversion was as follows: “the
deliberate act of verging away from the square’s proper depth plane
disturbs the usual fusion of the two halves and permits a new fusional state to
come about which carries with it depth inversion.” Such an explanation
does not account for our results because in our stimulus there is no fusion
problem. First, the disparities are relatively small, and there is no matching
ambiguity such as occurs in random dot stereograms.
Although spontaneous flips could not be prevented, the
flips between the two percepts were attention-driven. 8 Our study does not make clear what happens
in a subject’s mind while he or she flips between the two percepts. One
way of viewing bi-stability is in terms of the brain constructing an a
posteriori probability of the world’s state of affairs conditional on the
image data ( Kersten, Bülthoff, Schwartz,
& Kurtz, 1992). Such a Bayesian approach 9 is consistent with the general notion that
the visual system is picking rational and plausible interpretations of scene
properties causing the image. An example for a rational interpretation of scene
attributes is the finding that a binocularly viewed curved surface is only
perceived as glossy if the specular highlight is close to the correct
(geometrically derived) distance from the surface ( Blake & Bülthoff, 1990).
In our study, subjects were instructed that both
ambiguous (flip) and nonambiguous (non-flip) stimuli would be presented and some
observers noted “I just wanted to see left in front or right in
front” (again, see Schriever,
1925, and Stevens et al., 1991, for
very similar findings). Subjects were also informed that the stimuli could be
either trapezoidal or rectangular, and some observers explicitly used this
information and noted that they switched their attention from attempting to see
a trapezoid to attempting to see a rectangle. 10 All stimuli were consistent with a
real-world object, which may be trapezoidal or rectangular. It is only by using
this type of assumption that linear perspective can be informative. Elsewhere we
present a coherent Bayesian model for bi-stability in which it is assumed that
observers flipped between the two perceived slants by changing the strength of
the rectangularity assumption ( van Ee, Adams, & Mamassian,
2002). In the strong-rectangularity mode, the observer is assuming that the
object in the world was a rectangle, and deviations from rectangularity in the
image are a consequence of perspective projection. In the disparity dominant
mode, it is assumed that the observer is implementing a weak rectangularity
assumption. In the Bayesian bi-stability model, there is one set of parameters
(at the chosen viewing distance) that can explain perceptual bi-stability in
stereoscopic vision for the complete spectrum of combinations of perspective and
disparity .
We are grateful to Nieneke Elsenaar for finding many
references and to Pieter Schiphorst for technical assistance. We thank Drs. A.V.
van den Berg and W.J. Adams for helpful discussions, and the subjects for
participating in this study. R.V.E. was supported by The Netherlands
Organization for Scientific Research. Commercial Relationships: None.
. See
also Stevens, Lees, and Brookes
(1991), who reported bi-stability in stereoscopic curvature
perception; no metrical analyses were involved. Two more studies reported
bi-stability for a 3D stimulus in which disparity was varied ( Virsu, 1975; and Harris, 1980). They studied the effect of
disparity adaptation on the probability of seeing the Schröder staircase
either from below or from above. This bi-stability is different from that in our
study because the Schröder staircase bi-stability is essentially monocular.
Also, to observe a reversal for the Schröder staircase, the observer would
have to assume that the relative position of the observer and the object had
been altered (for other work on bi-stability, see Gregory, 1970; Papathomas, 2000; Papathomas, 2002; and Wade & Hughes, 1999).
. Wheatstone (1838) also described the
shape deformations in rotating stimuli as was described in 1951 by Ames (for the
famous rotating trapezoidal window). Such shape deformations can be compared
with the nonrigidity that is experienced when cues from stereo and
structure-from-motion interact while a rotating figure is observed in reversed
perspective ( Turner, Braunstein, &
Andersen, 1997).
In
his 1838 study, Wheatstone is not entirely clear about bi-stability in binocular
vision. On page 381, he writes that the bi-stability “phenomenon takes
place, though less decidedly, when the drawing is seen with both eyes,”
and there are a couple of examples. But on page 382, he states very clearly that
“no illusion of this kind can take place when an object of three
dimensions is seen with both eyes.” In 1839, about 6 months after the
appearance of the 1838 study, photography was being introduced and shortly
afterwards Wheatstone used photographs of objects in his stereoscope. The use of
photographs enabled him to use more realistic monocular cues to depth.
. Although
eye movements play a role, the general consensus is that the bi-stability is
predominantly central (for a review, see Howard, 1961). We are currently measuring eye
movements while subjects experience bi-stability in our grid stimuli. Our
preliminary conclusion is that flips between the two percepts in the
bi-stability evoked by our stimuli can occur by effort of will while subjects
keep strict fixation.
. A
reasonable objection to this metrical slant estimation method is that it is hard
to interpret the data because a slant angle that is estimated to be 35 deg in
one trial might look like 40 deg in another trial. Previous work has
demonstrated, however, that subjects have a relatively constant internal
reference and that they do not regard this task as difficult. This estimation
method has been used previously for real planes ( van Ee, Banks, & Backus, 1999) and when
subjects wore distorting lenses ( Adams, Banks,
& van Ee, 2001). In addition, a similar metrical depth estimation method
was successfully used for volumetric stimuli ( van Ee & Anderson, 2001).
. A
reviewer posed the interesting question as to whether observers are able to
perceive bi-stability in monocular stimuli. We found that subjects did not
perceive bi-stability when we presented them with monocular trapezoids and
rectangles ( van Ee, Hol, &
Erkelens, 2001). We provided controls for monocular images for both the
right (green) eye and the left (red) eye and also for synoptic presentation in
which the images were presented in yellow for vergence posture, as if the
stimulus was located at infinity.
. Note
that as early as 1852, he explains the resulting percepts in terms of cue
combination.
. See
Wheatstone, 1852, and particularly see
McDougall, 1906, and Flügel, 1913, for seminal discussions
about the role of attention (or central processing as opposed to an influence of
eye movements).
. Bayesian
modeling has been successfully applied in computer vision (for a review, see Knill & Richards, 1996), and in the last
decade, several investigators have started to apply this framework to human
vision ( Porrill, Frisby, Adams, &
Buckley, 1999; Bülthoff &
Yuille, 1991; Bülthoff &
Mallot, 1990; Clark & Yuille, 1990;
Yuille & Bülthoff, 1996; Freeman, 1994; Freeman, 1996; Bülthoff & Yuille, 1990; Bülthoff, 1991; Yuille, Geiger, & Bülthoff, 1991; Ascher & Grzywacz, 1999; Hogervorst & Eagle, 1998; Kontsevich & Tyler, 1999; Mamassian & Landy, 1998; Mamassian & Landy, 2001; Read, 2002).
. This
is related to the generic viewpoint assumption that observers make while viewing
visual objects ( Freeman, 1994; Nakayama & Shimojo, 1992).
Adams, W. J., Banks, M. S.,
& van Ee, R. (2001). Adaptation to three-dimensional distortions in human
vision. Nature Neuroscience,
4, 1063-1064. [PubMed]
Allison, R. S., &
Howard, I. P. (2000a). Temporal dependencies in resolving monocular and
binocular cue conflict in slant perception.
Vision Research,
40, 1869-1886. [PubMed]
Allison, R. S., &
Howard, I. P. (2000b). Stereopsis with persisting and dynamic textures.
Vision Research,
40, 3823-3827. [PubMed]
Ames, A. (1951). Visual
perception and the rotating trapezoidal window.
Psychological Monographs: General and
Applied, 65 (No. 324),
1-31.
Ascher, D., & Grzywacz,
N. M. (2000). A Bayesian model for the measurement of visual velocity.
Vision Research,
40, 3427-3434. [PubMed]
Banks, M. S., & Backus, B.
T. (1998). Extra-retinal and perspective cues cause the small range of the
induced effect. Vision Research,
38, 187-194. [PubMed]
Blake, A., &
Bülthoff, H. H. (1990). Does the brain know the physics of specular
reflection? Nature,
343, 165-169. [PubMed]
Braunstein, M. L.,
Andersen, G. J., Rouse, M. W., & Tittle, J. S. (1986). Recovering
viewer-centered depth from disparity, occlusion, and velocity gradients.
Perception & Psychophysics,
40, 216-224. [PubMed]
Buckley, D., & Frisby,
J. P. (1993). Interaction of stereo, texture and outline cues in the shape
perception of three-dimensional ridges. Vision
Research, 33, 919-933. [PubMed]
Bülthoff, H. H.
(1991). Shape from X: Psychophysics and computation. In M. S. Landy & J. A.
Movshon (Eds.), Computational models of visual
processing (pp. 305-330). Cambridge, MA: MIT Press.
Bülthoff, H. H., &
Mallot, H. A. (1988). Integration of depth modules: Stereo and shading.
Journal of the Optical Society of America
A, 5, 1749-1758. [PubMed]
Bülthoff, H. H.,
& Mallot, H. A. (1990). Integration of stereo, shading and texture. In A.
Blake & T. Troscianko (Eds.), AI and the
eye (pp. 119-146). New York: John Wiley & Sons.
Bülthoff, H. H.,
& Yuille, A. L. (1990). Shape from X: Psychophysics and computation.
SPIE Sensor fusion III: 3-D perception and
recognition, 1383,
235-246.
Bülthoff, H. H.,
& Yuille, A. L. (1991). Bayesian models for seeing surfaces and depth.
Comments on Theoretical Biology,
2, 283-314.
Clark, J. J., & Yuille, A.
L. (1990). Data fusion for sensory information
processing systems. Boston, MA: Kluwer.
Clark, W. C., Smith, A. H.,
& Rabe, A. (1955). Retinal gradient of outline as a stimulus for slant.
Canadian Journal of Psychology,
9, 247-253.
Clark, W. C., Smith, A. H.,
& Rabe, A. (1956). Retinal gradients of outline distortion and binocular
disparity as stimuli for slant. Canadian
Journal of Psychology, 10,
77-81.
Cumming, B., Johnston, E.,
& Parker, A. (1993). Effects of different texture cues on curved surfaces
viewed stereoscopically. Vision
Research, 33, 827-838. [PubMed]
Cutting, J. E., &
Millard, R. T. (1984). Three gradients and the perception of flat and curved
surfaces. Journal of Experimental Psychology:
General, 113, 198-216. [PubMed]
Dosher, B. A., Sperling, G.,
& Wurst, S. A. (1986). Tradeoffs between stereopsis and proximity luminance
covariance as determinants of perceived 3D structure.
Vision Research,
26, 973-990. [PubMed]
Flügel, J. C. (1913).
The influence of attention in illusions of reversible perspective.
British Journal of Psychology,
5, 357-397.
Freeman, R. B. (1966).
Absolute threshold for visual slant: The effect of stimulus size and retinal
perspective. Journal of Experimental
Psychology, 71, 170-176. [PubMed]
Freeman, W. T. (1994). The
generic viewpoint assumption in a framework for visual perception.
Nature,
368, 542-545. [PubMed]
Freeman, W. T. (1996). The
generic viewpoint assumption in a Bayesian framework. In D. C. Knill & W.
Richards (Eds.), Perception as Bayesian
inference (pp. 365-389). Cambridge, UK: Cambridge University Press.
Frisby, J. P., Buckley, D.,
& Freeman, J. (1996). Stereo and texture cue integration in the perception
of planar and curved large real surfaces. In T. Inui & J. L. McClelland
(Eds.), Attention and performance 16:
Information integration in perception and communication (pp. 71-91).
Cambridge, MA: MIT Press.
Frisby, J. P., Buckley, D.,
& Horsman, J. M. (1995). Integration of stereo, texture, and outline cues
during pinhole viewing of real ridge-shaped objects and stereograms of ridges.
Perception,
24, 181-198. [PubMed]
Frisby, J. P., & Mayhew,
J. E. (1979). Depth inversion in random-dot stereograms.
Perception,
8, 397-399. [PubMed]
Gillam, B. J. (1967). Changes
in the direction of induced aniseikonic slant as a function of distance.
Vision Research,
7, 777-783. [PubMed]
Gillam, B. J. (1968).
Perception of slant when perspective and stereopsis conflict: Experiments with
aniseikonic lenses. Journal of Experimental
Psychology, 78, 299-305. [PubMed]
Gillam, B. J. (1993).
Stereoscopic slant reversals: A new kind of 'induced' effect.
Perception,
22, 1025-1036. [PubMed]
Gillam, B. J., & Cook, M.
L. (2001). Perspective based on stereopsis and occlusion.
Psychological Science,
12, 424-429. [PubMed]
Gillam, B. J., & Ryan, C.
(1992). Perspective, orientation disparity, and anisotropy in stereoscopic slant
perception. Perception,
21, 427-439. [PubMed]
Gregory, R. (1970).
The intelligent eye. London, UK:
Weidenfeld and Nicholson.
Harris, J. P. (1980). How
does adaptation to disparity affect the perception of reversible figures?
American Journal of Psychology,
93, 445-457. [PubMed]
Harwerth, R. S.,
Möller, M. C., & Wensveen, J. M. (1998). Effects of cue context on the
perception of depth from combined disparity and perspectuve cues.
Optometry and Vision Science,
75, 433-444. [PubMed]
Hogervorst, M. A., &
Eagle, R. A. (1998). Biases in three-dimensional sructure-from-motion arise from
noise in the early visual system. Proceedings
of the Royal Society of London. Series B:
Biological Sciences,
265, 1587-1593. [PubMed]
Howard, I. P. (1961). An
investigation of a satiation process in the reversible perspective of revolving
skeletal shapes. Quarterly Journal of
Experimental Psychology, 13,
19-33.
Howard, I. P., & Rogers,
B. J. (2002). Depth perception.
Toronto, Ontario, Canada: I. Porteous.
Johnston, E. B., Cumming,
B. G., & Parker, A. J. (1993). Integration of depth modules: Stereopsis and
texture. Vision Research,
33, 813-826. [PubMed]
Kersten, D., Bülthoff,
H. H., Schwartz, B. L., & Kurtz, K. J. (1992). Interaction between
transparency and SFM. Neural
Computation, 4, 573-589.
Knill, D. C., & Richards,
W. (1996). Perception as Bayesian
inference. Cambridge, UK: Cambridge University Press.
Kontsevich, L. L., &
Tyler, C. W. (1999). Bayesian adaptive estimation of psychometric slope and
threshold. Vision Research,
39, 2729-2737. [PubMed]
Mamassian, P., &
Landy, M. S. (1998). Observer biases in the 3D interpretation of line drawings.
Vision Research,
38, 2817-2832. [PubMed]
Mamassian, P., &
Landy, M. S. (2001). Interaction of visual prior constraints.
Vision Research,
41, 2653-2668. [PubMed]
McDougall, W. (1906).
Physiological factors of the attention process (IV).
Mind,
15, 329-359.
Nakayama, K., &
Shimojo, S. (1992). Experiencing and perceiving visual surfaces.
Science,
257, 1357-1363. [PubMed]
Papathomas, T. V. (2000).
See how they turn: False depth and motion in Hughes' reverspectives.
Human Vision and Electronic Imaging V,
Proceedings of SPIE, 3959,
506-517.
Papathomas, T. V. (2002).
Top-down and bottom-up processes in 3-D face perception: Psychophysics and
computational model. Perception,
31, 521-530. [PubMed]
Porrill, J., Frisby, J. P.,
Adams, W. J., & Buckley, D. (1999). Robust and optimal use of information in
stereo vision. Nature,
397, 63-66. [PubMed]
Read, J. C. A. (2002). A
Bayesian model of stereopsis depth and motion direction discrimination.
Biological Cybernetics,
86, 117-136. [PubMed]
Rogers, B. J., & Collett,
T. S. (1989). The appearance of surfaces specified by motion parallax and
binocular disparity. Quarterly Journal of
Experimental Psychology, 41A,
697-717.
Rouse, M. W., Tittle, J. S.,
& Braunstein, M. L. (1989). Stereoscopic depth perception by static
stereo-deficient observers in dynamic displays with constant and changing
disparities. Optometry and Visual
Science, 66, 355-362.
Ryan, C., & Gillam, B.
(1994). Cue conflict and stereoscopic surface slant about horizontal and
vertical axes. Perception,
23, 645-658. [PubMed]
Schriever, W. (1925).
Experimentelle Studien über stereoskopisches Sehen.
Zeitschrift für Psychology und
Physiologie der Sinnesorgane,
96, 113-170.
Seagrim, G. N. (1967).
Stereoscopic vision and aniseikonic lenses. I.
British Journal of Psychology,
58, 337-350. [PubMed]
Smith, A. H. (1967). Perceived
slant as a function of stimulus contour and vertical dimension.
Perceptual and Motor Skills,
24, 167-173.
Stevens, K. A. (1981). The
information content of texture gradients.
Biological Cybernetics,
42, 95-105. [PubMed]
Stevens, K. A., &
Brookes, A. (1988). Integrating stereopsis with monocular interpretations of
planar surfaces. Vision Research,
28,
371-386 . [PubMed]
Stevens, K. A., Lees, M.,
& Brookes, A. (1991). Combining binocular and monocular curvature features.
Perception,
20, 425-440. [PubMed]
Turner, J., Braunstein, M.
L., & Andersen, G. J. (1997). Relationship between binocular disparity and
motion parallax in surface detection.
Perception & Psychophysics,
59, 370-380. [PubMed]
van der Meer, H. C.
(1979). Interrelation of the effects of binocular disparity and perspective cues
on judgments of depth and height. Perception
& Psychophysics, 29,
481-488.
van Ee, R. (2001). Perceptual
learning without feedback and the stability of stereoscopic slant estimation.
Perception,
30, 95-114. [PubMed]
van Ee, R.,
Adams, W. J., & Mamassian, P. (2002). Bayesian modelling of perceived slant
in bi-stable stereoscopic perception. Manuscript submitted for
publication.
van Ee, R., &
Anderson, B. L. (2001). Motion direction, speed, and orientation in binocular
matching. Nature,
410, 690-694. [PubMed]
van Ee, R., Banks, M. S.,
& Backus, B. T. (1999). An analysis of binocular slant contrast.
Perception,
28, 1121-1145. [PubMed]
van Ee, R., & Erkelens, C.
J. (1996). Temporal aspects of binocular slant perception.
Vision Research,
36, 43-51. [PubMed]
van Ee, R., Hol,
K., & Erkelens, C. J. (2001). Bistable stereoscopic percepts and depth cue
combination. Perception,
30, S42.
van Ee, R., &
Richards, W. (2002). A planar and a volumetric test for stereoanomaly.
Perception,
31, 51-64. [PubMed]
Virsu, V. (1975).
Determination of perspective reversals.
Nature,
257, 786-787. [PubMed]
von Helmholtz, H.
(1866). Handbuch der Physiologischen Optik:
Vol. III, § 26. Hamburg, Germany: Voss.
Wade, N. J., & Hughes, P.
(1999). Fooling the eyes: Trompe l'oeil
and reverse perspective. Perception,
28, 1115-1119. [PubMed]
Wheatstone, C. (1838).
Contributions to the physiology of vision - Part the first; On some remarkable
and hitherto unobserved phenomena of binocular vision.
Philosophical Transactions of the Royal
Society of London, 128,
371-394.
Wheatstone, C. (1852).
The Bakerian Lecture: Contributions to the physiology of vision - Part the
second; On some remarkable and hitherto unobserved phenomena of binocular
vision. Philosophical Transactions of the
Royal Society of London, 142,
1-17.
Yellott, J. I., & Kaiwi,
J. L. (1979). Depth inversion despite stereopsis: The appearance of random-dot
stereograms on surfaces seen in reverse perspective.
Perception,
8, 135-142. [PubMed]
Youngs, W. M. (1976). The
influence of perspective and disparity cues on the perception of slant.
Vision Research,
16, 79-82. [PubMed]
Yuille, A. L., &
Bülthoff, H. H. (1996). Bayesian decision theory and psychophysics. In D.
C. Knill & W. Richards (Eds.), Perception
as Bayesian inference (pp. 123-161). Cambridge, UK: Cambridge University
Press.
Yuille, A. L., Geiger, D.,
& Bülthoff, H. H. (1991). Stereo integration mean field theory and
psychophysics. Network,
2, 423-442.
|
|