| Volume 3, Number 8, Article 4, Pages 562-572 |
doi:10.1167/3.8.4 |
http://journalofvision.org/3/8/4/ |
ISSN 1534-7362 |
Occlusion cues resolve sudden onsets into morphing or line motion, disocclusion, and sudden materialization
Alex O. Holcombe |
Department of Psychology, University of California, San Diego, La Jolla, CA, USA |
|
Abstract
An abrupt appearance of a new stimulus, or sudden onset, has several possible perceptual interpretations. The change may reflect an object new to the scene or instead be caused by disocclusion of a pre-existing object. Alternatively, the sudden onset may be interpreted as the morphing of a pre-existing figure (as in “line motion”). Previous work has focused on the morphing percept to the exclusion of other interpretations of sudden onsets. This paper supports the idea that morphing, and the other interpretations of sudden onsets, reflect occlusion cues indicating the most likely cause of the stimulus. Consider a line segment that appears abruptly. The data herein show that when the segment has already been represented as present in the scene(via amodal completion), its onset is likely to be perceived as a disocclusion event, with no appearance of morphing. Even when individual frames do not support amodal completion, dynamic (although motionless) cues can favor the disocclusion interpretation, again vetoing the perception of line motion. Some final demonstrations address sudden materialization, in which previously unseen objects suddenly appear. Again there is ambiguity in that sudden materialization and disocclusion can be caused by image changes that are locally identical. Remote cues to occlusion are shown to give these stimuli distinct appearances. The existence of these ambiguities, and the role of occlusion cues in resolving them, has implications for theories of motion perception and attentional capture.
 |
|
History
Received January 26, 2003; published October 2, 2003
Citation
Holcombe, A. O. (2003). Occlusion cues resolve sudden onsets into morphing or line motion, disocclusion, and sudden materialization.
Journal of Vision, 3(8):4, 562-572,
http://journalofvision.org/3/8/4/,
doi:10.1167/3.8.4.
Keywords
occlusion, sudden onset, attentional capture, line motion, depth, transformational apparent motion, prior entry
for related articles by these authors
for papers that cite this paper |
A stimulus on a computer screen that becomes visible
instantly rather than gradually is said to have a “sudden onset”. A
sudden onset is sometimes interpreted perceptually as an outgrowth or morphing
of a pre-existing object. This phenomenon, perhaps first noted by Kanizsa (1951), was termed "line motion" by
Hikosaka, Miyauchi, and Shimojo (1993).
The broader terms "transformational apparent motion" and "morphing" have been
introduced to include the experience of motion in the change of other figures in
addition to lines ( Tse, Cavanagh, & Nakayama
1998; Baloch & Grossberg 1997).
“Morphing” will be used to refer to all these phenomena in the
present paper.
In the study of perception, showing that visual
mechanisms determine the most likely interpretation of the retinal stimulus is a
venerable tradition ( von Helmholtz
1867). For example, stereovision research elucidates how perceptual
interpretation of retinal disparities match likely corresponding 3-D scenes
(e.g. Gillam, Blackburn, & Nakayama
1999).
This, however, has not been the predominant tradition
for research into the phenomenon of morphing motion. Instead, morphing motion
has been explained by a putative effect of attention on perceptual latency ( Hikosaka, Miyauchi, & Shimojo 1993; Faubert & von Grunau 1995), as the
result of brain oscillations ( Holt-Hansen,
1970), or by the responses of elementary motion detectors ( Zanker 1994). For example, the attentional
theory posits that attention reduces the perceptual latency of the attended end
of the sudden onset. Subsequently, this would be expected to cause motion
detectors selective for motion originating at the attentional locus to respond,
yielding the percept of motion.
More recently, Tse,
Cavanagh, & Nakayama (1998) instead advocated the
most-likely-interpretation explanation of morphing motion, and Baloch & Grossberg (1997) presented a
model which instantiates aspects of this idea. The argument is that segmentation
mechanisms designed to determine the objects and their changes in the scene
produce the perception of morphing motion, and that this segmentation operates
on a space-time representation rather than operating on each frame separately.
The evidence is a qualitative correspondence between percepts and what is
intuitively expected of a rational segmentation mechanism. The purpose of this
paper is to validate this correspondence with additional qualitative evidence
and extend it to other perceptual interpretations of sudden onsets. Note,
however, that this framework is not incompatible with an additional effect of
attention by priming or by reducing perceptual latency.
In their examination of the morphing, Tse et al. (1998) explored the effects of
figural continuity between stimuli in two-frame displays. Building on Faubert & von Grunau (1995), Tse et al. (1998) noted that when a stimulus
appears suddenly and is spatially continuous with a pre-existing stimulus, it
appears to grow out of the static stimulus. When the sudden onset is adjacent to
two static stimuli rather than just one, the sudden onset appears to grow out of
the static stimulus that is more smoothly continuous with it, even when the
responses of simple motion detectors would favor the other direction (see Figure
8.7 in Tse et al. 1998). The putative effect
of attention on perceptual latency cannot explain these results. Instead, in
these cases the visual system seems to use segmentation cues to determine the
origin of the sudden-onset stimulus.
Indeed, in a subsequent paper Tse & Logothetis (2002) found that the
direction of morphing motion reflects correspondence of textures and colors in
addition to luminance-defined continuity. In addition, their results suggested
that the perception of morphing motion reflects a full 3-D, cue-invariant
representation of the stimulus. This evidence provided some validation for Tse et al
(1998)’s proposal that
the full sophistication of human image segmentation machinery is brought to bear
to determine the motion perceived.
Tse et al (1998)
suggest that morphing motion is a manifestation of the same parsing processes
that yield the phenomenon of spatial amodal completion — the perception
that an object part behind an occluder joins together a partially occluded
object. An intriguing prediction of this idea is that the same image cues that
affect spatial amodal completion will also affect morphing motion. This paper
presents decisive evidence for this. Indeed, the perception of morphing reflects
amodal completion cues in a way consistent with a process designed to determine
the most likely interpretation of the stimulus. Further demonstrations show that
another stimulus ambiguity – whether sudden onsets are caused by
disocclusion or by sudden materialization – are also resolved by cues to
occlusion. Hence the interpretation of image cues to occlusion explains not only
the perception of sudden onsets as morphing motion, but also the perception of
onsets as disocclusion and materialization.
Previous, Poor Evidence that Amodal Completion can Affect the Perception of a Sudden Onset
This section argues that previous work provides only
weak support for the hypothesis that amodal completion can affect whether
morphing motion is perceived. In contrast, experiments reported in this paper
will provide strong support for the hypothesis.
In a classic display of morphing motion, a filled
rectangle suddenly appears, and is coterminous with an adjacent stable shape.
Typically one perceives the rectangle to shoot out from the stable shape as if
it were an extension of it. To explain the claim that the matching process in
morphing motion can operate on an amodally completed representation, Tse et al offered the stimuli in Figure 1 (adapted from their Fig 8.5a).
The right half of the display is similar to previous line motion displays by Hikosaka et al (1993). Thus it is
unsurprising that the right line segment is perceived to shoot to the left. But
Tse et al reported that the leftmost segment
of the grey line in Figure 1 is also
perceived to shoot to the left. To them, this indicated a role for the perceived
continuity via amodal completion of the two line segments. Tse et al's (1998) idea is that morphing motion
mechanisms can operate on an amodally completed representation. However, the
evidence in this case is quite weak. If the part of the stimulus which
facilitates the amodal completion (the right half of Figure 1) is deleted, one would still
expect to perceive the remaining line segment as shooting to the left, for the
reason described below. Figure 1 . According to Tse et al.
(1998), the grey rectangles are perceived to shoot to the left, which they
considered evidence that morphing motion reflects an amodally-completed
representation.
The expectation that this percept might occur even
without amodal completion is based on the strong similarity the display to those
of Figure
2
(adapted from Tse et al's Figure 8.2
and 8.4). In each case in Figure 2, Tse et al report that the line segment is
perceived to shoot to the left. For these displays, the authors posit that the
contiguity between the line segment and the shape on its right results in
interpretation of the line segment as an extension of the shape on the right,
yielding the perception of shooting to the left. However, these demonstrations
undermine their claim that amodal completion causes the perception of motion in
Figure 1, as the continuity factor
documented in Figure 2 could be used to explain
the morphing motion of both line segments of Figure 1, without reference to amodal
completion. Indeed, Tse et al provide
additional examples to show that good continuity alone can determine the line
motion percept (their Figure 8.3). Whether this continuity effect has anything
to do with amodal completion remains an open question.
Figure 2 . A display adapted from Tse
et al. (1998). The arrows in the static version indicate the percept and
were not present in the experimental display. Reportedly, observers perceived
morphing motion to the left in each case.
Inspired by a preliminary report of Tse et al's work, Baloch & Grossberg (1997) also claimed
that amodal completion plays a role in morphing motion. As evidence, they
pointed to the display depicted by Figure
3 and reported that the rectangular red ring is perceived to shoot to the
right, whereas the green line is perceived to shoot to the
left. Figure 3 . A reproduction of Figure 10b from Baloch & Grossberg (1997). On some
display devices there may appear to be a small gap between the red and green
shape, but the experimental display contained no gap, only crisp
T-junctions.
Baloch &
Grossberg (1997) suggested that perception of the green line as shooting to
the left implies an effect of amodal completion on morphing motion. But consider
what the simple continuity factor would predict, irrespective of the occurrence
of amodal completion. The physical continuity of the leftmost green line segment
with the red rectangle on the first frame might cause it to be perceived as
shooting to the left. In contrast, the continuity of the central green line
segment with frame 1's red rectangle might cause the central segment to be
perceived as shooting to the right. Baloch
& Grossberg (1997) report that instead, both green line segments are
perceived to shoot to the left. This is consistent with their hypothesis of a
role for amodal completion. However, they do not provide control displays to
insure that the leftward motion does not result from another factor. This is a
concern because the simple continuity factor predicts the presence of leftward
motion in one part of the green line, and in this complex display it is possible
that another factor might cause it to win out over the rightward motion also
predicted by contiguity. For example, Steinman et al. (1995) provided evidence
that line motion cues cause a center-surround opponent effect, which might favor
perceiving the green line to move in the opposite direction than the red ring.
Further reason for uncertainty came from the present
author's subjective experience, in which the green line seems to appear all at
once without any shooting sensation. It is unclear whether this informal
observation constitutes inferior data to that of Baloch & Grossberg, since they did not
report data in their paper nor say what sort of experiment they conducted with
the display.
In the following experiment, a modified version of this
display as well as some novel control displays were used to provide a better
test of the possibility that amodal completion is a factor in the perception of
morphing motion.
New Evidence for a Role for Amodal Completion: Modification of Baloch & Grossberg
The Baloch &
Grossberg (1997) display was modified (one of the breaks in the green bar
was eliminated) to create Figure 4A. The
modification helped to clarify the question asked of the observers, by avoiding
a potential need to distinguish between differing percepts in different parts of
the green bar. If most observers perceive leftward motion of the green bar in A,
then the data from the control displays (see Figure4B, C, and D) should reveal whether this is
due to amodal completion of the flashing green bar with the static green bar.
Figure 4 . Observers were more likely to perceive the green line as
shooting to the left in the critical display A than in control displays B, C,
and D. Apparently, the perceived continuity of the green segment with the
preexisting segment via amodal completion increases the likelihood of perceiving
motion. However, the movie version of this figure may result in a different
experience than in the experiment, for various reasons, including perceptual
interactions among animations simultaneously presented. Also, some display
devices may introduce thin lines between the red and green shapes. These were
not present in the experiment.
A total of 31 subjects participated. All reported
normal or corrected-to-normal vision. Of these, 28 were completely naïve
undergraduate and graduate students and 3 were experienced psychophysical
observers.
To familiarize the observers with the morphing motion
percept, each person was provided with a few examples. In one of these fairly
unambiguous example displays, a rectangle flashed on and off next to a smoothly
contiguous rectangle, yielding a morphing percept. In a second example, the
flashing rectangle was contiguous with stable rectangles on both sides. Here,
all observers reported experiencing motion from both sides at once, towards the
center. As an example of nonmotion, observers were shown a flashing rectangle
which was flanked by shapes contiguous with the flashing shape. But in this case
the point of contact between the shapes was tiny and as expected, no observers
reported perceiving motion.
After becoming familiar with morphing motion, each
observer was shown the displays in pseudorandom order. After each display, they
were asked to report whether they perceived any motion of the red and the green
figures and, if they did perceive motion, the direction of motion perceived for
each. The first 11 observers viewed only displays 4A, 4B, and 4C, as 4D was
added after it was suggested by a reviewer.
The theory that morphing motion is adaptively affected
by cues to amodal completion made specific predictions for the displays of this
experiment. Specifically, when cues to amodal completion indicate that the
suddenly-appearing green line is an extension of the formerly present green
line, observers should experience motion to the left. Indeed this was the claim
made by Baloch & Grossberg (1997),
although they did not report any data.
Contrary to what would be expected from Baloch & Grossberg (1997), in the present
experiment the majority of observers (52%) did not perceive the green bar to
shoot to the left in the display depicted in Figure
4A. Nevertheless, a closer look at the data does indicate some role for
amodal completion. Specifically, compare the number of observers who perceived
leftward motion in 4A with the number of observers who perceived leftward motion
in the control displays ( Figure 4B, C, and
D).
In the case of the display of Figure 4A, 48% of the 31 observers reported that
the green part shot to the left when it appeared, 35% reported no motion, and
16% reported that it shot to the right.
In the control display of Figure 4B, zero observers reported that the green
shot to the left, for 4C only 13% did, and for the display of 4D, only 11% did.
Hence significantly more perceived leftward green motion in 4A than in any of
the control displays (pooled test
t(60)=5.30,
p <0.0001,
t(60)=3.23,
p=0.002, and
t(60)=4.13,
p=0.0001, respectively). The complete
results are provided in Table
1. Table 1 . Results for Displays Depicted in Figure 1.
|
A
|
B
|
C
|
D
|
|
Green
|
Red
|
Green
|
Green
|
Red
|
Green
|
Red
|
|
Left
|
48%
|
3%
|
0%
|
13%
|
0%
|
11%
|
0%
|
|
Right
|
35%
|
29%
|
100%
|
61%
|
39%
|
50%
|
37%
|
|
None
|
16%
|
68%
|
0%
|
26%
|
61%
|
30%
|
63%
|
The incidences of the perception of leftward,
rightward, or no motion for the green and red shapes of the critical display A
and the three control displays, B, C, and D of Figure 4. The incidences of the most critical
percept, leftward motion of the green shape, are in the shaded cells.
This difference between the experimental and the
control displays rules out several explanations of the perceived leftward motion
of the green shape. That observers perceived the green shape of display B to
shoot to the right instead of to the left implies that the presence of the
static green rectangle on the right is not sufficient to cause leftward motion.
The low incidence of leftward motion of the green shape in display C shows that
the red figures of 4A also are not responsible for the prevalence of leftward
motion in 4A. Finally, the low rate of leftward green motion in display D
suggests that the mere presence of the red ring and the green bar, in
combination, also cannot explain the leftward motion in A.
The factor responsible for the perception of leftward
motion of the green shape in 4A, then, is the combination of the presence of the
green rectangle on the right with the red ring in an occluding relationship.
This indicates that the green line was perceived to shoot leftward because of
its continuity with the green line on the right via an amodal representation. We
can conclude that the continuity factor, which causes a shape to seem to shoot
from one side, can be caused by continuity with amodal representations as well
as visible representations.
Although 48% of observers perceived the predicted green
leftward motion in display A, which was significantly more than in the control
displays,
still this means that less than 50% of the observers
experienced the predicted percept. If continuity with an amodal representation
is sufficient to cause morphing motion, then why did only a minority of
observers perceive the morphing motion in Figure
4A? The motion of the nearby red shape, and other factors, may compete with
amodal completion in determining the final percept. Other possible reasons for
variability with these types of displays are discussed in the conclusions of the
next section. In any case, since the evidence of this first experiment was not
definitive, the next experiment used a different approach, which yielded
stronger evidence for a role for amodal completion.
New Evidence for a Role for Amodal Completion: Novel Displays
The previous experiment provided tentative evidence for
the idea that morphing motion is adaptively affected by image cues to amodal
completion. Specifically, when a sudden onset appeared attached to a preexisting
object through amodal completion, for many observers it appeared in motion,
morphing from the side of the preexisting object.
But what if spatial amodal completion cues specified
that a suddenly appearing figure had already been present, although occluded? If
interpretation of sudden onsets intelligently incorporates amodal completion,
the visual system should report that the sudden onset stimulus appeared by
virtue of disocclusion rather than by morphing motion. Does our visual system do
this? In the following experiment, we test this with pairs of displays that
differ in whether cues to amodal completion are present.
If it turns out that morphing motion is indeed vetoed
by cues to occlusion, a further question is whether the system makes this
determination on the basis of individual frames, or whether it also integrates
information from other frames. For continuous nonmorphing motion, it has already
been shown that subsequent information is used to decide whether occlusion
exists in a previous frame. For example, in the displays of Cicerone et al. (1995), an invisible,
moving occluder is perceived when the movie is viewed, even though it is not
perceived from individual static frames. Using some additional displays, we
tested whether this would occur for two-frame morphing motion displays. But, to
start, the question was whether regions which appear suddenly would not be
perceived to morph if they were previously represented amodally.
20 subjects who reported normal or corrected-to-normal
vision participated. Three were experienced psychophysical observers associated
with the laboratory, and the remaining were undergraduates who participated for
course credit or for pay. All were naive to the purposes of the
experiment.
Subjects viewed the stimuli through a mirror
stereoscope which directed light from the left half of the CRT to the left eye
and from the right half to the right eye, along a path from eye to screen of
about 58 cm. Two copies of the stimulus, one for each eye, were displayed on the
screen with an appropriate difference between the two copies when binocular
disparity was desired.
Observers were screened for stereovision in a short
test. First, observers reported the relative depth of several static targets
which had relative disparity of ~0.2
deg. Subsequently they were tested on more targets for a total of 16 tests.
These subsequent targets were presented only briefly, for
~250 msec ( van Ee & Richards, 2002). Five of the 20
observers failed the screen by reporting the depth of more than 4 of the 16
targets incorrectly. Data of these observers was discarded.
After the stereovision screening procedure, observers
were familiarized with morphing motion using the same procedure as that used in
the previous experiment of this paper. Then each subject viewed in pseudorandom
order a total of 18 displays, counting the multiple display speeds and the
variations of stereodisparity (in front vs. behind) that were appropriate for a
few stimuli. Each observer was asked to report whether the relevant dynamic part
of the stimulus appeared to be in motion ("shooting") or not. The two frames of
each display were shown in alternation until the observer responded.
The displays most critical to the main hypothesis are
schematized in Figure 5. The red shape of frame
2 was 8.5 deg wide and 2.42 deg high, and each of the red shapes was textured to
include disparity signals from the shape’s
interior. Figure 5 . The numbers
indicate the percentage of the 15 observers who experienced morphing motion for
displays A and B at the slowest alternation rate. Observers were more likely to
experience motion when disparity indicated that the red was closer than the
green than when the red was farther than the green. The implication is that when
the red flankers are joined by an amodal representation, the sudden appearance
of the central portion is interpreted as disocclusion rather than as morphing
motion.
Observers viewed each of the two displays of Figure 5 at three different speeds: 360, 260, and
175 msec per frame, using Macromedia Flash
MX ™, which does not provide
precise timing. Hence the displays were not synchronized with the screen refresh
and presentation rate could vary by up to 70 msec per frame.
For each rate, observers viewed one version of each
display in which the relative binocular disparity indicated that the red shapes
were in front of the green, and another version in which the red shapes were
behind the green, by 0.42 deg of disparity.
To determine whether other cues to occlusion could also
play a role, the displays of Figure 6
were presented. The displays of Figure
6A, B, C, and D were shown at a rate of 635 msec/frame and those of 6E and
6F were presented at 450 msec/frame. These durations provided enough time for
form analysis to exert its full effects (according to Tse & Logothetis,
2002). Figure 6 . The numbers indicate the percentage of the 15 observers
who reported motion for each of 6 displays. Fewer perceived motion when
pictorial cues favored a preexisting amodal representation of the
suddenly-appearing figure than when such cues were not present. This was true
for the comparison of A to control displays B, C, and D, and also for E compared
to control display F.
The results show that when a suddenly appearing
stimulus was already represented amodally, observers are less likely to
experience motion. As tabulated in Figure 5,
observers were significantly less likely to experience motion when the red
flankers were behind the green shape than when they were closer than the green
shape. An ANOVA with subject, depth of the red flankers, display variant, and
speed of the alternation as factors indicates that the depth of the flankers has
a significant effect, F(1,14)=28.7,
p<.0001, whereas none of the
interactions is significant. There is also a main effect of display variant,
with subjects significantly less likely to experience line motion in variant B,
F (1,14)=5.7,
p=.031. This was expected. The
irregular contour of variant B was designed to make the stimulus more amenable
to interpretation as green occluding the red. Specifically, in B it is easier to
perceive the intersection of the red and green shapes as being intrinsic to the
green shape rather than caused by the red occluding the green. A pilot
experiment had prompted concern that observers would interpret variant A as the
red shape occluding the green even when the stereo disparity specified
otherwise, due to the strength of the T-junction as an occlusion cue. Display B
was devised to counter this possible tendency towards a ceiling effect.
Presentation rate was also varied in case any
particular speed significantly favored one interpretation, but no significant
effect obtained. Overall, we have a clear result indicating that amodal
completion can determine whether morphing motion is perceived. The sudden
appearance of a figure that was already represented, albeit previously
invisible, was not experienced as motion.
There were two purposes of including the displays of Figure 6 in the experiment. First, to
determine whether displays even without stereo disparity could nonetheless
change from morphing motion to disocclusion on the basis of a viable occlusion
interpretation. A second question was whether sudden onsets could be interpreted
as disocclusion even without an amodal representation prompted by the static
display. This possibility arose because in the displays of Figure 6A and Figure 6E, the individual frames were not
sufficient to induce the perception of occlusion.
The first frame of 6A is typically perceived as a red
rectangle adjacent to a green rectangle, and in 6E the blue rectangle is
perceived as flanked by a coplanar reddish shape and orange shape. However, the
second frame of these displays suggests that, in the case of 6A, the red shape
was but the visible portion of a larger bar, and in the case of 6E, that the
orange and reddish shapes were in fact part of one shape behind a blue
rectangle. This constitutes a dynamic cue to occlusion, as it requires the
comparison of two frames. The question was whether this dynamic occlusion cue
will reduce, compared to control displays, the perception of morphing motion of
the red rectangle in 6A and of the orange/red shapes in 6E.
Indeed the data indicate that the dynamic occlusion cue
was effective. The probability of perceiving morphing motion was much lower in
the display of Figure 6A than in the
control displays of Figure 6B
( t(28)=-4.91,
p<.0001)
and 6C ( t (28)=-3.35,
p=.0012). This result was predicted, as
the display of Figure 6A is more
consistent with disocclusion of the extension of the red than are displays 6B
and 6C. The cue to disocclusion is the coincident disappearance of the green
shape when the extension of the red appeared. Still, a critic might attribute
the difference in number of subjects reporting motion to some general disruption
of motion signals caused by the transient of the green shape's disappearance.
The result with control display 6D counters that criticism. In display 6D the
green rectangle disappeared just as it did in 6A, but the display in 6D was less
consistent with the red being amodally completed behind and indeed motion was
more likely to be perceived
( t(28)=-1.89,
p=.035).
Comparison of display E with control display F further
supports the hypothesis that a motionless dynamic pictorial occlusion cue can
cause the perception of disocclusion instead of morphing motion. When the first
frame of display E is viewed alone, typically three aligned shapes are perceived
rather than any occluding relationships. This is consistent with an
interpretation of morphing motion when the next frame appears — the orange
shapes morph together, closing like curtains over the blue rectangle. However,
only 47% of observers perceived motion in this display. In this experiment
observers did not report the nature of the occlusion perceived, so we cannot be
sure of their interpretation, but in an unpublished experiment all those who
reported motion described it as the curtain-closing percept. In the present
experiment, instead of perceiving motion in display E, most reported that the
blue shape appeared to dematerialize with the appearance of the second frame,
revealing the center of the longer shape that previously had been occluded.
Hence disocclusion was perceived despite the lack of any clue to the existence
of the hidden material in the first frame. This suggests that the dynamic
covering and uncovering itself can sometimes result in a disocclusion percept,
effectively vetoing a morphing motion percept.
Once again, in display E, approximately half of
observers perceived dematerialization of the blue, revealing an orange shape
behind. None of the observers ever reported this in the case of display F, even
though the center strip of display F was identical to display E. Instead, in
display F observers reported morphing motion (100% vs. 47% of display E,
t(28)=–4.0,
p=.0002). The addition of the
contiguous blue context to E, creating display F, resulted in interpretation of
blue as the background. With the orange pattern seen in front, then, appearance
of the orange could not be attributed to disocclusion and thus morphing motion
of the orange was perceived.
Together these displays show that the percept of
occlusion vs. dematerialization can occur even when no single frame yields
amodal completion. Interestingly, the dynamic occlusion cues of these displays
are dynamic but do not involve motion, despite the emphasis in the literature on
motion as the dynamic occlusion cue.
A final point is that none of the observers perceived
the blue shape of display 6E to transform or morph into the visible material of
the second display. When one stimulus replaces another, how does the system
decide whether to consider it a transformation of the old object instead of a
new object? In the present case, the decision clearly has something to do with
whether an occlusion relationship is plausible, but this question remains
mostly unexplored.
Although in display F the prediction that morphing
would be perceived was fulfilled without exception, most of the manipulations
resulted in more modest increases in the likelihood of the predicted outcome.
One reason is that the cues manipulated in the displays did not always result in
a change in the perceived depth. For example, in the case of the displays in
which stereo disparity was manipulated ( Figure
5), a preliminary experiment showed that despite stereodisparity's
reputation as a potent depth cue, for many naïve subjects it did not
overcome the interpretation they favored when the stimulus had no
stereodisparity. Such dominance by non-stereo cues is not unprecedented (for a
review, see Howard & Rogers
2002).
The other displays were also sometimes seen in ways
unanticipated by the author. Furthermore, it seemed that attention could be used
to perceive motion where at first glance motion was not perceived. This is
consistent with previous results that, at least with some stimuli, attention can
determine the motion perceived ( Verstraten, Cavanagh, & Labianca,
2000; Cavanagh 1992). Conflicts
between motion energy and ecological cues in some displays probably were another
contributor to the variability of the results. Because of this variability, and
especially because attention allows observer expectations to influence the
results, collecting data from naïve subjects is quite important. In some
previous reports in the literature, the methods were unfortunately not
described, making the number of naïve observers used unknown.
(Dis)occlusion vs. (De)materialization
The previous experiments suggest that the perception of
morphing reflects segmentation mechanisms designed to determine the cause of a
dynamic visual stimulus. Cues to amodal completion were shown to adjudicate
between an interpretation of morphing and other interpretations.
In the remainder of this paper it is shown that sudden
onsets contain yet another ambiguity: between disocclusion and sudden
materialization. Further, a demonstration shows that the ambiguity can be
resolved by image cues to occlusion. A larger point is that the ecological cue
account of morphing motion not only explains many aspects of morphing motion,
but also explains other interpretations of sudden onsets. By contrast, other
theories of morphing motion do not have this generality.
A sudden onset of one stimulus also results in a sudden
offset of the stimulus that was on the screen immediately before, be it a blank
background or a figural pattern. An attendant ambiguity ( Gibson, 1966) is that the new stimulus could
be a new object or it could instead have always been present in the scene and
appeared simply by disocclusion. When a lone stimulus suddenly disappears from
the screen, the new stimulus (a portion of the background) is interpreted as
appearing due to disocclusion rather than sudden materialization. The
interpretations of disocclusion and sudden materialization can be quite distinct
in one's phenomenology. On the left side of Figure 7, the bright central bars seem to
appear by virtue of sudden materialization, whereas on the right side they seem
to appear due to disocclusion. Figure 7 . In this movie, the appearance of the central grey areas
(outlined in red in the static figure) typically have distinct perceptual
interpretations: a sense of disocclusion on the left and sudden materialization
on the right.
But does the differing experience of the two parts of
the display depicted in Figure 7 show
that disocclusion and sudden materialization always correspond to distinct
percepts? A skeptic might argue that the predominant difference between the two
parts of the display is a difference in salience, with the appearance of the
background appearing much less salient than the disappearance of the figure. The
salience difference is unsurprising due to the difference in contrast from the
surround, as the background portion has no difference from the surround and the
smaller stimulus is very different from its surround. Higher surround contrast
results in higher salience ( Nothdurft,
1993). Would two-frame materialization and disocclusion look different even
when contrast with the surround were equated, or is the perceptual difference in
Figure 7 caused by nothing more than
the contrast difference?
J.J. Gibson
(1966) believed that there is more to the perceptual difference than a
salience difference caused by contrast. He supported this with displays of
temporally extended events, in which a gradual deletion cue can signal
occlusion. However, he apparently did not control for local surround contrast.
The display schematized in Figure 8
shows that even without a difference in local contrast, materialization and
disocclusion look quite different. Figure 8 . In this movie, an illusory square is seen to occlude the
red square in the occlusion display. When the red rectangle disappears, it seems
to continue to exist behind the illusory square. By contrast, in the animation
on the right the red rectangle appears to vanish from the scene, as if it
dematerialized. Binocular disparity strengthens the effect but is not necessary
for some observers.
The phenomenology of the appearance of the red square
in the displays of Figure 8 differs. In
the left hand display of Figure 8
dematerialization, without perceived motion, of the illusory square leads to the
perception of disocclusion of the red square.
In both displays, the local images changes around the
red square are identical. Nevertheless, the remote spatial context of frame 1
(which determines whether an illusory contour is created) causes the phenomenal
appearance of the red square to be quite different in the two cases. In the
disocclusion display, the notched circles yield the sense that a large grey
square is present in frame 1. The filling-in of the circles in frame 2 causes
the visual system to interpret the large illusory square as dematerializing,
revealing by disocclusion the smaller red square. In the movie, stereo disparity
is added to the large illusory square to increase the evidence for this
perceptual interpretation.
In "materialization", the addition of arc to complete
the blue circles, eliminating the illusory contour, plus the reversed stereo
disparity causes the large square to be perceived behind the red square instead
of in front.
These demonstrations constitute another case where an
ecological cue causes sudden onsets to be interpreted differently. Thus,
although both disocclusion and materialization can correspond to the sudden
appearance of previously unseen, identical objects with identical surrounds,
with appropriate cues such events still appear phenomenally distinct.
In this case, the animation embodies a degenerate,
reduced form of the classic accretion and deletion cue to occlusion. Sigman & Rock (1974) showed that this cue
can also adjudicate between perception of a sudden onset as a disoccluded object
versus a moving object. But unlike previous investigations of this cue, in the
present case there is no motion perceived in the display.
This paper has documented the influence of occlusion
cues on several perceptual interpretations of sudden onsets. First, strong
experimental support was provided for the idea of Tse et al. (1998) that parsing and segmentation
cues for amodal completion can determine the perceptual interpretation of
morphing motion displays. Second, this idea was extended to the interpretations
of sudden materialization and disocclusion. It was shown that these last two
interpretations yield a distinction in observers' phenomenology, above and
beyond differences in salience from local contrast. That the same occlusion cues
can arbitrate between these varied interpretations hints at a common mechanism
underlying the different interpretation of sudden onsets. The morphing or line
motion interpretation is but one outcome of this process.
Previous literature has concentrated on the single
distinction of the perception of motion vs. non-motion, and moreover has not
fractionated non-motion into materialization and disocclusion. This additional
distinction provides a challenge to models of the perception of dynamic scenes.
After an identified object is no longer at a particular location, computational
models typically only attempt to determine whether there is subsequently a
corresponding match in another location (which would indicate motion) ( Jojic & Frey 2001). If there is no
correspondence match, models ought to also determine whether the object has
simply disappeared or is instead still present but occluded.
Another challenge to models arises because previous
studies of dynamic occlusion have focused on motion, usually progressive
appearance and disappearance, as a critical cue. Sigman & Rock (1974) already showed that
progressive change is not needed. The displays here show that motion itself is
also not necessary to perceive dynamic occlusion.
Given the infrequency of sudden materialization and
dematerialization in the natural world, one may wonder how the visual system
came to distinguish between sudden dematerialization and sudden occlusion. The
pioneering developmentalist Jean Piaget
(1929) believed that young children fail to represent objects as continuing
to exist when they disappear due to occlusion. This amounts to the notion that
children represent disappeared objects as dematerialized, at least implicitly.
Subsequent research showed that Piaget's notion is wrong — children
continue to represent objects which have been fully occluded ( Kellman & Spelke, 1983; Johnson, Bremner, Slater, Mason, & Foster
2002; Aguiar & Baillargeon 2002).
Many dozens of experiments have now probed children's representation of
occlusion events. Nevertheless, it seems that research has not been directed at
the cues that might cause a child to perceive dematerialization rather than
occlusion. Such an investigation could yield some surprising results, taking us
even farther from Piaget’s original belief. Specifically, if our
perception of a distinction between occlusion and dematerialization comes from
experience with dematerialization events, it may be that the young and
inexperienced will perceive as occlusion many events that adults perceive as
dematerialization.
The perception of sudden onsets is affected not only by
the occlusion cues investigated here, but also by other factors (e.g. von Grunau, Dubé, & Kwas 1996;
Eagleman & Sejnowski, in press).
A major focus of previous work has been the influence of spatial attention on
morphing motion. Attending to one end of a suddenly appearing bar can cause the
experience of motion originating from the attended end ( Hikosaka et al. 1993). Several researchers
have concluded that attention reduces perceptual latency, causing the attended
part of the figure to appear first. In their focus on mechanism, however, they
seem to have overlooked functional consequences of this phenomenon. A typical
real-world scene evokes a large number of low-level motion responses, many of
which are spurious or otherwise not of interest to the observer. To speculate,
this may be one reason for the role of attention in resolving motion ambiguity
( Cavanagh 1992). In particular, one
consequence of reducing the perceptual latency of an attended item may be to
bias motion detectors to respond to the movement of the attended item rather
than, for example, other items moving into the attended location. Attended
figures with shorter latencies should reach motion detectors before figures in
unattended areas. Since motion detectors are activated when receiving
stimulation from one location shortly before another location, this will create
a bias to perceive motion originating in the attended location. Hence rather
than being an entirely non-adaptive byproduct of the circuitry of attention,
reduction in perceptual latency may help to track items of interest.
Another line of research which heavily utilizes sudden
onsets is the investigation of attentional capture. Research in this area has
distinguished between sudden onsets that are "new objects" ( Yantis, 1993), those that are object
disappearances ( Samuel & Weiner,
2001), and those that are only a brightness or color change ( Enns, Austen, di Lollo, Rauschenberg, & Yantis,
2001), and found evidence that "new objects" (which roughly seems to mean
suddenly-materializing figures) are the most potent for attracting attention.
But note that this classification does not differentiate between objects that
appear by disocclusion, by materialization, or by morphing motion. The cues
documented here should allow future work to determine the relative
attention-summoning potency of these various interpretations of sudden onsets.
Portions of this research were reported at the 2002
Vision Sciences Society Meeting, Sarasota, Florida, and the 2002 Joint Symposium
on Neural Computation, Pasadena, California. This research, and its publication
in an open-access journal, was supported by a postdoctoral NEI NRSA to AOH and
EY01711 awarded to D. I. A. MacLeod. I thank David Eagleman and Edward Hubbard
for comments on the manuscript.
Commercial relationships: none.
Aguiar,
A., & Baillargeon, R. (2002). Developments in young infants' reasoning about
occluded objects. Cognitive Psychology,
45(2), 267-336. [ PubMed]
Baloch, A. A., &
Grossberg, S. (1997). A neural model of high-level motion processing: Line
motion and formotion dynamics. Vision
Research, 37(21), 3037-3059. [ PubMed]
Cavanagh, P. (1992).
Attention-based motion perception. Science,
257, 1563-1565.[ PubMed]
Cicerone, C. M., Hoffman,
D. D., Gowdy, P. D., & Kim, J. S. (1995). The perception of color from
motion. Perception & Psychophysics,
57(6), 761-777. [ PubMed]
Eagleman, D. M. &
Sejnowski, T. J. (2003). The line-motion illusion can be reversed by motion
signals after the line disappears. Perception
32(8), 963-968 .
Enns, J. T., Austen, E. L., Di
Lollo, V., Rauschenberger, R., Yantis, S. (2001). New objects dominate
luminance transients in setting attentional priority.
Journal of Experimental Psychology: Human
Perception and Performance 27, 1287-302. [ PubMed]
Faubert, J., & Von
Grunau, M. (1995). The influence of two spatially distinct primers and attribute
priming on motion induction. Vision Research,
35(22), 3119-3130. [ PubMed]
Gillam, B., Blackburn, S.,
& Nakayama, K. (1999). Stereopsis based on monocular gaps: metrical encoding
of depth and slant without matching contours.
Vision Research, 39(3), 493-502. [ PubMed]
Helmholtz, H. von,
1910/1962 Treatise on Physiological Optics
volume 3 (New York: Dover, 1962) English translation by J.P.C. Southall
for the Optical Society of America (1925) from the 3rd German edition of
Handbuch der physiologischen Optik
(Hamburg: Voss, 1910; first published in 1867, Leipzig: Voss)
Hikosaka, O., Miyauchi, S.,
& Shimojo, S. (1993). Voluntary and stimulus-induced attention detected as
motion sensation. Perception, 22(5),
517-526. [ PubMed]
Holt-Hansen, K. (1970).
Perception of a straight line briefly exposed.
Perceptual & Motor Skills, 31(1),
59-69. [ PubMed]
Howard, I.P. & Rogers,
B.J. (2002). Chapter 27 of Depth
Perception, v.2 of Seeing in
Depth. Thornhill, Ontario: I. Porteous.
Johnson S.P., Bremner J.G.,
Slater A., Mason U., & Foster K (2002). Young infants' perception of unity
and form in occlusion displays. Journal of
Experimental Child Psychology,
81, 358-374. [ PubMed]
Jojic, N. & Frey, B
(2001). Learning flexible sprites in video players. Paper presented at the IEEE
Conference on Computer Vision and Pattern Recognition.
Jonides, J., & Yantis,
S. (1988). Uniqueness of abrupt visual onset in capturing attention.
Perception & Psychophysics, 43(4),
346-354. [ PubMed]
Kanizsa, G. (1951). Sulla
polarizzazione del movimento gamma. Archivo di
Psicologia, Neurologia e Psichiatria, 3, 224-267.
Kellman, P.J. & Spelke,
E.S. (1983). Perception of partly occluded objects in infancy.
Cognitive Psychology,
15(4), 483-524. [ PubMed]
Nothdurft, H. C. (1993).
The conspicuousness of orientation and motion contrast.
Spatial Vision, 7(4), 341-363. [ PubMed]
Piaget, J. (1929).
The child's conception of the world.
New York: Harcourt Brace.
Samuel, A. G., Weiner, S. K.,
(2001). Attentional consequences of object appearance and disappearance.
Journal of Experimental Psychology: Human
Perception and Performance, 27, 1433-51. [ PubMed]
Sigman, E., & Rock, I.
(1974). Stroboscopic movement based on perceptual intelligence.
Perception,
3(1), 9-28. [ PubMed]
Steinman, B. A., Steinman,
S. B., & Lehmkuhle, S. (1995). Visual attention mechanisms show a
center-surround organization. Vision Research,
35(13): 1859-69. [ PubMed]
Tse, P. U., Cavanagh, P., &
Nakayama, K. (1998). The role of parsing in high-level motion processing. In T.
Watanabe (Ed.), High level motion processing-
Computational, neurobiological and psychophysical perspectives (pp.
249-266). Cambridge, MA: MIT Press.
Tse, P. U., & Logothetis, N.
K. (2002). The duration of 3D form analysis in transformational apparent motion.
Perception & Psychophysics, 64(2),
244-265. [ PubMed]
van Ee, R., & Richards, W.
(2002). A planar and a volumetric test for stereoanomaly.
Perception, 31, 51-64. [ PubMed]
von Grunau, M., Kwas, M.,
& Dube, S. (1996). Two contributions of motion induction: A preattentive
effect and facilitation due to attentional capture.
Vision Research, 36(16), 2447-57. [ PubMed]
Verstraten, F. A. J.,
Cavanagh, P., & Labianca, A. (2000). Limits of attentive tracking reveal
temporal properties of attention. Vision
Research, 40(26), 3651-3664. [ PubMed]
Yantis, S. (1993).
Stimulus-driven attentional capture. Current
Directions in Psychological Science, 2, 156-161.
Zanker, J. (1994). Modeling
human motion perception: I. Classical stimuli.
Naturwissenschaften, 81, 156-163. [ PubMed]
|