 |
| Volume 3, Number 3, Article 3, Pages 209-229 |
doi:10.1167/3.3.3 |
http://journalofvision.org/3/3/3/ |
ISSN 1534-7362 |
Comparison of two weighted integration models for the cueing task: linear and likelihood
Steven S. Shimozaki |
Department of Psychology, University of California,
Santa Barbara, CA, USA |
|
Miguel P. Eckstein |
Department of Psychology, University of California,
Santa Barbara, CA, USA |
|
Craig K. Abbey |
Department of Biomedical Engineering,
University of California, Davis, CA, USA |
|
Abstract
In a task in which the observer must detect a signal at two locations, presenting a precue that predicts the location of a signal leads to improved performance with a valid cue (signal location matches the cue), compared to an invalid cue (signal location does not match the cue). The cue validity effect has often been explained with a limited capacity attentional mechanism improving the perceptual quality at the cued location. Alternatively, the cueing effect can also be explained by unlimited capacity models that assume a weighted combination of noisy responses across the two locations. We compare two weighted integration models, a linear model and a sum of weighted likelihoods model based on a Bayesian observer. While qualitatively these models are similar, quantitatively they predict different cue validity effects as the signal-to-noise ratios (SNR) increase. To test these models, 3 observers performed in a cued discrimination task of Gaussian targets with an 80% valid precue across a broad range of SNR’s. Analysis of a limited capacity attentional switching model was also included and rejected. The sum of weighted likelihoods model best described the psychophysical results, suggesting that human observers approximate a weighted combination of likelihoods, and not a weighted linear combination.
 |
|
History
Received February 1, 2002; published April 15, 2003
Citation
Shimozaki, S. S., Eckstein, M. P., & Abbey, C. K. (2003). Comparison of two weighted integration models for the cueing task: linear and likelihood.
Journal of Vision, 3(3):3, 209-229,
http://journalofvision.org/3/3/3/,
doi:10.1167/3.3.3.
Keywords
cueing, Bayesian observer, selective attention
for related articles by these authors
for papers that cite this paper |
The cueing
task has been an important paradigm in the study of attention, and has had broad
applications in vision, cognitive psychology, and cognitive neuroscience (see
Pashler,
1998, or Posner & Peterson,
1990, for a review). In a
simple typical cueing task, observers are asked to detect a signal appearing one
of two locations. A cue appears prior to the signal stimulus at one of the two
locations, and typically predicts the probable appearance of the signal at the
cued location. Besides the simple cueing task, there have been many variations
of the cueing paradigm that have been studied, such as changing the number of
locations and the validity of the precue. The common finding in the cueing task
is that performance, measured either as accuracy or response time, is better in
those trials in which the cue is valid (signal appears at the cued location)
than the trials in which the cue is invalid (signal appears at the uncued
location). The cue validity effect, or cueing effect, has often been explained
by assuming that attention is a limited-capacity resource, and that attention is
drawn or allocated to the precue location. Limited capacity is often implied by
descriptions of attention enhancing the stimulus at the cued location (e.g.,
Eriksen & James,
1986; Posner,
1980; Allport,
1987; Posner & Peterson,
1990; Spitzer, Desimone, & Moran,
1988; Henderson,
1996; Luck, et al.,
1994; Luck, Hillyard, Mouloua, & Hawkins,
1996), or degrading the stimulus
at the uncued location (e.g., Broadbent,
1958; Kahneman,
1973; Treisman,
1992). Common amongst these
descriptions is the concept that attention improves the perceptibility, or
quality of processing, of the signal at the cued location, relative to the
uncued location. (It should be noted that some of these authors’
descriptions (e.g., Posner,
1980; Broadbent,
1958), while implying limited
capacity, might be interpreted more generally to include the weighted
integration models discussed later.) Several authors (Shaw,
1980; Kinchla, Chen, & Evert,
1995; Sperling,
1984; Sperling & Dosher,
1986; Shiu & Pashler,
1994, 1995;
Eckstein, Shimozaki, & Abbey,
2002; Shimozaki, Eckstein, & Abbey,
2001) have noted that a cueing
effect can be predicted by an unlimited capacity model that has the same
perceptual quality at the cued and uncued locations. These models represent a
challenge to limited capacity attentional models of the cueing paradigm, which
imply a change in perceptual quality across the cued and uncued locations. The
two key components of the unlimited capacity models are first, attention only is
a selective mechanism, and second, the response at each location is perturbed by
internal noise. Because of the stochastic nature of the internal response,
these models are often considered Signal Detection Theory (SDT) models
(Green & Swets,
1974).1
Another well-known result in attention that also has been modeled with this
approach is the decrease in performance with increasing number of items in a
visual search task, also known as the set-size effect (Kinchla, 1974,
1977;
Shaw, 1980,
1984;
Eckstein, et al.,
2000; Palmer,
1995; Palmer, et al.,
1993; Palmer, et al.,
2000; Verghese,
2001). As with the cueing
effect, the set-size effect has often been attributed to various limited
capacity attentional mechanisms, and the SDT models of visual search challenge
this attribution.
Two
different unlimited capacity models have been proposed based on the concepts
developed in SDT. The first is a weighted linear model proposed by Kinchla, et
al., (1995).
In this model, the response at each location is perturbed by noise, and the
weighted responses are linearly combined to form a single decision variable.
The second model is a sum of weighted likelihoods model based on a Bayesian
optimal observer (Eckstein, et al.,
2002; Shimozaki, et al.,
2001). In this model, the
response at each location are assumed to be noisy, the likelihood of the data
given target presence is calculated for each of the two locations (cued and
uncued), and a weighted sum of the likelihoods is computed to form a single
decision variable. Both models weight information differentially, with the cued
location being weighted more heavily. Because of this differential weighting,
both models predict a cueing effect with the same sensitivity
(d’)
at the cued and uncued locations. The equivalent sensitivity implies that these
models do not change their perceptual quality at the cued (attended) location,
and therefore, they can be considered unlimited capacity models. These models
are also known as selective attention models (Graham,
1989; Palmer, Ames, & Lindsey,
1993; Palmer,
1995), as attention only
determines the weights in these models, or how information is selected, or
chosen, for the task.
While the
two weighted integration models (weighted linear and Bayesian weighted
likelihood) predict cueing effects that are qualitatively similar, it can be
shown that these models differ substantially in their quantitative predictions
of the size of the cueing effect as a function of signal-to-noise ratio (SNR).
In particular, the weighted linear integration model predicts larger cueing
effects than the Sum of Weighted Likelihoods Bayesian model as target/non-target
discriminability increases. Therefore, from an empirical point of view, it
seems important to determine whether human observer performance is consistent
with one or the other model.
In
addition, in some cases the Sum of Weighted Likelihoods model is equivalent to
the optimal Bayesian (ideal) observer, or, in other words, it predicts the best
possible performance for the cueing task across both valid and invalid cue
trials. Because of this, it can be used as a standard of objective comparison
(Shimozaki, et al.,
2001). Finally, Bayesian models
have been used successfully to model many aspects of human visual perception,
starting with simple detection and discrimination (Green & Swets,
1974; Barlow,
1978; Burgess, Wagner, Jennings, & Barlow,
1981; Kersten,
1984; Eckstein, Ahumada, & Watson,
1997), including motion
(Weiss & Adelson,
1998), texture (Knill, in
press), object recognition
(Braje, Tjan, & Legge,
1995; Liu, Knill, & Kersten,
1995; Tjan, Braje, Legge, & Kersten
1995), color constancy
(Brainard & Freeman,
1994), perceptual learning
(Gold, Bennett, & Sekuler,
1999; Abbey, Eckstein, & Shimozaki,
2001), heading (Crowell & Banks,
1996), and reading (Legge, Klitz, & Tjan,
1997). Therefore, also from a
theoretical point of view, it seems important to distinguish whether human
performance in the cueing paradigm is consistent with a Bayesian type model or a
linear weighted integration model.
To test
these models, human observers performed in a cued discrimination task of
Gaussian signals across a range of signal-to-noise ratios, and the models were
fit to their results. Also, analysis of a third ‘attentional
switching’ limited capacity model is included, which assumes that the
observer switches his or her attention to either the cued or the uncued location
with a certain probability from trial to trial. If attention is not placed at
the target location, the model assumes that the observer is at chance
performance.
Weighted Linear Integration
Figure 1. The
weighted Linear model of the cueing task. A yes/no cued detection task is
depicted in which the observer must report signal presence at either location.
A valid signal present trial (both cue and signal on the left) is shown. The
schematic starts on the left with the responses to the cued
( xc)
and uncued
( xuc)
locations.
The linear
model was proposed by Kinchla, first to explain set size effects (Kinchla,
1974,
1977),
and then applied to cueing effects (Kinchla, Chen, & Evert,
1995). The schematic in
Figure
1 describes the behavior of the
model for a cued detection task in which the observer must report signal
presence at either of the two locations (and not the location of the signal).
First, a response is generated at each location
(xc,
xuc),
perturbed by internal noise
(N).
Each response is weighted (scaled) by a separate weighting factor
(wc,
wuc),
and then summed to give a single decision variable
(y),
which is compared to a criterion to make a decision on signal presence. As both
xc
and
xuc
are assumed to be Gaussian-distributed,
y
is also Gaussian-distributed. The mathematical expressions to obtain
performance for the weighted linear integration model are developed in Appendix
B.
Sum of Weighted Likelihoods (Bayesian)
Figure 2. The Sum
of Weighted Likelihoods model of the cueing task. A valid signal present trial
(both cue and signal on the left) of a yes/no cued detection task is depicted.
The schematic starts on the left with the responses to the cued
( xc)
and uncued
( xuc)
locations.
The second
model is a Sum of Weighted Likelihoods model based on a Bayesian observer
(Green and Swets,
1974), and has been presented
previously by Eckstein, et al. (2002),
and Shimozaki, et al. (2001).
Figure
2 gives a schematic of the
model, and Appendix
C gives the mathematical
equations for predicting performance for the model. As in the linear model, a
response is generated at each location
(xc,
xuc),
and perturbed by internal noise
(N).
Then, the model determines the likelihoods of the responses
xc
and
xuc,
given a signal at the cued location (upper branch, valid trial, signal at cued
location =
sc,
noise at uncued location =
nuc),
given a signal at the uncued location (middle branch, invalid trial, noise at
cued location =
nc,
signal at uncued location =
suc),
and given signal absence (lower branch, noise at cued location =
nc,
noise at uncued location =
nuc).
The likelihoods are computed with assumed probability density functions for
xc
and
xuc
given signal presence and absence; these probability density functions are
assumed to be Gaussian. The likelihoods of the responses given a signal at the
cued and uncued locations are then weighted separately
(wc,
wuc)
and summed to give an overall weighted likelihood given signal presence
(weighted
Ls).
Ls
is divided by the likelihood of the responses given signal absence
(Ln),
resulting in a weighted likelihood ratio
(Ls/Ln).
The weighted likelihood ratio is then compared to a criterion to make a decision
on signal presence.
The
resulting decision variable includes the sum of the weighted likelihoods
(exponentials, because Gaussian probability distributions are assumed), and not
the responses themselves. Therefore, unlike the linear model, the decision
variable for the Sum of Weighted Likelihoods model is not normal but log-normal.
Also, when the weights are equal to the actual prior probabilities of target
appearance at the cued and uncued locations (the cue validity), this model is an
optimal decision (Bayesian observer) model that maximizes performance over all
trials. Thus, in this special case, this model can be used as a comparison
standard, relative to human performance (Shimozaki, et al.,
2001), a status that cannot be
applied to the linear model.
Figure 3. The
Attentional Switching model of the cueing task. A valid signal present trial
(both cue and signal on the left) of a yes/no cued detection task is depicted.
The schematic starts on the left with the responses to the cued
( xc)
and uncued
( xuc)
locations. The model switches to either attend the cued location (red branches)
or the uncued location (blue branches).
The last
model is a limited capacity model, and is based on the observer changing
attentional strategies from trial to trial by choosing to attend to either the
cued or uncued location with a certain probability (e.g., Sperling & Melchner,
1978; Shaw,
1982, the ‘all-or-none
mixture model’). This model’s predictions are qualitatively similar
to those of the unlimited capacity models, and is therefore difficult to
eliminate from consideration in many cases. For example, a previous study
employed the use of classification images in the cueing task, and developed a
technique for distinguishing between the general class of selective, unlimited
capacity models of attention, and models where attention improves the tuning of
the perceptual filter (Eckstein, et
al., 2002). The classification
image technique, however, could not distinguish between selective attention
models and the attentional switching model. However, it can be shown that this
model can be distinguished psychophysically in its performance predictions from
the unlimited capacity models over a broad range of signal-to-noise ratios (see
next section).
In the
attentional switching model (Figure
3, Appendix
D), the observer can only attend
to one location at a time, and chooses to attend the cued location (in red) with
a certain probability
(switch).
On the other trials
(1-switch),
the uncued location is attended (in blue). Sensitivity
(d’)
at the unattended location is assumed to be zero, whereas sensitivity at the
cued location is assumed to be nonzero. Thus, this can be considered a limited
capacity model, as the sensitivity at the cued and uncued locations differ. At
both locations, an internal response is generated
(xatt,
xunatt)
that is perturbed by internal noise, and that is assumed to be
Gaussian-distributed. Because the sensitivity at the unattended location is
zero, response at the unattended location is ignored, and the model only uses
the response at the attended location. That value
(xatt)
is then compared to a criterion to make a decision on signal presence.
Comparisons of Model Predictions as a Function of Signal-to-Noise Ratio
Figures
4a,
4b, and
4c show
the predicted hit and false alarm rates for each of the models in a cueing
paradigm with an 80% valid precue. The x-axes are expressed as the
Signal-to-Noise Ratio (SNR), which is equivalent to the sensitivity
(d’)
of the models in a simple discrimination task at a single location. For the
weighted integration models, the SNR’s at the cued and uncued locations
were the same; thus, these models are unlimited capacity, as there is no change
in perceptual quality across the cued and uncued locations. Also, the weights
of the weighted integration models were chosen to maximize performance across
all conditions. For the Sum of Weighted Likelihoods
model,2
this weight is the cue validity; also, the Sum of Weighted Likelihoods model is
the ideal observer in this case. For the Linear model, the weight at the cued
location
(wc)
that maximized performance was 0.62. For the attentional switching model, the
optimal performance is obtained by not switching attention, and maintaining
attention on the cued location. As this is a relatively trivial simulation,
instead the attentional switching model in this simulation was chosen so that
the switching probability matched the cue validity, or 80%.
Figure 4. Hit and False Alarm rates as a
function of SNR for three models: (a) Linear, (b) Sum of Weighted Likelihoods,
(c) Attentional Switching.
Figure
5 shows the cueing effects for
each of the models, expressed as the difference in the hit rate for the valid
trials and the hit rate for the invalid trials. First, it should be noted that
both weighted integration models show a cueing effect with the same perceptual
quality at the cued and uncued locations. This shows that a cueing effect, by
itself, does not suggest a limited capacity attentional mechanism (Shaw,
1980; Kinchla, et al.,
1995; Sperling,
1984; Sperling & Dosher,
1986; Shiu & Pashler,
1994,
1995;
Eckstein, et al.,
2002; Shimozaki, et al.,
2001). Second, the size of
cueing effect across signal-to-noise ratios varies differentially for the three
models. The attention switching model shows a continuous increase in cueing
effect with SNR, with an asymptote surpassing the cueing effects of the weighted
integration models. Note that using a switching probability other than the 80%,
or simply reducing the sensitivity at the unattended location compared to the
attended location instead of setting it to zero, would not change the
qualitative aspect of the increasing cueing effect with SNR, only the absolute
values of the asymptotes. The two weighted integration models show first an
increase, then a decrease in their cueing effects, with the Sum of Weighted
Likelihoods model having its peak cueing effect at a smaller SNR. Also,
compared to the Sum of Weighted Likelihoods model, the linear model shows a
smaller cueing effect at smaller SNR’s, and a larger cueing effect at
larger SNR’s. Thus, it may be possible to distinguish between the three
models by measuring the cueing effects across a broad range of
SNR’s. Figure 5. Predicted cueing
effects as a function of SNR for the three models. The cueing effect is defined
as valid hit rate
( Hv)
– invalid hit rate
( Hi).
To test the
three models of the cueing paradigm, three female observers (AH, age 22; KC, age
21; LL, age 21) participated in a cued contrast discrimination task of increment
Gaussian disks (σ =
12.4’ visual angle) presented for 50 msec in white noise (σ
= 4.88
cd/m2,
N.S.D. =
1.62X10-5
deg2,
mean = 25.0
cd/m2)
over a large range of
ideal observer signal-to-noise ratios (0.74 to 10.4, computed directly from the
image statistics, see Appendix
A) manipulated by changing the
contrast increment of the Gaussian signal added to a pedestal of 6.25% peak
contrast ((peak luminance – mean luminance)/mean
luminance).
The left
column of Figure
6 describes a single
‘signal present’ trial in which the cue was ‘valid’.
The observer initiated each trial by pressing a key on a computer keyboard. One
second after the key press, a square precue (5.86
cd/m2,
side length = 2.5° )
appeared for 150 ms around one or both of the potential signal locations
(centered 2.5° to the
right and left of the center of the display). The validity and the number of
the precues determined the condition for that trial, as explained below.
Immediately following the precue, the stimulus display appeared for 50 ms. As
the signal appeared at cued location in the left column of Figure
1, the figure represents a
‘valid’ signal present trial. Overall, half the trials were signal
present trials, with the other half being signal absent trials. The stimulus
plus cue duration of 200 ms was chosen to negate the effects of saccades, which
typically have a latency of approximately 200 ms. A white noise mask immediately
followed for 100 ms, having the same mean background luminance (25.0
cd/m2)
and twice the contrast
(σ = 9.76
cd/m2)
as the noise fields in the stimulus displays. The observers then pressed one of
two keys on a computer keyboard to indicate their decision for signal presence
on that trial. A feedback interval of 400 ms followed, visually indicating if
the observer was correct.
Figure 6. Types of trials in
the experiment.
The
validity of the precue determined the condition of each trial, with the valid
trials comprising approximately 80% of the signal present trials, and with the
remaining 20% of the trials being ‘invalid’, in which the signal
appeared at the uncued location (second column, Figure
6). Thus, the ‘cue
validity’ was approximately 80% for this study. There was only one type of
signal absent trial, as without the presence of the signal, the
‘valid’ and ‘invalid’ signal absent trials were
identical.
A small
fixation cross (0.5° by
0.5°, 5.86
cd/m2)
appeared continuously in the center of the display. Also, to reduce the
observer’s intrinsic uncertainty of the signal locations, four small dark
lines (5.86
cd/m2,
length = 0.5°, width =
0.034°) were
continuously displayed near the potential locations of the signal (nearest point
0.5° from center of the
signal location).
Separate
studies were run at each SNR, comprised of approximately 250 invalid trials,
approximately 1000 valid trials, and approximately 1250 signal absent trials for
each observer. The trials were broken into 10 sessions of 250 trials each, with
the valid, invalid, and signal absent trials randomly intermixed. The types of
trials were determined by sampling with replacement; thus the approximate
division of trials in each session was 25 invalid, 100 valid, and 125 signal
absent trials. For KC and AH, approximately 50 neutral trials (50% valid
precues at each location) were also included in each session; these neutral
trials were not used in the subsequent analysis.
Within a
session, the order of the signal present trials was randomized, as was the
placement of the signal between the left and right locations. Hit rates
(correctly detecting the signal when present) for the valid and invalid trials
and an overall false alarm rate (incorrectly stating signal presence on a signal
absent trial) were computed for each session. Standard errors of the mean for
the hit and false alarm rates were based on the values over the ten sessions for
each SNR.
Stimuli
were presented on a monochrome monitor (viewing size = 32.51 by 24.38 cm,
resolution = 1024 by 768 pixels, Image Systems Corp., Minnetonka, MN, 55343),
sitting 50 cm from the observer. At this distance, each pixel subtended
0.034° of visual angle.
Luminance calibrations were performed with software and equipment from Dome
Imaging Systems, Inc. (Luminance Calibration System, Waltham, MA).
Figure 7 gives the results for the three
observers in terms of hit and false alarm rates. As expected, all observers had
an improvement in performance with increasing SNR, as shown by the increasing
hit rates and decreasing false alarm rates. Figure 8
gives the cueing effects for the human observers, computed as the difference
between the hit rates for the valid and invalid trials. All observers first had
an increase in the cueing effect, followed by a decrease, as SNR increases. KC
had a somewhat smaller cueing effect than AH and LL across all
SNR’s.
Figure 7. Hit and false alarm rates for three
observers. Error bars are standard errors of the mean.
Figure 8.
Cueing effects for the human observers, defined as valid hit rate
(Hv)
– invalid hit rate
(Hi).
Error bars are standard errors of the mean.
Models fits
were found by finding the minimum
χ 2
error for the valid and
invalid hit rates, and the false alarm rates for each observer separately. For
the two weighted integration models, a single parameter defined the weight
placed on the cued location relative to the uncued location
(wc).
For each observer fit, the weighting parameter was fixed across SNR’s so
that only one free weighting parameter was allowed for each observer. A linear
relationship was assumed between human sensitivity and image SNR; thus, a slope
and intercept were estimated to predict the human sensitivity as a linear
function of image SNR. (A fit using a non-linear relationship between d’
and image SNR (Eckstein et al.,
1997) was not attempted given
the relatively good results obtained with the linear function.) For each
observer’s results, and for each model, there were 9 free parameters, 1
weight of the information at the cued location
(wc),
one slope
(b)
and intercept
(a)
relating the human index of detectability
(d’)
and image SNR, and 6 decision criteria
(crit),
one for each image SNR, to estimate 18 data values, 6 valid hit rates, 6 invalid
hit rates, and 6 false alarm rates. For the attentional switching model, the
fits were done with the switch probability
(switch)
as a free parameter instead of a weighting parameter.
The fits
for the three observers are shown in Figure
9, and Table
1 in Appendix
E gives the estimated parameters
from the fits. The fits for the attentional switching model are not shown, as
they were exceedingly poor, with all
χ 2
(18) values greater
than 1932. As the attentional switching model predicts a continuous increase in
the cueing effect with SNR (see Figure
5), a pattern not indicated by
any observer (see Figure
8), the poor fits were not
surprising. Across all three observers, the Sum of Weighted Likelihoods model
gave the better fits, compared to the linear model, particularly for
intermediate SNR values. This is due to the fact that the linear model could
not reconcile the relatively large cueing effects for the intermediate
SNR’s and the relatively small cueing effects for the larger SNR’s
found for the human observers. For both models, the estimated weight for KC was
smaller than the weights for other two observers, consistent with KC’s
smaller cueing effect. Note that all fits for the false alarm rates were
relatively good for both models, and therefore, the differences in the fits were
nearly all in the valid and invalid hit rates. Therefore, for a better
comparison of the fits, Figure 10 gives
the fits for the predicted cueing effects as the difference between the valid
and invalid hit rates. Here the advantage for the Sum of Weighted Likelihoods
model is clearly seen for all observers throughout all SNR values, particularly
at the intermediate values. Table
1 in Appendix A indicates that
the best fits for the Sum of Weighted Likelihoods model led to predictions of
decreasing criteria with increasing for AH and LL. This represents a deviation
from the optimal criterion of zero (expressed as log(likelihood ratio), and for
the same number of signal present and signal absent trials), and suggests that
AH and LL became increasingly conservative in their judgments as SNR
increased.
Figure 9. Fits of the models to hit
and false alarm rates for three observers.
Figure 10. Fits
of the models to cueing effects for three observers. Cueing effect is given as
valid hit rate
Hv
– invalid hit rate
Hi.
Limited Capacity versus Weighted Integration Models
A limited
capacity model was proposed that assumed an attentional switching strategy
between the cued and uncued locations from trial to trial. With a certain
probability, this model either chooses to follow the cue, or to attend the
uncued location, and performance is assumed to be at chance at the unattended
location. The attentional switching model predicts a cueing effect that
increases asymptotically with SNR, unlike the cueing effects for the human
observers, which first increased, then decreased, with SNR. Thus, the
attentional switching model was clearly rejected for this study.
The
attentional switching model is a specific version of a limited capacity model,
and the proposition that an observer switches a unitary attentional mechanism
from trial to trial might be seen as somewhat unlikely. Another general limited
capacity model might propose that attention weights the information at each
location equally, and induces a change in the tuning of the perceptual filters
at the cued and uncued location, such that the perceptual filter at the cued
location more closely matches the signal. This more general
‘tuning’ model captures the essential qualities of most descriptions
of a limited capacity attentional
mechanism3
(Eriksen & James,
1986; Posner,
1980; Allport,
1987; Posner & Peterson,
1990; Spitzer, et al.,
1988; Henderson,
1996; Broadbent,
1958; Kahneman,
1973; Treisman,
1992; Luck, et al.,
1994; Luck, et al.,
1996), and has been tested with
the same cueing task with the classification image technique (Eckstein, et al.,
2002). This technique allows
the investigator to estimate the shape of the perceptual filters or templates
from the observer’s trial to trial decisions and the image noise
samples.4
In the
cueing paradigm, it can be shown that a selective (weighting) attentional
mechanism and a limited capacity tuning attentional mechanism lead to
differential changes in the classification images. Roughly speaking, a
weighting mechanism leads to simple scalar (magnitude) changes in the
classification images, with the classification image for the cued location
having the greater magnitude. The changes predicted from an attentional tuning
mechanism, however, lead to qualitative ‘shape’ changes between the
cued and uncued classification images that cannot be simply scaled into each
other. The evidence from the four observers in Eckstein, et al. (2002)
strongly supported scalar changes in the classification images at the cued and
uncued locations, as opposed to changes in shape. Thus, the perceptual tuning
version of a limited capacity model of attention can be discounted for this
cueing task, based on the study of classification images.
Bayesian Sum of Weighted Likelihoods versus Weighted Linear Integration
The two
weighted integration models, linear and likelihood, have theoretically
significant qualitative similarities. Both assume a weighted integration of
noisy information across the cued and uncued locations, and propose an
attentional mechanism that is purely selective, or unlimited capacity, with no
change of perceptual quality at the cued and uncued locations. Notably, both
predict a cueing effect without a limited capacity attentional mechanism. While
this issue has been discussed by several authors (Shaw,
1980; Kinchla, et al.,
1995; Sperling,
1984; Sperling & Dosher,
1986; Shiu & Pashler,
1994,
1995;
Eckstein, et al.,
2002; Shimozaki, et al.,
2001), these models are still an
important demonstration, given the pervasiveness of limited capacity models of
the cueing effect. There are also, however, important differences between the
two models. First, the Sum of Weighted Likelihoods model has the theoretical
advantage that, in certain circumstances, it is equivalent to an optimal
Bayesian rule, and may be used as a standard of comparison (Shimozaki, et al.,
2001). The optimal weight for
the Sum of Weighted Likelihoods model is the cue validity, and the cueing
effects in this case may be taken as a ‘boundary’ condition. Cueing
effects less than or equal to the cueing effects found with the optimal weights
do not require the proposition of a limited capacity attentional mechanism.
This use of the Bayesian observer in the cueing paradigm is similar to the use
of SDT ‘uncertainty’ models for set-size effect in visual search
(Shaw, 1980, 1984; Eckstein & Whiting,
1996; Eckstein,
1998; Eckstein, et al.,
2000; Palmer, 1995;
Palmer, et al.,
1993; Palmer, et al.,
2000; Verghese,
2001). In those cases, the SDT
model predicts a set size effect with an unlimited capacity attentional model.
Thus, any set size effect equal to or less than the SDT prediction can also be
explained without a limited capacity attentional
mechanism.
Second,
there are relatively large differences in the predicted cueing effects as a
function of SNR for the two models (see Figure
5). While both weighted
integration models gave improved fits relative to the attentional switching
model, the Sum of Weighted Likelihoods model clearly gave the best fits to the
observers’ data. Thus, for this simple cued discrimination task, the
observers’ performance was best described by the Sum of Weighted
Likelihoods model. As seen in Figure
5, the Sum of Weighted
Likelihoods model differs largely from the linear model in the size of the
cueing effect at larger SNR’s, with the linear model predicting the larger
cueing effect, apparently due to underpredicting the invalid hit rate at the
larger SNR’s. Thus, it appears that the linear weighting rule penalizes
information from the uncued location too heavily, compared to the Sum of
Weighted Likelihoods model at high SNR’s.
Overall,
the combination of the classification image technique in Eckstein, et al.
(2002)
and the modeling of psychophysics in the current study can be a powerful method
in assessing the perceptual mechanisms involved in the cueing paradigm. For a
simple cued discrimination task, the classification image analysis can
discriminate between an attentional tuning model and a selective weighting
attentional model, and the evidence from Eckstein, et al. (2002)
strongly indicates the weighting hypothesis. The modeling of performance in the
current study distinguishes among a limited capacity attentional switching model
and two weighted integration models, a distinction that might be difficult
within the classification image technique. The results from this study suggest
that the Sum of Weighted Likelihoods model best describes performance. We
believe that employing these methodologies and others (including those done
based on varying the image noise and measuring efficiency, see Dosher &, Lu,
2000a,
2000b;
Lu & Dosher, 1998,
2000;
Gold, et al.,
1999) in concert may lead to a
clearer understanding of the cueing paradigm.
Weighted Integration Versus Maximum-Value Models
Common
variants of the integration models are known as ‘maximum-value’
models, in which the maximum value for a particular decision variable is chosen
amongst a number of locations or alternatives. These include ‘spatial
uncertainty’ models of visual search (Shaw, 1980,
1984;
Eckstein,
1998; Eckstein, et al.,
2000; Palmer,
1995; Palmer, et al.,
1993; Palmer, et al.,
2000; Verghese,
2001), and models of summation
across channels or features (i.e., probability summation, Graham,
1989; Tyler & Chen,
2000), and in some cases, they
are seen as approximations to ideal observer models (Nolte & Jaarsma,
1967; Palmer, et al.,
2000). Appendix
F compares the predicted cueing
effect of the two weighted integration models to their analogous maximum-value
models.
The first
section of Appendix
F describes a comparison of the
Linear model to a corresponding Maximum of Weighted Responses model, summarized
in Figure
12. This figure depicts the
Linear model with the same weighting as in Figures 4a and
5
(0.62), along with the Maximum of Weighted Responses model with different
weightings. First, it is apparent that a greater weighting of the cued location
is necessary for the maximum-value model for a comparably sized cueing effect
for the Linear model. Second, the maximum-value model’s cueing effects
rises less steeply with increases in SNR, such that it had a smaller cueing
effect for lower SNR’s across all weightings. Thus, the maximum-value
model does not appear to be a good approximation of the Linear model. Also,
part of the difficulty of the Linear model in fitting the human observer’s
data was that the Linear model’s cueing effect did not rise quickly enough
with SNR. As the maximum-value model rises even less quickly, this model would
provide worse fits to the human data.
The second
section of Appendix
F describes the comparison of
the Sum of Weighted Likelihoods model with an analogous maximum-value model of
weighted likelihoods. This comparison is summarized in Figure
14, with Figure
14a showing the hit and false
alarm rates, and Figure
14b showing the cueing effects,
for the same weighting depicted in Figures 4b and
5
(0.80). The Maximum Likelihood model approximated the Sum of Weighted
Likelihoods model well for high SNR’s, and less well for intermediate
SNR’s. This result agrees with a previous study by Nolte and Jaarma
(1967),
suggesting that a maximum-value decision rule closely approximates the optimal
Bayesian decision rule at higher SNR’s for an m-AFC
task.5
Relationship to Previous ‘SDT’ (Uncertainty) Models of Visual Search
The
weighted integration models for the cueing paradigm are closely related to the
SDT models for set-size effects in visual search, and, in fact, could be seen as
an extension of these models (e.g., Kinchla, 1974,
1977;
Shaw, 1980,
1984;
Eckstein, et al.,
2000; Palmer,
1995; Palmer, et al.,
1993; Palmer, et al.,
2000; Verghese,
2001). (Also, an SDT model
predicting the feature/conjunction dichotomy in visual search can be found in
Eckstein (1998)
and Eckstein, et al. (2000)).
Both classes of models for visual search and the cueing tasks are unlimited
capacity selective attention models operating on responses perturbed by noise.
Also, as noted in the Introduction, the SDT models of visual search and Sum of
Weighted Likelihoods model can be employed similarly in setting an upper bound
for the expected set size and cueing effects for an unlimited capacity
attentional model. However, the weighted integration models are not equivalent
to the SDT models for visual search, as the weighted integration models explain
cueing effects, and not set-size effects. These two effects, besides being two
of the more important phenomena in attention, have also been treated as distinct
phenomena, and therefore we believe that they deserve separate and distinct
treatments. The models for set-size effects, in essence, assume an equal
weighting across locations, and would need to expand to include unequal
weightings to model cueing effects (as has been done by Kinchla in his linear
model, for both set-size (Kinchla, 1974,
1977)
and cueing effects (Kinchla, et al.,
1995)). Another difference is
that the previous visual search models in general employ a maximum-value
decision rule (e.g., Shaw, 1980,
1984;
Eckstein,
1998; Eckstein, et al.,
2000; Palmer,
1995; Palmer, et al.,
1993; Palmer, et al.,
2000; Verghese,
2001), choosing the single
location most likely to contain the signal, as opposed to integrating
information across locations. As mentioned above, in general these
maximum-value models of visual search are an approximation to the optimal
Bayesian observer, and they become undistinguishable at high SNR’s
(Nolte & Jaarsma,
1967; Appendix
F).
In a study
of a visual search task of orientation discrimination, one study (Baldassi & Verghese,
2002) tested two models
analogous to the weighted integration models in the current study, a linear
combination rule and a maximum-output rule that approximates a Bayesian
observer. In their task, observers were asked to identify an oriented Gabor
patch, tilted either clockwise or counterclockwise relative to vertical
distractor Gabors. Analogous to the current study, both combination rules
predicted a set-size effect without a limited capacity attentional mechanism;
however, the two models gave different predictions for the psychometric
functions for increasing set size. Notably, the linear combination gave poorer
fits to the human observer psychometric functions than the maximum-output
approximation of a Bayesian observer. Smith (1998)
also tested unweighted linear integration and maximum-value decision rules based
on visual search uncertainty models in a cueing paradigm, and a difference of
fits between the two types of models could not be distinguished. The unweighted
combination rules led to the conclusion that the cueing effects found in the
study were due to differences in discriminability at the cued and uncued
locations.
Three
models have been presented for the cue validity effect in the cueing paradigm,
two weighted integration models, linear and Bayesian likelihood, and a limited
capacity attentional switching model. The first two models are selective,
unlimited capacity models that may be categorized loosely as SDT models, and
have two key components. First, information is integrated across all locations,
and second, the information at the cued location is weighted more heavily. Both
models predict a cueing effect while having the same perceptual quality at the
cued and uncued locations, contradicting the notion that a cueing effect implies
a limited capacity attentional mechanism. While qualitatively similar, analyses
of the predicted cueing effect found that the three models gave different
predictions with respect to signal-to-noise ratio; to test these predictions,
three observers performed in a cued discrimination task across a range of
SNR’s. For this task, fits for the attentional switching models were
poor, and human observer performance was better predicted by a Bayesian model
that combines weighted likelihoods than a weighted linear integration of the
internal responses at the cued and uncued locations.
Appendix A. Image Signal-to-Noise Ratio (SNR)
For the
case of a signal embedded in white Gaussian noise, the image signal to noise
ratio (SNR), which corresponds to the ideal observer index of detectability, can
be calculated as follows:
Let
s(x,y)
be the array describing luminance profile of the stimulus and
σpixel
be the standard
deviation of the Gaussian pixel noise. Then the signal energy
E
is:  | (A1) |
In
Gaussian (white) noise, image SNR is given
by  | (A2) |
|
This model
formulation was derived by Kinchla, et al. (1995),
and the terminology and development follow from that study.
Let | xc
= response at cued location |
| xuc
= response at uncued location |
| sc
= the event of a signal at the cued location |
| suc
= the event of a signal at the uncued location |
| nc
= the event of a noise stimulus at the cued location |
| nuc
= the event of a noise stimulus at the uncued location |
| wc
= the weight for the cued location |
| wuc
= 1 -
wc
= the weight for the uncued location |
The model
computes a decision variable
(y)
that is the weighted sum of the responses at the cued and uncued locations
(xc,
xuc),  | (B1) |
The model
assumes that the prior probability of response at each location
(xc,
xuc)
is determined by two Gaussian distributions, one for the signal and one for the
noise. The mean of the signal distribution is 1, the mean of the noise
distribution is zero, and the variance of both distributions is
α . These distributions
do not have unit variance, unlike the standard SDT assumption, and sensitivity
is not described as the distance between the two distributions
(d’);
instead, sensitivity in this model is described in terms of the variance. For
development of an equivalent model with unit variance, see Eckstein, et al.
(2000).
Note that α is the same
at the cued and uncued locations, so that this model may be defined as an
unlimited capacity model.
First
d’ was determined from the image SNR by the following linear equation with
a given slope
(b)
and intercept
(a):
The
sensitivity of the model is expressed in terms of
α.
For the
prior probability distributions of
xc
and
xuc
given signal presence (s) and signal absence
(n),
For a noise
stimulus,  | (B4) |
For a
signal
stimulus,  | (B5) |
The model
predicts performance by evaluating the decision variable
(y).
As
xc
and
xuc
are Gaussian distributed, y is also Gaussian distributed.
The
expected values of y for valid trials
(sc,
nuc),
invalid trials
(nc,
suc),
and signal absent trials
(nc,
nuc)
are as
follows:
The
variance of y is given by the
following:  | (B9) |
Thus, a
d’
measure can be derived for the valid and invalid trials (compared to the signal
absent
trials),  | (B10) |
 | (B11) |
To generate
the hit and false alarm rates, values of
y
were normalized to unit variance, which has no effect on the predictions of the
model.  | (B12) |
Then, the
hit and false alarm rates predicted by the model were found by choosing a
criterion
(crit)
so that the normalized y values above the criterion
(xatt)
led to ‘signal present’ decisions for the model.
| Hv
= valid hit rate =
Pr(yz>crit|sc,
nuc) | (B13) |
| Hi
= invalid hit rate =
Pr(yz>crit|
nc,
suc) | (B14) |
| FA=
false alarm rate =
Pr(yz>crit|
nc,
nuc) | (B15) |
Defining
g(x)
as the Gaussian probability density function of unit variance and mean equal to
zero, and
G(x)
as the cumulative probability density function for
g(x),
we can substitute these functions into the above equations,  |
. |
Therefore, | Hv
=
Pr(yz>crit|sc,
nuc) = 1 –
G(crit -
d’valid) | (B16) |
| Hi
=
Pr(yz>crit|
nc,
suc) = 1 –
G(crit -
d’invalid) | (B17) |
| FA=
Pr(yz>crit|
nc,
nuc) = 1 –
G(crit) | (B18) |
|
Appendix C. Sum of Weighted Likelihoods Model
This model
is a modification of an optimal Bayesian observer (Green & Swets,
1974), and begins with the human
internal response. A version of this model beginning with the image
(correlation of perceptual filters with the image) can also be found in
Eckstein, et al. (2002).
The model
calculates three likelihoods of the responses
xc
and
xuc.
The
likelihood of
xc
and
xuc
given that a signal stimulus was present at the cued location and a noise
stimulus was present at the uncued location (valid
trial), . | (C1) |
The
likelihood of
xc
and
xuc
given that a noise stimulus was present at the cued location and a signal
stimulus was present at the uncued location (invalid trial).
. | (C2) |
The
likelihood of
xc
and
xuc
given that a noise stimulus was present at the cued location and at the uncued
location, (signal absent
trial),
. | (C3) |
Each
likelihood is the probability of
xc
given the stimulus at the cued location, and the probability of
xuc
given the stimulus at the uncued
location,
The model
calculates the weighted likelihoods of the responses given a signal trial
(weighted
Ls)
by summing the weighted likelihoods given a valid signal trial
(sc,
nuc),
and given an invalid signal trial
(nc,
suc).
. | (C7) |
The
model calculates the weighted likelihood ratio of the
responses.
. | (C8) |
The
model assumes that the probability density functions of the response at each
location are determined by two Gaussian distributions of unit variance, one for
the signal and one for the noise. The mean of the noise distribution is zero,
and sensitivity is defined as the mean of the signal distribution
(d’).
This is the standard Signal Detection Theory assumption. Note that
d’
is the same regardless of whether the location is cued or
uncued,
The index
of detectability was related to the ideal observer (image) SNR by the following
linear
relationship:
where
b
is a term that includes the suboptimal nature of the human perceptual filter
(sampling efficiency, Burgess et al.,
1981) and internal noise
(Burgess & Colborne,
1988).
Then,
for a noise
stimulus,
For a
signal
stimulus, . | (C11) |
Substituting
the Gaussian assumption into the weighted likelihood
ratio,
To find the
predicted hit and false alarm rates, the log likelihood ratio was compared to a
criterion
(crit).
(The monotonic log transform of the likelihood ratio has no effect on the
predictions of the model.) Responses generating log likelihood ratios greater
than the criterion led to ‘signal present’ decisions for the model.
Therefore, | Hv
= valid hit rate =
Pr(log(Ls/n)>crit|sc,nuc) | (C13) |
| Hi
= invalid hit rate =
Pr(log(Ls/n)>crit|nc,suc) | (C14) |
| FA=
false alarm rate =
Pr(log(Ls/n)>crit|nc,nuc) | (C15) |
Monte Carlo
simulations of 10,000 trials for each SNR were performed to generate predictions
for the model. When the weights match the cue validity, the Sum of Weighted
Likelihoods model becomes the Bayesian model. If, in addition,
d’
is replaced by the image SNR (which assumes that the perceptual filter perfectly
matches the signal and there is no internal noise), then the model becomes the
optimal observer.
Appendix D. Attentional Switching Model
This
section describes an attentional switching model (Sperling & Melchner,
1978; Shaw,
1982) in which the model chooses
to attend to either the cued or the uncued location with a certain probability
on each trial.
Let
| xatt
= response at cued location |
| xunatt
= response at uncued location |
| switch
= the probability of attending the cued location |
| 1 -
switch = the probability of attending
the uncued location |
On each
trial, this model chooses to attend to either the cued location with a
probability equal to
switch
(p(attend cued) =
switch),
or the uncued location with a probability equal to
1 –
switch
(p(attend uncued) = 1 -
switch).
The response at the unattended location
(xunatt)
is assumed to have zero sensitivity, and is ignored. Thus, the decisions of the
attentional switching model are made solely on the response from the attended
location
(xatt).
As in the
Sum of Weighted Likelihoods model, assume that the probability density functions
for
xatt
are described by two Gaussian distributions of unit variance, one for the signal
(μatt-s
=
d’)
and one for the noise
(μatt-n
= 0).
Then, for a
noise
stimulus,  | (D1) |
For a
signal
stimulus,  | (D2) |
To
calculate the hit and false alarm rates, a criterion
(crit)
was chosen so that attended responses
(xatt)
above the criterion led to ‘signal present’ decisions for the model.
| Hv
= valid hit rate =
Pr(xatt>crit|sc,
nuc) | (D3)
|
| Hi
= invalid hit rate =
Pr(xatt>crit|
nc,
suc) | (D4) |
| FA
= false alarm rate =
Pr(xatt>crit|
nc,
nuc) | (D5) |
These
probabilities are comprised of:
A) the
probability of
xatt
exceeding the criterion when attending the cued location,
p(attend cued) =
switch,
and
B) the
probability of
xatt
exceeding the criterion when attending the uncued location,
p(attend uncued) = 1 –
switch.
Therefore, | Hv
=
Pr(xatt>crit|
sc,
nuc)
= switch
Pr(xatt>crit|
sc) |
|
+
(1-switch)
Pr(xatt>crit|
nuc) | (D6) |
| Hi
=
Pr(xatt>crit|
nc,
suc)
= switch
Pr(xatt>crit|nc) |
|
+
(1-switch)
Pr(xatt>crit|
suc) | (D7) |
| FA
=
Pr(xatt>crit|
nc,
nuc)
= switch
Pr(xatt>crit|nc) |
|
+
(1-switch)
Pr(xatt>crit|
nuc) | (D8) |
Substituting
the cumulative Gaussian probability density
function, | Pr(xatt>crit|s)
= 1-G(crit
–
d’)
, |
| Pr(xatt>crit|n)
=
1-G(crit)
. |
Therefore, | Hv
=
switch
(1-G(crit-
d’)) +
(1–switch)(1-G(crit)) | (D9) |
| Hi
=
switch
(1-G(crit))+
(1–switch)(1-G(crit-
d’)) | (D10) |
| FA
= 1 -
G(crit) | (D11)
|
|
Appendix E. Model Fit Parameters
Table E1. Parameters for the
Model Fits.
|
Sum of Weighted Likelihoods
|
Linear
|
|
Image SNR
|
criterion
|
|
Image SNR
|
criterion
|
|
AH
|
wc
= 0.82
|
0.745
|
0.08
|
wc
=0.57
|
0.745
|
0.4
|
|
b=0.29
|
3.21
|
0.02
|
b=0.34
|
3.21
|
0.59
|
|
a=0.14
|
5.87
|
-0.18
|
a=0.25
|
5.87
|
0.77
|
|
χ2
(18) = 57.81
|
8.57
|
-0.6
|
χ2
(18) =77.34
|
8.57
|
0.96
|
|
9.97
|
-0.76
|
|
9.97
|
1.15
|
|
10.40
|
-0.88
|
|
10.40
|
1.14
|
|
KC
|
wc
= 0.73
|
0.745
|
-0.02
|
wc
= 0.52
|
| |