| Volume 3, Number 2, Article 2, Pages 116-145 |
doi:10.1167/3.2.2 |
http://journalofvision.org/3/2/2/ |
ISSN 1534-7362 |
A linear cue combination framework for understanding selective attention
Richard F. Murray |
Department of Psychology, University of Toronto, Toronto, Canada |
|
Allison B. Sekuler |
Department of Psychology, McMaster University, Hamilton, Canada |
|
Patrick J. Bennett |
Department of Psychology, McMaster University, Hamilton, Canada |
|
Abstract
Using a linear cue combination framework, we develop a measure of selective attention that describes the relative weight that an observer assigns to attended and unattended parts of a stimulus when making perceptual judgments. We call this measure attentional weight. We present two methods for measuring attentional weight by calculating the trial-by-trial correlation between the strength of attended and unattended parts of a stimulus and the observer's responses. We illustrate these methods in three experiments that investigate whether observers can direct selective attention according to contrast polarity when judging global direction of motion or global orientation. We find that when observers try to judge the global direction or orientation of the parts of a stimulus with a given contrast polarity (white or black), their responses are nevertheless strongly influenced by parts of the stimulus that have the opposite contrast polarity. Our measure of selective attention indicates that the influence of the opposite-polarity distractors on observers' responses is typically 65% as strong as the influence of the targets in the motion task, and typically 25% as strong as the targets in the orientation task, demonstrating that observers have only a limited ability to direct attention according to contrast polarity. We discuss some of the advantages of using a linear cue combination framework to study selective attention.
 |
|
History
Received June 18, 2001; published March 18, 2003
Citation
Murray, R. F., Sekuler, A. B., & Bennett, P. J. (2003). A linear cue combination framework for understanding selective attention.
Journal of Vision, 3(2):2, 116-145,
http://journalofvision.org/3/2/2/,
doi:10.1167/3.2.2.
Keywords
selective attention, contrast polarity, global motion, texture, signal detection theory
for related articles by these authors
for papers that cite this paper |
When we make visual judgments about a scene, we can
base our judgments on selected parts of the scene, and ignore other parts. This
ability is called selective visual
attention. We can direct visual attention according to simple stimulus
properties, such as spatial location ( Posner,
Snyder, & Davidson, 1980), color ( Brawn
& Snowden, 1999), direction of motion ( Ball & Sekuler, 1981), and spatial
frequency ( Davis & Graham, 1981), and
perhaps also according to more complex criteria, such as the perceptual
segmentation of a scene ( Baylis & Driver,
1992; Duncan, 1984; Egly, Driver, & Rafal, 1994). However,
selective attention is sometimes imperfect: if targets and distractors differ
along certain dimensions, we find that even when we try to attend only to the
targets, our judgments are nevertheless influenced by the distractors. This
raises the question of how targets and distractors together determine an
observer’s responses, and the closely related question of how we should
measure intermediate degrees of selective attention.
The problem of how observers combine information from
two or more sources to arrive at a single response has a long history in
perceptual psychology ( Anderson, 1974).
One particularly simple hypothesis is that observers calculate a weighted sum of
internal responses to individual sources of information. Such weighted sum
models have been used to describe how observers perform many different tasks,
including detecting an auditory signal with two frequency components that
activate different auditory channels ( Green,
1958), combining redundant stimulus properties in complex figures ( Kinchla, 1977), combining multiple depth
cues ( Landy, Maloney, Johnston, & Young,
1995), and combining information across different senses ( Ernst, Banks, & Bülthoff, 2000; Jacobs, 1999). Applied to the problem of
selective attention, the weighted sum hypothesis suggests that if
T is an internal response to targets
and D is an internal response to wholly
or partly unattended distractors, then the observer bases his responses on a
decision variable of the
form . | (1) |
The weighting factor
k measures the influence of the
distractors on the observer’s responses, and we will call it the
attentional weight that the observer
assigns to the distractors.
Here we investigate some theoretical and empirical
aspects of this weighted sum theory of selective attention. First, we discuss
why we might expect selective attention to work this way. We present a general
Bayesian description of how observers perform discrimination tasks, and we show
that in many circumstances, it is entirely natural for observers to combine
information from attended and partly unattended sources in a weighted sum, as in
Equation 1.
Second, we derive two methods for measuring the
attentional weight k assigned to
distractors in a wide range of tasks, and we show that these methods work even
when we do not know how the observer computes the internal responses
T and
D to the targets and distractors. We
illustrate these methods in three experiments that investigate whether observers
can direct selective attention according to contrast polarity when judging
global direction of motion, or when judging global orientation. Several recent
studies have investigated the first question concerning global motion and have
given conflicting results ( Croner &
Albright, 1997; Edwards & Badcock,
1994; Li & Kingdom, 2001; Snowden & Edmunds, 1999; van der Smagt & van de Grind, 1999).
The methods that we introduce avoid some of the problems of these earlier
studies, and so we hope to give a more convincing answer to the question whether
observers can direct attention according to contrast polarity.
Third, we test an assumption that is implicit in the
weighted sum hypothesis, namely that selective attention only affects the
relative weight that an observer assigns to the internal responses to the
targets and distractors, T and
D, without changing the internal
responses themselves. This issue is crucial for the problem of how to measure
selective attention. If selective attention affects only the relative weight
assigned to targets and distractors, then it can be described by a scalar, such
as attentional weight. On the other hand, if selective attention qualitatively
changes how an observer computes the internal responses
T and
D, then a more complex description may
be necessary. We show how methods developed by Chubb and colleagues ( Chubb, 1999; Chubb, Econopouly, & Landy, 1994) can be
used to investigate how observers process attended and unattended stimuli, and
we illustrate these methods by measuring directional selectivity for attended
and partly unattended motion signals in a global direction discrimination
task.
We begin with the question of why selective attention
might take the form of a single weighting factor.
When studying human performance in a perceptual task,
it is often revealing to model observers as Bayesian decision-makers who are
limited by simple degradations of the stimulus or by imperfect knowledge of the
stimulus. For instance, in many shape discrimination tasks, human observers
behave like Bayesian observers who view stimuli through a small amount of
additive Gaussian noise and have an imperfect representation of the shapes to be
discriminated ( Barlow, 1956; Lu & Dosher, 1998; Pelli, 1990). Bayesian models are often
illuminating, because they make explicit claims about what information observers
use to perform a task, and about what types of inefficiencies limit
observers’ performances ( Geisler,
1989; Watson, 1987). We follow a
similar approach to define a measure of selective attention.
Consider a task in which the observer discriminates
between two classes of stimuli, A and
B. A Bayesian decision-maker performs
this task by viewing the stimulus U on
each trial, and evaluating the probability that the stimulus was drawn from
class A or class
B, given that the observed stimulus was
U. Bayes’ theorem shows that
these probabilities
are  | (2) |
. | (3) |
Equivalently, the observer can base
his responses on the likelihood ratio
L:  | (4) |
If stimulus types
A and
B appear equally often, and if the
observer’s goal is to maximize the number of correct responses, then the
optimal strategy is to respond
‘A’
if  , and
‘B’
otherwise ( Green & Swets, 1974).
If the stimulus
U is composed of many independently
varying elements
Ui
(e.g., a noisy N pixel stimulus, or a
random dot cinematogram with N
independent dot displacements), then the likelihood ratio
L is the product of many subsidiary
likelihood ratios
ui
computed from the stimulus elements
Ui: , where
 | (5) |
Equivalently, the observer can
calculate the logarithm of this likelihood ratio, which is the sum of the
subsidiary log likelihood
ratios:  | (6) |
A likelihood ratio
ui>1
makes it more likely that U belongs to
A, and a likelihood ratio
ui<1
makes it more likely that U belongs to
B. A likelihood ratio
ui=1
does not shift the overall likelihood ratio
L either way.
We should point out that the observer’s estimates
of the likelihood ratio L may be
correct or incorrect. Often we use a Bayesian framework to derive the
ideal observer for a particular task,
and certainly the ideal observer must compute the relevant likelihood ratios
correctly. More generally, though, a Bayesian framework allows us to model an
observer’s beliefs about what can
be inferred from an observation, and these beliefs may be correct or incorrect.
In other words, just because we describe an observer in a Bayesian framework, we
need not assume that the observer follows an ideal strategy.
How could we represent selective attention in this
well-known Bayesian pattern classification framework? Suppose that a stimulus
contains two classes of elements,
Ui
and
Vj.
When the observer selectively attends to
Ui,
he takes these elements as being more relevant to the task than
Vj,
and he reduces the influence of
Vj
on his responses. Another way of saying this is that the observer discounts the
evidence provided by
Vj,
and assigns it a smaller weight in his decision. If we regard the observer as
basing his responses on a likelihood ratio as in Equation 5, this amounts to his adjusting the
likelihood ratios
ui
and
vj
that are computed from the two classes of stimulus elements,
Ui
and
Vj.
For instance, if on a particular trial an element
V1
would contribute a likelihood ratio of
v1=1.2
if attended to, hence biasing the observer’s response toward
'A', an observer who selectively
attends away from
V1
can be thought of as adjusting the likelihood ratio
v1
toward 1.0, so that
V1
has less influence on his response. That is, when the observer selectively
attends to
Ui,
he adjusts the likelihood ratios
vj by some function
f:  | (7) |
We will assume that selective
attention affects only the likelihood ratios
vj
corresponding to the elements
Vj
that the observer selectively attends away from. Later in this section we show
that this makes our model only slightly less general than if we allow selective
attention to affect both sets of likelihood ratios,
ui
and
vj.
For this description of selective attention to be
meaningful, the attenuating function f
must satisfy a simple constraint: the likelihood ratio
L computed in Equation 7 should not depend on how we conceptually
divide the stimulus into independently varying elements
Ui
and
Vj.
In particular, our predictions concerning the effects of selective attention
should not change if we reformulate our model so that two elements
V1
and
V2
with likelihood ratios
v1
and
v2
are now regarded as a single element
V1,2
with likelihood ratio
v1v2.
It follows
that . | (8) |
The theory of functional equations ( Falmagne, 1985) shows that Equation 8 implies that
f is a power
function, . | (9) |
Hence, a reasonable guess for the form
of selective attention
is  | (10) |
. | (11) |
The corresponding log likelihood
ratio
is
. | (12) |
If
k=0, all likelihood ratios
vj
are mapped to 1, and the distractor elements
Vj
have no effect on the observer’s responses. If
k=1, the likelihood ratios
vj
are unaffected, and
Vj
have their full effect. Note the similarity of Equation 12 to Equation
1, where we defined k as the
attentional weight assigned to the distractors. 1
The idea that observers combine information from
different sources in a weighted sum has been proposed by many authors for many
different tasks, as we discussed in the 'Introduction.' This derivation shows
that in tasks where observers selectively attend to one information source
rather than another, there are good reasons why they might combine information
this way. This formulation leads directly to the notion of attentional weight,
which provides a very general way of measuring selective attention, and even
gives a meaningful way of comparing the efficacy of selective attention across
different tasks.
Finally, suppose that we allow selective attention to
affect the likelihoods computed from both targets and
distractors:  | (13) |
 | (14) |
If we
set the attentional weight in Equation 10 to
k=k2/k1,
then the likelihood ratio in Equation 10 exceeds 1 if and
only if the likelihood ratio in Equation 14
exceeds 1, so an unbiased observer would give the same response regardless of
which expression that he used. Hence, for an unbiased observer, we can assume
that selective attention affects only the likelihood ratios corresponding to
unattended stimuli. If an observer is biased (i.e., adopts a likelihood ratio
criterion different from 1), then models (10) and (14) are not equivalent, and
we might be able to compare these models experimentally by persuading the
observer to use an extreme criterion. Here we do not consider the case of a
biased observer. 2
An Illustration: Selective Attention and Contrast Polarity
As an illustration, we will apply this framework to the
question of whether observers can direct attention according to contrast
polarity when judging global direction of motion. Edwards and Badcock (1994) argued that this
question is relevant to whether signals in ON and OFF pathways merge before
reaching cortical area MT, which plays an important role in computing global
direction of motion ( Newsome &
Paré, 1988). The question is also interesting from a purely
psychological point of view, as it addresses a basic question about the
capabilities of selective attention.
In Edwards and
Badcock’s (1994) experiments, observers viewed random dot
cinematograms that contained an equal number of white target dots and black
distractor dots. A small number of white target dots all moved either directly
upward or directly downward, whereas the remaining white target dots and all the
black distractor dots moved in random directions. Observers judged whether all
the white dots moved on average upward or downward. The question Edwards and Badcock (1994) posed was,
“Can observers judge the direction of only the white dots, or do the black
dots disrupt the ability to discriminate between upward and downward motion of
the white dots?” (In the following section, we will assume that the dots
move on average to the left or to the right, rather than upward or downward, as
this was the case in the experiments we report later in this work.)
In this task, a Bayesian observer could take each dot
displacement as a piece of evidence that the correct answer is
“left” or “right,” as in Equation 5. Such an observer would compute the
product of the likelihood ratios corresponding to the individual dot
displacements, and set a criterion to discriminate between movement to the left
and to the right. Equivalently, the observer could calculate the sum of the log
likelihood ratios corresponding to the dot displacements, as in Equation 6. This sum of quantities corresponding to
individual dot displacements can often be redescribed more intuitively. For
instance, if the observer assumes that the distribution of dot directions is
Gaussian, then the sum of log likelihood ratios simply measures the total
horizontal displacement of all the target dots; an unbiased observer who follows
this strategy responds “left” if the total displacement is leftward,
and “right” if the displacement is rightward ( Watamaniuk, 1993). Alternatively, the
observer could base his responses on the output of more narrowly tuned motion
channels, perhaps considering only the number of dots that move directly to the
left or to the right. To be concrete, we will assume that observers base their
responses on the total horizontal displacement of all the dots, but in a later
section (“A More General Model”) we show that our results do not
depend on this assumption. 3
We can plot the total horizontal displacements of the
white target dots and the black distractor dots on orthogonal axes ( Figure 1). In this plot, each point represents a
single trial. The x-component of each point is the total horizontal displacement
of all the target dots on that trial (i.e., the sum of the horizontal
displacements of the individual target dots), and the y-component is the total
horizontal displacement of all the distractor dots. The cluster on the left
represents trials on which the correct answer is “left” and the
cluster on the right represents trials on which the correct answer is
“right.” Because the dots take finite random walks, there is
trial-to-trial variability in their horizontal
displacements. Figure 1. A hypothetical observer’s
decision space in Experiment 1. Each point represents a single trial. The
x-coordinate of each point is the total horizontal displacement of all the
target dots on a trial, and the y-coordinate is the horizontal displacement of
the distractor dots. The red and blue lines are illustrative decision
lines.
This plot represents the decision space of an observer
who bases his responses on the total horizontal displacements of the target and
distractor dots. Ideally, the observer should ignore the displacement of the
distractors, as this quantity gives no information as to the correct response.
For such an observer, the decision variable, which we will call
s, is equal to the target displacement,
which we will call T. An unbiased
observer of this type responds “right” if
s is greater than zero, and
“left” if s is less than
zero. This strategy can be represented as a vertical decision line that divides
the decision space in two (e.g., the red line in Figure 1). On the other hand, if the observer
cannot selectively attend to the target dots, his responses will be based on
some combination of the total horizontal target displacement
T and the total horizontal distractor
displacement, which we will call D. As
in Equation 12, we will model the
observer’s decision variable s as
a weighted sum of the internal responses to the target and distractor
dots:  | (15) |
The attentional weight
k assigned to the distractor dots
determines the influence of the distractors on the observer’s responses.
For an observer for whom  , the decision line is not vertical, but
rather has slope  (e.g., the blue line in Figure 1).
The weighted sum of target and distractor displacements
in Equation 15 would be a natural first attempt
at modeling selective attention in this task, even on grounds of simplicity. We
wish to emphasize, though, that we arrived at Equation 15 by a different, less pragmatic argument.
First, we noted that several studies support the notion that observers judge the
global direction of a random dot cinematogram by summing internal responses to
individual dot displacements ( Watamaniuk,
1993; Watamaniuk, Sekuler, &
Williams, 1989; Williams, Tweten, &
Sekuler, 1991; Zohary, Scase, &
Braddick, 1996). Second, we reasoned that any observer who arrives at a
response by summing several quantities computed from a stimulus can be regarded
as a Bayesian observer who sums log likelihood ratios, as in Equation 6). Third, our derivation of attentional
weight showed that one simple and plausible way of describing the effects of
selective attention is with a single weighting factor in a sum of log likelihood
ratios, as in Equation 12. These considerations
show that the weighting factor k in Equation 15 is not just an arbitrary free parameter,
but that it is actually the attentional weight that our hypothetical observer
assigns to the distractors. Hence Equation 15
results from a direct application of our account of attentional weight to the
task of judging the direction of target dots mixed with distractor dots in a
random dot cinematogram.
How to Measure Attentional Weight
According to the account we have outlined, a key
problem in the study of selective attention is measuring the attentional weight
that an observer assigns to distractors. We now describe a simple method of
doing this.
Figure 2 is a plot
of our hypothetical observer’s decision space, showing only trials on
which the correct answer is “right.” The large black dot
M shows the mean total horizontal
displacement of the target and distractor dots over all trials where the target
signal dots move right, indicating that on average the target dots move to the
right and the distractor dots have zero displacement. The green dot
MR
shows the mean target and distractor displacements over all trials where the
target moves right and the observer responds “right.” As indicated
by the dashed line in Figure 2, this
conditional mean is shifted from the unconditional mean along a line that is
perpendicular to the decision line. This follows from the fact that the
distribution of target and distractor displacements is radially symmetric: the
part of the distribution that falls on one side of the decision line is
mirror-symmetric about a line that is perpendicular to the decision line and
passes through M, so the mean over all
trials where the observer responds “right” must lie along this line.
Similarly, the mean displacement
ML
over all trials where the target moves right and the observer responds
“left” is shifted from the overall mean along the same line in the
opposite direction, as indicated by the large red dot. The slope of the decision
line is –1/k, so the slope of the
perpendicular line connecting the two conditional means is
k, the attentional weight that the
observer assigns to the distractor
dots. Figure 2. Part of a hypothetical observer’s
decision space, showing trials on which the target signal dots move to the
right. M is the mean over all trials,
MR
is the mean over trials where the observer responds ”right,” and
ML
is the mean over trials where the observer responds “left.”
Let the random variable
C represent the correct response on a
given trial, taking the value +1 or –1 on trials where the correct
response is “right” or “left,” respectively. Similarly,
let the random variable R represent the
observer’s responses, taking the value +1 or –1 on trials where the
observer responds “right” or “left,” respectively. With
this notation, the coordinates of the conditional mean displacements
MR
and
ML
are  | (16) |
and we have just shown that the slope
of the line connecting these points is
k:  | (17) |
We can obtain a second, independent
estimate of k by using Equation 17 with
C=–1 (i.e., finding the slope of
the line connecting the conditional means over all trials where the correct
answer is “left”).
We could stop here, as Equation 17 shows how to calculate
k from measurable quantities, but a
reformulation makes the meaning of this expression much clearer. First, note
that the coordinates of
MR
and
ML
with respect to an origin  at the mean of the distribution of target
and distractor displacements
are  | (18) |
and, of course, we obtain the
same value of k if we calculate the
slope of the connecting line in this coordinate frame. Second, because
MR',
ML',
and M are collinear, we obtain the same
value for k if we multiply
MR'
by  and multiply
ML'
by , where
 . These transformations convert Equation 17 into a ratio of
covariances:  | (20) |
 | (21) |
Hence to find the attentional weight
that the observer assigns to the distractor dots, we can measure the covariance
between the total horizontal target and distractor dot displacements and the
observer’s responses, over all trials where the correct answer is
“right,” and take the ratio of these two covariances. That is, the
attentional weight is equal to the influence of the distractor dots on the
observer’s responses, as a proportion of the influence of the target dots
on the observer’s responses.
Strictly speaking, Equation
21 requires a small correction. We have assumed that the distribution
 is radially symmetric over all trials where the target
dots move in a given direction. The random variables
T and
D are independent, so this is true only
if they are Gaussian and their variances are equal. Both
T and
D are the sum of many horizontal dot
displacements, so the central limit theorem ensures that they will be
approximately Gaussian. However, in the random dot cinematograms in the
experiments we report below, there are an equal number of target and distractor
dots, and a small number of target dots always move in a given direction, so
there are slightly fewer randomly moving target dots than randomly moving
distractor dots. Consequently, the variance of
T is actually slightly less than the
variance of D. In Appendix A, we show that we can correct for
this difference by adjusting k by a
factor  , where N is
the total number of target dots, and
nT
is the number of target dots that move directly left or right. The corrected
expressions
are  | (22) |
. | (23) |
When the coherently moving target
dots make up only a small proportion of the dots in the cinematogram, as is
usual, this correction is negligible compared to experimental
error.  | (19) |
This correlation method is closely related to the
classification image method used in psychophysics to characterize the
computation that an observer uses to perform a perceptual task ( Ahumada & Lovell, 1971; Beard & Ahumada, 1998; Gold, Murray, Bennett, & Sekuler, 2000; Neri, Parker, & Blakemore, 1999), and to
the reverse correlation method used in neurophysiology to map receptive fields
( Chichilnisky, 2001; Pinter & Nabet, 1992). Our method reduces
the stimulus to two numbers, the total horizontal target and distractor
displacements, and measures the correlation of these quantities with the
observer’s responses. As in the classification image and reverse
correlation methods, these correlations reveal the linear component of the
computation that the observer uses to perform the task.
It should be clear that this correlation method could
be useful even outside the linear cue combination framework. If we measure the
correlation of targets and distractors with an observer’s responses, and
find that the distractors have as strong an influence on an observer’s
responses as the targets do, then clearly we can conclude that the observer has
little ability to selectively attend to the targets, even if we have no reason
to believe that the observer uses a decision variable that is a weighted sum of
internal responses, as in Equation 1. That is,
regardless of how the observer makes his responses, the correlation ratio gives
a rough measure of how much an observer’s responses are influenced by
distractors.
In Experiment 1, we illustrate this correlation method
by measuring the attentional weight that observers assign to distractor dots in
a global direction discrimination task.
Up to now, we have assumed that the observer’s
decision variable is a weighted sum of the total horizontal displacements of the
targets and distractors,  . This allowed us to calculate the exact
value of the random variables T and
D on each trial, directly from the
stimulus. With this information, we were able to locate each trial in the
observer’s decision space, as in Figure
2, and recover the attentional weight
k by finding the slope of the line
connecting the mean internal responses over all trials where the observer
responded “left” or “right.” However, real
observers’ decision variables are certainly not  .
First of all, real observers have internal noise, and, second, observers might
compute some quantity other than the horizontal displacement of the target and
distractor dots (e.g., an observer might count the number of dots that move
directly to the left or right, or monitor the activation of 30º-wide motion
channels). This seems to pose a problem for our method of measuring attentional
weight, as this method apparently relies on our knowing the observer’s
internal responses to the target and distractor dots on every trial.
In fact, the methods given by Equations 17 and 21
are valid under a much broader range of conditions than we have shown so far. In
Appendix A, we show that we need assume
only that the observer’s decision variable fits the following model, which
is similar to the very general Bayesian decision variable in Equation 12, except that it explicitly introduces
noise into the observer’s decisions.
First, we assume that the observer’s decision
variable is a weighted sum of a quantity
T* computed from the target dots and a
quantity D* computed from the
distractor
dots:  | (24) |
Second, we assume that
T* and
D* are computed by summing responses to
individual target and distractor dot displacements, and that the observer has
the same selectivity f for target and
distractor dot displacements. We also assume that
T* and
D* are contaminated by independent,
equal-variance internal noise sources
ZT
and
ZD.
Thus we can write the internal responses
T* and
D*
as  | (25) |
. | (26) |
Here
ti
and
di
are random variables, perhaps multidimensional, that describe the relevant
properties of individual target and distractor dot displacements, respectively.
For instance, to describe an observer who performs the direction discrimination
task using 30º-wide motion channels, but is less affected by dots at
greater eccentricities, the random variables
ti
and
di
would report the direction and eccentricity of each dot displacement, and the
function f would describe the
observer’s selectivity to dots in each direction, at each eccentricity.
Such noisy linear-filter models have been found to give a good account of global
motion perception under a wide range of conditions ( Watamaniuk, 1993; Zohary et al., 1996).
One straightforward way of testing this model is by
measuring the observer’s psychometric function, which the following
argument shows should be linear when plotted as
d’ versus the number of signal
dots. Let
fR
and
fL
be the mean value of
f(ti)
when
ti
is a dot that steps directly to the right or to the left, respectively. If an
observer can perform the direction discrimination task at all, then
fR≠fL,
and in a task with
nT
target signal dots, the difference in the mean of
T* when the dots move to the right or
to the left is
nT(fR-fL).
Furthermore, if
nT
is much smaller than the total number of dots in the cinematogram, then the
variance
σs2
of the observer’s decision variable is largely independent of
nT.
Consequently, the observer’s sensitivity is
d'
=nT(fR-fL)/σs,
indicating that the psychometric function is linear when plotted as
d' versus the number of signal dots. In
Experiment 1, we measured psychometric functions in a global direction
discrimination task to test the linearity assumption implicit in this
model.
Same Selectivity for Attended and Unattended Stimuli?
According to Equation
24, the observer computes the same internal response
D* from the distractors, regardless of
whether the distractors are fully attended
( k=1) or partially or completely
unattended ( k<1); selective
attention merely modulates the influence of this internal response on the
decision variable. In other words, this account implies that selective attention
does not qualitatively change how the observer processes the distractors, but
only attenuates the influence that the distractors have on the observer’s
responses. Of course, we cannot know a priori whether this is true of human
observers, and it may be that in some tasks, processing of attended and
unattended stimuli is qualitatively different. For instance, it may be that when
observers judge global direction of motion in random dot cinematograms, the
directional selectivity of motion channels is different for attended and partly
unattended dots. Accordingly, we cannot be certain that attentional weight is an
appropriate measure of selective attention until we compare how observers
process attended and unattended stimuli.
Chubb and colleagues ( Chubb, 1999; Chubb et al., 1994) have developed a method of
characterizing observers’ strategies in perceptual tasks by measuring the
influence of small stimulus elements on the observers’ responses. They
call this method histogram contrast analysis (HCA). In Appendix B, we describe a version of HCA that
allows us to measure the directional selectivity of the motion channels that an
observer applies to attended and unattended stimuli. We show that if the
observer bases his responses on a linear motion channel with directional
selectivity
f(θ),
then we can estimate the directional selectivity function
f(θ)
by measuring the influence that each dot moving in direction
θ
has on the observer’s responses. Specifically, we show that the
conditional probability that an observer responds ”right” when an
arbitrarily chosen dot moves in direction
θ
is related to the directional selectivity function
f(θ)
as
follows:
, | (27) |
where
u and
v are constants. In Experiment 1, we
used the HCA method to compare direction selectivity for attended and unattended
dots in a global direction discrimination task.
In the first experiment, we applied the methods we have
described in the previous sections to a global direction discrimination task.
First, we measured psychometric functions in a task where observers judged the
global direction of black or white random dot cinematograms, in order to see
whether observers met the linearity assumption of the model given by Equations 24 through 26, which underpins our other methods. Second, we
measured the attentional weight that observers assigned to distractors, in a
task where observers judged the global direction of motion of white target dots
in a random dot cinematogram. In one condition, the white target dots were mixed
with black distractor dots. This condition tested whether observers could direct
attention according to contrast polarity. In a second condition, the white
target dots were mixed with white distractor dots. This condition served as a
validation condition for our method of measuring attentional weight, as
observers could not distinguish between targets and distractors, 4 and so we knew in advance that the correct value of
attentional weight was k=1. Finally, we
used the HCA method developed by Chubb et al.
(1999) to measure directional selectivity for target and distractor dots, to
see whether selective attention led to qualitative differences in processing of
targets and distractors, or merely reduced the influence of distractors on
observers’ responses.
One author (R.F.M.) and four University of Toronto
students participated. Two observers (R.F.M. and C.P.T.) were practiced at
direction discrimination in random dot cinematograms and were aware of the
hypotheses being tested. The other three observers were not practiced at this
task and were unaware of the hypotheses. All observers in all experiments
reported in this paper had normal or corrected-to-normal Snellen acuity.
Psychometric Function Conditions (100L, 100D)
The stimuli in the psychometric function conditions
were eight-frame random dot cinematograms ( Figure
3). Each frame lasted 45 ms, and the entire cinematogram lasted 360 ms. In
each frame, 100 dots of radius 0.10 deg of visual angle appeared in a circular
aperture of radius 6.0 deg. Between successive frames, a number of dots (the
“signal dots”) moved 0.30 deg to the left or to the right, and the
remainder (the “noise dots”) moved an equal distance in random
directions. On a given trial, all the signal dots moved in the same direction.
On each frame, a new random subset of dots was chosen as signal dots. The
lifetime of each dot was eight frames. In the 100L condition, the dots were
white (Weber contrast 0.40; Figure 3a), and
in the 100D condition, the dots were black (Weber contrast –0.40; Figure 3b). Weber contrast is defined as
 , where L is
the luminance of the point of interest, and
Lbg
is background luminance. The stimuli were shown on a gray background of
luminance 40 cd/m 2.
|
(a) 100L
|
(b) 100D
|
(c) 50L50L
|
(d) 50L50D
|
|
|
|
|
|
Figure 3. Stimuli in Experiments 1 and 2.
Stimuli were displayed on an AppleVision 1710 monitor
(640 × 480 resolution, pixel size 0.467 mm, refresh rate 67 Hz). Observers
viewed the stimuli binocularly from a distance of 1 m, and head position was
stabilized using a chin-and-forehead rest.
Attention Conditions (50L50L, 50L50D)
The stimuli in the attention conditions were similar to
those in the psychometric function conditions, but the dots were divided into
two 50-dot subsets. Fifty dots were target dots: between successive frames, a
number of dots in this subset (the signal dots) moved 0.30 deg to the left or
to the right, and the remainder (the noise dots) moved an equal distance in
random directions. From frame to frame, a new random subset of the 50 target
dots was chosen as signal dots. The other 50 dots in the cinematogram were
distractor dots: between successive frames, all the dots in this subset moved
0.30 deg in random directions. In both the target and the distractor subsets,
the lifetime of each dot was eight frames. In the 50L50L stimulus, both the
targets and the distractors were white (Weber contrast 0.40; Figure 3c). In the 50L50D stimulus, the targets
were white and the distractors were black (Weber contrast ±0.40; Figure 3d). These stimuli are similar to those
used by Edwards and Badcock (1994), the
main difference being that in Edwards and Badcock’s 100L stimulus, any of
the 100 dots could become a signal dot, whereas in our 50L50L stimulus, only the
50 target dots could become signal dots, and all 50 distractor dots took
unbiased random walks.
Psychometric Function Conditions
Two observers (J.A.P. and S.U.M.) participated in two
to three 1-hr sessions. Each session consisted of 18 blocks of 100 trials. One
half the blocks were 100L blocks, one half were 100D blocks, and the session
alternated between the two types of blocks. Each trial began with a 500-ms
fixation interval, followed by a 360-ms random dot cinematogram, followed by a
response interval in which the observer pressed one of two keys to indicate
whether the mean direction of the dots was to the left or to the right. Auditory
feedback indicated whether the observer’s response was correct. A small
white fixation dot appeared at the center of the screen throughout the trial in
the 100L condition, and a small black fixation dot appeared in the 100D
condition. The number of signal dots varied across trials according to the
method of constant stimuli. The numbers of signal dots were chosen to span each
observer’s psychometric function, based on a short pilot session. For
observer J.A.P., the signal levels were 2, 4, 8, 12, and 16 signal dots per
frame, and for observer S.U.M., they were 5, 10, 15, 20, and 25 signal dots per
frame.
Three observers (A.N.C., C.P.T., and R.F.M.)
participated in four to eight 1-hr sessions. Each session consisted of eight
blocks of 300 trials. One half the blocks were 50L50L blocks, one half were
50L50D blocks, and the session alternated between the two types of blocks. The
sequence of events in a trial was the same as in the 100L and 100D conditions.
For each observer, the number of signal dots per frame was fixed at a number
found during a pilot session to give approximately 70% correct performance. For
observer A.N.C., this was eight signal dots per frame, for C.P.T., six signal
dots per frame, and for R.F.M., two signal dots per frame.
In both the 50L50L and 50L50D conditions, observers
were instructed to indicate the mean direction of the white dots. In the 50L50L
condition, the targets and distractors were indistinguishable, so we assumed
that instructions to selectively attend to the target dots would merely
frustrate the observers. Furthermore, the purpose of the 50L50L condition was to
measure attentional weight in a condition where observers attended equally to
the targets and the distractors, and instructions to judge the mean direction of
all the white dots encouraged observers to follow this strategy.
Figure 4 shows
psychometric functions for both observers in the 100L and 100D conditions. The
functions were approximately linear, supporting our hypothesis that the
observers’ decision variable is a linear sum of responses to individual
dot displacements. Figure 4. Psychometric functions in the 100L and
100D conditions. The error bars are SEs, and are often smaller than the data
points.
Figure 5 shows the
results of the 50L50L condition for all three observers. Each small X represents
a single trial on which the observer responded “left,” and each
small O represents a trial on which the observer responded “right.”
The x-coordinate of each small X and O shows the total horizontal displacement
of the target dots on that trial, and the y-coordinate shows the total
horizontal displacement of the distractor dots. The cluster on the left
represents trials on which the correct answer was “left,” and the
cluster on the right represents trials on which the correct answer was
“right’.” Only 150 randomly chosen trials are shown, to keep
the graphs from being too cluttered. The red and green dots represent the mean
displacements over all trials on which the observer responded “left”
and “right,” respectively. The pair of red and green dots on the
left side of each observer’s plot represents the means over all trials on
which the correct answer was “left,” and the pair on the right
represent the means over all trials where the correct answer was
“right.”
Figure 5.
Results of Experiment 1, 50L50L condition. Each X represents a trial on which
the observer responded “left,” and each O represents a trial on
which the observer responded ”right.” The x-coordinate of each X and
O shows the total horizontal displacement of the target dots on that trial, and
the y-coordinate shows the total horizontal displacement of the distractor dots.
The cluster on the left represents trials on which the correct answer was
“left,”, and the cluster on the right represents trials on which the
correct answer was “right.” The red and green dots show the mean
displacements over all trials on which the observer responded “left”
and “right,” respectively. The blue lines are the observers’
decision lines, T+kD=0.
The left and right clusters of data points are
separated by different distances in each observer’s plot, because each
observer required a different number of signal dots per frame to maintain 70%
correct performance. For instance, observer A.N.C. required eight signal dots,
whereas R.F.M. required only two, so the distance between the clusters is four
times larger for A.N.C. than for R.F.M. For the highly practiced observer
R.F.M., a large number of the trials on which the nominally correct answer was
“right” actually had mean target displacements to the left, and vice
versa. This indicates that R.F.M. used such an efficient strategy that his
performance was largely limited by statistical noise in the stimulus
itself.
Note that the mean displacements on trials where the
observer responded “right” (the green dots) are shifted upward and
to the right of the mean displacements on trials where the observer responded
“left” (the red dots). The horizontal shift indicates that the
target displacement was correlated with observers’ left/right responses,
and the vertical shift indicates that the distractor displacement was also
correlated with observers’ responses. Furthermore, the vertical
displacement was typically just as large as the horizontal displacement,
indicating that the distractors influenced observers’ responses as much as
the targets did. This is precisely what we expect in the 50L50L condition, as
observers had no way of distinguishing between target and distractor dots.
Because these shifts are small, and not easily seen in some observers’
plots, we have listed the conditional mean displacements of the target and
distractor dots in Table 1.
We calculated the attentional weight that observers
assigned to the distractor dots using Equation 22
and the conditional mean displacements in Table
1. For observer A.N.C., k=0.95
± 0.16, for C.P.T., k=0.87 ±
0.19, and for R.F.M., k=0.99 ±
0.09. The error values are SEs. None of these estimates of
k is significantly different from the
anticipated value of 1 ( p >.40 for
all comparisons in a two-tailed test). The slanted blue lines in Figure 5 show the decision lines,  ,
corresponding to these values of
k.
Figure 6 shows the
results of the 50L50D condition for all three observers, and Table 1 lists the conditional mean displacements
of the target and distractor dots. Again, both target and distractor
displacements were correlated with observers’ responses, indicating that
observers were unable to restrict their attention to the white target dots. For
observer A.N.C., k=0.93 ± 0.20,
for C.P.T., k=0.52 ± 0.15, and for
R.F.M., k=0.84 ± 0.09. All these
estimates of k are significantly
greater than zero ( p < .001 for all
comparisons), none is significantly less than the observer’s corresponding
value in the 50L50L condition ( p >.
10 for all comparisons), and only C.P.T.’s is significantly less than 1
( p < .01).
Figure 6. Results of Experiment 1, 50L50D
condition. See caption of Figure 5 for
details.
Table 1. Results of Experiment
1
|
|
Mean target displacement (deg)
|
Mean distractor displacement (deg)
|
|
|
Target R
|
Target L
|
Target R
|
Target L
|
|
|
Response R
|
Response L
|
Response R
|
Response L
|
Response R
|
Response L
|
Response R
|
Response L
|
|
50L50L
|
A.N.C.
|
2.364
|
2.263
|
-2.277
|
-2.389
|
0.022
|
-0.087
|
0.083
|
-0.047
|
|
C.P.T.
|
1.786
|
1.679
|
-1.658
|
-1.795
|
0.033
|
-0.081
|
0.093
|
-0.038
|
|
R.F.M.
|
0.688
|
0.453
|
-0.417
|
-0.682
|
0.104
|
-0.142
|
0.182
|
-0.089
|
|
50L50D
|
A.N.C.
|
2.364
|
2.277
|
-2.284
|
-2.376
|
0.024
|
-0.074
|
0.077
|
-0.025
|
|
C.P.T.
|
1.794
|
1.640
|
-1.680
|
-1.799
|
0.017
|
-0.094
|
0.038
|
-0.011
|
|
R.F.M.
|
0.691
|
0.441
|
-0.420
|
-0.672
|
0.084
|
-0.124
|
0.179
|
-0.054
|
This table shows the mean total rightward
displacement of the target and distractor dots, conditional on the target signal
dots moving left or right and the observer responding “left” or
”right.” For example, the top left entry shows that for observer
A.N.C. in the 50L50L condition, the average total target dot displacement was
2.364 deg to the right on trials where the target signal dots moved right and
the observer responded ”right.” The values in this table are the
coordinates of the conditional mean displacements shown in Figures 5 and 6 as red and green dots. Note that both the
target and the distractor displacements were correlated with observers’
responses: all mean displacements were further to the right when observers
responded ”right” than when observers responded “left.”
This was true even in the 50L50D condition, where observers tried to ignore the
distractor dots.
Clearly, observers’ abilities to direct attention
according to contrast polarity were limited at best: two of the three observers
were not influenced significantly less by opposite-polarity distractors than by
same-polarity distractors, and the third observer was influenced 52% as much by
opposite-polarity distractors as by same-polarity distractors. These results are
consistent with previous findings that opposite-polarity distractors have a
large influence on observers’ responses ( Edwards & Badcock, 1994; Li & Kingdom, 2001; Snowden & Edmunds, 1999; van der Smagt & van de Grind, 1999).
These results do not mean that observers
misperceive distractors as targets. At
a Weber contrast of ±40%, the targets and distractors are highly
discriminable. Rather, these results show that observers cannot make global
direction judgments based solely on the directions of white target dots, in the
presence of black distractor dots.
We compared observers’ directional selectivity
for attended and unattended stimuli in the 50L50D condition, using the version
of HCA presented in Appendix B. For each
target and distractor dot in the cinematogram, we measured the probability of
the observer responding “right” when the dot moved in direction
θ,
and we averaged these direction selectivity functions separately over all target
dots and over all distractor dots. Figure 7
shows the influence of target dots and distractor dots on the observers’
responses, as a function of dot direction, averaged across all three observers.
The directional tuning was approximately sinusoidal for both attended and
unattended dots, varying as  , indicating that observers based their responses
on the horizontal displacements of both target and distractor dots. Evidently
observers processed attended and unattended stimuli in the same way, at least in
terms of their directional selectivity. Furthermore, the best-fitting sinusoids
had slightly different amplitudes, reflecting the fact that unattended dots had
less overall influence on the observer’s responses. These results support
the notion that observers have the same selectivity for attended and unattended
stimulus elements, and that selective attention operates by uniformly reducing
the influence of distractor elements on observers’ responses.
Figure 7. Histogram contrast analysis of
Experiment 1, 50L50D condition. The plot shows the probability of a rightward
response, as a function of the direction of each target or distractor dot,
averaged across observers. The solid line is the best-fitting sinusoid to the
target dot data, and the dotted line is the best-fitting sinusoid to the
distractor dot data. The mean, amplitude, and phase of the sinusoids were chosen
to give the best sum-of-squares fit. The amplitude of the distractor sinusoid is
0.88 times the amplitude of the target sinusoid, which is approximately the same
as the mean value of attentional weight measured in Experiment 1,
k=0.86.
Recently, Eckstein,
Shimozaki, and Abbey (2001) and Shimozaki, Abbey, & Eckstein
(2001) used the response classification
method to compare processing of attended and partly unattended stimuli in a very
different task, namely a detection task in which observers were given a
partially valid cue as to where the target would appear, if it appeared at all.
Eckstein et al. also found that observers processed cued and uncued locations
similarly, and simply gave more weight to cued locations in their responses:
classification images at cued and uncued locations had the same spatial profile,
and differed only in amplitude. This finding strongly supports Kinchla’s (1974) and Kinchla and Collyer’s (1974)
weighted sum account of cued detection tasks, and is persuasive evidence that
attentional weight is an appropriate measure of attention in such
tasks.
A technical but potentially troublesome issue is
whether we have properly equated the contrast magnitudes of black and white dots
in this experiment. We showed black and white dots with equal Weber contrast
magnitudes (±40%), but there are other ways of measuring contrast besides
Weber contrast. For instance, the Michelson contrast of the black and white dots
was –25% and +17%, respectively, so according to this measure, the
contrast of the black dots was 1.5 times too high, compared to the white dots.
(Michelson contrast is defined as  , where
Lmax
is the maximum luminance in the region of interest and
Lmin
is the minimum luminance.) A mismatch like this might lead us to underestimate
observers’ abilities to direct attention according to contrast polarity,
as the black dots might evoke a stronger response in motion channels than the
white dots, and therefore be more difficult to ignore than black dots with
properly equated contrast magnitudes.
However, for the following reasons, we believe that we
have correctly matched the contrasts of the white and black dots by setting them
to ±40% Weber contrast. First, Edwards,
Badcock, and Nishida (1996) found that performance in an up-down direction
discrimination task only improved with stimulus contrast up to about 15% Weber
contrast, suggesting that at our contrast level of ±40%, moderate
differences in contrast should have little effect on performance. Second, in the
psychometric function conditions of this experiment, performance was
approximately the same for white and black cinematograms at ±40% contrast.
These two facts are not conclusive, however, as Edwards et al. (1996) found that even though
direction discrimination performance saturated at about 15% contrast when all
dots in a cinematogram had the same contrast, performance worsened when the
contrast of selected noise dots was increased, and continued to worsen as the
contrast of the noise dots was increased up to 80% contrast. Similarly, in a
previous study, we found that when observers judged small differences in the
global direction of a random dot cinematogram, rather than 180º left-right
or up-down direction differences, performance improved with stimulus contrast up
to at least 80% contrast ( Murray, Sekuler,
Bennett, & Sekuler, 1998). These studies show that perceptual responses
to global motion stimuli do not always saturate at low contrasts, so it is
important to properly match the contrasts of white and black dots. The most
persuasive evidence, therefore, is that Murray
et al. (1998) measured performance in global direction discrimination tasks
over a wide range of positive and negative contrasts, and found that observers
performed equally well with white and black cinematograms that were equated for
Weber contrast magnitude (e.g., ±40% Weber contrast). All these factors
indicate that we have correctly equated the strength of the targets and
distractors in our stimuli, so that we can use the influence of the distractors
on observers’ responses as an unbiased measure of the attentional weight
that observers assign to the distractors.
A Faster Correlation Method
The correlation method we used in Experiment 1 requires
a large number of trials, because it measures the effect of small statistical
variations in the targets and distractors on the observer’s responses. One
way of measuring attentional weight more quickly would be to introduce larger
trial-to-trial variations into the target and distractor displacements, and to
measure the effect of these variations on the observer’s responses. Here
we describe a method that takes this
approach.
Figure 8 shows a
plot of a hypothetical observer’s decision space for a task in which we
vary both the mean target displacement and the mean distractor displacement from
trial to trial. In this task, signal dots in the target distribution move left
or right, and signal dots in the distractor distribution also move left or
right. The directions of the target and distractor signal dots are chosen
independently on each trial, so the decision space has four clusters of points
corresponding to the four types of trials: target right, distractor right;
target right, distractor left; target left, distractor right; and target left,
distractor left. The observer’s task is to judge the mean direction of the
target dots. Note that the distractor signal dots contain no information as to
the correct response; we call them signal dots only because they move
coherently, rather than moving in random directions. In the task depicted in Figure 8, there are twice as many target signal
dots as distractor signal dots, as indicated by the fact that the mean of each
of the four clusters of trials is twice as far along the target axis as along
the distractor axis.
Figure 8. A hypothetical observer’s
decision space in Experiment 2.
If the observer’s responses are influenced by the
distractor dots in this task, he will give more rightward responses on trials
where both the targets and the distractors move right than on trials where the
target moves right and the distractor moves left. In the decision space, this is
represented by the fact that a greater proportion of trials falls on the right
side of the decision line when both the targets and the distractors move right
(the top-right cluster in Figure 8) than when
the targets move right and the distractors move left (the bottom-right cluster).
By measuring the difference in the proportion of rightward responses, depending
on whether the distractors move right or left, we can determine how much
influence the distractors have on the observer’s responses, and we can
estimate the attentional weight assigned to the distractors.
The Probe Method of Measuring Attentional Weight
Let us consider in more detail how to measure
attentional weight this way. Again, we will assume that the observer uses a
decision variable,  , described by Equations 24 though 26. We will derive expressions that show how
attentional weight is related to the probability that the observer responds
“right,” depending on whether the target and distractor signal dots
move left or right.
First, consider the statistics of
T*. Let  be the expected value
of T* over all trials, and let
 be the difference in the expected value of
T* between trials where the target
signal dots move right and trials where they move left. There are an equal
number of signal-left and signal-right trials, so the overall mean 
lies midway between the means over signal-left trials and signal-right trials,
and we can write the mean of T* as
 , where the sign depends on whether the target signal
dots move right or left. Furthermore, the variance of
T* is the same over trials where the
target signal dots move right and trials where they move left, because the
signal dot displacements are constant within the signal-left and signal-right
classes of trials, and do not contribute to the variance. We will denote the
variance of T* on signal-left or
signal-right trials as  . For later convenience, we define  ,
which is the sensitivity of T* to the
difference between signal-left and signal-right trials.
Second, consider the statistics of
D*. Just as with
T*, we can write the mean of
D* as  , where the sign
depends on whether the distractor signal dots move left or right. Also, the
variance of D* is the same regardless
of whether the distractor signal dots move left or right, and we will denote
this variance by  .
Third, consider how the statistics of
T* and
D* are related. Both
T* and
D* are calculated by summing internal
responses to individual dot displacements, so  and 
are proportional to the number of target and distractor signal dots,
respectively. In the task we are considering ( Figure 8), there are twice as many target signal
dots as distractor signal dots, so  . Furthermore, there
are approximately the same number of target noise dots as distractor noise dots,
so the variances of T* and
D* are approximately equal:  .
(We will return to this approximation shortly.)
Finally, consider the statistics of the decision
variable,  . The mean of s
is  , where the
signs depend on whether the target and distractor signal dots move left or
right. In this task,  , so the mean of
s is
 . The variance
of s is
 , and to a close
approximation  , so we can rewrite the variance as
 . The midpoint
of the distribution of s is
 , which is
therefore the response criterion of an unbiased observer.
Now we are in a position to see how attentional weight
is related to the probability of a “right” response, depending on
the direction of the target and distractor signal dots. On trials where both the
target and the distractor signal dots move to the right, which we will call RR
trials, the mean of the decision variable is  and the variance is
 . Hence on an RR
trial, the probability that the decision variable exceeds the observer’s
criterion, and the observer responds ”right,”
is
Here
 is the normal
cumulative distribution function, and when we omit arguments
μ
and
σ,
they default to 0 and 1, respectively.
On trials where the target signal dots move right and
the distractor signal dots move left, which we will call RL trials, the mean of
the decision variable is  , and the variance is the same as on
RR trials. Hence the probability of the
observer responding “right” on an
RL trial
is . | (31) |
Similarly, the probabilities of the
observer responding “right” when the target moves to the left and
the distractor moves to the left
( pLL)
or to the right
( pLR)
are  | (32) |
. | (33) |
We could solve Equations 30 and 31
to find k and
d'T
as a function of the conditional response probabilities, and solve Equations 32 and 33
to give another independent estimate. However, when analyzing data from
simulated model observers with known values of attentional weight, we have found
the estimates of k and
d'T
to be less variable when we solve all four equations simultaneously, using a
simplex search to find the values of k
and
d' |