Volume 3, Number 2, Article 2, Pages 116-145 doi:10.1167/3.2.2 http://journalofvision.org/3/2/2/ ISSN 1534-7362
A linear cue combination framework for understanding selective attention
Richard F. Murray
Department of Psychology, University of Toronto, Toronto, Canada
[home] [e-mail]
Allison B. Sekuler
Department of Psychology, McMaster University, Hamilton, Canada
[home] [e-mail]
Patrick J. Bennett
Department of Psychology, McMaster University, Hamilton, Canada
[home] [e-mail]
Abstract

Using a linear cue combination framework, we develop a measure of selective attention that describes the relative weight that an observer assigns to attended and unattended parts of a stimulus when making perceptual judgments. We call this measure attentional weight. We present two methods for measuring attentional weight by calculating the trial-by-trial correlation between the strength of attended and unattended parts of a stimulus and the observer's responses. We illustrate these methods in three experiments that investigate whether observers can direct selective attention according to contrast polarity when judging global direction of motion or global orientation. We find that when observers try to judge the global direction or orientation of the parts of a stimulus with a given contrast polarity (white or black), their responses are nevertheless strongly influenced by parts of the stimulus that have the opposite contrast polarity. Our measure of selective attention indicates that the influence of the opposite-polarity distractors on observers' responses is typically 65% as strong as the influence of the targets in the motion task, and typically 25% as strong as the targets in the orientation task, demonstrating that observers have only a limited ability to direct attention according to contrast polarity. We discuss some of the advantages of using a linear cue combination framework to study selective attention.




History
Received June 18, 2001; published March 18, 2003
Citation
Murray, R. F., Sekuler, A. B., & Bennett, P. J. (2003). A linear cue combination framework for understanding selective attention. Journal of Vision, 3(2):2, 116-145, http://journalofvision.org/3/2/2/, doi:10.1167/3.2.2.
Keywords
selective attention, contrast polarity, global motion, texture, signal detection theory
for related articles by these authors

for papers that cite this paper


Introduction
When we make visual judgments about a scene, we can base our judgments on selected parts of the scene, and ignore other parts. This ability is called selective visual attention. We can direct visual attention according to simple stimulus properties, such as spatial location (Posner, Snyder, & Davidson, 1980), color (Brawn & Snowden, 1999), direction of motion (Ball & Sekuler, 1981), and spatial frequency (Davis & Graham, 1981), and perhaps also according to more complex criteria, such as the perceptual segmentation of a scene (Baylis & Driver, 1992; Duncan, 1984; Egly, Driver, & Rafal, 1994). However, selective attention is sometimes imperfect: if targets and distractors differ along certain dimensions, we find that even when we try to attend only to the targets, our judgments are nevertheless influenced by the distractors. This raises the question of how targets and distractors together determine an observer’s responses, and the closely related question of how we should measure intermediate degrees of selective attention.
The problem of how observers combine information from two or more sources to arrive at a single response has a long history in perceptual psychology (Anderson, 1974). One particularly simple hypothesis is that observers calculate a weighted sum of internal responses to individual sources of information. Such weighted sum models have been used to describe how observers perform many different tasks, including detecting an auditory signal with two frequency components that activate different auditory channels (Green, 1958), combining redundant stimulus properties in complex figures (Kinchla, 1977), combining multiple depth cues (Landy, Maloney, Johnston, & Young, 1995), and combining information across different senses (Ernst, Banks, & Bülthoff, 2000; Jacobs, 1999). Applied to the problem of selective attention, the weighted sum hypothesis suggests that if T is an internal response to targets and D is an internal response to wholly or partly unattended distractors, then the observer bases his responses on a decision variable of the form
article002.gif.(1)
The weighting factor k measures the influence of the distractors on the observer’s responses, and we will call it the attentional weight that the observer assigns to the distractors.
Here we investigate some theoretical and empirical aspects of this weighted sum theory of selective attention. First, we discuss why we might expect selective attention to work this way. We present a general Bayesian description of how observers perform discrimination tasks, and we show that in many circumstances, it is entirely natural for observers to combine information from attended and partly unattended sources in a weighted sum, as in Equation 1.
Second, we derive two methods for measuring the attentional weight k assigned to distractors in a wide range of tasks, and we show that these methods work even when we do not know how the observer computes the internal responses T and D to the targets and distractors. We illustrate these methods in three experiments that investigate whether observers can direct selective attention according to contrast polarity when judging global direction of motion, or when judging global orientation. Several recent studies have investigated the first question concerning global motion and have given conflicting results (Croner & Albright, 1997; Edwards & Badcock, 1994; Li & Kingdom, 2001; Snowden & Edmunds, 1999; van der Smagt & van de Grind, 1999). The methods that we introduce avoid some of the problems of these earlier studies, and so we hope to give a more convincing answer to the question whether observers can direct attention according to contrast polarity.
Third, we test an assumption that is implicit in the weighted sum hypothesis, namely that selective attention only affects the relative weight that an observer assigns to the internal responses to the targets and distractors, T and D, without changing the internal responses themselves. This issue is crucial for the problem of how to measure selective attention. If selective attention affects only the relative weight assigned to targets and distractors, then it can be described by a scalar, such as attentional weight. On the other hand, if selective attention qualitatively changes how an observer computes the internal responses T and D, then a more complex description may be necessary. We show how methods developed by Chubb and colleagues (Chubb, 1999; Chubb, Econopouly, & Landy, 1994) can be used to investigate how observers process attended and unattended stimuli, and we illustrate these methods by measuring directional selectivity for attended and partly unattended motion signals in a global direction discrimination task.
We begin with the question of why selective attention might take the form of a single weighting factor.
Why Attentional Weight?
When studying human performance in a perceptual task, it is often revealing to model observers as Bayesian decision-makers who are limited by simple degradations of the stimulus or by imperfect knowledge of the stimulus. For instance, in many shape discrimination tasks, human observers behave like Bayesian observers who view stimuli through a small amount of additive Gaussian noise and have an imperfect representation of the shapes to be discriminated (Barlow, 1956; Lu & Dosher, 1998; Pelli, 1990). Bayesian models are often illuminating, because they make explicit claims about what information observers use to perform a task, and about what types of inefficiencies limit observers’ performances (Geisler, 1989; Watson, 1987). We follow a similar approach to define a measure of selective attention.
Consider a task in which the observer discriminates between two classes of stimuli, A and B. A Bayesian decision-maker performs this task by viewing the stimulus U on each trial, and evaluating the probability that the stimulus was drawn from class A or class B, given that the observed stimulus was U. Bayes’ theorem shows that these probabilities are
article003.gif(2)
article004.gif .(3)
Equivalently, the observer can base his responses on the likelihood ratio L:
article005.gif(4)
If stimulus types A and B appear equally often, and if the observer’s goal is to maximize the number of correct responses, then the optimal strategy is to respond A’ if article006.gif, and B’ otherwise (Green & Swets, 1974).
If the stimulus U is composed of many independently varying elements Ui (e.g., a noisy N pixel stimulus, or a random dot cinematogram with N independent dot displacements), then the likelihood ratio L is the product of many subsidiary likelihood ratios ui computed from the stimulus elements Ui:
article007.gif, where article008.gif(5)
Equivalently, the observer can calculate the logarithm of this likelihood ratio, which is the sum of the subsidiary log likelihood ratios:
article009.gif(6)
A likelihood ratio ui>1 makes it more likely that U belongs to A, and a likelihood ratio ui<1 makes it more likely that U belongs to B. A likelihood ratio ui=1 does not shift the overall likelihood ratio L either way.
We should point out that the observer’s estimates of the likelihood ratio L may be correct or incorrect. Often we use a Bayesian framework to derive the ideal observer for a particular task, and certainly the ideal observer must compute the relevant likelihood ratios correctly. More generally, though, a Bayesian framework allows us to model an observer’s beliefs about what can be inferred from an observation, and these beliefs may be correct or incorrect. In other words, just because we describe an observer in a Bayesian framework, we need not assume that the observer follows an ideal strategy.
How could we represent selective attention in this well-known Bayesian pattern classification framework? Suppose that a stimulus contains two classes of elements, Ui and Vj. When the observer selectively attends to Ui, he takes these elements as being more relevant to the task than Vj, and he reduces the influence of Vj on his responses. Another way of saying this is that the observer discounts the evidence provided by Vj, and assigns it a smaller weight in his decision. If we regard the observer as basing his responses on a likelihood ratio as in Equation 5, this amounts to his adjusting the likelihood ratios ui and vj that are computed from the two classes of stimulus elements, Ui and Vj. For instance, if on a particular trial an element V1 would contribute a likelihood ratio of v1=1.2 if attended to, hence biasing the observer’s response toward 'A', an observer who selectively attends away from V1 can be thought of as adjusting the likelihood ratio v1 toward 1.0, so that V1 has less influence on his response. That is, when the observer selectively attends to Ui, he adjusts the likelihood ratios vj by some function f:
article010.gif(7)
We will assume that selective attention affects only the likelihood ratios vj corresponding to the elements Vj that the observer selectively attends away from. Later in this section we show that this makes our model only slightly less general than if we allow selective attention to affect both sets of likelihood ratios, ui and vj.
For this description of selective attention to be meaningful, the attenuating function f must satisfy a simple constraint: the likelihood ratio L computed in Equation 7 should not depend on how we conceptually divide the stimulus into independently varying elements Ui and Vj. In particular, our predictions concerning the effects of selective attention should not change if we reformulate our model so that two elements V1 and V2 with likelihood ratios v1 and v2 are now regarded as a single element V1,2 with likelihood ratio v1v2. It follows that
article011.gif.(8)
The theory of functional equations (Falmagne, 1985) shows that Equation 8 implies that f is a power function,
article012.gif.(9)
Hence, a reasonable guess for the form of selective attention is
article013.gif(10)
article014.gif.(11)
The corresponding log likelihood ratio is
article015.gif .(12)
If k=0, all likelihood ratios vj are mapped to 1, and the distractor elements Vj have no effect on the observer’s responses. If k=1, the likelihood ratios vj are unaffected, and Vj have their full effect. Note the similarity of Equation 12 to Equation 1, where we defined k as the attentional weight assigned to the distractors.1
The idea that observers combine information from different sources in a weighted sum has been proposed by many authors for many different tasks, as we discussed in the 'Introduction.' This derivation shows that in tasks where observers selectively attend to one information source rather than another, there are good reasons why they might combine information this way. This formulation leads directly to the notion of attentional weight, which provides a very general way of measuring selective attention, and even gives a meaningful way of comparing the efficacy of selective attention across different tasks.
Finally, suppose that we allow selective attention to affect the likelihoods computed from both targets and distractors:
article016.gif(13)
article017.gif(14)
If we set the attentional weight in Equation 10 to k=k2/k1, then the likelihood ratio in Equation 10 exceeds 1 if and only if the likelihood ratio in Equation 14 exceeds 1, so an unbiased observer would give the same response regardless of which expression that he used. Hence, for an unbiased observer, we can assume that selective attention affects only the likelihood ratios corresponding to unattended stimuli. If an observer is biased (i.e., adopts a likelihood ratio criterion different from 1), then models (10) and (14) are not equivalent, and we might be able to compare these models experimentally by persuading the observer to use an extreme criterion. Here we do not consider the case of a biased observer.2
An Illustration: Selective Attention and Contrast Polarity
As an illustration, we will apply this framework to the question of whether observers can direct attention according to contrast polarity when judging global direction of motion. Edwards and Badcock (1994) argued that this question is relevant to whether signals in ON and OFF pathways merge before reaching cortical area MT, which plays an important role in computing global direction of motion (Newsome & Paré, 1988). The question is also interesting from a purely psychological point of view, as it addresses a basic question about the capabilities of selective attention.
In Edwards and Badcock’s (1994) experiments, observers viewed random dot cinematograms that contained an equal number of white target dots and black distractor dots. A small number of white target dots all moved either directly upward or directly downward, whereas the remaining white target dots and all the black distractor dots moved in random directions. Observers judged whether all the white dots moved on average upward or downward. The question Edwards and Badcock (1994) posed was, “Can observers judge the direction of only the white dots, or do the black dots disrupt the ability to discriminate between upward and downward motion of the white dots?” (In the following section, we will assume that the dots move on average to the left or to the right, rather than upward or downward, as this was the case in the experiments we report later in this work.)
In this task, a Bayesian observer could take each dot displacement as a piece of evidence that the correct answer is “left” or “right,” as in Equation 5. Such an observer would compute the product of the likelihood ratios corresponding to the individual dot displacements, and set a criterion to discriminate between movement to the left and to the right. Equivalently, the observer could calculate the sum of the log likelihood ratios corresponding to the dot displacements, as in Equation 6. This sum of quantities corresponding to individual dot displacements can often be redescribed more intuitively. For instance, if the observer assumes that the distribution of dot directions is Gaussian, then the sum of log likelihood ratios simply measures the total horizontal displacement of all the target dots; an unbiased observer who follows this strategy responds “left” if the total displacement is leftward, and “right” if the displacement is rightward (Watamaniuk, 1993). Alternatively, the observer could base his responses on the output of more narrowly tuned motion channels, perhaps considering only the number of dots that move directly to the left or to the right. To be concrete, we will assume that observers base their responses on the total horizontal displacement of all the dots, but in a later section (“A More General Model”) we show that our results do not depend on this assumption.3
We can plot the total horizontal displacements of the white target dots and the black distractor dots on orthogonal axes (Figure 1). In this plot, each point represents a single trial. The x-component of each point is the total horizontal displacement of all the target dots on that trial (i.e., the sum of the horizontal displacements of the individual target dots), and the y-component is the total horizontal displacement of all the distractor dots. The cluster on the left represents trials on which the correct answer is “left” and the cluster on the right represents trials on which the correct answer is “right.” Because the dots take finite random walks, there is trial-to-trial variability in their horizontal displacements.
fig01.gif
Figure 1. A hypothetical observer’s decision space in Experiment 1. Each point represents a single trial. The x-coordinate of each point is the total horizontal displacement of all the target dots on a trial, and the y-coordinate is the horizontal displacement of the distractor dots. The red and blue lines are illustrative decision lines.
This plot represents the decision space of an observer who bases his responses on the total horizontal displacements of the target and distractor dots. Ideally, the observer should ignore the displacement of the distractors, as this quantity gives no information as to the correct response. For such an observer, the decision variable, which we will call s, is equal to the target displacement, which we will call T. An unbiased observer of this type responds “right” if s is greater than zero, and “left” if s is less than zero. This strategy can be represented as a vertical decision line that divides the decision space in two (e.g., the red line in Figure 1). On the other hand, if the observer cannot selectively attend to the target dots, his responses will be based on some combination of the total horizontal target displacement T and the total horizontal distractor displacement, which we will call D. As in Equation 12, we will model the observer’s decision variable s as a weighted sum of the internal responses to the target and distractor dots:
article019.gif(15)
The attentional weight k assigned to the distractor dots determines the influence of the distractors on the observer’s responses. For an observer for whom article020.gif, the decision line is not vertical, but rather has slope article021.gif (e.g., the blue line in Figure 1).
The weighted sum of target and distractor displacements in Equation 15 would be a natural first attempt at modeling selective attention in this task, even on grounds of simplicity. We wish to emphasize, though, that we arrived at Equation 15 by a different, less pragmatic argument. First, we noted that several studies support the notion that observers judge the global direction of a random dot cinematogram by summing internal responses to individual dot displacements (Watamaniuk, 1993; Watamaniuk, Sekuler, & Williams, 1989; Williams, Tweten, & Sekuler, 1991; Zohary, Scase, & Braddick, 1996). Second, we reasoned that any observer who arrives at a response by summing several quantities computed from a stimulus can be regarded as a Bayesian observer who sums log likelihood ratios, as in Equation 6). Third, our derivation of attentional weight showed that one simple and plausible way of describing the effects of selective attention is with a single weighting factor in a sum of log likelihood ratios, as in Equation 12. These considerations show that the weighting factor k in Equation 15 is not just an arbitrary free parameter, but that it is actually the attentional weight that our hypothetical observer assigns to the distractors. Hence Equation 15 results from a direct application of our account of attentional weight to the task of judging the direction of target dots mixed with distractor dots in a random dot cinematogram.
How to Measure Attentional Weight
According to the account we have outlined, a key problem in the study of selective attention is measuring the attentional weight that an observer assigns to distractors. We now describe a simple method of doing this.
Figure 2 is a plot of our hypothetical observer’s decision space, showing only trials on which the correct answer is “right.” The large black dot M shows the mean total horizontal displacement of the target and distractor dots over all trials where the target signal dots move right, indicating that on average the target dots move to the right and the distractor dots have zero displacement. The green dot MR shows the mean target and distractor displacements over all trials where the target moves right and the observer responds “right.” As indicated by the dashed line in Figure 2, this conditional mean is shifted from the unconditional mean along a line that is perpendicular to the decision line. This follows from the fact that the distribution of target and distractor displacements is radially symmetric: the part of the distribution that falls on one side of the decision line is mirror-symmetric about a line that is perpendicular to the decision line and passes through M, so the mean over all trials where the observer responds “right” must lie along this line. Similarly, the mean displacement ML over all trials where the target moves right and the observer responds “left” is shifted from the overall mean along the same line in the opposite direction, as indicated by the large red dot. The slope of the decision line is –1/k, so the slope of the perpendicular line connecting the two conditional means is k, the attentional weight that the observer assigns to the distractor dots.
fig02.gif
Figure 2. Part of a hypothetical observer’s decision space, showing trials on which the target signal dots move to the right. M is the mean over all trials, MR is the mean over trials where the observer responds ”right,” and ML is the mean over trials where the observer responds “left.”
Let the random variable C represent the correct response on a given trial, taking the value +1 or –1 on trials where the correct response is “right” or “left,” respectively. Similarly, let the random variable R represent the observer’s responses, taking the value +1 or –1 on trials where the observer responds “right” or “left,” respectively. With this notation, the coordinates of the conditional mean displacements MR and ML are
article023.gif(16)
and we have just shown that the slope of the line connecting these points is k:
article024.gif(17)
We can obtain a second, independent estimate of k by using Equation 17 with C=–1 (i.e., finding the slope of the line connecting the conditional means over all trials where the correct answer is “left”).
We could stop here, as Equation 17 shows how to calculate k from measurable quantities, but a reformulation makes the meaning of this expression much clearer. First, note that the coordinates of MR and ML with respect to an origin article025.gif at the mean of the distribution of target and distractor displacements are
article026.gif(18)
and, of course, we obtain the same value of k if we calculate the slope of the connecting line in this coordinate frame. Second, because MR', ML', and M are collinear, we obtain the same value for k if we multiply MR' by article027.gif and multiply ML' by article028.gif, where article029.gif. These transformations convert Equation 17 into a ratio of covariances:
article030.gif(20)
article031.gif(21)
Hence to find the attentional weight that the observer assigns to the distractor dots, we can measure the covariance between the total horizontal target and distractor dot displacements and the observer’s responses, over all trials where the correct answer is “right,” and take the ratio of these two covariances. That is, the attentional weight is equal to the influence of the distractor dots on the observer’s responses, as a proportion of the influence of the target dots on the observer’s responses.
Strictly speaking, Equation 21 requires a small correction. We have assumed that the distribution article032.gif is radially symmetric over all trials where the target dots move in a given direction. The random variables T and D are independent, so this is true only if they are Gaussian and their variances are equal. Both T and D are the sum of many horizontal dot displacements, so the central limit theorem ensures that they will be approximately Gaussian. However, in the random dot cinematograms in the experiments we report below, there are an equal number of target and distractor dots, and a small number of target dots always move in a given direction, so there are slightly fewer randomly moving target dots than randomly moving distractor dots. Consequently, the variance of T is actually slightly less than the variance of D. In Appendix A, we show that we can correct for this difference by adjusting k by a factor article033.gif, where N is the total number of target dots, and nT is the number of target dots that move directly left or right. The corrected expressions are
article034.gif(22)
article035.gif.(23)
When the coherently moving target dots make up only a small proportion of the dots in the cinematogram, as is usual, this correction is negligible compared to experimental error.
article036.gif(19)
This correlation method is closely related to the classification image method used in psychophysics to characterize the computation that an observer uses to perform a perceptual task (Ahumada & Lovell, 1971; Beard & Ahumada, 1998; Gold, Murray, Bennett, & Sekuler, 2000; Neri, Parker, & Blakemore, 1999), and to the reverse correlation method used in neurophysiology to map receptive fields (Chichilnisky, 2001; Pinter & Nabet, 1992). Our method reduces the stimulus to two numbers, the total horizontal target and distractor displacements, and measures the correlation of these quantities with the observer’s responses. As in the classification image and reverse correlation methods, these correlations reveal the linear component of the computation that the observer uses to perform the task.
It should be clear that this correlation method could be useful even outside the linear cue combination framework. If we measure the correlation of targets and distractors with an observer’s responses, and find that the distractors have as strong an influence on an observer’s responses as the targets do, then clearly we can conclude that the observer has little ability to selectively attend to the targets, even if we have no reason to believe that the observer uses a decision variable that is a weighted sum of internal responses, as in Equation 1. That is, regardless of how the observer makes his responses, the correlation ratio gives a rough measure of how much an observer’s responses are influenced by distractors.
In Experiment 1, we illustrate this correlation method by measuring the attentional weight that observers assign to distractor dots in a global direction discrimination task.
A More General Model
Up to now, we have assumed that the observer’s decision variable is a weighted sum of the total horizontal displacements of the targets and distractors, article037.gif. This allowed us to calculate the exact value of the random variables T and D on each trial, directly from the stimulus. With this information, we were able to locate each trial in the observer’s decision space, as in Figure 2, and recover the attentional weight k by finding the slope of the line connecting the mean internal responses over all trials where the observer responded “left” or “right.” However, real observers’ decision variables are certainly not article037.gif. First of all, real observers have internal noise, and, second, observers might compute some quantity other than the horizontal displacement of the target and distractor dots (e.g., an observer might count the number of dots that move directly to the left or right, or monitor the activation of 30º-wide motion channels). This seems to pose a problem for our method of measuring attentional weight, as this method apparently relies on our knowing the observer’s internal responses to the target and distractor dots on every trial.
In fact, the methods given by Equations 17 and 21 are valid under a much broader range of conditions than we have shown so far. In Appendix A, we show that we need assume only that the observer’s decision variable fits the following model, which is similar to the very general Bayesian decision variable in Equation 12, except that it explicitly introduces noise into the observer’s decisions.
First, we assume that the observer’s decision variable is a weighted sum of a quantity T* computed from the target dots and a quantity D* computed from the distractor dots:
article038.gif(24)
Second, we assume that T* and D* are computed by summing responses to individual target and distractor dot displacements, and that the observer has the same selectivity f for target and distractor dot displacements. We also assume that T* and D* are contaminated by independent, equal-variance internal noise sources ZT and ZD. Thus we can write the internal responses T* and D* as
article039.gif(25)
article040.gif .(26)
Here ti and di are random variables, perhaps multidimensional, that describe the relevant properties of individual target and distractor dot displacements, respectively. For instance, to describe an observer who performs the direction discrimination task using 30º-wide motion channels, but is less affected by dots at greater eccentricities, the random variables ti and di would report the direction and eccentricity of each dot displacement, and the function f would describe the observer’s selectivity to dots in each direction, at each eccentricity. Such noisy linear-filter models have been found to give a good account of global motion perception under a wide range of conditions (Watamaniuk, 1993; Zohary et al., 1996).
One straightforward way of testing this model is by measuring the observer’s psychometric function, which the following argument shows should be linear when plotted as d’ versus the number of signal dots. Let fR and fL be the mean value of f(ti) when ti is a dot that steps directly to the right or to the left, respectively. If an observer can perform the direction discrimination task at all, then fR≠fL, and in a task with nT target signal dots, the difference in the mean of T* when the dots move to the right or to the left is nT(fR-fL). Furthermore, if nT is much smaller than the total number of dots in the cinematogram, then the variance σs2 of the observer’s decision variable is largely independent of nT. Consequently, the observer’s sensitivity is d' =nT(fR-fL)/σs, indicating that the psychometric function is linear when plotted as d' versus the number of signal dots. In Experiment 1, we measured psychometric functions in a global direction discrimination task to test the linearity assumption implicit in this model.
Same Selectivity for Attended and Unattended Stimuli?
According to Equation 24, the observer computes the same internal response D* from the distractors, regardless of whether the distractors are fully attended (k=1) or partially or completely unattended (k<1); selective attention merely modulates the influence of this internal response on the decision variable. In other words, this account implies that selective attention does not qualitatively change how the observer processes the distractors, but only attenuates the influence that the distractors have on the observer’s responses. Of course, we cannot know a priori whether this is true of human observers, and it may be that in some tasks, processing of attended and unattended stimuli is qualitatively different. For instance, it may be that when observers judge global direction of motion in random dot cinematograms, the directional selectivity of motion channels is different for attended and partly unattended dots. Accordingly, we cannot be certain that attentional weight is an appropriate measure of selective attention until we compare how observers process attended and unattended stimuli.
Chubb and colleagues (Chubb, 1999; Chubb et al., 1994) have developed a method of characterizing observers’ strategies in perceptual tasks by measuring the influence of small stimulus elements on the observers’ responses. They call this method histogram contrast analysis (HCA). In Appendix B, we describe a version of HCA that allows us to measure the directional selectivity of the motion channels that an observer applies to attended and unattended stimuli. We show that if the observer bases his responses on a linear motion channel with directional selectivity f(θ), then we can estimate the directional selectivity function f(θ) by measuring the influence that each dot moving in direction θ has on the observer’s responses. Specifically, we show that the conditional probability that an observer responds ”right” when an arbitrarily chosen dot moves in direction θ is related to the directional selectivity function f(θ) as follows:
article041.gif ,(27)
where u and v are constants. In Experiment 1, we used the HCA method to compare direction selectivity for attended and unattended dots in a global direction discrimination task.
Experiment 1
In the first experiment, we applied the methods we have described in the previous sections to a global direction discrimination task. First, we measured psychometric functions in a task where observers judged the global direction of black or white random dot cinematograms, in order to see whether observers met the linearity assumption of the model given by Equations 24 through 26, which underpins our other methods. Second, we measured the attentional weight that observers assigned to distractors, in a task where observers judged the global direction of motion of white target dots in a random dot cinematogram. In one condition, the white target dots were mixed with black distractor dots. This condition tested whether observers could direct attention according to contrast polarity. In a second condition, the white target dots were mixed with white distractor dots. This condition served as a validation condition for our method of measuring attentional weight, as observers could not distinguish between targets and distractors,4 and so we knew in advance that the correct value of attentional weight was k=1. Finally, we used the HCA method developed by Chubb et al. (1999) to measure directional selectivity for target and distractor dots, to see whether selective attention led to qualitative differences in processing of targets and distractors, or merely reduced the influence of distractors on observers’ responses.
Methods
Participants
One author (R.F.M.) and four University of Toronto students participated. Two observers (R.F.M. and C.P.T.) were practiced at direction discrimination in random dot cinematograms and were aware of the hypotheses being tested. The other three observers were not practiced at this task and were unaware of the hypotheses. All observers in all experiments reported in this paper had normal or corrected-to-normal Snellen acuity.
Stimuli
Psychometric Function Conditions (100L, 100D)
The stimuli in the psychometric function conditions were eight-frame random dot cinematograms (Figure 3). Each frame lasted 45 ms, and the entire cinematogram lasted 360 ms. In each frame, 100 dots of radius 0.10 deg of visual angle appeared in a circular aperture of radius 6.0 deg. Between successive frames, a number of dots (the “signal dots”) moved 0.30 deg to the left or to the right, and the remainder (the “noise dots”) moved an equal distance in random directions. On a given trial, all the signal dots moved in the same direction. On each frame, a new random subset of dots was chosen as signal dots. The lifetime of each dot was eight frames. In the 100L condition, the dots were white (Weber contrast 0.40; Figure 3a), and in the 100D condition, the dots were black (Weber contrast –0.40; Figure 3b). Weber contrast is defined as article043.gif, where L is the luminance of the point of interest, and Lbg is background luminance. The stimuli were shown on a gray background of luminance 40 cd/m2.
(a) 100L
(b) 100D
(c) 50L50L
(d) 50L50D
Figure 3. Stimuli in Experiments 1 and 2.
Stimuli were displayed on an AppleVision 1710 monitor (640 × 480 resolution, pixel size 0.467 mm, refresh rate 67 Hz). Observers viewed the stimuli binocularly from a distance of 1 m, and head position was stabilized using a chin-and-forehead rest.
Attention Conditions (50L50L, 50L50D)
The stimuli in the attention conditions were similar to those in the psychometric function conditions, but the dots were divided into two 50-dot subsets. Fifty dots were target dots: between successive frames, a number of dots in this subset (the signal dots) moved 0.30 deg to the left or to the right, and the remainder (the noise dots) moved an equal distance in random directions. From frame to frame, a new random subset of the 50 target dots was chosen as signal dots. The other 50 dots in the cinematogram were distractor dots: between successive frames, all the dots in this subset moved 0.30 deg in random directions. In both the target and the distractor subsets, the lifetime of each dot was eight frames. In the 50L50L stimulus, both the targets and the distractors were white (Weber contrast 0.40; Figure 3c). In the 50L50D stimulus, the targets were white and the distractors were black (Weber contrast ±0.40; Figure 3d). These stimuli are similar to those used by Edwards and Badcock (1994), the main difference being that in Edwards and Badcock’s 100L stimulus, any of the 100 dots could become a signal dot, whereas in our 50L50L stimulus, only the 50 target dots could become signal dots, and all 50 distractor dots took unbiased random walks.
Procedure
Psychometric Function Conditions
Two observers (J.A.P. and S.U.M.) participated in two to three 1-hr sessions. Each session consisted of 18 blocks of 100 trials. One half the blocks were 100L blocks, one half were 100D blocks, and the session alternated between the two types of blocks. Each trial began with a 500-ms fixation interval, followed by a 360-ms random dot cinematogram, followed by a response interval in which the observer pressed one of two keys to indicate whether the mean direction of the dots was to the left or to the right. Auditory feedback indicated whether the observer’s response was correct. A small white fixation dot appeared at the center of the screen throughout the trial in the 100L condition, and a small black fixation dot appeared in the 100D condition. The number of signal dots varied across trials according to the method of constant stimuli. The numbers of signal dots were chosen to span each observer’s psychometric function, based on a short pilot session. For observer J.A.P., the signal levels were 2, 4, 8, 12, and 16 signal dots per frame, and for observer S.U.M., they were 5, 10, 15, 20, and 25 signal dots per frame.
Attention Conditions
Three observers (A.N.C., C.P.T., and R.F.M.) participated in four to eight 1-hr sessions. Each session consisted of eight blocks of 300 trials. One half the blocks were 50L50L blocks, one half were 50L50D blocks, and the session alternated between the two types of blocks. The sequence of events in a trial was the same as in the 100L and 100D conditions. For each observer, the number of signal dots per frame was fixed at a number found during a pilot session to give approximately 70% correct performance. For observer A.N.C., this was eight signal dots per frame, for C.P.T., six signal dots per frame, and for R.F.M., two signal dots per frame.
In both the 50L50L and 50L50D conditions, observers were instructed to indicate the mean direction of the white dots. In the 50L50L condition, the targets and distractors were indistinguishable, so we assumed that instructions to selectively attend to the target dots would merely frustrate the observers. Furthermore, the purpose of the 50L50L condition was to measure attentional weight in a condition where observers attended equally to the targets and the distractors, and instructions to judge the mean direction of all the white dots encouraged observers to follow this strategy.
Results and Discussion
Psychometric Functions
Figure 4 shows psychometric functions for both observers in the 100L and 100D conditions. The functions were approximately linear, supporting our hypothesis that the observers’ decision variable is a linear sum of responses to individual dot displacements.
fig04.gif
Figure 4. Psychometric functions in the 100L and 100D conditions. The error bars are SEs, and are often smaller than the data points.
50L50L Condition
Figure 5 shows the results of the 50L50L condition for all three observers. Each small X represents a single trial on which the observer responded “left,” and each small O represents a trial on which the observer responded “right.” The x-coordinate of each small X and O shows the total horizontal displacement of the target dots on that trial, and the y-coordinate shows the total horizontal displacement of the distractor dots. The cluster on the left represents trials on which the correct answer was “left,” and the cluster on the right represents trials on which the correct answer was “right’.” Only 150 randomly chosen trials are shown, to keep the graphs from being too cluttered. The red and green dots represent the mean displacements over all trials on which the observer responded “left” and “right,” respectively. The pair of red and green dots on the left side of each observer’s plot represents the means over all trials on which the correct answer was “left,” and the pair on the right represent the means over all trials where the correct answer was “right.”
fig05.gif
Figure 5. Results of Experiment 1, 50L50L condition. Each X represents a trial on which the observer responded “left,” and each O represents a trial on which the observer responded ”right.” The x-coordinate of each X and O shows the total horizontal displacement of the target dots on that trial, and the y-coordinate shows the total horizontal displacement of the distractor dots. The cluster on the left represents trials on which the correct answer was “left,”, and the cluster on the right represents trials on which the correct answer was “right.” The red and green dots show the mean displacements over all trials on which the observer responded “left” and “right,” respectively. The blue lines are the observers’ decision lines, T+kD=0.
The left and right clusters of data points are separated by different distances in each observer’s plot, because each observer required a different number of signal dots per frame to maintain 70% correct performance. For instance, observer A.N.C. required eight signal dots, whereas R.F.M. required only two, so the distance between the clusters is four times larger for A.N.C. than for R.F.M. For the highly practiced observer R.F.M., a large number of the trials on which the nominally correct answer was “right” actually had mean target displacements to the left, and vice versa. This indicates that R.F.M. used such an efficient strategy that his performance was largely limited by statistical noise in the stimulus itself.
Note that the mean displacements on trials where the observer responded “right” (the green dots) are shifted upward and to the right of the mean displacements on trials where the observer responded “left” (the red dots). The horizontal shift indicates that the target displacement was correlated with observers’ left/right responses, and the vertical shift indicates that the distractor displacement was also correlated with observers’ responses. Furthermore, the vertical displacement was typically just as large as the horizontal displacement, indicating that the distractors influenced observers’ responses as much as the targets did. This is precisely what we expect in the 50L50L condition, as observers had no way of distinguishing between target and distractor dots. Because these shifts are small, and not easily seen in some observers’ plots, we have listed the conditional mean displacements of the target and distractor dots in Table 1.
We calculated the attentional weight that observers assigned to the distractor dots using Equation 22 and the conditional mean displacements in Table 1. For observer A.N.C., k=0.95 ± 0.16, for C.P.T., k=0.87 ± 0.19, and for R.F.M., k=0.99 ± 0.09. The error values are SEs. None of these estimates of k is significantly different from the anticipated value of 1 (p >.40 for all comparisons in a two-tailed test). The slanted blue lines in Figure 5 show the decision lines, article047.gif, corresponding to these values of k.
50L50D Condition
Figure 6 shows the results of the 50L50D condition for all three observers, and Table 1 lists the conditional mean displacements of the target and distractor dots. Again, both target and distractor displacements were correlated with observers’ responses, indicating that observers were unable to restrict their attention to the white target dots. For observer A.N.C., k=0.93 ± 0.20, for C.P.T., k=0.52 ± 0.15, and for R.F.M., k=0.84 ± 0.09. All these estimates of k are significantly greater than zero (p < .001 for all comparisons), none is significantly less than the observer’s corresponding value in the 50L50L condition (p >. 10 for all comparisons), and only C.P.T.’s is significantly less than 1 (p < .01).
fig06.gif
Figure 6. Results of Experiment 1, 50L50D condition. See caption of Figure 5 for details.
Table 1. Results of Experiment 1


Mean target displacement (deg)
Mean distractor displacement (deg)


Target R
Target L
Target R
Target L


Response R
Response L
Response R
Response L
Response R
Response L
Response R
Response L
50L50L
A.N.C.
2.364
2.263
-2.277
-2.389
0.022
-0.087
0.083
-0.047
C.P.T.
1.786
1.679
-1.658
-1.795
0.033
-0.081
0.093
-0.038
R.F.M.
0.688
0.453
-0.417
-0.682
0.104
-0.142
0.182
-0.089
50L50D
A.N.C.
2.364
2.277
-2.284
-2.376
0.024
-0.074
0.077
-0.025
C.P.T.
1.794
1.640
-1.680
-1.799
0.017
-0.094
0.038
-0.011
R.F.M.
0.691
0.441
-0.420
-0.672
0.084
-0.124
0.179
-0.054
This table shows the mean total rightward displacement of the target and distractor dots, conditional on the target signal dots moving left or right and the observer responding “left” or ”right.” For example, the top left entry shows that for observer A.N.C. in the 50L50L condition, the average total target dot displacement was 2.364 deg to the right on trials where the target signal dots moved right and the observer responded ”right.” The values in this table are the coordinates of the conditional mean displacements shown in Figures 5 and 6 as red and green dots. Note that both the target and the distractor displacements were correlated with observers’ responses: all mean displacements were further to the right when observers responded ”right” than when observers responded “left.” This was true even in the 50L50D condition, where observers tried to ignore the distractor dots.
Clearly, observers’ abilities to direct attention according to contrast polarity were limited at best: two of the three observers were not influenced significantly less by opposite-polarity distractors than by same-polarity distractors, and the third observer was influenced 52% as much by opposite-polarity distractors as by same-polarity distractors. These results are consistent with previous findings that opposite-polarity distractors have a large influence on observers’ responses (Edwards & Badcock, 1994; Li & Kingdom, 2001; Snowden & Edmunds, 1999; van der Smagt & van de Grind, 1999). These results do not mean that observers misperceive distractors as targets. At a Weber contrast of ±40%, the targets and distractors are highly discriminable. Rather, these results show that observers cannot make global direction judgments based solely on the directions of white target dots, in the presence of black distractor dots.
Directional Selectivity
We compared observers’ directional selectivity for attended and unattended stimuli in the 50L50D condition, using the version of HCA presented in Appendix B. For each target and distractor dot in the cinematogram, we measured the probability of the observer responding “right” when the dot moved in direction θ, and we averaged these direction selectivity functions separately over all target dots and over all distractor dots. Figure 7 shows the influence of target dots and distractor dots on the observers’ responses, as a function of dot direction, averaged across all three observers. The directional tuning was approximately sinusoidal for both attended and unattended dots, varying as article048.gif, indicating that observers based their responses on the horizontal displacements of both target and distractor dots. Evidently observers processed attended and unattended stimuli in the same way, at least in terms of their directional selectivity. Furthermore, the best-fitting sinusoids had slightly different amplitudes, reflecting the fact that unattended dots had less overall influence on the observer’s responses. These results support the notion that observers have the same selectivity for attended and unattended stimulus elements, and that selective attention operates by uniformly reducing the influence of distractor elements on observers’ responses.
fig07.gif
Figure 7. Histogram contrast analysis of Experiment 1, 50L50D condition. The plot shows the probability of a rightward response, as a function of the direction of each target or distractor dot, averaged across observers. The solid line is the best-fitting sinusoid to the target dot data, and the dotted line is the best-fitting sinusoid to the distractor dot data. The mean, amplitude, and phase of the sinusoids were chosen to give the best sum-of-squares fit. The amplitude of the distractor sinusoid is 0.88 times the amplitude of the target sinusoid, which is approximately the same as the mean value of attentional weight measured in Experiment 1, k=0.86.
Recently, Eckstein, Shimozaki, and Abbey (2001) and Shimozaki, Abbey, & Eckstein (2001) used the response classification method to compare processing of attended and partly unattended stimuli in a very different task, namely a detection task in which observers were given a partially valid cue as to where the target would appear, if it appeared at all. Eckstein et al. also found that observers processed cued and uncued locations similarly, and simply gave more weight to cued locations in their responses: classification images at cued and uncued locations had the same spatial profile, and differed only in amplitude. This finding strongly supports Kinchla’s (1974) and Kinchla and Collyer’s (1974) weighted sum account of cued detection tasks, and is persuasive evidence that attentional weight is an appropriate measure of attention in such tasks.
Contrast Magnitude
A technical but potentially troublesome issue is whether we have properly equated the contrast magnitudes of black and white dots in this experiment. We showed black and white dots with equal Weber contrast magnitudes (±40%), but there are other ways of measuring contrast besides Weber contrast. For instance, the Michelson contrast of the black and white dots was –25% and +17%, respectively, so according to this measure, the contrast of the black dots was 1.5 times too high, compared to the white dots. (Michelson contrast is defined as article050.gif, where Lmax is the maximum luminance in the region of interest and Lmin is the minimum luminance.) A mismatch like this might lead us to underestimate observers’ abilities to direct attention according to contrast polarity, as the black dots might evoke a stronger response in motion channels than the white dots, and therefore be more difficult to ignore than black dots with properly equated contrast magnitudes.
However, for the following reasons, we believe that we have correctly matched the contrasts of the white and black dots by setting them to ±40% Weber contrast. First, Edwards, Badcock, and Nishida (1996) found that performance in an up-down direction discrimination task only improved with stimulus contrast up to about 15% Weber contrast, suggesting that at our contrast level of ±40%, moderate differences in contrast should have little effect on performance. Second, in the psychometric function conditions of this experiment, performance was approximately the same for white and black cinematograms at ±40% contrast. These two facts are not conclusive, however, as Edwards et al. (1996) found that even though direction discrimination performance saturated at about 15% contrast when all dots in a cinematogram had the same contrast, performance worsened when the contrast of selected noise dots was increased, and continued to worsen as the contrast of the noise dots was increased up to 80% contrast. Similarly, in a previous study, we found that when observers judged small differences in the global direction of a random dot cinematogram, rather than 180º left-right or up-down direction differences, performance improved with stimulus contrast up to at least 80% contrast (Murray, Sekuler, Bennett, & Sekuler, 1998). These studies show that perceptual responses to global motion stimuli do not always saturate at low contrasts, so it is important to properly match the contrasts of white and black dots. The most persuasive evidence, therefore, is that Murray et al. (1998) measured performance in global direction discrimination tasks over a wide range of positive and negative contrasts, and found that observers performed equally well with white and black cinematograms that were equated for Weber contrast magnitude (e.g., ±40% Weber contrast). All these factors indicate that we have correctly equated the strength of the targets and distractors in our stimuli, so that we can use the influence of the distractors on observers’ responses as an unbiased measure of the attentional weight that observers assign to the distractors.
A Faster Correlation Method
The correlation method we used in Experiment 1 requires a large number of trials, because it measures the effect of small statistical variations in the targets and distractors on the observer’s responses. One way of measuring attentional weight more quickly would be to introduce larger trial-to-trial variations into the target and distractor displacements, and to measure the effect of these variations on the observer’s responses. Here we describe a method that takes this approach.
Figure 8 shows a plot of a hypothetical observer’s decision space for a task in which we vary both the mean target displacement and the mean distractor displacement from trial to trial. In this task, signal dots in the target distribution move left or right, and signal dots in the distractor distribution also move left or right. The directions of the target and distractor signal dots are chosen independently on each trial, so the decision space has four clusters of points corresponding to the four types of trials: target right, distractor right; target right, distractor left; target left, distractor right; and target left, distractor left. The observer’s task is to judge the mean direction of the target dots. Note that the distractor signal dots contain no information as to the correct response; we call them signal dots only because they move coherently, rather than moving in random directions. In the task depicted in Figure 8, there are twice as many target signal dots as distractor signal dots, as indicated by the fact that the mean of each of the four clusters of trials is twice as far along the target axis as along the distractor axis.
fig08.gif
Figure 8. A hypothetical observer’s decision space in Experiment 2.
If the observer’s responses are influenced by the distractor dots in this task, he will give more rightward responses on trials where both the targets and the distractors move right than on trials where the target moves right and the distractor moves left. In the decision space, this is represented by the fact that a greater proportion of trials falls on the right side of the decision line when both the targets and the distractors move right (the top-right cluster in Figure 8) than when the targets move right and the distractors move left (the bottom-right cluster). By measuring the difference in the proportion of rightward responses, depending on whether the distractors move right or left, we can determine how much influence the distractors have on the observer’s responses, and we can estimate the attentional weight assigned to the distractors.
The Probe Method of Measuring Attentional Weight
Let us consider in more detail how to measure attentional weight this way. Again, we will assume that the observer uses a decision variable, article052.gif, described by Equations 24 though 26. We will derive expressions that show how attentional weight is related to the probability that the observer responds “right,” depending on whether the target and distractor signal dots move left or right.
First, consider the statistics of T*. Let article053.gif be the expected value of T* over all trials, and let article054.gif be the difference in the expected value of T* between trials where the target signal dots move right and trials where they move left. There are an equal number of signal-left and signal-right trials, so the overall mean article055.gif lies midway between the means over signal-left trials and signal-right trials, and we can write the mean of T* as article056.gif, where the sign depends on whether the target signal dots move right or left. Furthermore, the variance of T* is the same over trials where the target signal dots move right and trials where they move left, because the signal dot displacements are constant within the signal-left and signal-right classes of trials, and do not contribute to the variance. We will denote the variance of T* on signal-left or signal-right trials as article057.gif. For later convenience, we define article058.gif, which is the sensitivity of T* to the difference between signal-left and signal-right trials.
Second, consider the statistics of D*. Just as with T*, we can write the mean of D* as article059.gif, where the sign depends on whether the distractor signal dots move left or right. Also, the variance of D* is the same regardless of whether the distractor signal dots move left or right, and we will denote this variance by article060.gif.
Third, consider how the statistics of T* and D* are related. Both T* and D* are calculated by summing internal responses to individual dot displacements, so article061.gif and article062.gif are proportional to the number of target and distractor signal dots, respectively. In the task we are considering (Figure 8), there are twice as many target signal dots as distractor signal dots, so article063.gif. Furthermore, there are approximately the same number of target noise dots as distractor noise dots, so the variances of T* and D* are approximately equal: article064.gif. (We will return to this approximation shortly.)
Finally, consider the statistics of the decision variable, article065.gif. The mean of s is article066.gif, where the signs depend on whether the target and distractor signal dots move left or right. In this task, article067.gif, so the mean of s is article068.gif. The variance of s is article069.gif, and to a close approximation article070.gif, so we can rewrite the variance as article071.gif. The midpoint of the distribution of s is article072.gif, which is therefore the response criterion of an unbiased observer.
Now we are in a position to see how attentional weight is related to the probability of a “right” response, depending on the direction of the target and distractor signal dots. On trials where both the target and the distractor signal dots move to the right, which we will call RR trials, the mean of the decision variable is article073.gif and the variance is article074.gif. Hence on an RR trial, the probability that the decision variable exceeds the observer’s criterion, and the observer responds ”right,” is
article075.gif(28)
article076.gif(29)
article077.gif .(30)
Here article078.gif is the normal cumulative distribution function, and when we omit arguments μ and σ, they default to 0 and 1, respectively.
On trials where the target signal dots move right and the distractor signal dots move left, which we will call RL trials, the mean of the decision variable is article079.gif, and the variance is the same as on RR trials. Hence the probability of the observer responding “right” on an RL trial is
article080.gif.(31)
Similarly, the probabilities of the observer responding “right” when the target moves to the left and the distractor moves to the left (pLL) or to the right (pLR) are
article081.gif(32)
article082.gif .(33)
We could solve Equations 30 and 31 to find k and d'T as a function of the conditional response probabilities, and solve Equations 32 and 33 to give another independent estimate. However, when analyzing data from simulated model observers with known values of attentional weight, we have found the estimates of k and d'T to be less variable when we solve all four equations simultaneously, using a simplex search to find the values of k and d'