| Volume 2, Number 8, Article 3, Pages 559-570 |
doi:10.1167/2.8.3 |
http://journalofvision.org/2/8/3/ |
ISSN 1534-7362 |
Comparing integration rules in visual search
Stefano Baldassi |
The Smith-Kettlewell Eye Research Institute, San Francisco, CA, USA; Dipartimento di Psicologia, Università di Firenze, Italy |
|
Preeti Verghese |
The Smith-Kettlewell Eye Research Institute, San Francisco, CA, USA |
|
Abstract
Search performance for a target tilted in a known direction among vertical distractors is well explained by signal detection theory models. Typically these models use a maximum-of-outputs rule (Max rule) to predict search performance. The Max rule bases its decision on the largest response from a set of independent noisy detectors. When the target is tilted in either direction from the reference orientation and the task is to identify the sign of tilt, the loss of performance with set size is much greater than predicted by the Max rule. Here we varied the target tilt and measured psychometric functions for identifying the direction of tilt from vertical. Measurements were made at different set sizes in the presence of various levels of orientation jitter. The orientation jitter was set at multiples of the estimated internal noise, which was invariant across set sizes and measurement techniques. We then compared the data to the predictions of two models: a Summation model that integrates both signal and noise from local detectors and a Signed-Max model that first picks the maxima on both sides of vertical and then chooses the value with the highest absolute deviation from the reference. Although the function relating thresholds to set size had a slope consistent with both the Signed-Max and the Summation models, the shape of individual psychometric functions was in the most crucial conditions better predicted by the Signed-Max model, which chooses the largest tilt while keeping track of the direction of tilt.
 |
|
History
Received October 30, 2001; published December 2, 2002
Citation
Baldassi, S. & Verghese, P. (2002). Comparing integration rules in visual search.
Journal of Vision, 2(8):3, 559-570,
http://journalofvision.org/2/8/3/,
doi:10.1167/2.8.3.
Keywords
visual search, spatial vision, signal detection theory, identification
for related articles by these authors
for papers that cite this paper |
When a visual target does not pop out, increasing the
number of distractors makes this stimulus harder to find. Classical reaction
time studies suggest that each item in the display is scanned sequentially (for
a review, see Wolfe, 2000). However, more
recent studies based on accuracy measures and signal detection theory (SDT) have
disproved the necessity for serial processing, showing that parallel processing
of the elements can explain the impairment in performance with increasing set
size (for a review, see Verghese, 2001.
Here we will refer to these as parallel models for the sake of clarity, even
though we think they are “not-necessarily-serial” explanations of
search behavior.
Parallel models share the idea that an integration rule
combines the responses to individual inputs to yield a decision variable.
However, they may differ substantially in the specific rule they implement. In
particular, by using threshold measures across various visual dimensions, such
as contrast, length, speed, and orientation discrimination, different studies
have shown that the increase in thresholds with increasing set size is well fit
by the prediction of a SDT model that chooses the strongest response among
independent detectors ( Palmer, 1994; Palmer, Ames, & Lindsey, 1993; Palmer, Verghese, & Pavel, 2000; Shaw, 1980; Shaw, 1982; Shiu & Pashler, 1995; Verghese & Stone, 1995; Solomon, Lavie, & Morgan, 1997). For
this class of models, every new element contributes uncertainty. In other words,
the magnitude of the observed threshold increase is consistent with a model that
predicts only an increase in uncertainty with increasing set size. Similar
models have also been successful in explaining the additional disruption of
performance occurring in conjunction search tasks ( Eckstein, 1998). All of these studies used
paradigms well suited to this integration rule, commonly referred to as
maximum-of-outputs rule or Max rule. However, as postulated formerly by Green and Swets (1966) and by Shaw (1982), information might be integrated in
a different way via mechanisms that pool together individual responses by what
is generally referred to as the summation rule or Sum rule ( Graham, Kramer, & Yager, 1987).
Two studies using an identification task ( Baldassi & Burr, 2000; Morgan, Ward, & Castet, 1998) have
suggested that the information about a given visual feature (the target) may be
diluted by adding neutral elements (distractors) through the action of a pooling
mechanism. In particular, Baldassi and Burr
(2000) observed that orientation thresholds varied with set size according
to the square root of the number of elements, and that this relationship
persisted across the whole range of noise levels and experimental conditions. In
another task where subjects needed to locate the target, the rise in thresholds
was substantially shallower, being about one half of that observed for
identification. Due to the difference in the behavior of thresholds in these two
tasks, the authors suggested that these tasks dissociated Sum from Max behavior
in visual search. The function relating thresholds to set size in the location
task had a log-log slope of about 0.25, consistent with a standard Max rule.
However, in the identification task, it is debatable whether the square-root
relation (log-log slope of 0.5) reflects summation of target and distractor
responses, or whether it is due to a variant of the Max rule suited to the
particular nature of this task. In fact, both studies ( Baldassi & Burr, 2000; Morgan, Ward, & Castet, 1998) used a type
of 2-alternative task where the target was always present, but had a value that
was randomly on one side of the distractor(s), or the other. In particular, they
measured orientation discrimination with a task where the target was tilted
clockwise (CW) or counterclockwise (CCW) from vertical. Following Baldassi and Burr (2000), we will call this
an identification task because the
observer has to identify the sign of the target orientation with respect to a
mean, as opposed to the standard odd-man-out search tasks where the target
varies along a single direction.
The aim of this study is to compare the experimental
data of an identification task with models that consider both Sum- and
Max-of-outputs decision rules. Moreover, we will compare the outcome of
different models by considering their effects on the whole psychometric
function. Psychometric functions reveal information beyond a summary measure
such as threshold at a criterion percent correct (e.g., Solomon & Morgan, 2001). Basic studies
have shown that the slope of the psychometric function increases with
uncertainty or the number of detectors that the observer monitors ( Burgess & Ghandeharian, 1984; Pelli, 1985; Swensson & Judy, 1981; Tanner, 1961; Tyler & Chen, 2000). Specific parameters
associated with the steepness of the function and with its horizontal position
allow quantifying, at least through relative estimates, the number of detectors
an observer monitors for any given task. Some studies have used this as a tool
to reveal the functioning of the visual system when a decision must be made
among many competitors, as in the case of detecting
motion trajectories in dynamic noise ( Verghese & McKee, 2002). Therefore,
uncertainty analysis may be a useful tool to reveal mechanisms of visual
search.
In this section, we will examine in detail the
theoretical background leading to quantitative predictions of the Max and the
Sum models.
The two models have at least three stages between the
input, consisting of target and distractors, and the output, which is the
observer’s response. Both models share a common first stage, which assumes
that the response to each stimulus element is an independent noisy variable.
Each response is drawn from a distribution whose mean is linearly related to the
stimulus value, and whose variance reflects the internal noise (plus added
external noise; see “Methods” section). The second stage, which
combines the noisy responses, is crucially different between the two. The Sum
model postulates that these responses are added together resulting in a single
variable whose mean and variance are the sum of the individual means and
variances ( Green & Swets, 1966). The
Max model postulates that the decision depends on the variable producing the
strongest response. As the task is to identify the target’s direction of
tilt, the Max rule should take this into account. We will call the modified Max
rule that we propose the Signed-Max rule. The output of the second stage is fed
to a third and final stage, where a decision is taken.
The following paragraphs outline the principles of the
Max model and develop the Signed-Max version as a variant that we propose for
identification.
As reported in the “Introduction,” many
studies have successfully explained search performance in tasks where the target
was tilted in a predetermined direction (e.g., clockwise) with a model that
bases its decision on the largest response in the relevant detectors. For
example, in a 2-interval forced-choice experiment with vertical distractors and
a CW tilted target, the observer monitors the activity of a CW tilted detector
at each of the n locations in the 2
intervals and chooses the interval that evokes the greatest response in this
class of detectors. Therefore, a correct decision will be made if the maximum
output comes from any of the locations in the signal interval. If we assume that
the response of the tilted detector is a sample from the probability density
function, f(r) for a vertical stimulus
and f(r-kθ) for a tilted stimulus,
then the probability of a correct response is given
by  | (1) |
where
r is the response from a CW-oriented
detector, θ is the orientation of
the target, k is a sensitivity
parameter that scales the orientation, and
F(r) is the probability distribution
. |
We will refer to Equation 1 as the standard Max rule. Here we
assume that  has a Gaussian
distribution. The standard Max rule assumes that
at each location there is a single detector tuned to, say, a CW orientation away
from vertical. In an identification task, we then need to consider a modified
form of this rule because the target can be tilted in either direction from
vertical. We propose that at each location
there is one detector tuned for CW and one for CCW orientation ( Figure 1). These detectors have preferred
orientations centered beyond the largest tilt used in our experiment
(28°± jitter), say 30° and -30° from vertical. This allows
the response of these detectors to increase with increasing tilt away from
vertical. These assumptions are consistent with known physiology (e.g., Blakemore & Campbell, 1969; Hubel & Wiesel, 1968): the orientation
tuning of simple cells in primate cortex is broad (full width at half height
equal to 22.5°) and the orientation dimension is sparsely sampled (about 7
unique orientations). The width of this distribution does not represent the
variability in response to a single tilt, but rather the bandwidth of these
orientation-tuned detectors.
Figure 1. Representation of the Signed-Max
model. The big horizontally oriented panel shows the two detectors tuned to
opposite orientations with respect to vertical, as a function of the angle of a
stimulus. The detectors for CCW and CW orientations are represented by dashed
red and continuous blue lines, respectively. The small oriented gratings along
the abscissa show the orientation range spanned by these detectors, with the
vertical orientation in green and a possible target orientation in dark gray.
Because we do not have vertical targets, the two detectors produce responses of
different strengths to a tilted target. The mean response strengths of the two
detectors are marked by horizontal lines that project to the vertical panel on
the right, where the response of the CCW detector to the target is represented
by a dashed red line, that of the CW detector by a continuous blue line, and the
overlapping mean response of the two detectors to the distractor(s) by a dotted
green line. The small vertical panel on the right shows the variability of these
mean responses. The response probability distributions are centered at the
output response of the respective detectors and have noise (σ) from
internal and, in our experiment, external sources. This figure shows the effect
of a CW-oriented stimulus, but the same rationale holds for a CCW target.
For both the target and the distractor locations, one
response is taken from each of these two detectors. A correct choice is made
when the largest response comes from the detector whose preferred direction of
tilt matches the target tilt. Consider the CW tilted target (dark gray) shown
along the abscissa of Figure 1. It produces a
response in both the CW and the CCW detectors (shown by the blue and red
horizontal lines, respectively). Due to noise, these responses vary around a
mean response, as shown by the blue and red distributions on the vertical panel
at the right for the CW and CCW detectors, respectively. A vertical distractor
produces equal mean responses in the two detectors, as shown by the green
distribution (there are two overlapping distributions of responses to the
distractor from CW and CCW detectors).
We assume that the decision requires a comparison
between the outputs of the two detectors. We represent that comparison as the
difference between CW and CCW responses at each location, as subtraction has
been proposed by a number of biologically plausible models of neural competition
(e.g., Desimone & Duncan, 1995). In
the small vertical panel on the right of Figure
1, the distribution of responses of the CCW (red dashed line) and CW (blue
solid line) detectors are by their nature positive. These responses are shown in
the top and middle panels of Figure 2, which
plot the response distribution of the CW and CCW detectors, respectively, to 3
different stimulus orientations. The bottom panel of Figure 2 shows the distributions of the
difference between the responses of the two detectors, what we can call a
difference detector. This subtraction
operation produces distractor responses that have a zero mean, CW responses that
have a positive mean value, and CCW responses that have a negative mean. Note
that while the responses of the two individual detectors have a standard
deviation equal to 1, the difference distribution has a standard deviation that
is larger by a factor of  . The distributions in the bottom panel of
Figure 2 represent the output of this
difference operation. Figure 2. Probability distributions of detector
responses as a function of orientation. The top panel represents the response
distribution of a CW detector to a CCW target (dashed red), to a vertical
distractor (dotted green), and to a CW target (continuous blue). The CW and CCW
targets have equal and opposite tilt as shown above the mean values of the
distributions. Note that the CW detector responds more strongly to a CW tilted
stimulus. The middle panel shows the response of the CCW detector to the same
stimuli. Here the detector responds more strongly to the CCW target. The bottom
panel shows the distributions of the difference between CW and CCW responses.
This distribution has zero mean response to vertical, a positive mean response
to CW, and a negative mean response to CCW. Each of these distributions has a
standard deviation that is  larger than
that for the individual detectors.
To evaluate which direction of tilt has the larger
response, we determined whether the largest positive or the largest negative
response has the greater absolute value. Any value
r generated by the target or the
distractors would be the largest, in absolute magnitude, if all the other
samples produced responses between
–r and
r.
According to the Signed-Max model, the probability of
choosing the clockwise orientation is given
by  | (2) |
where
r is the response from an individual
detector, θ is the orientation
difference,
f(r)
is the probability density function for responses,
k a sensitivity
parameter, and
n is the uncertainty parameter. The
Signed-Max model is a variant of the standard Max model that takes absolute
values and keeps track of their sign. The equation is the sum of two terms. The
first part of the equation calculates the probability that the response to a CW
target (a sample from the distribution  ) is larger in magnitude than the response to
vertical distractors ( n-1 samples from
the distribution  ). Note that
these responses are samples from the difference distribution of CW and CCW
responses. The second part calculates the probability that one of the
n-1 distractor responses has the
largest positive value, and the response from the target and the other
distractor elements is smaller in magnitude. The integration limits in the first
and second parts of the equation go from
– r to
+ r. This is because any value
r generated by the target or distractor
will be the largest in absolute magnitude if all of the other responses lie
between – r and
r. The integration limits of the entire
expression go from 0 to ∞ because only positive responses are correct for
a clockwise target. 1For the Sum
model, if we assume Gaussian distributions of equal variance, then the sum of
distributions resulting from 1 CW and
n-1 vertical distractors has a mean
equal to kθ
,
where
θ
is the CW target angle and the variance
is
nσ2.
The probability that a sample from such a distribution has a value
greater than 0 (CW)
is  | (3) |
where and
. |
In Equations 1 and
2,
k is a parameter that defines
signal-to-noise ratio. In Equation 3, the
factor k/n modulates the
signal-to-noise ratio. Because k and
n cannot be independently evaluated in
this formulation, we set k equal to the
total variance and calculated the best-fitting value for the parameter
n.
For both the Signed-Max and the Sum rules, the above
equations predict the probability of responding correctly for a CW target. The
same rationale applies to CCW targets as we assume no bias in the model for CW
and CCW targets. In fact, we verified experimentally this assumption by
comparing the psychometric functions for CW responses to that of CCW
responses.
The two models have different signatures on the overall
shape of the psychometric functions. For the Signed-Max model, increasing the
set size changes the slope of the function: the greater the uncertainty, the
steeper the function. For the Sum model, increasing the set size shifts the
whole curve across the abscissa. In the context of visual search, psychometric
function analysis offers a clear-cut way to test the predictions of both the Max
and the Sum integration rules. If the decision is based on the signal producing
the highest output among the pool, then one can speculate that as the set size
increases, the number of monitored units increases, resulting in a progressive
steepening of the psychometric functions ( Figure
3, blue curves). If instead target and distractors signals are summed
together, then the distractors decrease the signal to noise ratio, with the net
effect of a rightward shift of the psychometric function with increasing set
size ( Figure 3, red curves). Figure 3 is a schematic of the model prediction
for set size 2 to
16. Figure 3. Schematic representation of the
signature effects of the Signed-Max model (left panel) and the Sum model (right
panel) on the psychometric functions generated at different set sizes (set size
grows from left to right). The Signed-Max model predicts steeper functions with
increasing set size, while the Sum model predicts a rightward shift of the whole
function. The two sets of predictions diverge progressively with increasing set
size. To compare the two models, each panel shows one model’s predictions
superimposed on the other (dashed lines). Note that these psychometric functions
are plotted in log-linear coordinates.
To determine which one of the two models provides a
better account of the data, we used Equations
2 and 3 to fit the curves relating the
percentage of correct responses to the angle of the target in our task. If we
are able to verify that the major source of noise limiting this task is local
(at the level of the individual elements) and that central noise (the noise
source at the level of the integration) is negligible, then we will be able to
work out predictions for both the Sum and the Max models on a common and
comparable ground. Furthermore, we have the advantage of using empirical
estimates of internal noise. Both models have two variables. The sensitivity
parameter k represents a scalar that
relates orientation to detector response. The parameter
n represents uncertainty, or the number
of detectors (locations) monitored. Uncertainty will affect the Sum model by
increasing the variance of the summed distribution, and will affect the
Signed-Max model by increasing the units to be monitored. We used an iterative
procedure to find the best-fitting values of
k and
n for a given set of data. In the
Signed-Max model, we left both k and
n free to vary, whereas we effectively
had only one free parameter for the Sum model. Because both
k and
n modulate the signal-to-noise ratio in
the Sum model, we set k equal to the
inverse of the internal noise plus the external noise, and allowed
n to vary. It is of interest to note
that the values of k returned by the
Signed-Max fit, where it varied freely, closely matched the fixed value of
k in the Sum model.
Stimuli and procedure were designed to match the Baldassi and Burr (2000) task. The stimuli
were generated in Matlab using the Psychophysics Toolbox extensions ( Brainard, 1997; Pelli, 1997), and displayed on a Sony
Multiscan 210GS monitor at a refresh rate of 75 Hz. Two observers performed the
experiment, one author (S.B.) and a naïve paid subject; both of them had
normal or corrected vision.
Each individual stimulus was a Gabor patch of space
constant equal to 0.5º and spatial frequency of 2 cpd, displayed at 50%
contrast. The patches were 5º eccentric from fixation and their positions
were distributed to maximize the angular separation at different set sizes ( Figure 4). Set size was varied from 1 to 16 and
exposure time was 8 frames (106.7 ms). The viewing distance was 57 cm and the
mean luminance 19 cd/m 2. Pilot data indicated that crowding effects
were under control. Thresholds obtained with two elements were similar whether
the elements were on opposite sides of fixation or were separated by 22.5º
(the angular separation for 16 elements).
Figure 4. Example of the stimuli used. Set
size increases from left to right, going from 2 to 16. In addition, the left
panel displays a no-noise condition with a tilted target and vertical
distractors. The two central panels show intermediate levels of noise. The right
panel shows a very high level of noise, which in our experiment peaked at 4
times the threshold for 1 element.
Each trial began with the observer fixating a small
0.05º fixation square that was always on, followed by the stimulus
presentation. Subjects were asked to report the direction of tilt of the single
target that appeared at a random location around the notional circle. In the set
size 1 condition, the target was randomly displayed in 1 out of 4 predetermined
locations. This avoids foveation of a fixed target location but does not add
significant uncertainty as the target was supra-threshold. Acoustic feedback was
provided and a response triggered the next trial. Sessions were blocked by set
size and jitter level. In some conditions, we added noise (orientation jitter)
to the stimuli. This dimensional noise
(e.g, Verghese & Stone, 1995) is
characterized by variability within the dimension of interest (orientation),
rather than the standard pixel contrast-modulated noise used in similar studies.
It has the advantage of directly affecting orientation detectors responsible for
the task, and not requiring any assumption about the way contrast is related to
orientation. Different amounts of jitter were used in separate sessions. We
decided not to use the QUEST procedure ( Watson
& Pelli, 1983) as in the Baldassi
and Burr (2000) study, but rather a fixed set of angles that spanned the
whole range of performance from chance to perfect, interleaved within a session.
While adaptive procedures concentrate trials around the inferred threshold
value, we wanted the same number of trials over the entire psychometric
function. Orientation jitter was introduced by setting the standard deviation of
the orientation distribution of both target and distractors to a multiple of the
internal orientation noise estimate (threshold for 1 element). The mean of the
distractor distribution was 0 (vertical), and the mean of the target
distribution was ± target angle, with 50% probability of being CW or CCW.
Target and distractors were independently drawn from these noisy orientation
distributions. In the no-noise condition, the added noise was 0, so target and
distractors were displayed at the mean of their
distributions.
All these conditions produced a set of 20 psychometric
functions (4 set sizes x 5 noise
levels) for each subject for the main experiment. Set sizes included 2, 4, 8,
and 16 elements, and noise levels included no noise, 0.5, 1, 2, and 4 times the
internal noise level estimated in the absence of external noise (see below). A
block of trials had a fixed set size and noise level with 6 interleaved values
of target orientation (in steps of 1 or 0.5 octaves), each presented 20 times.
Each block was repeated 4 to 6 times to yield 80 to 120 trials per orientation.
The equivalent internal orientation noise was estimated in two ways. For both
observers, we obtained an additional psychometric function for set size 1 with
10 target orientations (100 trials each). The best-fitting
k value was used to determine the
equivalent internal orientation noise, a fixed parameter in the model fits to
each observer’s data. As an alternative measure to estimate the equivalent
internal orientation noise for that task, which is a key parameter of the study,
observer S.B. measured additional orientation thresholds at set size 1 at
different orientation noise levels.
As we stated previously, the two models have a common
first stage that produces independent and noisy responses. To provide an
empirical estimate of this noise, we measured psychometric functions for a set
size of 1 with no external noise for both observers. We then calculated the
standard deviation of the Gaussian distribution underlying these functions,
which was 1.35 for S.B. and 1.38 for V.A. We also measured psychometric
functions with various amounts of added orientation noise for S.B. for the set
size 1 condition. We then used the thresholds obtained with added noise (82%
criterion) to estimate the equivalent internal orientation noise by performing
an analysis similar to Pelli’s (1985)
equivalent noise measurement for contrast. Orientation thresholds are plotted as
a function of external orientation noise in Figure 5, where the points represent the data
and the line represents the fit through the data calculated using the following
equation:  | (4) |
where
σint
is internal variability and
σext
the orientation jitter that we added. The numerator is the d’ value for
82% correct. The text box in Figure 5 reports the actual estimate of
σint,
or equivalent internal orientation noise, for one observer. This value is
almost identical to the estimated standard deviation of the psychometric
function for set size 1 with no noise. More importantly, it was similar to the
equivalent noise values estimated by reanalyzing thresholds measured across set
sizes ranging from 1 to 16 using pixel contrast-modulated noise ( Baldassi & Burr, 2000). The latter set
of data showed that the internal noise does not change with set size, suggesting
that the dominating source of internal noise arises locally from each element,
rather than globally, following the integration process (as in Morgan et al., 1998). This provides empirical
support for the common first stage of the two models stated in the Models section. The value of
k estimated from internal noise was
consistent with the estimate of k from
the Signed-Max model fits to the data. Therefore, we have converging evidence
from various sources to justify the use of this noise estimate in the fits of
the Sum model. It is reasonable to use the standard deviation at the set size 1
with no noise as a fixed parameter because this value seems to be independent of
both the set size and the added external noise.
Figure 5. Thresholds versus orientation
jitter for the set size 1 condition for observer S.B. The circles are the
experimental data, whereas the smooth line represents the fit obtained with Equation 4. The text label reports the value
of the free parameter that we assume estimates the internal equivalent noise of
the system.
Psychometric Function Analysis
Panels A and B in Figure
6 report the data for the two observers, S.B. and V.A., and the fit of the
two models, Signed-Max ( Equation 2) and Sum
( Equation 3). The variance of each
observer’s internal representation was set to be equal to the estimated
variance for the set size 1 condition measured with no noise; under added noise
conditions, we summed the internal and external noise variances. In both
figures, each horizontal line of subplots shows a different set size, from 2 to
16, while the columns mark the orientation noise levels used in the experiment.
The column labels represent the orientation noise, which was a multiple of the
threshold for one element (1.35 for S.B., 1.38 for V.A.). In the added noise
condition, the orientation jitter represented the standard deviation of the
Gaussian distribution from which the actual orientation of each element, target
and distractors, was drawn. The black symbols within each subplot represent the
experimental data, whereas the smooth curves are the model fits for the
Signed-Max and the Sum models, in solid blue and dashed red, respectively. The
subplots with a slightly darker background indicate the conditions where the Sum
model had a statistical advantage over the Signed-Max model. Statistical
comparisons between the two models have been performed by comparing the weighted
χ 2 values for each model fit for each separate condition
(subplot in Figure 6). As previously stated
in more detail, we expect the two models to differ from each other in the way
they fit the whole psychometric function across the conditions we explored. In
particular, a different signature should be represented by diverging slopes of
the fits as the set size increases. Indeed, this is what we observed for both
our subjects. When the set size is small, 2 and 4, the two models yield similar
trends to the fits, with virtually overlapping functions. When instead the set
size increases up to 16, the overall picture is that psychometric functions have
increasing slope consistent with the Signed-Max model, whereas the Sum model
predicts shifts along the abscissa without changing slope. This is true for both
observers, although it is more evident in V.A.’s data.
Figure 6.
Individual psychometric functions plotting percent correct (on linear axes)
versus angle (on log axes) for observers S.B. (A) and V.A. (B). Noise level
increases along the rows from left to right, and set size increases along the
columns from top to bottom. Each plot shows the model fits for the Sum model
(red dashed curves) and for the Signed-Max model (blue continuous curves). Plots
drawn on a gray background indicate conditions where the statistical comparison
was in favor of the Sum model. For all the others, the Signed-Max model
dominate.
Interestingly, the statistical comparison between the
models shows a very slight advantage for the Sum model at low set sizes, where
the two models do not effectively differ from each other. In this case, the
advantage is numerically subtle and substantially null. When instead the two
fits appear different, especially at high set sizes, then the advantage of the
Signed-Max over the Sum model becomes much bigger, and even visual scrutiny
shows that it clearly fits the data better.
This is evident in Table
1, which shows weighted χ 2 values for the Signed-Max and the
Sum models for the two observers.
Table 1. Weighted
χ2 Values
Weighted
χ 2 values obtained
by fitting Equations 2 and 3 (Signed-Max and Sum models, respectively) to
the psychometric functions obtained by the two observers, S.B. and V.A. . The
difference column reports the difference between the Sum and the Signed-Max
models; negative values indicate the advantage of the Sum (italic) and positive
values of the Signed-Max (bold) model.
The estimate of
n averaged across noise conditions
corresponds to the actual set size for observer S.B. for both the Signed-Max and
the Sum models. Observer V.A.’s average estimate of
n reflects the number of elements in
the display at low set sizes, but exceeds that number for the two larger set
sizes used. While the uncertainty increases with set size, this increase is not
proportional to the displayed set size. So it cannot simply be explained by an
intrinsic uncertainty factor. This naïve observer exhibits non-optimal
behavior only at larger set sizes.
The analysis of the entire psychometric function in the
previous section shows an advantage for the Signed-Max model. How do the
predictions of the model compare to the function relating thresholds to set
size? Figure 7 plots the thresholds (just
noticeable differences) as a function of the set size on log-log coordinates
(dark blue circles) along with the predictions of the Sum and the Signed-Max
models (red dotted and blue continuous line, respectively) and of the standard
Max rule for tasks with target varying in a single direction (green dashed line)
(e.g., Palmer, 1994). If we compare the
Sum with the standard Max, it is clear that the former does a better job, even
though some points appear to lie in between. However, the modified Max proposed
in this study (i.e., the Signed-Max) is closer to the data than the other two
models, especially at higher set sizes, where they generate clearly different
predictions.
Even though the log-log slope of the threshold versus
set size function is a standard measure for comparing models of visual search,
plotting thresholds at a single criterion limits the scope of the possible
conclusions. Indeed, the predictions of the two models for the psychometric
functions sketched in Figure 3 show different
behaviors at different criteria for thresholds. As the Sum model predicts a
parallel shift of the whole function with increasing set size, the slope of the
threshold versus set size functions should be independent of the criterion for
threshold. However, the Signed-Max model predicts far apart psychometric
functions at low criteria and converging functions at higher criteria, implying
a change in the slope of the threshold versus set size function with changing
criterion. Figure 8 plots the set size
dependency, that is the log-log slope of the threshold versus set size function,
at five different criteria for the two observers and compares them to the
predictions of two models. Whereas the prediction for the Sum model is flat at a
slope of 0.5, the prediction for Signed-Max ranges from about 0.65 to 0.2 with
increasing criterion for threshold. The subjects’ data show a trend very
similar to the predictions of the latter, with slopes ranging from 0.9 to 0.2 in
the extreme cases. The systematic shift toward steeper slopes probably reflects
the higher uncertainty associated with the two larger set sizes for observer
V.A. Figure 7. Just noticeable difference (JND)
versus set size functions for the two observers in the noise 4 condition and
predictions of the models. The actual data are plotted as dark blue circles. The
red dotted line shows the predicted thresholds over the set size range 2 to 16
for the Sum model, the blue continuous for the Signed-Max, and the green dashed
line for the standard Max model. In all cases, the criterion was 75% correct to
match the criterion used by Baldassi and
Burr (2000). This plot confirms that different predictions differ
progressively with increasing set size, and that the Signed-Max fits the data
better at larger set size.
We used psychometric function analysis to discriminate
between different strategies of visual search in an identification task.
Previous speculations showed the Sum and the Max rule to be the appropriate
combination rules for multidimensional stimuli, one better than the other under
different conditions ( Graham et al.,
1987). Indeed, the different integration rules we considered have proven to
be optimal (or close to optimal) in different visual tasks with compound
displays. The Sum model is the optimal rule, producing the best predicted
performance, for the so-called summation tasks where all the elements are
equally informative ( Graham et al., 1987;
Green & Swets, 1966; Verghese & Stone, 1995). It also seems
to be implemented in crowded displays ( Parkes,
Lund, Angelucci, Solomon, & Morgan, 2001) and in averaging local signals
for the extraction of orientation-defined textures ( Dakin & Watt, 1997). But the Max rule
works more efficiently when the task is to detect a single target among
distractors and when there is little or no interaction between local elements.
Our results ( Figure
7) showed a strong set size effect on thresholds at all noise levels
(greater than predicted by the standard Max rule), similar to the data of Baldassi and Burr (2000). This allowed us
to compare the Baldassi and Burr (2000)
model with a version of the Max model modified for this type of task and
procedure, that is the Signed-Max model. Fitting individual psychometric
functions by allowing the uncertainty
( n) and the gain
( k) parameters to vary allowed us to
quantitatively compare the outcomes of the two models (the gain parameter
k was free only in the Signed-Max
model). There has been some debate on possible integration
rules that explain the observer’s strategy in feature search tasks. The
difference between the predictions of the Sum and the Max model has been
characterized by the slopes of the functions (on log-log coordinates) relating
thresholds to the number of elements, with shallow slopes for Max rule and
steeper slopes for the Sum ( Baldassi &
Burr, 2000; Palmer, 1994; Palmer et al., 1993). Distinguishing between
these two explanations of the set size effect on the basis of thresholds can be
done only under the assumption that the psychometric functions do not change
slope across set sizes. In other words, the slope of the set size function can
represent a signature only when the shape of the psychometric function is taken
into account. The present data show drastic increases of psychometric function
slopes with set size. This implies that the set size function from thresholds
drawn at different criteria would span the whole range of possible predictions.
Indeed, the plot of Figure 8 shows
significant differences between log-log slopes at different criteria, ranging
from about 0.9 to about 0.3. Moreover, as predicted by the Signed-Max model, a
single value of slope across the whole set size function does not fit the
observed change of slope with set size. In fact, thresholds exhibit a
progressive flattening with increasing set size, particularly between set size 8
and 16. Therefore, we would like to pursue the point that a comparison between
different search strategies should be done on the basis of the psychometric
functions rather than on the variability of thresholds across set sizes. When
the whole psychometric function is considered, the two models have similar
predictions at low set sizes until, at high set sizes, the extrinsic uncertainty
introduced by the set size clearly shows its effects. When that occurs, the two
models deviate from one another and the Signed-Max model shows a highly
significant statistical advantage over the Sum model. A similar approach has
been successful in showing the advantage of the Max rule for a location task in
a similar display ( Solomon & Morgan,
2001).
Figure 8. Threshold versus set size slopes
as a function of different criteria for threshold. The red and blue lines are
the predictions of slopes from the Sum and the Signed-Max model, respectively.
The blue squares and the green circles are the slopes shown by S.B. and V.A.,
respectively, averaged across external noise levels. The thresholds for the data
are estimated from Weibull fits to the psychometric functions. The data’s
trend is similar to the Signed-Max model’s predictions, and very different
from the Sum model’s predictions.
The success of the Signed-Max model in the present form
suggests that the outputs of independent detectors (at least two for each
stimulus location) and a comparison of the respective outputs constitute the
decision variable. Moreover, we propose a plausible basis for the mechanisms
that regulate the local decision related to any single stimulus. In fact, even
though the identification task has been used extensively in the orientation
(e.g., Morgan & Baldassi, 1997) and in
other domains ( Solomon et al., 1997), we
think previous accounts either neglected or underestimated its singular nature.
For example, in the Monte-Carlo simulation of Morgan et al. (1998), the authors used the
absolute values generated from n=set
size independent random variables to generate predictions of the Max model. Even
though it is computationally equivalent to our account, an absolute Max is not
satisfying conceptually. In our account, each direction of tilt away from
vertical activates two mirrored detectors whose activity is monitored by the
observer and labeled. Actually, a standard Max rule takes place on either side
and then the two maxima are compared, similar to the Max rule applied to 2IFC
tasks, where a max is assumed to be extracted in each interval, and the two
maxima are compared. We think that this labeling takes place in tasks such as
location search and m-alternative
forced choice (e.g., 10AFC, Solomon &
Morgan, 2001). Here the outputs are monitored in the location rather than
the orientation domain, and the location producing the highest response is
eventually chosen. The equivalent of Morgan
and colleagues’ (1998) absolute Max in the positional context would be
to collapse all the positional information onto a single abstract space and take
the Max. If such a strategy were used in an
mAFC task, then the Max response would
have to be remapped back onto the original space. We think this is not
biologically plausible, nor economical, and assume that such tasks are instead
accomplished by labeled detectors. The model sketched in Figure 1 is biologically plausible and can be
extended to other tasks, once the nature of the task and the behavior of
front-end filters are taken into account.
Moreover, as our model takes into account the actual
behavior of physiologically plausible orientation detectors, it calls into play
an extra stage in between the detectors response and the decision variable, as
sketched in Figure 1, that is not considered
in different accounts of the same task (e.g., Carrasco, Penpeci-Talgar, & Eckstein,
2000). We started from orientation detectors whose response as a function of
orientation is non-monotonic (as opposed to the monotonic contrast response
function). By considering the response of far-apart detectors (±30º),
we obtain responses that grow monotonically with tilt away from vertical over
our stimulus range (±28º). Therefore, rather than modeling the
response of matched filters with preferred orientation peaking at the physical
orientation at any given trial and location, we compute the probability that the
detector that matches the direction of the target stimulus tilt produces the
greater response. This choice of signed detectors makes sense given the fact
that 6 different orientations were equally possible on any trial: the respective
responses of two broadly tuned detectors differ by increasing amounts for larger
deviations from vertical.
Although both the Sum and Max rules are plausible
integration rules for visual search, we have shown that a version of the Max
rule better describes search performance in an identification task. A detailed
comparison of these rules was achieved by fitting the model to entire
psychometric functions; analysis of thresholds versus set size functions alone
would have obscured such differences.
We thank Miguel Eckstein for an extensive discussion of
the manuscript. This work was made possible by National Eye Institute Grant
R01EY12038 to P.V. Commercial Relationships: None.
1The Signed-Max model defined above produces
the same fit to the data as Equation A3 in Carrasco et al., 2000. We chose a different
exposition than Carrasco et al. to reflect biological plausibility, that is,
cortical detectors do not usually produce modulations below baseline.
Baldassi, S., & Burr,
D. C. (2000). Feature-based integration of orientation signals in visual search.
Vision Research, 40, 1293-2000. [PubMed]
Blakemore, C., &
Campbell, F. W. (1969). On the existence of neurones in the visual system
selectively sensitive to the orientation and size of retinal images.
Journal of Physiology (London), 225,
437-455.
Brainard, D. H. (1997). The
Psychophysics Toolbox. Spatial Vision,
10, 433-436. [PubMed]
Burgess, A. E., &
Ghandeharian, H. (1984). Visual signal detection. II. Signal-location
identification. Journal of the Optical Society
of America A, 1, 906-910. [PubMed]
Carrasco, M.,
Penpeci-Talgar, C., & Eckstein, M. (2000). Spatial covert attention
increases contrast sensitivity across the CSF: Support for signal enhancement.
Vision Research, 40, 1203-1215. [PubMed]
Dakin, S. C., & Watt, R.
J. (1997). The computation of orientation statistics from visual texture.
Vision Research, 37, 3181-3192. [PubMed]
Desimone, R., & Duncan,
J. (1995). Neural mechanisms of selective visual attention.
Annual Review of Neuroscience, 18,
193-222. [PubMed]
Eckstein, M. (1998). The
lower visual search efficiency for conjunctions is due to noise and not serial
attentional processing. Psychological Science,
9, 111-118.
Graham, N., Kramer, P., &
Yager, D. (1987). Signal-detection models for multidimensional stimuli:
Probability distributions and combination rules.
Journal of Mathematical Psychology, 31,
366-409.
Green, D. M., & Swets, J.
A. (1966). Signal detection theory and
psychophysics. New York: John Wiley & Sons.
Hubel, D. H., & Wiesel, T.
N. (1968). Receptive fields and functional architecture of monkey striate
cortex. Journal of Physiology (London),
195, 215-243. [PubMed]
Morgan, M. J., &
Baldassi, S. (1997). How the human visual system encodes the orientation of a
texture, and why. Current Biology, 7,
999-1002. [PubMed]
Morgan, M. J., Ward, R. M.,
& Castet, E. (1998). Visual search for a tilted target: Tests of spatial
uncertainty models. Quarterly Journal of
Experimental Psychology, 51A, 347-370.
Palmer, J. (1994). Set-size
effects in visual search: The effect of attention is independent of the stimulus
for simple tasks. Vision Research, 34,
1703-1721. [PubMed]
Palmer, J., Ames, C. T.,
& Lindsey, D. T. (1993). Measuring the effect of attention on simple visual
search. Journal of Experimental Psychology
Human Perception and Performance, 19, 108-130. [PubMed]
Palmer, J., Verghese, P.,
& Pavel, M. (2000). The psychophysics of visual search.
Vision Research, 40, 1227-1268. [PubMed]
Parkes, L., Lund, J.,
Angelucci, A., Solomon, J. A., & Morgan, M. (2001). Compulsory averaging of
crowded orientation signals in human vision.
Nature Neuroscience, 4, 739-744. [PubMed]
Pelli, D. G. (1985).
Uncertainty explains many aspects of visual contrast detection and
discrimination. Journal of the Optical Society
of America A, 2, 1508-1532. [PubMed]
Pelli, D. G. (1997). The
VideoToolbox software for visual psychophysics: Transforming numbers into
movies. Spatial Vision, 10, 437-442. [PubMed]
Shaw, M. L. (1980). Identifying
attentional and decision-making components in information processing. In R.
Nickerson (Ed.), Attention and
performance (Vol. VIII, pp. 106-121). Hillsdale,NJ: Erlbaum.
Shaw, M. L. (1982). Attending
to multiple sources of information. I. The integration of information in
decision making. Cognitive Psychology,
14, 353-409.
Shiu, L. P., & Pashler, H.
(1995). Spatial attention and vernier acuity.
Vision Research, 35, 337-343. [PubMed]
Solomon, J. A., Lavie, N.,
& Morgan, M. J. (1997). Contrast discrimination functions: Spatial cuing
effects. Journal of the Optical Society of
America A, 14, 2443-2448. [PubMed]
Solomon, J. A., &
Morgan, M. J. (2001). Odd-men-out are poorly localized in brief exposures.
Journal of Vision, 1(1), 9-17,
http://journalofvision.org/1/1/2, DOI 10.1167/1.1.2. [ Article]
Swensson, R. G., &
Judy, P. F. (1981). Detection of noisy visual targets: Models for the effects of
spatial uncertainty and signal-to-noise ratio.
Perception & Psychophysics, 29,
521-534. [PubMed]
Tanner, W. P. J. (1961).
Physiological implications of psychomphysical data.
Annals of the New York academy of
Sciences, 89, 752-765.
Tyler, C. W., & Chen, C.
C. (2000). Signal detection theory in the 2AFC paradigm: Attention, channel
uncertainty and probability summation. Vision
Research, 40, 3121-3144. [PubMed]
Verghese, P. (2001). Visual
search and attention: A signal detection theory approach.
Neuron, 31, 523-535. [PubMed]
Verghese, P., &
McKee, S. P. (2002). Predicting future motion.
Journal of Vision,
2(5), 413-423,
http://journalofvision.org/2/5/5/, DOI 10.1167/2.5.5. [ Article]
Verghese, P., &
Nakayama, K. (1994). Stimulus discriminability in visual search.
Vision Research, 34, 2453-2467. [PubMed]
Verghese, P., & Stone,
L. S. (1995). Combining speed information across space.
Vision Research, 35, 2811-2823. [PubMed]
Watson, A. B., & Pelli,
D. G. (1983). QUEST: A Bayesian adaptive psychometric method.
Perception and Psychophysics, 33,
113-120. [PubMed]
Wolfe, J. (2000). Visual
attention. In K. De Valois (Ed.),
Seeing (2nd ed., pp. 335-386). San
Diego, CA: Academic Press.
|