 |
| Volume 4, Number 10, Article 4, Pages 879-890 |
doi:10.1167/4.10.4 |
http://journalofvision.org/4/10/4/ |
ISSN 1534-7362 |
Perceptual learning: A case for early selection
Manfred Fahle |
Institute of Brain Research, Human Neurobiology,
University of Bremen, Germany |
|
Abstract
Perceptual learning is any relatively permanent change of perception as a result of experience. Visual learning leads to sometimes dramatic and quite fast improvements of performance in perceptual tasks, such as hyperacuity discriminations. The improvement often is very specific for the exact task trained, for the precise stimulus orientation, the stimulus position in the visual field, and the eye used during training. This specificity indicates location of the underlying changes in the nervous system at least partly on the level of the primary visual cortex. The dependence of learning on error feedback and on attention, on the other hand, proves the importance of top-down influences from higher cortical centers. In summary, perceptual learning seems to rely at least partly on changes on a relatively early level of cortical information processing (early selection), such as the primary visual cortex under the influence of top-down influences (selection and shaping). An alternative explanation based on late selection is discussed.
 |
|
History
Received April 8, 2002; published October 26, 2004
Citation
Fahle, M. (2004). Perceptual learning: A case for early selection.
Journal of Vision, 4(10):4, 879-890,
http://journalofvision.org/4/10/4/,
doi:10.1167/4.10.4.
Keywords
visual perception, plasticity of visual function, specificity of learning, error feedback, orientation dependence, monocular learning, interocular transfer
for related articles by these authors
for papers that cite this paper |
Perceptual learning: Hyperacuity as a sensitive probe
Perceptual learning is any relatively permanent change
of perception (usually improvement as measured by changes in perceptual
thresholds or brain physiology) as a result of experience. The specification
“relatively permanent” distinguishes perceptual learning from
sensitization (and habituation) as well as from priming, which denote more
transient changes in perception. In contrast to classical conditioning,
perceptual learning involves individual stimuli rather than the association of
two or more stimuli, and is not restricted to one specific response, as in
operant conditioning. Perceptual learning clearly is of the implicit or
procedural type: it does not lead to conscious insights that can be (easily)
communicated, as is the case in declarative, or factual learning. The brain
circuits storing facts and events (episodes) seem to at least partially differ
from those analyzing the outer world. Hence, in amnesic syndromes, scenes may be
analyzed without subsequent memory (e.g., after lesions of the hippocampal
formation). Perceptual learning, on the other hand, seems to change the very
cortical circuits solving the perceptual task trained. In this review, I will
present results suggesting that perceptual learning is (a) very specific for
elementary attributes of the stimulus, such as its orientation, and (b) able to
change signal processing even on the level of primary sensory cortices that were
considered as “hard wired” in adults in the not too distant
past.
The perceptual task employed to test perceptual
learning in most of the experiments reported here is vernier acuity, a type of
visual hyperacuity (Wülfing, 1892;
Westheimer, 1976). In these hyperacuity
tasks, even untrained observers can attain thresholds around 10 arcsec. (These
are thresholds calculated according to the conventional definition, while the
appropriately calculated thresholds are a factor of 2 higher, cf. Harris &
Fahle, 1995). These thresholds are at least
slightly below the spacing of foveal photoreceptors, and through training, they
can improve by up to a factor of 5 (i.e., to 2 arcsec in especially gifted and
trained observers).
Obviously, performance in these tasks is not determined
primarily by the optics of the eye nor by the photoreceptor spacing, though both
factors are important because they ensure that the requirements of the sampling
theorem for complete, high-resolution reconstruction of the original stimulus
are met (cf., Barlow, 1981; Crick, Marr,
& Poggio, 1981). Performance is instead
limited by the signal-to-noise ratio of the information reaching the cortex and
by the precision and selectivity of cortical processing. Hyperacuity is a good
choice to study learning processes in visual perception because it is a very
sensitive measure based on cortical processing. Moreover, hyperacuity is not
some freak ability of over-trained laboratory observers but can be achieved even
without specific training, and simultaneously at many positions in the visual
field (Fahle, 1991).
Specificity of perceptual learning for stimulus orientation, position, and eye trained
In a first series of experiments, we investigated the
specificity of perceptual learning on low-level features of the stimuli, such as
orientation, position in the visual field, and the eye used during training.
Observers usually sat 2 m or further away from an analogue monitor (HP or
Tektronix), controlled by a Power-Mac computer via custom-made high-speed 16-bit
D/A converters with an output rate of 1 megapixel/s. Vernier stimuli consisted
of thin (1 arcmin), bright greenish lines (around 50-400 cd/m 2) on a
dark surround. Each of the vernier elements was usually about 10-arcmin long and
presented for 100-150 ms. Observers had to indicate in a binary forced-choice
task without time pressure (maximum 5-s reaction time allowed) whether the lower
(or left for horizontal verniers) element was offset to the right or left
(respectively up or down) relative to the upper (right) element. Observers had
to indicate their choice by pressing the appropriate one of two push buttons.
Usually, we recorded the number of correct responses for a vernier with a fixed
offset size for experiments consisting of only two sessions, whereas thresholds
were measured in experiments with more sessions. A staircase procedure (PEST;
Taylor & Creelman, 1967) controlled
vernier offset size in these experiments where thresholds were defined as 75%
correct responses. Individual sessions lasted for about 1 hr, usually consisting
of 20 blocks of 80 stimulus presentations each for the experiments with fixed
vernier offsets. Only one session per observer took place each day, with
sessions following each other in intervals of no more than 3 days, usually on
subsequent
days. It turned out that observers, on average, significantly
improved performance in a standard vernier discrimination task, though there was
high inter-observer variance (Fahle & Edelman, 1993). Typically, performance improved fast
initially and slower after about 10-20 min of training (see Figure 1). Earlier, we had found that the
improvement in vernier discrimination through learning did not generalize to a
stimulus rotated by 90° (Poggio, Fahle, & Edelman, 1992). In the present experiment, by stepwise
reducing the rotation of the stimulus after the first training session in six
subsequent groups of observers, we found that stimulus rotation by as little as
10° was sufficient to reduce performance to baseline (i.e., there was no
generalization of improvement in vernier discrimination from one stimulus to a
stimulus rotated by as little as 10°: learning had to start from scratch
for the new orientation) ( Figure 1). Even after
a stimulus rotation of 4°, performance decreased slightly and full transfer
occurred only for rotations of no more than 2° (results not shown).
Figure 1. No transfer of improvement
through learning after stimulus rotation by 10 deg. Eleven observers practiced
vernier discriminations with a stimulus slanted by 5 deg relative to the
vertical. Performance (i.e., the percentage of correct responses) improved
within 1 hr of training. On the next day, a single block of presentations at the
old orientation (point immediately left of vertical red line) proved that
performance remained constant over night. When the stimulus was rotated by 10
deg to a slant of 5 deg in the opposite direction, performance dropped to
pretraining levels (first point to the right of vertical line). The first
orientation was retested at the end of the experiment. Means and
SEs of 11 observers (after Fahle, 1998).
Improvement is similarly specific for position in the
visual field. Eight observers practiced vernier discrimination sequentially at
eight positions in the visual field, in pseudo-random order. These positions
were all located on an imaginary circle around the fovea with a radius of
10° (i.e., all stimuli were presented at 10° eccentricity). Observers
practiced vernier discrimination for 1 hr at each of the positions while
fixation was monitored. During this time, they improved performance, on average,
by 7 % (e.g., from 80% to 87% correct responses) ( Figure 2).
Figure 2. Specificity of perceptual learning for the visual field position trained. Eight observers practiced vernier discriminations sequentially at 8 positions at 10º distance from the fovea. At each position, their mean performance improved during the 1 hr of training at each position by, on average, 7% (with one exception: position 4). But when proceeding to the next visual field position, performance dropped by roughly the same amount. Hence, improvement did not transfer between different visual field positions (after Fahle, Edelman & Poggio, 1995).
However, when stimuli were presented at a new visual
field position, performance dropped by 7%, hence there was no transfer at all
between these visual field positions. Learning, it seems, is highly specific for
location in the visual field, and even these rather regularly spaced locations
that had a distance of no more than (2 × 10 × π/8)
≈ 8° from their nearest
neighbors did not show any sign of being able to transfer the improvement
achieved during training. These results indicate that improvement through
perceptual learning is also very specific for position in the visual field. A
number of different perceptual learning tasks showed a similar specificity for
visual field position (Dill & Fahle, 1997; cf., however,
Beard et al., 1996).
In the third experiment, a new group of six observers
practiced vernier discriminations monocularly, starting with either the left or
the right eye patched for five sessions. Observers improved monocular
performance during these first five sessions, similar to that seen in binocular
improvement. At the start of the sixth session, the contralateral eye was
patched instead. Improvement did not transfer to the previously patched eye;
hence, learning was specific even for the particular eye trained ( Figure 3). Several other groups found a similar
specificity of perceptual learning for the particular eye and visual field
position trained in completely different tasks (e.g., Karni & Sagi, 1991).
Figure 3. Specificity of perceptual
learning for the eye used during training. Half of observers practiced vernier
discrimination with the left eye patched, whereas the right eye was patched for
the second half of observers. After 5 days of training for 1 h daily, the
contralateral eye was patched during training. Thresholds had improved
significantly over the first 5 days, but increased with an overshoot when the
patch was moved to the contralateral eye (after block 22). Means and
SEs of six observers (after Fahle in
Fahle & Poggio, 2002).
Specificity of perceptual learning for different tasks based on orientation cues
Discriminating the orientation of short line elements
obviously relies on some form of orientation discrimination. The same seems to
be true for vernier discrimination (see Watt, 1984) and also for curvature detection (see Kramer
& Fahle, 1996).
Eighteen observers were divided into six groups who
practiced vernier, curvature, and orientation discrimination for 1 hr each in
counterbalanced order. Each of the three parts of Figure 4 shows the results of three groups of six
observers, each group practicing one of the three tasks for 1 hr each during
each of the sessions. It is obvious that observers improved through training in
each of the sessions but that the improvement did not transfer to another task
(cf., Fahle, 1997). Moreover, separate
analysis of the data of each of the six groups revealed no indication of
transfer between any pair of
tasks.
Figure 4. Upper panel. Not only
orientation discrimination but also vernier offset and curvature discrimination
can be based on orientation cues. Here, the discrimination is between a slant to
the right versus a slant to the left. Lower panel. Performance as a function of
practice in three hyperacuity tasks based on discrimination of orientation cues:
vernier, orientation, and curvature discrimination (after Fahle, 1997).The same number of observers practiced
each of these tasks in each of the sessions; hence, the stimulus condition is
counterbalanced between sessions. In each session, observers significantly
improve performance, and retain this improvement on the first block of the next
session, usually on the next day (first data point to the right of the first two
vertical lines). Changing to a new type of orientation-discrimination task
(second points to the right of lines) decreases performance to baseline level.
The very last data point tests performance for the condition of the first
session. Mean results and SEMs for 18
observers. Insets give slopes and correlation factors of linear regressions
through the data points of each session.
Perceptual learning without (explicit) memory: Amnesic patients
A recent study of six patients suffering from amnesic
syndrome supports the hypothesis of an involvement of relatively low levels of
cortical processing in perceptual learning, rather than a purely cognitive level
of learning. Testing of the patients was similar to that of normal observers,
apart from the fact that the intervals between the response of the patient and
the next stimulus presentation were 3-4 s rather than 0.5 s as with the normal
observers. Moreover, patients indicated the direction of offset verbally, and
the experimenter then pushed the corresponding buttons without himself seeing
the stimuli. Two of the six patients tested clearly improved performance within
as few as 2 sessions with 5 blocks of 80 presentations each, tested at a 1-week
interval (i.e., receiving less than half the number of stimulus presentations
used to train normal observers) ( Figure 5;
Fahle & Daum, 2002). Another two
patients improved somewhat, whereas performance of the remaining two patients
stayed constant, similar to the results of about 15% of the healthy student
population of around 300 observers we tested so far. Although after the 1-week
gap following the first session the patients did not recollect that they had
ever before participated in such an experiment (and did not remember the
experimenter), the performance of the six patients as a group improved
significantly as a result of training. This finding indicates that perceptual
learning does not require normal function of the neuronal circuits underlying
explicit or declarative
learning.
Figure 5. Improvement in a vernier
discrimination task in six amnesic patients (after Fahle & Daum, 2002). Six patients suffering from amnesic
syndrome practiced vernier discriminations for 5 blocks with 80 presentations
each and for another 5 blocks 1 week later (gap between blocks 5 and 6 indicates
1-week interval). Thresholds of two patients (RE and MH) improved dramatically
as a result of training, whereas those of the remaining four patients either
improved somewhat (JR and HW) or not at all (AS and HS). Hence, at least some
amnesic patients are capable of perceptual learning.
Does improvement through training in perceptual tasks
require attention or is it automatic, that is, based on mere stimulus
presentation? A recent study reports that performance for detecting the
predominant direction of dot motion improves even if this motion is not
consciously perceived. Hence improvement can be independent not only of
attention to the stimulus but also of conscious perception (Watanabe,
Náñez, & Sasaki, 2001)!
In hyperacuity learning, on the other hand, attention
certainly seems to play an important role. When two vernier stimuli are
presented simultaneously, resembling a cross, and only one of the verniers is
attended, offset discrimination only for this stimulus improves over the course
of training. Half of observers started by indicating the offset of the
horizontal vernier (up versus down; cf., Figure
6; Herzog & Fahle, 1994), whereas the
other half attended to the vertical vernier and indicated its offset (left
versus right). After 1 hr of training, observers’ tasks were exchanged:
those initially responding to the offset of the horizontal vernier now attended
to the vertical one and vice versa. Performance dropped at this transition,
though the stimulus had not been changed at all, just the task was different ( Figure 6): an argument against motor improvement
(of accommodation or fixation; see below) as the basis of perceptual learning.
On the other hand, changing the motor instructions (press left button for right
offset and vice versa) yielded perfect transfer of improvement (data not shown).
Hence, the mere presentation of the stimulus elements was not sufficient for
improvement, but the elements had to be attended to yield improved performance,
in contrast to the results with random dot kinematograms (Watanabe et al., 2001).
Figure 6. Two vernier stimuli are
combined. Observers start with discriminating offset directions of either the
horizontal or the vertical vernier stimulus. After 1 hr of training (in the
second session), they respond to the perpendicular stimulus that had not been
attended to during the first session. This switch of attention to another task
decreases performance without any change of the physical stimulus (after Herzog
& Fahle, 1994). One of the vernier targets
is shown as a dotted line in each of the stimuli of the graph only to indicate
that this vernier was not attended to. The physical stimuli were always solid
lines in the experiments and did not change over the course of the
experiment.
Improvement through learning in hyperacuity tasks is
possible even in the absence of external error feedback (McKee & Westheimer,
1978; Fahle et al., 1995) but often is significantly faster if
feedback is present. If only half of the incorrect responses are followed by an
error signal (incomplete feedback), observer’s improvement is almost as
fast as with complete feedback where each incorrect response leads to an error
signal (Herzog & Fahle, 1997).
This finding poses difficult problems for neuronal
network theories of perceptual learning based on a teacher signal, which allows
the observer to classify each stimulus. With partial feedback, half of the
incorrect responses would be classified as being correct, and this should
strongly decrease learning but it does not. (A possible remedy would be to use
only the error signals for response modification, but this method would not be
able to reliably discriminate between correct versus incorrect responses.)
Random feedback signals, on the other hand, without any correlation to the
correctness of the response, effectively prevented improvement through training
if observers assumed that they received correct error feedback (Herzog &
Fahle, 1997).
Improvement is about as fast with block feedback when
the percentage of correct responses is indicated after each block of 80
presentations as it is for complete trial-by-trial feedback (Herzog & Fahle,
1997). Most surprisingly, manipulating the
block feedback in a way similar to the condition of random feedback, by
presenting a number uncorrelated with the actual performance of the observer,
also prevents improvement (Herzog & Fahle, 1997). Hence, feedback can strongly influence
the speed and extent of visual learning, indicating that several top-down
influences, not just attention, must play a major role in this type of learning,
even if it occurs partly on early levels of cortical information
processing. The role of motor factors
Extremely high visual resolution is possible only at
the very center of the visual field, subserved by the foveola. Resolution starts
deteriorating at a distance of less than 1 deg from the center. So to achieve
optimal performance, targets for most tasks have to fall into this very center
of the fovea (but which is still more than 20-30 photoreceptor diameters wide).
This may not be the case for inexperienced observers in the darkish experimental
room under somewhat artificial viewing conditions: Observers may not be able to
maintain a sufficiently precise fixation for the duration of the whole
experiment. Similarly, for optimal retinal image quality, accommodation has to
be very precise, and this may not be easy to achieve throughout a whole session
for inexperienced observers. So several skeptics argued that in reality,
improvement through training might be the result of motor learning: improvement
of accommodation, or fixation, or both. (After all, improvement through training
is based on motor improvement in many forms of procedural learning.) These
skeptics continued by arguing that this motor improvement would be specific for
the stimulus and eye employed and hence disappear after any change of
orientation or eye used in the experiment.
A simple experiment ruled out this suspicion. Half of
observers started to practice a three-dot vernier discrimination task. Here, the
task was to indicate, in the usual binary forced-choice way, whether the middle
dot was offset to the right or left relative to an imaginary line through the
two end points. The other half of observers performed a three-dot bisection task
(i.e., they had to indicate whether the middle one of three dots was closer to
the upper or to the lower end point). These two stimuli are very similar to each
other indeed: Thresholds for discrimination of vernier offsets are in the order
of 10-15 arcsec, usually higher by a factor of around 2 for the bisection task.
Hence, position of the middle point differs by about 1 photoreceptor diameter
between a vernier offset to the left versus to the right and similarly between
displacement up versus down. Differences in the position of the middle dot are
even smaller between the stimuli for two tasks (e.g., between a middle dot
displaced up and one displaced to the right) ( Figure 7 and Fahle & Morgan, 1996). According to the theorem of Pythagoras,
this difference between dot positions would be  , if
n1,
n2
are the displacements at threshold for vernier and for bisection, respectively,
or  ≈
1.4n
for
n1 = n2.
The difference between displacements to the left versus to the right, on the
other hand, would be
2n.
Figure 7. Failure to transfer perceptual
improvement between virtually identical stimuli, due to task difference (after
Fahle & Morgan, 1996). Half of the
observers started with a three-dot vernier discrimination task; the other half
of the observers started with practicing a three-dot bisection task for about 1
hr. The transition between tasks is indicated by the thin vertical lines. The
next day observers exchanged tasks. There was no transfer of improvement between
the tasks, though the stimuli were virtually identical. The nearest data point
to the left of the vertical line (21st block) was recorded on the second
day.
Improvement does not transfer between the three-dot bisection and vernier tasks. A repeated measures analysis of variance with two within factors (block and sequence) on the individual data of all observers yielded a significant difference between the first and second condition, with lower performance during the second condition (76.8 +/– 0.84) than during
the first condition (80.8 +/– 0.75). These results clearly demonstrate
that transfer of improvement may fail even between virtually identical stimuli.
The relevant parameter seems to be the task required from the observer, and
motor components such as steady fixation or accommodation do not play an
important role. Improvement of any of these factors through training with either
the three-dot vernier or bisection task should not be disrupted by a position
change of the middle dot smaller than that between the two stimuli discriminated
during the first part of the
experiment!
What could possibly be the cortical mechanisms
underlying the improvement of vernier discrimination through perceptual
learning? We saw above that improvements on the motor side, as are common in
many forms of procedural learning, can be excluded. So we have to look for
improvements on the sensory side.
Two straightforward lines of reasoning based on
improved sensory processing are able to explain the specificity of perceptual
learning. The first one emphasizes changes in receptive field structure of
specific cortical neurons, whereas the second one emphasizes improvements in
signal selection. These lines of argument are not mutually exclusive, but rather
differ in the type of approach: more physiological versus more formal. Clearly,
changes in input selection of any neuron lead to a change in its receptive field
structure and simultaneously to changed signal selection, so the two processes
are intrinsically related.
For both lines of argument, the question arises
concerning the exact level of processing at which improvement is achieved. This
question concerning the neuronal level of perceptual learning is, in many
respects, similar to the one regarding the location, in the nervous system, of
the neuronal process underlying stimulus selection based on attention. As with
attentional processing, one could contrast an early selection hypothesis of
perceptual learning with a late selection hypothesis (for a discussion on
attention-based selection processes, see Broadbent, 1958; Deutsch & Deutsch, 1963; Johnston & Heinz, 1979).
Here I address this basic controversy regarding the
cortical level on which perceptual learning operates. It has been argued that
the specificity of perceptual learning indicates an early level of the
underlying cortical modifications (e.g., Poggio et al., 1992), whereas this has been questioned by
others. Mollon and Danilova ( 1996; cf., also
Morgan, 1992) pointed out convincingly
that learning might take place on levels beyond the primary visual cortex in
spite of the high stimulus specificity. These authors argued that even though
neurons on these higher levels are binocularly activated, the activations
stemming from each of the two eyes might differ from each other (e.g., as a
result of slight differences in the photoreceptor geometry between the two
eyes). The controversy will be exemplified first in terms of signal detection
theory and second in terms of physiology (i.e., receptive fields). Hence, in the
following, I will discuss both psychophysical and neurophysiological
findings. Improvement of signal detection: Early selection
By selecting those signals discriminating best between
two stimuli while ignoring those that respond in a similar way to both stimuli,
the decision level can benefit from a greatly improved signal-to-noise ratio
and, hence, improve performance (cf., Pelli, 1985; similar processes may happen during
childhood; cf., Andrews, 1964). This is
to say that perceptual learning might be based, to a certain amount, on changing
the weights with which individual inputs influence the overall reaction of the
observer. Thus, by eliminating the influence of uninformative inputs, the amount
of noise at the decision level is decreased.
Rejection (e.g., by inhibition) of the less relevant
signals could occur principally on all levels of signal analysis before the
decision stage. Two straightforward alternatives come to mind: (1) changes in
signal processing by early selection (e.g., on a level where neurons are still
monocularly activated but show orientation specificity); and (2) late selection
with improvement of input selection and/or signal processing on higher
processing levels and lack of transfer due to subtle changes of activation
patterns elicited by, for example, presentation to different eyes (for monocular
presentation).
Better and more appropriate selection and processing of
input signals on an early level is the most straightforward explanation for the
high stimulus specificity and the lack of transfer between similar stimulus
positions, orientations, and between the eyes (Poggio et al., 1992).
But such a permanent change in input selection on an
early stage would interfere with other perceptual tasks. Moreover, purely
bottom-up driven modifications of input selection cannot explain dependence of
performance on feedback and the lack of transfer between vernier, orientation,
and curvature discrimination (see above and, e.g., Herzog & Fahle, 1998). Improvement of signal detection: Late selection
The second alternative, more adequate input selection
and processing of signals exclusively on higher processing levels, would require
that the inputs, in the case of monocular stimulus presentation, are too
different to allow generalization between the eyes (Mollon & Danilova, 1996; an extreme form of this alternative would
be input selection on the level of the decision process). This hypothesis allows
for incorporating the effects of attention and feedback. However, the crucial
assumption is that the inputs from one eye differ clearly from those of the
other eye, and that the inputs used at one stimulus orientation differ from
those used at a very similar orientation.
This alternative explanation has to cope with several
problems, too. In healthy observers, the retinal mosaic is relatively similar in
the two eyes and observers cannot, even after training, indicate which eye was
stimulated during short monocular presentations (Helmholtz, 1867). At the same time, the variance of the
exact pattern of neuronal activity evoked by a simple line stimulus must be
enormous in each of the eyes due to fixation instability and tremor. Even under
optimal fixation, observers move their eyes over an area with a side-length of
more than 1 arcmin, and probably much more. Hence, the same stimulus (e.g., the
three dots presented for the three-dot bisection/vernier discrimination task; Figure 7) will activate different groups of
neurons, to a widely differing amount, even if presented repeatedly to the same
eye. The amplitude of eye movements is larger than the (vernier) offset to be
detected. Hence, it would be surprising if the higher cortical areas were able
to achieve such an impressive improvement of performance on the basis of a
highly variable monocular input within a few hundred stimulus presentations. On
the other hand, observers are completely unable to make use of this improvement
when analyzing the ensemble of inputs from the other retina when the partner eye
is tested. These considerations argue against any explanation of eye-specific
learning based on “labeled lines,” given the binocular nature of
most neurons beyond the primary visual
cortex. A possible solution: Top-down control
This indirect argument does not safely refute late
selection. But the additional evidence of changes induced by perceptual learning
in early components (latency around 50 ms) of evoked potentials, especially over
the occipital cortex (Fahle & Skrandies, 1994), and the results of animal experiments
favor the early selection hypothesis. Moreover, I would argue that selection is
most beneficial if exerted on a neuronal level as early as possible and that
learning occurs at the lowest appropriate level in the visual system (cf.,
Karni, 1996).
The primary visual cortex represents the visual world
in an ordered way with high positional precision and small receptive fields,
whereas receptive fields increase on later levels of visual processing. Hence,
suppression of all inputs not activated by a target is essential to isolate this
target from nearby objects that could interfere with processing of the target.
Suppression would yield highest improvement if exerted before the signals
originating from the target converge, on higher cortical levels, with the
signals evoked by other objects. Hence, I would advocate the first alternative:
selection of the optimal input signals on an early level under top-down-control,
combined with early modification of processing. A change of processing on this
early level must occur, ideally in a task-specific way, under top-down control,
changing, for example, lateral interactions between neurons. Task-specific
selection of the input best suited to solve the task at hand would lead to
temporary modifications of receptive field structure of at least some of the
receptive fields on this level, and there exists an intimate relationship
between input selection and cortical processing. We will now consider possible
neuronal implementations for changes of input
selection. Neuronal mechanisms: Early selection compatible with neurophysiology of primary visual cortex?
The specificity for orientation, position, and
especially the eye trained immediately points to a specific location, in the
visual system, of the neuronal changes underlying perceptual learning. Only in
area 17, the primary visual cortex, exist neurons sensitive both to stimulus
orientation (unlike those in the retina and the lateral geniculate nucleus
[LGN]) and the eye stimulated (cf., Figure 8).
The most parsimonious explanation for the results presented above is that
perceptual learning relies on an improvement in processing by those neurons in
area 17 best suited to discriminate between verniers offset in opposite
directions, hence an early selection (see Poggio et al., 1992). If this explanation of the experimental
results is correct, then perceptual learning would involve changes in
connectivity between neurons already on the level of the primary visual
cortex.
Figure 8. Schematic diagram of the visual
system indicating the only location of orientation-specific monocular neurons.
Neurons in the retina and in the lateral geniculate nucleus (LGN) have
rotation-symmetric receptive fields; hence, they cannot discriminate between
stimuli oriented at different angles and cannot subserve orientation-specific
learning. Only some of the orientation-specific neurons in the primary visual
cortex are monocularly driven, whereas neurons in higher visual projection areas
are usually binocularly driven and should be unable to discriminate between
separate stimulations of the two eyes. So the most parsimonious explanation for
eye specificity of perceptual learning is based on the assumption that these
monocular cells are involved in the learning.
This may be an implausible assumption because the
primary visual cortex was considered for quite some time to be a hard-wired
first stage of analysis (Marr, 1982). But more
recent electrophysiological evidence points to plasticity even in the adult
primary visual cortex (Gilbert & Wiesel, 1992; Eysel, Eyding, & Schweigart, 1998; Fahle & Skrandies, 1994; Godde, Leonhardt, Cords, & Dinse, 2002), supporting the hypothesis that perceptual
learning may involve plasticity even of primary visual cortex. The recent
electrophysiological results demonstrating plasticity of the adult primary
visual cortex, therefore, fit nicely with the assumptions developed on the basis
of psychophysical experiments demonstrating the specificity of perceptual
learning as cited above. To conclude, physiological knowledge should no longer
prevent us from speculating about plasticity of the primary visual cortex, hence
from assuming modification of an early cortical
level.
Neuronal mechanisms: Improvement of orientation tuning on an early level?
Assuming that perceptual learning modifies receptive
fields in the primary visual cortex, as suggested by the hypothesis of early
selection, what type of modification would we expect? Receptive fields of
neurons in the primary visual cortex typically consist of an elongated
excitatory field center determining the orientation specificity of the neuron
and of inhibitory surrounds on both sides of the center. The neuron is most
strongly activated by light falling on the receptive field center without
extending into the inhibitory surround. The receptive fields of neurons with
different orientation preferences and slightly differing receptive field
positions are clearly able to discriminate between a straight vernier and an
offset one, or between an offset to the left and an offset to the right ( Figure 9; cf., Wilson, 1986). The precision of discrimination depends,
among other factors, on the width of the receptive field center. So a
straightforward and plausible hypothesis regarding the neuronal changes
underlying perceptual learning with vernier stimuli would be that learning leads
to permanently narrower receptive field centers and hence a narrower orientation
band-width on an early level, such as the primary visual cortex as discussed
(and rejected) by Herzog and Fahle ( 1998)
( Figure 9).
Figure 9. A simple hypothesis about the
neuronal basis of visual hyperacuity with vernier stimuli, postulating that
improvement through training may be leading to narrower receptive field centers.
Neurons in the visual cortex have receptive fields with antagonistic
center-surround characteristics. Neurons are optimally activated by stimuli
restricted to their receptive field centers without activating the surround.
Narrowing of the field center means that the neurons are better able to
discriminate between different stimulus orientations, and between offset
directions. Most models use orientational mechanisms rotated by several 10s of
degrees relative to the target orientation (off-center mechanisms (cf., left and
right parts of figure; Findlay, 1973; Mussap
& Levi, 1996; see also Morgan, 1986).
A general consideration and two specific examples argue
against this hypothesis in its “feedforward” form. First, a change
of receptive field structure could have consequences for the processing of
virtually all visual stimuli, with potentially deleterious effects for other
visual tasks, such as the detection of low-contrast stimuli. The receptive field
may become too small to detect small differences in luminance. (For such
reasons, Marr [ 1982] postulated a hard-wired
[i.e., nonplastic] early level of processing.)
The first example immediately follows from this
consideration. We know that the receptive fields in the primary visual cortex
are similar to the ones shown in Figure 9
– all the information flowing to subsequent levels of analysis will pass
through these early filters. If the receptive field width decreased as a
consequence of perceptual learning, the signal-to-noise ratio and hence
performance for orientation discrimination would improve. In consequence,
performance for all perceptual tasks relying on orientation discrimination
should ameliorate. Practicing vernier discriminations should lead to better
orientation discrimination, and vice versa, and practicing of both tasks should
transfer to curvature detection because for low curvatures, too, the feature
used for discrimination seems to be an orientation cue (Kramer & Fahle, 1996). However, improvement through training did
not transfer between these three tasks (see Figure
4).
The second example supporting the argument against
permanent modifications of receptive fields as the basis of vernier learning is
based on the specificity for stimulus orientation. If learning relied on the
narrowing of early, orientation-sensitive receptive field centers, the
improvement should transfer to similar stimulus orientations, as long as the
same neurons are involved in the detection process. This consideration raises
the question about the orientation bandwidths of cortical neurons. These have
not been measured directly in humans, but neurons in the macaque cortex show
orientation-bandwidths of between 20° and 60° (Movshon &
Blakemore, 1973). That is to say that
neuronal responses to an oriented bar are best for a defined (optimal)
orientation and decrease to half that value if the stimulus is rotated by
10° to 30° to either side. (Neurons with small receptive fields, i.e.,
with best resolution for fine grating stimuli, show the most narrow orientation
tuning). As we saw in Figure 1, the orientation
tuning of perceptual learning is much finer than 20°. A rotation of the
stimulus by 10° suffices to require completely new learning at the rotated
stimulus orientation. This is a strong argument against the hypothesis that
perceptual learning is based primarily on a permanent modification of receptive
field structure in early visual areas. According to this hypothesis, training
with a stimulus of a given orientation would lead to a narrowing of a wide range
of orientation-selective receptive fields, whereas we find a very strict
orientation selectivity of improvement.
In summary, perceptual learning of hyperacuity tasks is
not just a permanent sharpening in the orientation tuning of the (relatively)
peripheral orientation specific filters. The improvement is specific for each of
the different tasks based on detection of differences in line orientation, and
is highly specific for stimulus orientation far beyond the bandwidth of the
cortical neurons subserving orientation discrimination. Hence, the assumption
that continuously active modifications of early receptive field modifications
are exclusive in achieving perceptual improvement in a strictly feed forward
system lacks
plausibility. Neuronal mechanisms: Modification on a late cortical level?
We realize that perceptual learning is unlikely to rely
on the permanent modification of receptive field properties of
“early” cortical neurons (e.g., by sharpening their orientation
tuning). Dependence of perceptual learning on attention and on feedback add
plausibility to the view that improvement cannot be based exclusively on
exposure-dependent bottom-up processes permanently changing signal processing in
the primary visual cortex (i.e., for all visual tasks; see Herzog & Fahle,
1994, 1998). Moreover,
it is undisputed that learning can change processing of visual information on
higher or more cognitive levels of cortical information processing.
Traditionally, the effects of perceptual learning have been attributed
exclusively to changes on these levels. More recently, as the additional
involvement of early levels became clear, the interplay between these different
levels has been elaborated on by classifying different types or levels of
perceptual learning (Ahissar & Hochstein, 1997). Hence, the question is not whether or
not perceptual learning involves higher cortical levels (it does) but whether or
not it additionally involves the primary visual
cortex? What might be the changes of receptive
fields on higher levels of cortical signal processing? A number of different
scenarios are possible. Basically, the neurons on higher levels of processing
may use more complex features to discriminate vernier offsets to the right from
those to the left. Through training, they would learn which input neurons on
preceding or lower levels of cortical processing are best suited to discriminate
between those two classes of neurons. Learning would consist, at least partly,
in assuring a higher impact of these neurons on the discriminating neurons on
the higher level and, hence, in changing the receptive fields of these neurons
in a task-specific way.
However, these higher level cortical neurons tend to be
binocularly activated and to possess large receptive fields. Therefore, they
would be expected to be less specific for the exact stimulus parameters and be
more disturbed by flanking lines at close distance to the test stimuli (but see
below for a possible counter-argument based on labeled lines).
Let us summarize the findings based on the more
physiological approach to answer the question concerning late versus early
selection or change in receptive field properties. The high stimulus specificity
of perceptual learning with lack of transfer between very similar stimulus
orientations and the lack of transfer between the two eyes as well as the
ability of observers to suppress nearby flanking lines through training (Spang,
Herzog, Holland-Moritz, Stein, & Fahle, 2000) all support the argument in favor of
plasticity involving the level of the primary visual cortex. But the dependence
of learning on error feedback, the lack of generalization between tasks based on
orientation discrimination, as well as theoretical considerations, argue
strongly against plasticity on this early level. These considerations against an
early modification of receptive fields could be invalidated by the postulate
that the structure of receptive fields would be adjusted to each task by
top-down influences from higher cortical areas. A possible explanation of the
stimulus specificity of perceptual learning could then rely on a change of
receptive field properties of low-level cortical neurons, in a task-specific
way, under top-down control. Some final arguments for this proposal
follow. Early versus late selection: Selected evidence for early selection
It is hardly at all possible to isolate, in a complex
recurrent system, the level on which a change occurs by means of black-box
methods such as psychophysics. In trying to resolve the controversy between
early versus late learning, it will be helpful also to consider the results of
neurophysiological studies. It is reassuring that the plasticity assumed, on the
basis of the psychophysical results (e.g., Poggio et al., 1992), has a neuronal counterpart in the visual
cortex (Gilbert & Wiesel, 1992; Fahle
& Skrandies, 1994; Godde et al., 2002) and auditory cortex (e.g., Recanzone, Schreiner, & Merzenich,
1993; Menning, Roberts, & Pantev, 2000; Tremblay et al., 2001), suggesting early selection to take
place.
Suppression of neuronal activation not relevant to
solve the task is especially important if, for example, flanks are presented on
both sides of vernier targets (cf., Spang et al., 2000). Through learning, the influence of the
flanks can indeed be greatly reduced, a more difficult feat for neurons with
large receptive fields. Similar reasoning would apply for neurons specialized
for orientations different from the stimulus and for those representing an eye
not stimulated under training conditions, making the assumption of early
selection of signals through perceptual learning even more feasible. Moreover,
the psychophysical results of Watanabe et al. ( 2002) argue strongly in favor of an
involvement of early cortical levels in perceptual learning. These authors find
greater improvement through training in lower level than in simultaneously
trained higher level visual motion processing in a perceptual learning task.
Local motion is processed at a very low level of motion processing, whereas
global motion is processed at a higher level stage by spatiotemporal
integration. Hence, the learning must take place on the lower processing
level.
The hypothesis of early modification and selection of
visual input under top-down control seems to be best suited to explain the
psychophysical and electrophysiological findings on perceptual learning. The
psychophysical indicators for plasticity in adult primary visual cortex agree
well with the results of electrophysiological experiments in humans and animals.
Both types of experiments provide the insight that indeed we have to accept the
notion of plasticity in adult primary sensory cortices, because both the sum
potentials over the occipital pole of human observers and receptive field
properties of single neurons in primary visual cortex change as a result of
training.
Visual perceptual learning leads to sometimes dramatic
and relatively fast improvements of performance in perceptual tasks, such as
hyperacuity discriminations. The improvement often is very specific for the
exact task trained, the precise stimulus orientation, the stimulus position in
the visual field, and the eye used during training. This specificity indicates
location of the underlying changes in the nervous system at least partly on the
level of the primary visual cortex. The dependence of learning on error feedback
and on attention, on the other hand, proves the importance of top-down
influences from higher cortical centers. In summary, perceptual learning seems
to rely on changes on a relatively early level of cortical information
processing, such as the primary visual cortex, under the influence of top-down
selection and shaping influences. According to this view, the primary visual
cortex is not a hard-wired filtering device, but modifies its input signals in a
partly task-dependent way under top-down control. By learning, new types of
processing are implemented on this early level. This conclusion is incompatible
with older views of primary sensory cortices assuming lack of plasticity in
adults, and is also incompatible with a strictly feedforward signal processing
in the cortex, while advocating a model of information processing in a complex
system with strong feedback from higher to lower levels of
processing.
Supported by the German Research Council Center of
Excellence (SFB 517). The author wishes to thank Michael Morgan and John Mollon
for constructive criticism.
Commercial relationships: none.
Corresponding author: Manfred Fahle.
Email: mfahle@uni-bremen.de.
Address: Institute of Brain Research, Human
Neurobiology, University of Bremen,
Germany.
Ahissar, M., &
Hochstein, S. (1997). Task difficulty and the specificity of perceptual
learning. Nature, 387, 401-406. [ PubMed]
Andrews, D. P. (1964). Error-correcting perceptual
mechanisms.
Quarterly Journal of
Experimental Psychology, 16, 104-115.
Barlow, H. B. (1981). The
Ferrier Lecture, 1980. Critical limiting factors in the design of the eye and
visual cortex. Proceedings of the Royal
Society of London B, 212, 1-34.
[ PubMed]
Beard, B. L., Klein, S. A.,
Ahumada, A. J., Jr., & Slotnick, S. D. (1996). Training on a vernier acuity
task does transfer to untrained retinal locations [Abstract].
Investigative Ophthalmology & Visual
Science, 37(Suppl.), S696.
Broadbent, D. E. (1958).
Perception and communication. Oxford,
UK: Pergamon.
Crick, F. H. C., Marr, D. C.,
& Poggio T. (1981). The organization of the cerebral cortex. In F. Schmitt
(Ed.), An information-processing approach to
understanding the visual cortex (pp. 505-533). Cambridge: MIT Press.
Deutsch, J. A., &
Deutsch, D. (1963). Attention: Some theoretical considerations.
Psychological Review, 70, 80-90. [ PubMed]
Dill, M., & Fahle, M.
(1997). The role of visual field position in pattern-discrimination learning.
Proceedings of the Royal Society of London B,
264, 1031-1036. [ PubMed]
Eysel, U. T., Eyding, D.,
& Schweigart, G. (1998). Repetitive optical stimulation elicits fast
receptive field changes in mature visual cortex.
NeuroReport, 9, 949-954. [ PubMed]
Fahle, M. (1991). A new
elementary feature of vision. Investigative
Ophthalmology & Visual Science, 32, 2151-2155. [ PubMed]
Fahle, M. (1997). Specificity
of learning curvature, orientation, and vernier discriminations.
Vision Research, 37, 1885-1895. [ PubMed]
Fahle, M. (1998). Perceptual
learning and orientation specificity [Abstract].
Perception, 27(Suppl.), 155.
Fahle, M., & Daum, I.
(2002). Perceptual learning in amnesia.
Neuropsychologia, 40, 1167-1172. [ PubMed]
Fahle, M., & Edelman, S.
(1993). Long-term learning in vernier acuity: Effects of stimulus orientation,
range and of feedback. Vision Research,
33, 397-412. [ PubMed]
Fahle, M., & Morgan, M.
(1996). No transfer of perceptual learning between similar stimuli in the same
retinal position. Current Biology, 6,
292-297. [ PubMed]
Fahle, M., & Poggio,
T. (Eds.) (2002). Perceptual learning.
Cambridge: MIT Press.
Fahle, M., & Skrandies, W.
(1994). An electrophysiological correlate of learning in motion perception.
German Journal of Ophthalmology, 3,
427-432. [ PubMed]
Fahle, M., Edelman, S., &
Poggio, T. (1995). Fast perceptual learning in hyperacuity.
Vision Research, 35, 3003-3013. [ PubMed]
Findlay, J. M. (1973). Feature
detectors and vernier acuity. Nature,
241, 135-137. [ PubMed]
Gilbert, C. D., & Wiesel, T.
N. (1992). Receptive field dynamics in adult primary visual cortex.
Nature, 356, 150-152. [ PubMed]
Godde, B., Leonhardt, R., Cords,
S. M., & Dinse, H. R. (2002). Plasticity of orientation preference maps in
the visual cortex of adult cats. Proceedings of the National Academy of Sciences USA, 99, 6352-6357. [ PubMed][ Article]
Harris, J. P., & Fahle, M.
(1995). The detection and discrimination of spatial offsets.
Vision Research, 35, 51-58. [ PubMed]
Helmholtz, H. v. (1867).
Handbuch der physiologischen Optik.
Leipzig, Voss.
Herzog, M. H., & Fahle, M.
(1994). Learning without attention? In N. Elsner & H. Breer (Eds.),
Proceedings of the
22th
Göttingen Neurobiology Conference (p. 817). Stuttgart: Thieme.
Herzog, M. H., & Fahle,
M. (1997). The role of feedback in learning a vernier discrimination task.
Vision Research, 37, 2133-2141. [ PubMed]
Herzog, M. H., & Fahle,
M. (1998). Modeling perceptual learning: Difficulties and how they can be
overcome. Biological Cybernetics, 78,
107-117. [ PubMed]
Johnston, W. A., & Heinz,
S. P. (1979). Depth of nontarget processing in an attention task.
Journal of Experimental Psychology: Human
Perception & Performance, 5, 168-175. [ PubMed]
Karni, A. (1996). The
acquisition of perceptual and motor skills: A memory system in the adult human
cortex. Cognitive Brain Research, 5,
39-48. [ PubMed]
Karni, A., & Sagi, D.
(1991). Where practice makes perfect in texture discrimination: Evidence for
primary visual cortex plasticity. Proceedings
of the National Academy of Sciences USA, 88, 4966-4970. [ PubMed][ Article]
Kramer, D., & Fahle, M.
(1996). A simple mechanism for detecting low curvatures.
Vision Research, 36, 1411-1419. [ PubMed]
Marr, D.
(1982). Vision. New York: Freeman.
McKee, S. P., & Westheimer, G.
(1978). Improvement in vernier acuity with practice.
Perception & Psychophysics, 24,
258-262. [ PubMed]
Menning, H., Roberts, L. E.,
& Pantev, C. (2000). Plastic changes in the auditory cortex induced by
intensive frequency discrimination training.
NeuroReport, 11, 817-822. [ PubMed]
Mollon, J. D., & Danilova, M.
V. (1996). Three remarks on perceptual learning.
Spatial Vision, 10, 51-58. [ PubMed]
Morgan,
M. J. (1992). Hyperacuity of those in the know.
Current Biology, 2, 481-482.
Morgan, M. J. (1986). The
detection of spatial discontinuities: Interactions between contrast and spatial
contiguity. Spatial Vision, 1, 291-303.
[ PubMed]
Movshon, J. A., & Blakemore,
C. (1973). Orientation specificity and spatial selectivity in human vision.
Perception, 2, 53-60. [ PubMed]
Mussap, A. J., & Levi, D. M.
(1996). Spatial properties of filters underlying vernier acuity revealed by
masking: Evidence for collator mechanisms.
Vision Research, 36, 2459-2473. [ PubMed]
Pelli, D. G. (1985). Uncertainty
explains many aspects of visual contrast detection and discrimination.
Journal of the Optical Society of America A,
2, 1508-1532. [ PubMed]
Poggio, T., Fahle, M., &
Edelman, S. (1992). Fast perceptual learning in visual hyperacuity.
Science, 256, 1018-1021. [ PubMed]
Recanzone, G. H., Schreiner,
C. E., & Merzenich, M. M. (1993). Plasticity in the frequency representation
of primary auditory cortex following discrimination training in adult owl
monkeys. Journal of Neuroscience, 13,
87-103. [ PubMed]
Spang, K. M., Herzog, M. H.,
Holland-Moritz, A., Stein, M., & Fahle, M. (2000). Suppression learning of
masking elements is not orientation specific [Abstract].
Investigative Ophthalmology & Visual
Science, 41, 48.
Taylor, M. M., & Creelman, C.
D. (1967). PEST: Efficient estimates on probability functions.
The Journal of the Acoustical Society of
America, 41, 782-787.
Tremblay, K., Kraus, N., McGee,
T., Ponton, C., & Otis, B. (2001). Central auditory plasticity: Changes in
the N1-P2 complex after speech-sound training.
Ear & Hearing, 22, 79-90. [ PubMed]
Watanabe, T.,
Náñez, J. E., Sr., & Sasaki, Y. (2001). Perceptual learning
without perception. Nature, 413,
844-848. [ PubMed]
Watanabe, T.,
Náñez, J. E., Sr., Koyama, S., Mukai, I., Liederman, J., &
Sasaki, Y. (2002). Greater plasticity in lower-level than higher-level visual
motion processing in a passive perceptual learning
task. Nature Neuroscience, 5,
1003-1009. [ PubMed]
Watt, R. J. (1984). Towards a
general theory of the visual acuities for shape and spatial arrangement.
Vision Research, 24, 1377-1386. [ PubMed]
Westheimer, G. (1976).
Diffraction theory and visual hyperacuity.
American Journal of Optometry &
Physiological Optics, 53, 362-364. [ PubMed]
Wilson, H. R. (1986). Responses
of spatial mechanisms can explain hyperacuity.
Vision Research, 26, 453-469. [ PubMed]
Wülfing, E. A. (1892).
Ueber den kleinsten Gesichtswinkel.
Zeitschrift für Biologie Bd., 29,
199-202.
|
|