Connectionist Modeling
Connectionism : The most modern theoretical and
computational approach to understanding human cognition.
Commonly referred to also as :
Neural Net Models : stresses the link between neurology
and cognitive psychology.
Parallel Distributed Processing Models
(PDP models) :
These types of models rely extensively on computer
simulation of mathematical equations which represent the
human thought process.
Basic Connectionist Model
Basic assumptions of Connectionism
1. Cognition takes place at multiple levels. Each level
generates its own information, based upon the activation
levels of individual units.
2. All Levels function simultaneously
(Processing in Parallel).
3.Connections between levels have + and – weights
between them. These weights help guide the activation of
higher level units.
4.Top – Drown processing is essential for learning to occur
(back propagation), and to speed up basic cognitive tasks
(automaticity).
Three Different PDP Levels and Units involved in visual
word recognition :
THE
Input Levels : At the feature detection level, the most basic visual detector cells examine the visual pattern.
Analogous to the computational & Data demons in the Selfridge model.
Excitatory (+) and Inhibitory(-) Pathways connect these 9 basic visual detectors
to the units in the:
Hidden Level : Now, at the whole letter (or graphemic) level, letter detectors which have received the most
positive activation from the input units become highly active,and the most active connections trigger the answer at the:
Output Level : The output unit with the highest activation becomes the recognized pattern, in this case "THE"
Learning in a PDP network
Fluent reading is such an automatic process, it represents the domination of Top-Down (or conceptually driven) processing.
Learning takes place through :
Back Propagation :Once feedback is received (i.e. "You said TEE, proper answer was THE"), the connection weights between
levels are adjusted slightly, starting from the Output Level and working backwards.
Delta Rule : The mathematical formula for adjusting weights within a PDP system (Learning Rule)
Auditory Pattern Recognition
Auditory Speech recognition also shows the dominance of conceptually driven (Top-Down) processing :
Warren & Warren (1970) : Had subjects perform a shadowing tasks in which a critical phoneme was replaced by white noise.
Phoneme : smallest unit of sound which has meaning.
As long as the remaining sentence context was coherent,
recall was almost perfect (phonemic restoration) :
It was found that the *eel was on the axle
It was found that the *eel was on the shoe
It was found that the *eel was on the orange
Selective Attention in Auditory Perception
Filter Theories : How do we pay attention to one message, and ignore irrelevant auditory signals ?
Shadowing Experiments:
By using dual messages, and having a subject shadow back just one, we can see attentional filtering at work.
Subjects Noticed :
When the unattended speaker switched from man to
women.
Change from human speech to a tone.
Subjects did not typically notice :
Change of language in unattended ear.
Reversal of language in unattended ear.
How does selective attention work ?
The Earliest model of Broadbent postulated that all auditory
signals were processed to a certain degree, but that only one
signal received additional processing at any one time.
However, the cocktail party phenomenon demonstrated that
even unattended auditory information could suddenly enter
conscious awareness.
This led to a more complex view of attention ;
The two stages of auditory selection model developed by
Eyesenck (1982)
2 Stage Auditory Information Selection
One, physical acoustical differences allow us to attend to
one signal or the other. This is called :
Stage 1 selection: people can select to listen based upon
loudness, pitch, location, and sensory information. (Eyesnck
, 82)
Two, if the physical characteristics are same (same speaker,
same ear, two messages) , people can attend according to
meaning.
Stage 2 selection : Message content (semantics)
determines where attention is located.
The cocktail party phenomenon is influenced by stage 2
selection.
Experiments in which subjects cross shadow the message
also shows Stage 2 selection
More on Attention
By making cognitive or physical tasks automatic, we reduce
the demand on attention.
What makes a process "automatic"
Posner & Snyder (75) 3 criteria:
1.An automatic process occurs without attention.
2. An automatic process does not reveal itself to conscious
awareness.
3. A fully automatic process consumes little or no conscious
resources.
Contrast Automaticity with Conscious/Strategic
Processing ;
Where :
The process occurs with intention, the process is open to
conscious awareness, and this process consumes some of
our limited capacity attentional resources.
The route to automaticity is through practice and memory.
The repetition and overlearning inherent in acquiring
language skills help move language perception and
comprehension from being highly strategic to becoming
highly automatic.
Two types of Attention
Conscious Attention : Limited quantity attentional
resources we have to perform a strategic task.
Focal Attention (spotlight attention) : An extremely rapid,
nearly automatic attentional system.
Focal attention can refocus our attention when something in
the environment happens which might require our immediate
attention
Bright Lights, Loud Noises, our Name, are all stimuli which
can refocus our attention automatically.
Short Term Memory
Miller (1956) did experiments to determine the capacity of
short term memory.
Using simple letters and digits, Miller determined we can
hold 5 to 9 (7 +/-2)
Basic Units of information
To get around this severe processing limitation, we can
combine simple units into a single chunk of information.
Examples: a Phone number : 462 –0584
Could be 7 different units or 2 chunks of information.
Recoding : grouping individual items together (chunking),
and then remembering only the newly formed groups.
Identifying whole words, instead of letters, is one example of
linguistic recoding .
Long Term Memory Involvement in Recoding
Long Term memory assists in recoding by supplying a
scheme to assist the recoding –
Mnemonic Device : a rehearsal or remembering strategy
Very Simple : Rephrasing statements into our own words.
More Complex: Storing information in mental maps, the
"peg" method.
Decay of Short Term Memory
The Brown - Peterson Task (1958)
They explored how the passage of time and learning new
information affected items held in short term memory.
Subjects see two visual stimuli :
First, a three letter sequence XPG
Second, a three number sequence 698
Subjects have to then count backwards by three from the
presented digits (twice per sec)
At some point after they begin counting, subjects are
prompted to recall the initial three letter string "XPG"
Recall dropped to about 50 % with only 3 seconds of
backward counting !
The Brown Peterson Task :
Shows how interfering with rehearsal shows how quickly
items decay from short term memory.
If we do not constantly ‘refresh" the contents of our short
term memory, they can be quickly overwritten by subsequent
processing.
Does the Brown Peterson task support simple decay or
retroactive interference ?
Brown & Peterson believed that because letters and digits
are categorically distinct, there results were evidence of
simple decay.
Waugh & Norman (1956) felt that retroactive interference
was what produced the memory decrement.
In their experiments, they presented 16 digits, either 1 digit
per sec, or 4 digits per sec. (auditory presentation)
Subjects were cued to recall 1 of the 16 digits
If simple decay occurred, as in the Brown Peterson task,
than memory should be much better in the 4 item per sec
group than in the one digit per sec group.
However, recall rates were similar in the two groups.
Therefore, it seems that interference from subsequent items
was the more likely cause of forgetting from short term
memory, rather than just a passive decay.
The more cognitively demanding the interference task is, the
greater the disruption.
Interference in Short Term Memory
Proactive Interference (PI): When something learned first
interferes with your ability to learn something new.
Retroactive Interference (RI) : Newly learned information
interferes with your ability to recall older information.
Wickens (1963): discovered that Brown- Peterson task
performance depended upon which trial the subject was on.
The first trial, performance is very good, 90%
By the 4th trial, recall of the three letter sequence is about 40
%,
If however, the to-be-remembered category is shifted on the
fourth trial, to words instead of letters, performance returns
to 90%
Wicken labeled this "Release from PI"
Short Term Memory and Recall
Different Recall Instructions :
Free Recall : recall the items in any order
Serial Recall: recall the items in order, from first to last.
Serial Recall is more difficult than free recall.
Free Recall encourages the recency effect, as the last few
items are discharged from memory as soon as recall item is
given.
Serial Position curves show the recall as a function of the
item number in the list.
Serial Position Curves are typically higher in the beginning
and the end. (Thus, the curve)
Serial Position Curves (continued)
At the beginning of the curve , we get a :
Primacy effect : Better recall for items at beginning of list.
Primacy effects are due to the increased rehearsal of the
first few items.
At the end of the curve, we have our :
Recency Effect : Better recall for items at the end of the list.
Recency Effects are primarily due to our long lasting echoic
memory, and can be erased by the modality effect.
Rehearsal Within Short Term Memory
Rehearsal strategies are vital in both keeping information
active in short term memory and for transferring information
into Long Term memory.
Short Term Memory was once described as a ‘rehearsal
buffer’ and Rehearsal is a major function of our short term
memory system.
Rehearsal strategies are learned, not automatic:
Seventh Graders will spend more time rehearsing items than
Fifth Graders, and Fifth graders rehears items more than
third graders. (Kellas et al 1975)
Maintenance Rehearsal : Simple repeating of items in
short term memory.
Elaborative Rehearsal : tying new info. To an existing
knowledge structure.
How fast can we search Short Term Memory?
Sternberg (1966) developed "additive factors logic" to help
study separate cognitive process.
Three Hypothesiszed Mental Processess
RT = A + B + C
Encoding Search for match Output
If I vary the amount of times Process B has to be used, while
holding A and C constant, I should be able to identify how
long process B takes, by examining the difference in
Reaction Time (RT).
Sternberg’s Memory Scanning Task
1.Subjects are shown a memory set.
The memory set can have from 1 to 7 letters.
2.After a short pause, subjects are presented with a probe
item.
The probe item may or may not have been in the
memory set.
3.The subjects must compare the probe item to the memory
set in short term memory to determine whether or not a
match exist, and reply ‘yes’ or ‘no"
4.The amount of time it takes to answer, and the different
memory set sizes are then used to determine how fast the
comparison process (stage B) takes place.
If memory set size is 1, only one comparison between the
probe item and the memory set is required.
Sternberg’s Task (continued)
Results of Sternberg’s experiment helps to answer two
important questions :
Is the memory search of short term memory done serially
(one comparison at a time) or simultaneous (multiple
comparisons take place at one time) ?
If the search process is parallel, than adding additional items
to the memory set should have no effect on the RT to
answer.
Does the memory scanning process stop, once we have
received positive feedback ?
(self-terminating search) If the search is self-terminating,
than "yes" responses should generally be faster than "no"
responses.
The pattern obtained indicates that we scan short term
memory in a serially exhaustive manner: 400ms + 38ms(for
each item in memory set)
This memory scanning rate of 38ms per item is for
individuals with normal cognitive capabilities.
People with Parkinson’s disease or mental retardation have
slower memory scanning capabilities.
A Mnemonist, someone very skilled at using mnemonic
strategies, can speed up the memory scan a little bit.
How is information stored in Short Term Memory?
Several Different Code Types:
Verbal Codes : acoustic-articulatory codes tells us about
both the sound, and how to reproduce a meaningful sound.
Phonological recall errors demonstrate the availability of
these codes to short term memory.
Semantic Codes : meaning based coding.
Wicken’s release from PI experiments showed how
changing the category of to-be-remembered items
dramatically affected recall in Brown-Peterson type recall
tasks.
Visual Codes : recognizing visual objects is a fundamental
cognitive process.
Dual Task experiments in which both tasks are visually
based show substantial task performance decrements.
Other codes in Short Term Memory
Short Term Memory can also hold information concerning
physical movement.
When we conduct memory tests on people with ASL
knowledge (American Sign Language), many of their errors
are cherologically similar. (They recall a similar hand
movement)
Olfactory and Gustatory codes are also available to short
term memory, but very little research is conducted with these
senses.
These multiple codes indicate that short term memory is
much more complex than originally thought --- Part of this
bias was due to the fat that the vast majority of studies
concentrated on auditory information, thus the early
emphasis on the acoustic-articulatory coding in short term
memory.
Visual Sensation
How does the eye translate light into information the brain
can use ?
The eye is actually organized backwards
When light enters the eyeball, it is focused and inverted and
projected onto the retina.
Most light is never processed by the eye.
The surface of the retina has three basic neuronal levels :
Rods and Cones : rods are responsible for black and white
vision, and cones are responsible for color vision.
Bipolar Cells : Collect information from Rods and Cones
and pass this information to the ganglion cells.
Ganglion Cells : Passes information to the optic nerve,
which carries information to the visual cortex, for further
processing.
The process of vision
Visual Sensation : The reception of stimulation from the
environment and the initial encoding of that stimulation into
the nervous system.
Rods: About 120 million rods are located in the eyeball.
Responsible for black and white vision and largely
responsible for night vision. Multiple rods link to a single
bipolar cell in a process known as compression.
Compression : The analysis and summarizing of visual
sensation.
Cones : Much less numerous, only about 7 million cones
per eye. Cones give us our most accurate and precise vision
because there is one to one mapping of cones with bipolar
cells. (No compression of cone data takes place)
Ganglion Cells : About 1 million ganglion cells are
responsible for summarizing visual sensation from the
bipolar cells and passing this information to the optic nerve.
One the information arrives in the visual cortex, we can
begin visual processing and decide what objects exist in the
environment.
Visual Perception : the process of interpreting and
understanding sensory information.
Visual perception is not a continuous process ---- It is more
analogous to films where we have consecutive visual
"frames"
Each movement of the eye is called a saccade.
Each time the eye pauses to take in visual sensations, this is
called a fixation
Saccades and Fixations
Our visual system takes in input during fixations, but
suppresses input during saccades.
Any time you are voluntarily moving your eye, No visual
sensation are being processed. Its almost as if you are
blind during the 50-100 ms it takes for a single saccade.
While the eye movement itself is fairly fast, it takes about
200 ms to initiate each saccade.
During this approx. 200 ms fixation is when visual sensation
information is processed.
This gives us about 4 visual cycles per second.
We perceive a continuous visual environment, however, not
discrete frames.
Visual Sensory Memory
Unusual circumstances can give us insight into the visual
sensation process.
Lightning Bolts : Even though we perceive a continuous
lightning flash, there are actually three or four separate
lightning bolts which last about 1 ms each and are separated
by about 50 ms .
Visual Persistence : The apparent persistence of a visual
stimulus beyond its physical duration.
Iconic Memory : the temporary visual buffer that holds
visual information for short periods of time.
Capacity and Duration of Iconic Memory
Sperling (1960) Used a tachistoscope to explore the capacity
and duration of iconic memory.
Tachistoscope : Essentially a slide projector which can
control the time of exposure, and what is seen before and
after the experimental stimulus (preexposure and
postexposure field control)
T-scopes control both exposure time and where the visual
pattern is placed on the eye.
Sperling discovered that Iconic Memory has unlimited
capacity by using a t-scope.
Sperling’s Paradigm
A U K Q
N W P D
E J L R
Whole Report : After a brief look at screen, participant must
recall all letters on the chart.
Typical whole report is 4 ½ letters
Partial Report : After a brief visual exposure, participants
receive a tone cue which tells them which row to report.
Sperlings Experiment (1960)
After seeing the visual stimulus for 50 ms , subjects had to
report everything they saw (whole report condition)
As long as there was less than five letters, recall was
essentially perfect.
For twelve item displays, they still only recalled about 4.5
items, or 37 % recall.
This recall rate remained the same even the duration of the
stimulus was increased to 500 ms.
Span of Apprehension : the number of individual items
recallable after any short display (span of attention or span
of immediate memory)
The Partial Report Discovery
Sperling thought that maybe all the visual information was
initially present, but just faded to fast for all the information
to be extracted from the visual scene.
Now, subjects were instructed to report only the row or
column of letters identified by an audio tone.
Now, recall went back up to 76% when the tone immediately
followed the scene.
With a one second delay, performance went down to 36%
(comparable to whole report)
Since subjects could not predict before hand which items
they would have to report, Sperling concluded that Iconic
Memory encoded the entire visual scene, and the recall
limitation was due to the span of apprehension.
Sperling (1960) also manipulated the preexposure and
postexposure visual fields.
When bright light was used before and after the letter
display, performance dropped.
The bright light following the letter display "overwrote" the
icon with the letter display, eliminating the extended visual
persistance of the icon present when the postfield exposure
stimulus is a dark, blank screen
Masking Patterns : "Nonsense" patterns of visual (or
auditiory) Cog. Psychoogists use to control for the effect of
iconic memory
In visual experiments.
An Icon lasts up to a ¼ of a second.
An Icon is worth about 100 ms of additional exposure time.
Erasure of Visual Information from Iconic Memory
The mere passage of time degrades the quality of the last
icon taken.
Information on subsequent Icons which falls on the same
geographic region of the eye can wipe out memory for
information at that previous spot, a phenomenon known as
Backward Masking.
Averbach & Coriell (1961): Replicated Sperling’s Experiment,
but used a visual cue to trigger partial report :
50 ms exp.
E H C S L Z I P
R W Z Q G F K A
A blank postexposure field (varying in duration) followed by
O (or) I
Iconic Memory
Seen as the initial step in visual processing.
Under normal viewing conditions, subsequent Icons wipe out
previous Icons, but due to the suppression of information
during saccades, we perceive a continuous visual
environment.
The duration of the Icon is about ¼ sec. Because we need it
for the entire saccade.
However, the functional utility of the Icon is about 100 ms
worth of additional processing time, because it takes time to
access the Icon and identify specific features.
Neisser (1967) Focal Attention : the mental process of visual
attention.
Visual attention is the bridge which creates the continuity
from discrete Icon to discrete Icon.
Auditory Sensation and Perception
How basic audition works :
1.Sound waves are funneled into the ear.
2.These sound waves cause the eardrum, or tympanic
membrane, to vibrate.
3.This vibrations cause the small bones in the inner ear to
move, which displaces the fluid in the inner ear.
4.The fluid moves the tiny hairs along the basilar
membrane, which recognize different frequencies.
5.This information is now sent from the basilar membrane
along the auditory nerve to the auditory cortex of the
brain.
6.The auditory cortex is responsible for translating auditory
sensation into meaningful auditory perceptions.
Human Hearing
We hear a frequency range from 20 – 20,000 Hz (or cycles
per second)
Dogs, for example, can hear much higher frequencies than
humans, this is why dog whistles seem almost silent.
We are particularly sensitive to the frequency range from
3,000 to 8,000 Hz.
Most human speech sounds fall into this range.
Auditory Sensory Memory
Echoic Memory : the brief memory system that receives
auditory stimuli and preserves them for some amount of
time.
Because auditory information is spread out over time, the
echoic memory trace must be of longer duration than iconic
memory, so that we can extract meaningful information from
auditory stimuli.
Darwin, Turvey, & Crowder (1972) devised the "Three-eared
Man" experiment to examine the time course and capacity of
echoic memory.
Three simultaneous auditory signals were played to the
subject, who then had to recall as many of the nine stimuli
as possible.
Left Ear Only (B 7 L)
Right Ear Only (D 5 P
Both Ears (sounds like coming from center of head) (Q 9 K)
Three Eared Man Experiment (continued)
In the whole report condition, where the subject had to recall
as many of the 9 distinct stimuli as possible, performance
hovered around 4 items. 44 %
To create a partial report condition, the subjects were given a
visual cue following the auditory signal which would direct
them to report only items from Left, Right, or ‘Center’ audio
track.
When the visual cue is given immediately after the auditory
signal, proper recall rose above 50%.
The partial report advantage was still present with a 4
second delay between the auditory signal and the visual cue.
This led Darwin et. al. To conclude that echoic memory
could last at least 4 seconds. The simpler the stimuli, the
longer the echoic trace lasts.
Persistence and Erasure of Auditory Information
If we do not attend to the auditory trace, it will gradually fade
from memory.
Crowder & Morton (1969, 70, 72)
Presented 9 digits at a rate of 2 per second
Silent Vocalization : Subjects are required to read digits
silently as they appear on the screen.
Passive Vocalization : An audio tape plays the digits aloud
as they appear on the monitor.
Active Vocalization : Subjects must say the digits aloud as
they appear on the
screen.
Recall of ninth item was almost perfect in the P.V. and A.V.
conditions. (60% in S.V.)
Crowder & Morton called this the Modality Effect
Modality Effect : Superior Recall of the end of a list when
the auditory mode is used instead of the visual mode of
presentation
Crowder claimed his experimental results supported two
conclusions :
1.The existence of an auditory traces in
sensory memory, which Crowder named Precategorical
Acoustical Storage (PAS)
2.The persistence of auditory traces across a short interval
of time.
Eliminating the Modality Effect
Crowder presented an auditory list of nine digits, followed by
a suffix:
No suffix : Subjects were given a visual cue which began their recall.
Zero Suffix : Subjects heard the word "zero" which signaled the recall. Performance on 9th item dropped to 60 %
PAS gets more complicated, because a
White Noise Suffix : Did not eliminate the modality effect.
This shows that the The type of suffix is as important as the
existance of the suffix.. Why ?
Greene & Crowder suggest that PAS also holds the
articulatory codes (or gestural codes) necessary for later
reproducing these digits, and the suffix must wipe out the
last articulatory code to eliminate the modality effect.
Pattern Recognition in Visual Perception
How do we recognize written language ?
Template Theory : we have a stored model of all
categorizable patterns …. Kind of an extremely large photo
album.
Scanners recognize Bar Codes through template matching.
However, we can recognize a wide varieties of type face and
handwriting styles.
Template Theory seems to violate the principle of Cognitive
Economy, because the number of different templates which
would have to be stored in memory would be enormous.
Your entire life would be spent learning all these different
templates.
Visual Feature Detection
Feature Analysis (Feature Detection) : The visual cortex
"breaks down" each visual pattern into its basic components
in order to recognize the overall pattern.
Diagonals, Horizontal Lines, Verticle Lines, Open Faced
Circles, Filled Circles, are all individual features which are
combined to form meaningful stimuli, i.e. letters.
The Pandemonium Model of Selfridge (1959)
According to Selfridge, feature detection is a chaotic process
where all of the activated features clamor for attention as
they attempt to identify the visual pattern.
Selfridge included three levels of ‘demons’ which coordinate
visual language recognition :
Data Demons : responsible for encoding the visual stimuli.
Analogous to Iconic memory.
Computational Demons : This level identifies the simple
features contained in the Icon.
Cognitive Demons : Tries to match the whole letter pattern
based upon the excitation of the computational demons.
Decision Demon : Has to decide which of the cognitive
demons is telling the truth, and identifies the pattern
completely.
All of these demons work at the same time, an idea called
Parallel Processing.
Supporting Physiological Evidence for Feature
Detection :
The visual cortex has highly specialized cells for pattern
recognition :
Simple Cells : Identify visual activity in specific geographic
regions.
Complex Cells : Fire only when they detect a specific
feature : Vertical Line, Horizontal, Diagonals. Feature
Specific Cells
Hypercomplex Cells : Are both feature specific and
location specific.
These visual cortex cell types were identified by Hubel and
Weisel (1962), who were planting microelectrodes in cat’s
brains.
The problem with feature analysis
Feature Detection models can not always adequetely
account for the influence of Top-Down (or Conceptually
Driven) processing.
Feature Detection models, by definition, are Bottom-Up
(Data Driven) processing models.
However, experiments have shown that we can "turn-off" or
pay selective attention to certain feature detectors, as the
task demands.
Our cognitive processes can control our basic pattern
recognition system.