Part B - Cognitive Aspects

Sensation, Perception, and Attention

Describe the Visual, Auditory, and Haptics Sensory Systems
Describe the Neuron with its Transmission Mechanisms

"Nothing is in the understanding, which was not first perceived by some of the senses" John Locke

Vision | Hearing | Touch | Motion | Neuroscience | Attention | Potentiation | References

We receive information from the physical world through our senses of sight, hearing, touch, smell, and taste.  Most of the information that we receive when interacting with computers is through sight, hearing, and touch.  These sensations taken together support our sense of motion.  We call the initial detection of energy from the physical world sensation.  We collect this information in our sensory memories and call the process of interpreting this information perception

The scientific basis for models of memory and mental processing exists in cognitive neuroscience.  Our brains and our central nervous systems provide the anatomical support for perception, attention, and potentiation.  It is through attention that we select the information to bring to consciousness so that we don't overload our working memories.  It is through potentiation that we form memories.

In this chapter, we describe elements of the visual, auditory, and haptic systems from both sensory and perceptual perspectives.  We describe the central nervous system and comment on possible explanations for attention and long-term memory.


Vision is the primary source of information in HCI.  Light that has reflected off objects in the distance enters our eyes.  Our eyes convert this light into electrical energy.  Cells within the eye, called ganglion cells transmit this electrical information to the brain.  The visual cortex in the brain makes sense of the signals received.

visual cortex
The Visual Cortex (source: Washington Irving Wikipedia 2007 CC-BY-SA)

The Human Eye

The human eye is a sophisticated organ that connects directly to the brain through the optic nerve.  The main components of the eye are

  • cornea - covers the pupil and the iris
  • sclera - the white covering of the eye
  • iris - controls the amount of light entering the eye
  • pupil - the aperture through which light enters the eye
  • lens - focuses the light onto the retina
  • retina - the sensory part of the eye

The interior of the eye is filled with a viscous humor.  The cornea and lens focus the light entering the eye so that it impinges on the back of the retina.  The image on the retina is upside down from the image outside the eye.  A small part of the retina near its center and the furthest distance from the lens plays an important role in interpreting the light.  This part is called the fovea. 

human eye
The Human Eye (source: Wikipedia 2008 PD)

There is a blindspot on the retina where the retina turns into the optic nerve.  Our perceptual system compensates for this blindspot so that it does not have a noticeable effect. 

The Retina

The retina is light sensitive.  Light striking it causes chemical and electrical reactions within.  The retina consists of a large number of photoreceptor cells that contain a protein called opsin.  There are two types of photoreceptor cells:

  • rod cells or rods
  • cone cells or cones

An opsin, when it absorbs a photon of light, transmits a signal to a photoreceptor cell.  We have two types of opsins:

  • rhodopsins found in rod cells
  • iodopsins found in cone cells

human retina
The Human Retina (source: Wikipedia 2009 PD)

There are 90-120 million rod cells or rods dispersed throughout the retina.  These rods are highly sensitive to light and distributed towards the edges of the retina so that they capture peripherial images.  They are insensitive to fine detail and can become saturated easily.  They are responsible for night vision.  Saturation explains the temporary blindness that we experience when moving from a dark room into bright sunlight. 

There are about 6 million cone cells or cones within the retina, most of which are on the fovea.  The cones are about 100 times less sensitive to light than the rods.  There are three types of cones: L, M and S.  Each type is sensitive to a certain range of wavelengths of light (S - blue, M - green, and L - red).  The cones are also sensitive to light intensity.  Because cones are sensitive to both wavelength and intensity, only the combined response of all three types can identify the color of the impinging light. In other words, initially all of us are colour blind.

ganglion cells
Ganglion Cells (source: Wikipedia 2007 PD)

The rods and cones are located near the surface of the retina.  The rods and cones synapse into ganglion cells that connect to the brain.  The ganglion cells are located deeper within the retina.  They are the output neurons of the retina and we can distinguish them into

  • X ganglion cells - numerous with narrow receptive fields
  • Y ganglion cells - sparse with wide receptive fields

X-cells are mostly in the fovea and detect patterns.  Y-cells are widely distributed throughout the retina and responsible for early detection of movement. 


Our eyes can distinguish size, depth, brightness and color. 

Size and Depth

All images that enter the eye project upside down on the retina.  Our eyes measure size and depth in terms of visual angle; that is, the fraction of the retina covered by the projected image.  Visual angle measures in degrees, minutes, and seconds; where 1 degree is 60 minutes and 1 minute is 60 seconds. 

visual angle
Visual Angle (source: Wikipedia 2008 CC-A Ojosepa)

Visual acuity is the ability to distinguish fine detail.  The normal human eye can identify a single line at 0.5 seconds of arc and a space between two lines at 30 seconds of arc. 

Our eyes identify depth through overlap and familiarity.  The law of size constancy states that we perceive an object that moves further away to be constant in size as it moves away. 


Brightness is a subjective measure of the level of light.  We measure brightness in terms of luminance.  Luminance depends on the amount of light incident on an object and the reflective properties of the object. 

Contrast compares the luminance of an object with the luminance of the background to the object. 

Our visual system compensates for brightness so that most scenes look alike.  As brightness decreases, the rods dominate and we lose color vision. 

Visual acuity increases with luminance.  As luminance increases so does flicker.  Flicker may be avoided where updating occurs at a rate greater than 50Hz.  Hz stands for Hertz or cycles per second.  Flicker is more noticeable in peripheral vision.  This is why large displays flicker more. 


We measure colour in terms of hue, intensity, and saturation.  An average person can distinguish up to 150 hues.  Intensity is a measure of brightness.  Saturation is a measure of the whiteness.  A combination of hue, intensity and saturation can create in the order of 7 million different colours.  The untrained eye might only distinguish about 10 different colors.

There are relatively few S (blue) cones in the fovea (3%-4% of all of the cones in the fovea).

Color blindness occurs in males (8%) and females (1%).  Total color blindness (monochromacy) is much less common than partial color blindness.  We classify partial color blindness into:

  • red-green
    • dichromacy (protanopia (red absent) and deuteranopia (green absent))
    • anomalous trichromacy (protanomaly (red shifted) and deuteranomaly (green shifted))
  • blue-yellow
    • dichromacy (tritanopia (blue absent))
    • anomalous trichromacy (tritanomaly (blue shifted))

In dichromacy, one of the cone pigments is missing.  In anomalous trichromacy, one of the cone pigments has altered its spectral senstivity.

Classification of Color Deficiencies
ishihara test
Ishihara 9 (source: Wikipedia 2009 PD)

Color Scheme Designer (simulates schemes as seen by color-blind persons)

Optical Illusions

Our visual processing system compensates for movement, using expectations and context to resolve ambiguities.  Our systems interpret what we expect to see, not what is actually there.  This is the source of our optical illusions.


We recognize words at about the same rate as we recognize individual characters.  We recognize words in a process that consists of jerky movements (saccades) followed by fixations.  94% of our perception occurs during these fixations.  Our eyes move backwards over the text as well as forwards.  We call backward movement regression.  More complicated sentences involve more regressions. 

Fixation (source: Hans-Werner34 2008 CC-BY-SA)

The reading process itself consists of three distinct stages:

  1. perception of the visual pattern of a word
  2. decoding of the visual pattern using an internal representation of language
  3. syntactic and semantic analysis of the phrases of the sentence


  • read about 250 words per minute
  • find fonts between 9 and 12 point equally legible
  • lines between 2.3 and 5.2 inches (58 and 132 mm) easy to read

Point: there are 72 points in an inch (the modern definition of a point).  12 points constitute 1 pica

Evidence suggests that we read slower from computer displays because

  • the line length is too long or too short
  • the medium is unfamiliar
  • the contrast is insufficient

Dark characters on a light background provide higher luminance and higher acuity, but such images are more prone to flicker.

In practice, white letters on black background are preferred and increase reading accuracy.

Fixation (source: Hans-Werner34 2008 CC-BY-SA)

Eye fatigue results from letters with too low a resolution:

  • professional printers use 1200 dpi since 600 dpi will cause fatigue
  • displays have about 100 dpi
  • scalable fonts are preferable
  • flicker increases fatigue

Here is a 4-minute on Vision-->

Here is a 1-minute on Vision-->


Our hearing includes the ability to distinguish types of sounds as well as the source of those sounds.  The human auditory system consists of the human ear, the auditory pathway, and the primary auditory cortex. 

primary auditory cortex
Primary Auditory Cortex (source: Jimhutchins Wikipedia 2008 CC-BY-SA)

Human Ear

The human ear consists of three parts:

  1. outer ear
  2. middle ear
  3. inner ear

human ear
Human Ear (source: Chittka, Brockmann, Wikipedia 2009 CC-BY)

Outer Ear

The outer ear in turn consists of two parts:

  • pinna
  • auditory canal

The pinna protects the auditory canal and amplifies the sounds.

The auditory canal is covered in wax and provides further protection against dust, insects and other instrusions. 

auditory canal
Auditory Canal (source: Iain Wikipedia 2003 CC-BY-SA)

Middle Ear

The middle ear consists of two parts:

  1. the tympanic membrance or ear drum
  2. the ossicles (malleus, incus, stapes)

Ossicles (source: Arcadian Wikipedia 2006 PD)

The ear drum vibrates in response to changes in air pressure and the three tiny bones of the middle ear transmit this vibration to the inner ear.  These bones, called ossicles concentrate and amplify the vibrations of the tympanic membrane.  They are the smallest bones in our bodies. 

middle ear
Middle Ear (source: Pngbot, Wikipedia 2007 PD)

Inner Ear

The inner ear is a labyrinth, which consists of three parts:

  1. the semi-circular canals
  2. the vestibule
  3. the cochlea - the auditory component

The vestibular system - the canals and the vestibule - provides our sense of balance and communicates with the brain through the vestibular nerve.  This system works with our visual system to keep objects in focus. 

inner ear
Inner Ear (source: Krimpet, Wikipedia 2007 PD)

The cochlea communicates with the brain through the auditory nerve, which is separate from the vestibular nerve.  The cochlea is filled with a fluid that moves in response to the movement of the ossicles. 

inner ear
Cross-Section through Cochlea (source: Oarih, Wikipedia 2004 CC-BY-SA)

The core component of the cochlea is the Organ of Corti.  This is the sensory organ of hearing.  It contains about 15,000 to 20,000 nerve receptors.  Each receptors has its own hair cell or cilia.  The cilia detect vibrations within the organ.  They vibrate and serve as chemical transmitters to the auditory nerve.  By the time the sound waves reach the Organ of Corti, their pressure amplitude is 20 times that of the air impinging on the pinna. 

organ of corti
Section through Organ of Corti (source: Madhero88, Wikipedia 2009 CC-BY-SA)

Here is a 1-miunte video on The Sense of Hearing.


We describe auditory processing in terms of:

  • pitch or frequency
  • loudness
  • timbre - tone quality or tone color

The human ear can capture sounds in the range of 20 to 15KHz and can detect subtle changes in pitch.  The American Standards Association defines timbre as

"[...] that attribute of sensation in terms of which a listener can judge that two sounds having the same loudness and pitch are dissimilar".

Our auditory processing system filters out noise and can focus selectively on particular sounds.

In HCI, music and speech

  • enrich the user's experience
  • provide the user with more information
  • help users with poor vision


Our sense of touch allows us to distinguish hot from cold, grasp objects without crushing them, and respond to pain. 

Human Skin

Our sense of touch is not localized, but is distributed throughout our skin.  Some areas of the skin are highly sensitive while other areas are much less sensitive.  Sensitivity depends uopn the density of receptors within the skin.  The densities of the receptors in the skin varies throughout the body.  Areas of the skin that have no hair are called glaborous skin. 

cross-section of glaborous skin
Cross-section of Skin (source: Wikipedia 2008 CC-BY-SA)

The two-point threshold test measures the degree of sensitivity to touch of any part of the skin. 

Three types of receptors support our sense of touch:

  1. thermoreceptors - sensitive to heat and cold
  2. nociceptors - sensitive to intense pressure and pain
  3. mechanoreceptors - sensitive to pressure and motion - four types
    • Meissner's corpuscles - rapidly acting type I - respond to light pressure and low frequency vibrations with narrow receptive field
    • Pacinian corpuscles - rapidly acting type II - respond to deep pressure and high frequency vibrations with a large receptive field
    • Merkel's discs - slowly adapting type I - respond to superficial mechanical deflection
    • Ruffini corpuscles - slowly acting type II - respond to sustained mechanical deflection and sustained pressure

The rapidly acting receptors respond to immediate pressure but do not react to prolonged pressure.  The slowly acting receptors respond to continuously acting pressure.  Haptic memory may be related to the slow-acting mechanoreceptors. 

Here is a 2-minute on Touch


We sense motion through our visual, auditory, and haptic systems and in our joints. 

Visual, Auditory, and Haptic

Our sensation of motion can be divided into reaction time and movement time. 

Reaction Time

Reaction time is stimulus time.  Our three sensory stimuli have different thresholds:

  • auditory signals - 150 ms
  • visual signals - 200 ms
  • pain signals - 700 ms

Combinations of these three signals yield the fastest reaction time. 

Movement Time

Consider the case when a user receives some sort of signal, the user must respond and hit some button.  The accuracy of the movement time depends upon how big the target is and how far the user has to move.  One common law to calculate the movement time is Fitt's law where movement time is given by the expression

 movement time = a + b log2(displacement/size + 1)

and a and b are empirically determined. 

In other words, targets should be large while displacements should be small.  Controls should be placed close to one another to minimize movements.  Controls should be large enough so that they can be accurately hit with little effort.  Frequently used menu items should be closer to one another. 


We also have receptors within the joints between our bones.  These receptors measure proprioception.  Proprioception is the unconscious internally caused perception of movement and change in spatial orientation of our body. 

There are three types of proprioreceptors:

  1. rapidly acting
  2. slowly acting
  3. positional


In 1909, Korbinian Brodmann divided the human brain into 52 different areas, each with a specific cell type.  These areas are known as Brodmann areas and still considered to be quite accurate.  Santiago Ramon y Cajal (1852-1934) discovered that neurons are discrete cells and that they transmit electrical signals in a single direction only.  His discovery forms part of what is known as the neuron doctrine.  This doctrine is central to modern neuroscience. 


A neuron is the fundamental cellular element of our nervous system.  This cell transmits information through electrochemical signaling.  There are two distinct types of neurons: sensory and motor.  Sensory neurons respond to external stimuli such as light, sound, and touch, which impinge upon our sensory organs.  These neurons transmit signals to our brains.  Motor neurons transmit signals from our brains to our muscles and glands. 

Each neuron consists of

  • a soma (~20 micrometers in diameter) with a tree of dendrites, which receive signals from other neurons
  • a set of axon terminals, which contain synapses that transmit neurochemicals to neighbouring cells, which may be dendrite branches of a neighbouring neuron or the cells of the target muscle or gland
  • an axon (~1 micrometer in diameter) - a cable-like structure that can extend hundreds of times the diameter of the soma and transmits the signal from the soma to the axon terminals through chemical ionization

(source: Quasar Jarosz Wikipedia 2009 CC-BY-SA)

A signal from one neuron to its target transmits through the release of neurochemical molecules from the synapses of the axon terminals.  The region between a synapse and the neighbouring receptor is called the synaptic cleft.

(source: Wikipedia 2008 PD)

The biological changes that support memory processing are attributed to events and states at these neuronal synapses. 

Short-Term Memory

Short-term memory may be explained biologically by the "prolonged firing of neurons which depletes the Readily Releasable Pool (RRP) of neurotransmitter vesicles at presynaptic terminals" (Tarnow 2008).


Attention is the cognitive process by which we filter the information that is of interest to ourselves.  In William James' time, psychologists studied attention through introspection.  Once the dominant view shifted away from behaviorism during the cognitive revolution, attention became a legitimate object of scientific inquiry and researchers started to study the cocktail party problem experimentally: how we select the conversation we listen to and ignore the rest. 

In the 1960s, Robert Wurtz tied attention to neural activity.  He showed that enhanced firing of neurons is directly correlated to an increase in attention. 

Donald Broadbent (1926-1992) was a British psychologist who worked at the Applied Psychology Research Unit and developed theories of selective attention and short-term memory.  His Filter Model is based on the Atkinson and Shiffrin model and prevents the overloading of the limited capacity of short-term memory.  He asserted that the selective filter allows information to come into working memory from only one channel at a time.

Anne Treisman (1935-present) is a British-born psychologist who works at Princeton University's Department of Psychology and who identified some problems with Broadbent's early-selection theory.  This eventually led to the Deutsch-Norman late-selection model in 1968.  In that model, no signal is filtered out until the point of activating its stored representation in memory.  At that attentional bottleneck, only one can be selected and is selected.  In the late-selection theory visual perception is automatic and doesn't depends on attention.

Some Videos

Here is a 1-miunte video on Selective Attention.

Here is a 2-miunte video on Paying Attention.

Try this 2-minute Awareness Test.

Long-Term Potentiation

Long-term potentiation is long-lasting improvement in signal transmission across cellular mechanisms.  It is now considered the neuroscientific basis of learning, higher-level cognition, and long-term memory. 

Santiago Ramon y Cajal was amongst the first to suggest that learning does not require the formation of new neurons.  He proposed that memories are formed by improving the effectiveness of communication between neighbouring neurons. 

Donald Hebb (1904-1985) was a Canadian psychologist who worked in the Department of Psychology at McGill University and developed Hebbian theory ("Neurons that fire together wire together")

"When an axon of cell A is near enough to excite cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased"
which proved to be the basis of long-term potentiation.  He identified the combination of neurons that could be grouped together as one processing unit as cell-assemblies and asserted that their combination of connections made up the ever-changing algorithm which determines response to stimuli. 

In 1966, Terje Lomo observed that "a high-frequency stimulus could produce a long-lived enhancement in the postsynaptic cells' response to subsequent single-pulse stimuli".  This increased efficacy arising from repeated and persistent stimulation of post-synaptic cells by pre-synaptic ones is called long-term potentiation.

Here is a 1-miunte video on LTP.

Procedural Learning

Damage to the basal ganglia and cerebullum have been shown direct effects on procedural learning. 

basal ganglia and cerebullum
(source: Wikipedia 2009 PD)