While it is easy to recognise the necessary role that vision plays in reading, it is important to know that many other senses contribute to the acquisition, refinement and maintenance of reading skills including sound, touch and motor coordination, to name just a few
Oral language is ubiquitous across cultures throughout human history. However, this is not the case for written language, which is, in an evolutionary sense, relatively new. As such, the human brain does not have a dedicated “reading system” and relies instead upon established neural networks and circuitry to build and refine the skills required for reading and writing (see (1) for a discussion of the neuronal recycling hypothesis for arithmetic and reading).
Sight, sounds and touch
Here we focus on three senses that are critical to the acquisition of reading skills, namely sight, sound and the somatosensory system (touch).
To appreciate how these senses contribute to a coordinated reading process, let us first consider a child who is learning to read. In the early stages of reading instruction, highly familiar words are used (eg, dog, ball), and children are instructed to read aloud (ie, overtly) to allow the instructor (eg, a parent or teacher) to monitor the process of letter/word identification and decoding. That is, children do not spontaneously learn to read in the absence of explicit instruction.
Eventually, with repeated practice, the beginning reader associates the visual letter forms with their corresponding sounds, and with the somatosensory information required to physically articulate (produce) the sounds.
In other words, the initial stages of the reading acquisition process involve encoding and coupling the visual, auditory (phonological) and somatosensory (articulatory) representations of letters and words by repeatedly activating sensory and associative regions of the brain that are specialised to process and integrate visual, auditory and somatosensory information (2-4). For successful and fluent reading to emerge, the coupling of these sensory systems must be timely, consistent and robust (3, 5).
The print-to-speech model
The print-to-speech model (6), an adaptation of the well-known Directions into Velocity of Articulators (DIVA) model of speech production (7, 8) , describes how beginning readers link these visual, auditory and articulatory representations, and use feedforward and feedback systems to generate speech motor commands that eventually automatise and drive fluent reading.
In the print-to-speech model, the feedback system facilitates the identification and production of novel or unfamiliar words via their visual, phonological and articulatory representations, and generates motor representations (consisting of a motor plan and a motor command). Importantly, the motor representations produced by multisensory coupling are produced independently of whether or not the word was read aloud (9). In other words, speech motor commands are activated even during silent reading.
Feedback system
For example, to generate motor representations to read the nonsense word wint, a reader must activate visual and auditory representations linking the letters w-i-n-t with their corresponding sounds, along with articulatory representations that provide information about how to produce these sounds.
After the word is read, auditory and somatosensory feedback is used to compare the actual production to the anticipated (target) sound, refining the motor representation for future attempts (see (7, 8) for an explanation of this process in speech production).
In the print-to-speech model, the motor representation is also refined for future productions based on external feedback from the instructor about how accurately the word was read. There is also evidence for refinement via internal (reader-initiated) feedback using strategies like mispronunciation correction (10).
Feedforward system
The feedforward system in the print-to- speech model drives the identification and articulation of highly learned words. For example, when an experienced reader is presented with the visual stimulus flip, which is a regular word with predictable letter- sound correspondences, the motor representations to produce the spoken “flip” are automatically generated, regardless of whether the word is actually spoken aloud (9). The feedforward system also drives the production of motor representations for words with irregular letter-sound correspondences (eg, yacht). These words cannot be successfully read using just letter-by-letter decoding and instead enter the system by way of memorisation and partial decoding (10).
Similarly, to read pseudohomophones (nonsense spellings that sound like real words) such as leev, readers again combine visual, phonological and articulatory representations to generate a motor representation.
However, the motor representation for pseudohomophones is familiar and already stored in the feedforward system as the word leave.
After several years of targeted instruction, reading shifts from an overt practice to a silent (ie, covert) one. Much evidence suggests that the visual, phonological, articulatory and motor representations required to successfully read aloud remain tightly coupled even after overt production is no longer required (4, 6, 11-13).
Notably, as with the DIVA model of speech production, both the feedforward and feedback systems remain effective features of the print-to-speech model, even after literacy has been established. In other words, the feedback system actively contributes to the reading process even in highly fluent and experienced readers.
The print-to-speech model and automatised reading
Eventually, the act of reading becomes automatised, as the tight coupling between the sensory components means that even when presented with just a small portion of information, successful and automatic reading can still unfold. The automaticity becomes so strong that skilled adult readers cannot stop themselves from reading (ie, Stroop tasks that ask individuals to name the colour of the font, but not the word – i.e., red printed in blue ink, demonstrating the automaticity of reading even when it interferes with the task at hand (14), and seeing a single word will excite the entire reading circuitry (15, 16).
The print-to-speech model describes how, in typical readers, visual information in the form of printed words is repeatedly paired (via explicit reading instruction) with auditory and somatosensory feedback that results from overt reading, leading to the development of visual, auditory and articulatory representations for words, part-words and letters. These representations are coupled with one another and with a corresponding motor representation that is used to produce the word overtly (read aloud).
In most cases, with repeated exposure and in response to external and internal feedback, reading becomes highly automatised. Automaticity, which is the hallmark of fluent, skilled reading, is possible because visual information is overlaid on established neural systems for speech production. Crucially, this neural system for reading is optimal when visual sensory information is paired with auditory and somatosensory input, resulting in robust sensory associations.
For references, see:
This work is licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International.