The DIVA Model of Speech Motor Control

DIVA (Directions Into Velocities of Articulators) is a neural network model of speech motor skill acquisition and speech production. In computer simulations, the model learns to control the movements of a computer-simulated vocal tract in order to produce speech sounds. The model’s neural mappings are tuned during a babbling phase in which auditory feedback from self-generated speech sounds is used to learn the relationship between motor actions and their acoustic and somatosensory consequences. After learning, the model can produce arbitrary combinations of speech sounds, even in the presence of constraints on the articulators.

A schematic of the model is provided below, and Matlab/Simulink code implementing the model is available on our Software page. Each block in the diagram corresponds to a hypothesized map of neurons in a particular region of the brain. DIVA provides unified explanations for a number of long-studied speech production phenomena including motor equivalence, contextual variability, speaking rate effects, anticipatory coarticulation, and carryover coarticulation. Because the model’s components are localized to particular stereotactic coordinates in the brain, it is capable of accounting for a wide range of neuroimaging studies of speech production.

Schematic of the DIVA model (Guenther, 2016, Chapter 3).