Speech

Speech Interaction

Currently the speech recognition abilities of computers are limited in practice, due to

inter-individual phonetic differences

intra-individual phonetic differences (emotions, colds, time of day)

ambient noise

continuous speech

interjections (ahem, hm), breaks, repetitions

Partial remedies

train the system on speech data of individual user
(speaker-dependent recognition, e.g. in dictation software)

leave the system in control of the interaction (by, e.g., asking questions, telling the user what to say)

limit what the user can say at each step (e.g., to "yes" and "no")

Some design recommendations for speech dialogues

verify the correct understanding of every user input

abbreviate instructions after user heard them repeatedly ("tapering")

allow users to request a human agent (by saying "agent")

redirect to human in case of repeated recognition failure