Speech Interaction
Currently the speech recognition
abilities of computers are limited in practice, due to
- inter-individual phonetic
differences
- intra-individual phonetic
differences (emotions, colds, time of day)
- ambient noise
- continuous speech
- interjections (ahem,
hm), breaks, repetitions
Partial remedies
- train the system on
speech data of individual user
(speaker-dependent recognition, e.g. in dictation software)
- leave the system in
control of the interaction (by, e.g., asking questions, telling the user
what to say)
- limit what the user
can say at each step (e.g., to "yes" and "no")
Some design recommendations
for speech dialogues
- verify the correct understanding
of every user input
- abbreviate instructions
after user heard them repeatedly ("tapering")
- allow users to request
a human agent (by saying "agent")
- redirect to human
in case of repeated recognition failure