The speech signal is remarkably rich. As discussed by Munson, Edwards, and Beckman (2012), a single production of the word cat can index not only the regular semantic features of felis catus, but also the words position in utterances larger prosodic structure, the speakers stance toward the topic being discussed, the speakers intentions for how the word should be interpreted relative to the ongoing discourse, and aspects of the speakers social identity (such as their gender and sexuality) and emotional state. Humans and automatic speech processing systems must be able to unpack these different messages from this complex signal. In this talk, I discuss how different types of information interact in speech production and perception. I give special attention to contrasting typical speakers and listeners with atypical populations, i.e., populations other than native language speaking adults with no history of speech, language, or hearing impairments. Together, the results I present are a call to action for the INTERSPEECH community to consider a broader set of sources of variability when modeling spoken language production and comprehension.
Bibliographic reference. Munson, Benjamin (2013): "On the interaction of social and linguistic factors in phonetic variation in typical and atypical speakers", In INTERSPEECH-2013 (abstract).