13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

A Random, Semantically Appropriate Sentence Generator for Speaker Verification

Jason Lilley (1), Amanda Stent (2), Ilija Zeljkovic (2)

(1) Department of Linguistics and Cognitive Science, University of Delaware, Newark, DE, USA
(2) AT&T Labs - Research, Florham Park, NJ, USA

In this paper, we describe two systems for automatically generating English sentences, and evaluate the suitability of their output for speaker verification. The first system, SUSGen, generates grammatical but semantically anomalous sentences of controlled length, vocabulary and phonetic content. The second system, SASGen, extends SUSGen to generate a greater variety of sentences and ones which are, for the most part, semantically acceptable. We demonstrate that sentences generated by SASGen are significantly more readable and meaningful than those generated by SUSGen. While sentences generated by SASGen were not judged to be as readable or meaningful as human-generated sentences, the additional control SASGen provides for sentence length, vocabulary and phonetic content make it more suitable for speaker verification and other voice collection purposes.

Full Paper

Bibliographic reference.  Lilley, Jason / Stent, Amanda / Zeljkovic, Ilija (2012): "A random, semantically appropriate sentence generator for speaker verification", In INTERSPEECH-2012, 739-742.